1 2 Previous Next 21 Replies Latest reply: Jun 26, 2008 6:31 PM by Dean Huxley
R Bryan Harrison Level 2 Level 2 (210 points)
I'm running Apache 2.2 under OS X Server 10.5.2. In the wake of Security Update 2008-002, Apache refuses to run, with the following log entries:

[Wed Mar 19 17:01:16 2008] [notice] Graceful restart requested, doing restart
[Wed Mar 19 17:01:16 2008] [error] (9)Bad file descriptor: aprpollsetpoll: (listen)
pipedlogspawn: unable to setup child process '/usr/sbin/rotatelogs "/Services/LaughingBoot/Sites/Logs/defaultsite/accesslog" 604800 -420': Too many open files
Unable to open logs
[Wed Mar 19 17:01:20 2008] [warn] pid file /var/run/httpd.pid overwritten -- Unclean shutdown of previous Apache run?
[Wed Mar 19 17:01:20 2008] [error] (24)Too many open files: could not create /var/run/httpd.pid
[Wed Mar 19 17:01:20 2008] [error] httpd: could not log pid to file /var/run/httpd.pid

The last three lines repeat, ad infinitum.

I have forty-some sites, and if I disable all but 33 of them, the problem vanishes. Enable 34, and the above fit ensues.

As you can no doubt tell, this all a bit over my head. I'm pounding my head against the espresso machine and shortly will begin moaning about how God hates me.

Anyone care to at least point me in the right direction?

Thanks,
Bryan

G5 DP 2.5 ghz, etc., Mac OS X (10.5.2), OS X Server 10.5.2
  • 1. Re: 420 Too Many Open Files
    Leif Carlsson Level 5 Level 5 (4,950 points)
    How many files does it need to open?

    On my 10.5.2 user

    sysctl -a | grep files


    kern.maxfiles = 12288
    kern.maxfilesperproc = 10240
    kern.maxfiles: 12288
    kern.maxfilesperproc: 10240


    I don't know for sure, but maybe try increase those figures?

    sudo sysctl -w kern.maxfilesperproc=65536 (or some other lower figure)

    This change is temporary. You could set it in a file /etc/sysctl.conf that is read at boot.

    Just add the line:

    kern.maxfilesperproc=65536
  • 2. Re: 420 Too Many Open Files
    R Bryan Harrison Level 2 Level 2 (210 points)
    Thanks very much for the suggestion. Alas, increasing maxfiles and/or maxfilesperproc makes no difference at all.

    Baffling.

    -Bryan
  • 3. Re: 420 Too Many Open Files
    Leif Carlsson Level 5 Level 5 (4,950 points)
    lsof should list open files.

    For lack of finding another way to list open files i did this:

    sudo lsof > openfiles.txt

    Then I opened the openfiles.txt file in TextWrangler and choose "show line numbers" to see how (approximatly) many file were open.
  • 4. Re: 420 Too Many Open Files
    Jans Carton Level 1 Level 1 (0 points)
    We're seeing the same problem.

    Here's a sample from /var/log/httpd/error_log:

    pipedlogspawn: unable to setup child process '/usr/sbin/rotatelogs "/private/var/log/httpd/wwwjanscarton_com/accesslog" 86400 -300': Too many open files
    Unable to open logs

    Here's a sample from the console:

    3/27/08 2:17:09 PM com.apple.launchd[1] (org.apache.httpd[6748]) Stray process with PGID equal to this dead job: PID 6844 PPID 1 rotatelogs

    The two errors seem to be related.

    We're generating separate access and error log files for about 75 sites.

    Upping maxfilesperproc from 10240 to 65536 in sysctl didn't affect the problem.

    This command:

    sudo lsof | wc -l

    revealed that only 1236 files were open anyway. Far from the per process limit.

    Disabling log rotation in the site settings via Server Admin caused the problem to abate, but that's not a workable solution.
  • 5. Re: 420 Too Many Open Files
    Jans Carton Level 1 Level 1 (0 points)
    Gave the wrong path to our Apache error log above. It's actually:

    /var/log/apache2/error_log
  • 6. Re: 420 Too Many Open Files
    Jans Carton Level 1 Level 1 (0 points)
    Anybody else seeing this problem? It a pretty big issue for us.

    Any alternatives to the default log rotation?
  • 7. Re: 420 Too Many Open Files
    Jans Carton Level 1 Level 1 (0 points)
    This should give you a count of open files:

    sudo lsof | wc -l
  • 8. Re: 420 Too Many Open Files
    Hamish Mackellar Level 1 Level 1 (0 points)
    Yeah I got this problem too, I have to turn off my "busy" sites just to run the server. In my Activity Monitor program, while browsing the current processes in order of CUP % the Directory Process takes up all my remaining CPU power. Stopping the web service seems to clear the problem, and on restarting the web service allows for good running for about 15 mins then the web services die again. Stopping and starting the web server also takes about 10 mins. I find it fast to reboot "shutdown -r now" using terminal.

    I am wasting so muck time on this problem. Any feed back would be appreciated.

    System Log file:
    Apr 30 12:50:38 ns1 DirectoryService25: CFPropertyListCreateFromXMLData(): Old-style plist parser: missing semicolon in dictionary.
    Apr 30 12:50:42: --- last message repeated 3 times ---
    Apr 30 12:50:42 ns1 servermgrd50: * -WPSession createRecordWithType:attributes:inNode:error:: * Warning: the record of type dsRecTypeStandard:Config was created successfully, but OpenDirectoryFramework cannot find it. More info: recordName 'ServicesInformation11111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 1111111' in /Local/Default
  • 9. Re: 420 Too Many Open Files
    Hamish Mackellar Level 1 Level 1 (0 points)
    Ok, I may have the answer (for my case anyway). After loads of searching I couldnt find the file used by the DirectoryService program that was causing my problems. I did finally find the error log used by Directory Service. located: /Library/Logs/DirectoryService/DirectoryService.error.log if you issue the following command:
    tail -50 /Library/Logs/DirectoryService/DirectoryService.error.log

    You should get an out put with some minor errors or nothing at all. In my case I got the following:

    2008-05-05 22:49:03 EST - T[0xF0103000] - CDSLocalPluginNode::FlushRecord(): fopen() returned NULL while attempting to open 'w' "/var/db/dslocal/nodes/Default/config/ServicesInformation1111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111.plist.temp"
    2008-05-05 22:49:03 EST - T[0xF0103000] - CDSLocalPluginNode::FlushRecord(): fopen() returned NULL while attempting to open 'w' "/var/db/dslocal/nodes/Default/config/ServicesInformation1111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111.plist.temp"
    2008-05-05 22:49:05 EST - T[0xF0103000] - CDSLocalPluginNode::FlushRecord(): fopen() returned NULL while attempting to open 'w' "/var/db/dslocal/nodes/Default/config/ServicesInformation1111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111.plist.temp"
    2008-05-05 22:49:05 EST - T[0xF0103000] - CDSLocalPluginNode::FlushRecord(): fopen() returned NULL while attempting to open 'w' "/var/db/dslocal/nodes/Default/config/ServicesInformation1111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111.plist.temp"
    2008-05-05 22:50:02 EST - T[0xF0103000] - CDSLocalPluginNode::FlushRecord(): fopen() returned NULL while attempting to open 'w' "/var/db/dslocal/nodes/Default/config/ServicesInformation1111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111.plist.temp"
    2008-05-05 22:50:02 EST - T[0xF0103000] - CDSLocalPluginNode::FlushRecord(): fopen() returned NULL while attempting to open 'w' "/var/db/dslocal/nodes/Default/config/ServicesInformation1111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111.plist.temp"
    2008-05-05 22:50:07 EST - T[0xF0103000] - CDSLocalPluginNode::FlushRecord(): fopen() returned NULL while attempting to open 'w' "/var/db/dslocal/nodes/Default/config/ServicesInformation1111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111.plist.temp"
    2008-05-05 22:50:07 EST - T[0xF0103000] - CDSLocalPluginNode::FlushRecord(): fopen() returned NULL while attempting to open 'w' "/var/db/dslocal/nodes/Default/config/ServicesInformation1111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111.plist.temp"
    2008-05-05 22:50:11 EST - T[0xF0103000] - CDSLocalPluginNode::FlushRecord(): fopen() returned NULL while attempting to open 'w' "/var/db/dslocal/nodes/Default/config/ServicesInformation1111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111.plist.temp"
    2008-05-05 22:50:11 EST - T[0xF0103000] - CDSLocalPluginNode::FlushRecord(): fopen() returned NULL while attempting to open 'w' "/var/db/dslocal/nodes/Default/config/ServicesInformation1111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111.plist.temp"

    This error was causing Directory Service to use all my CPU power up and finally kill other services like web and mail. I moved all the files in folder /var/db/dslocal/nodes/Default/config/ using this comand:

    rm /var/db/dslocal/nodes/Default/config/ServicesInformation* /Desktop/Backup

    I did a restart and now my CPU activity is back around 15-20%. I assumed that Directory Service would rebuild the files if it needed them. I am wondering if I have killed my server? But its now working as required and much faster.

    Any feedback would be appreciated.

    Cheers,
    Hamish.
  • 10. Re: 420 Too Many Open Files
    Doug Whitmore Level 2 Level 2 (235 points)
    Your particular issue is a known one. Please feel free to write up another bug report, bug reports = escalations.
    Sorry I can't help you more right now, but we do know about the issue and agree that it is a serious one..
  • 11. Re: 420 Too Many Open Files
    ILBT Level 1 Level 1 (0 points)
    Does anyone have any solution to this problem yet? If you do, please post here as the 'too many open files' problem is crippling our web services!

    I just can't believe how many problems we are having since upgrading to Leopard! Both Client and Server software would have to be the most unstable and buggy software from Apple we I have ever used - and I've been using Apple since the Apple II days! Very disappointing!

    Just to confirm what we have tried as well (none have been successful):
    - Upping ulimit
    - Upping maxfiles/maxfilesperproc
    - Switching off 'Persistent Connections' seems to slow the problem for a few minutes
    - Increasing Max Persistent Connections
    - Decreasing Persistent Connection timeout seems to slow the problem a few minutes
    - Switching off all 'cache' modules in Apache2 slows the problem for a few minutes
    - Increasing the Maximum Simultaneous Connections
    - Increasing (or decreasing) the Maximum Spare Servers

    The '_www' process has ~1600 files open when the problem occurs

    Anyone with any help, please let us know what you did? Due to the time and changes we have implemented, it is almost impossible to go back to 10.3 Server (which was 100% more stable!)
  • 12. Re: 420 Too Many Open Files
    Paul Vail Level 1 Level 1 (135 points)
    I had a related problem.

    Background: 10.4 server running fine for a few years. Migrated to 10.5.0 in November. At no point in time would it allow the Apache migration from 1.3 to 2.x. The 10.4 server had been used strictly via the GUI -- never a CLI alteration to the httpd or any other config. Yet Apple's scripting could not perform the upgrade for Apache versioning. 10.5.1 and 10.5.2 releases behaved the same way. Upon 10.5.3 update, the migration tool magically worked (this past weekend) and the server has been functional. But there are similar errors in my logs. I didn't find them until stumbling across a particularly nasty one today.

    [Fri Jun 13 07:45:30 2008] [warn] No JkShmFile defined in httpd.conf. Using default /usr/logs/jk-runtime-status
    [Fri Jun 13 07:45:31 2008] [warn] The Alias directive in /etc/apache2/sites/0020any_80forums.domain1.com.conf at line 56 will probably never match because it overlaps an earlier Alias.
    [Fri Jun 13 07:45:31 2008] [warn] The Alias directive in /etc/apache2/sites/0052any_80domain2.com.conf at line 58 will probably never match because it overlaps an earlier Alias.
    [Fri Jun 13 07:45:31 2008] [warn] worker http://127.0.0.1:8087/groups already used by another worker
    [Fri Jun 13 07:45:31 2008] [warn] The Alias directive in /etc/apache2/httpdteamsrequired.conf at line 41 will probably never match because it overlaps an earlier Alias.
    [Fri Jun 13 07:45:31 2008] [warn] worker http://127.0.0.1:8086 already used by another worker
    [Fri Jun 13 07:45:31 2008] [warn] worker http://127.0.0.1:8087/collaboration-config already used by another worker
    [Fri Jun 13 07:45:31 2008] [warn] worker http://127.0.0.1:8087/wikihelp already used by another worker
    [Fri Jun 13 07:45:31 2008] [warn] worker http://127.0.0.1:8087/groups already used by another worker
    [Fri Jun 13 07:45:31 2008] [warn] The Alias directive in /etc/apache2/httpdteamsrequired.conf at line 41 will probably never match because it overlaps an earlier Alias.
    [Fri Jun 13 07:45:31 2008] [warn] worker http://127.0.0.1:8086 already used by another worker
    [Fri Jun 13 07:45:31 2008] [warn] worker http://127.0.0.1:8087/collaboration-config already used by another worker
    [Fri Jun 13 07:45:31 2008] [warn] worker http://127.0.0.1:8087/wikihelp already used by another worker
    [Fri Jun 13 07:45:31 2008] [warn] The Alias directive in /etc/apache2/sites/0070any_80x.domain3.com.conf at line 56 will probably never match because it overlaps an earlier Alias.
    [Fri Jun 13 07:45:31 2008] [warn] The Alias directive in /etc/apache2/sites/0072any_80y.domain4.comt.conf at line 56 will probably never match because it overlaps an earlier Alias.
    [Fri Jun 13 07:45:32 2008] [warn] No JkShmFile defined in httpd.conf. Using default /usr/logs/jk-runtime-status
    [Fri Jun 13 07:45:32 2008] [notice] modscgipubsub: generating secret for digest authentication ...
    [Fri Jun 13 07:45:32 2008] [notice] modscgipubsub: done
    [Fri Jun 13 07:45:32 2008] [notice] Apache/2.2.8 (Unix) PHP/5.2.5 mod_ssl/2.2.8 OpenSSL/0.9.7l DAV/2 mod_jk/1.2.23 modscgipubsub/1.11-pubsub configured -- resuming normal operations
    [Fri Jun 13 07:45:32 2008] [crit] (24)Too many open files: mod_rewrite: could not init rewrite log lock in child
    (and this last line is repeated five more times)
    [Fri Jun 13 07:51:54 2008] [warn] (13)Permission denied: connect to listener on [::]:80
    [Fri Jun 13 07:52:19 2008] [notice] caught SIGTERM, shutting down

    Today, I had Apache totally quit on me -- completely go deaf, 'though the rest of the server (listserve, ftp) would function. Mine might be a special case. Here's some detail I've gleaned from the logs.

    What is of interest here is that only a few of the domains (from the 80+) that migrated from 1.3 to 2.2.8 are listed above in the errors. In reviewing the logs and comparing them to the GUI settings, the differences between settings for domains that are not reported in error and those that are fall within these tab options:

    Web:Sites:Logging
    Web:Sites:Aliases section for URL Aliases and Redirects

    Those also happen to be the four most recently generated Site entries. If the new site was enabled, the server would crash and fail. If the site was then disabled (enable checkbox cleared), the server would start and run, serving sites properly.

    So something within those few default GUI settings (not matching the bulk of the sites that did work) may have been the cause -- or there are elements in the .conf files for each site that are not reflected in the GUI that differ. After some troubleshooting, I narrowed it down to the logging.

    If one of the 'bad' sites was changed to disable the 'Enable Access Log' by unchecking it, saving this change, then enabling the site, the server would restart gracefully.

    Next, if the 'bad' site adopted the same custom logging format string, was set to the same log roll (archiving) time frame as the others, and enabled, Apache2 would gracefully restart. Apache2 would only fail IF the logging for one site did not match the format for all of the others.

    I'm sure there are other causes for concern, but this is how I solved my issue with Apache crashing when a new site is added to the server.
  • 13. Re: 420 Too Many Open Files
    Mabel O'Farrell Level 3 Level 3 (975 points)
    You have too many sites and their attendant .conf files. You should consider using modvhostaliases along with a common log file for all the hosted sites with your virtual host information indicated the server-wide log file. You should also consider using a server-wide httpd.conf file instead of individual virtual site .conf files.
  • 14. Re: 420 Too Many Open Files
    Paul Vail Level 1 Level 1 (135 points)
    Perhaps he is using the GUI admin, and that it is poorly crafted to support what you dictate. There is nothing wrong with using the VH conf file structures -- Apache should be able to concatenate that without issue. What is the problem is how Apple has permitted the config of the logging. In a robust hosting environment, individual logging (and storing to a client-accessible directory for individual analysis) is not unusual at all. It should be expected -- and it should not die with only 30-some hosted sites. 800 or 1000, maybe, but 30? Unless he has other issues with his installation.

    What Apple should do is rewrite their admin (and their documentation) to either disable this or make it impossible to split out the logs when this 'file threshold' is approached, forcing the server to use one log format for all domains (or one common log and a common error log).

    Apple used to preach that it's the GUI that should do the work, and the underlying code should not get in the way of the user experience or the user's productivity. They have lost their way badly in the OSXS product, broken it in Leopard more than fixed it (relative to Tiger) with regard to Internet services, and they really need to recraft 10.5 to work properly. Not wait for 10.6 -- fix 10.5. Include the obvious management requirements:

    If you are going to have email services, they should support VH accounts and domains properly. Tiger worked, why doesn't Leopard?

    If you are going to support listserves, have them support VH. CompanyB doesn't want to have to ride on CompanyA's name in a listserve environment. The current Mailman can do it, but Apple has 1 yr old technology under the hood that hasn't kept up.

    If you are going to use PHP5 (or PHP4), then simple admin stuff like max file size management should be a given in the GUI, not a .ini file to be hacked out in an invisible directory via uploadmaxfilesize = . This has gone begging for nearly 4 years now.

    If you are going to support multiple accounts via the WGM, there should be a GUI method to point additional account to an existing home, or subdirectories within a home/Sites/ folder to allow fine-tuned FTP access. Many companies have different divisions work different subdirectories in a website, and one master account doesn't cut it.

    There's a whole lot more that is a mess in this product. It is NOT acceptable to say one needs to CLI to make it work. Why bother to sell a server product if the minute it is deployed, the admin has to be abandoned to get any functionality? I invite any viewer here to read the Apple Human Interface Guidelines from the 80s or early 90s. That philosophy is what should drive modern computing -- not a cobbled together half-functional admin GUI.

    Sorry to seem pissy, but there needs to be a wakeup call to Apple to get their server dev group in order (or let them do their work and stop pulling those resources over to the i-product of the moment). Jobs: It's the user experience that matters. Even here.
1 2 Previous Next