Skip navigation

For all of you having services failing to start

47166 Views 117 Replies Latest reply: Sep 6, 2012 5:48 PM by paulfromburwood RSS
  • pattieja1 Level 1 Level 1 (0 points)

    I encountered this same issue but noticed the following log entry, which was generated shortly after the system killed postgresql:

     

    2011-08-08 12:33:15 MST LOG:  last_statrequest 2011-08-08 12:37:46.323678-07 is later than collector's time 2011-08-08 12:33:15.064742-07

     

    I checked my Date/Time settings and it wasn't set to use NTP, so I enabled that option, rebooted and postgresql came up just fine.  Now I'm having to troubleshoot database corruption issues (there appeared to be an auto migration that moved the pgsql directory out of the way and calDAV got messed up for authentication and all calendars are now missing, etc.).

     

    Hope that helps those who aren't getting this resolved.

  • Beno 44 Level 1 Level 1 (15 points)
    Currently Being Moderated
    Aug 11, 2011 7:25 PM (in response to pattieja1)

    Nothing has worked for me so far.

     

    If there is any other suggestions out there...

     

    Thanks

    Ben

  • Beno 44 Level 1 Level 1 (15 points)
    Currently Being Moderated
    Aug 11, 2011 11:17 PM (in response to Beno 44)

    Here are the logs specific to the postgre database startup:

     

    2011-08-12 13:30:29 EST LOG:  database system was shut down at 2011-08-12 13:15:10 EST

    2011-08-12 13:30:29 EST LOG:  database system is ready to accept connections

    2011-08-12 13:30:29 EST LOG:  autovacuum launcher started

    2011-08-12 13:30:30 EST LOG:  connection received: host=[local]

    2011-08-12 13:30:30 EST LOG:  connection authorized: user=collab database=collab

    2011-08-12 13:30:32 EST LOG:  connection received: host=[local]

    2011-08-12 13:30:32 EST LOG:  connection authorized: user=collab database=collab

    2011-08-12 13:30:35 EST ERROR:  duplicate key value violates unique constraint "global_settings_pkey"

    2011-08-12 13:30:35 EST DETAIL:  Key (key)=(com.apple.setting.schema_version) already exists.

    2011-08-12 13:30:35 EST STATEMENT:  INSERT INTO global_settings (key, value_int) VALUES ('com.apple.setting.schema_version', $1)

    2011-08-12 13:30:35 EST LOG:  unexpected EOF on client connection

    2011-08-12 13:30:36 EST LOG:  connection received: host=[local]

    2011-08-12 13:30:36 EST LOG:  connection authorized: user=collab database=collab

    2011-08-12 13:30:36 EST LOG:  connection received: host=[local]

    2011-08-12 13:30:36 EST LOG:  connection authorized: user=_devicemgr database=device_management

    2011-08-12 13:30:37 EST LOG:  connection received: host=[local]

    2011-08-12 13:30:37 EST LOG:  connection authorized: user=collab database=collab

    2011-08-12 13:30:41 EST LOG:  connection received: host=[local]

    2011-08-12 13:30:41 EST LOG:  connection authorized: user=caldav database=caldav

    2011-08-12 13:30:46 EST LOG:  connection received: host=[local]

    2011-08-12 13:30:46 EST LOG:  connection authorized: user=_devicemgr database=device_management

    2011-08-12 13:30:56 EST LOG:  connection received: host=[local]

    2011-08-12 13:30:56 EST LOG:  connection authorized: user=_devicemgr database=device_management

    2011-08-12 13:31:04 EST LOG:  connection received: host=[local]

    2011-08-12 13:31:04 EST LOG:  connection authorized: user=_postgres database=postgres

    2011-08-12 13:31:07 EST LOG:  unexpected EOF on client connection

    2011-08-12 13:31:07 EST LOG:  unexpected EOF on client connection

    2011-08-12 13:31:07 EST FATAL:  terminating connection due to administrator command

    2011-08-12 13:31:07 EST LOG:  received smart shutdown request

    2011-08-12 13:31:07 EST LOG:  autovacuum launcher shutting down

    2011-08-12 13:31:07 EST LOG:  shutting down

    2011-08-12 13:31:07 EST LOG:  database system is shut down

     

    Appears that something is wrong with com.apple.setting.schema_version. I cannot find this file anywhere on the server.

     

    Thanks

  • George Wilde Level 1 Level 1 (5 points)

    I have a new Mac Mini Server with Lion Server pre-installed. I did a brand new setup, and not a migration from my previous Snow Leopard Server. Setup of all services went fine using the Profile Manager. However uppon reboot I lost several services and couldn't restart them manually. Unchecking the Dedicate option and rebooting allowed me to restart the services like you indicated - without loosing any setup data. I only used the Server app (and not Server Administrator) in my setup, so it was a fairly straightforward installation. I have all services running in Server App other than Time Machine. I am using Remote Management with Apple Remote Desktop and I have Remote Login and File Sharing activated. Otherwise all other System Preferences are off.

     

    This is obviously a fairly serious bug in the Lion Server if you can't rely on it returning to its previous state upon reboot. I am also experiencing some periodic iCal, Address Book, and iChat lost connections with the Lion Server on a Snow Leopard client. The connections work fine for a while and then just refuse to connect.

     

    George Wilde

  • Teknologist Calculating status...

    Hi All,

     

    Taking some time from holidays location to give some feedback. I can confirm turning dedicated server on/off didn't solve the problem. That was only a coincidence triggered by the needed reboot.

     

    First , it seems something is really broken in Lion server. Many, by seeing all responses to this discussion and judging from similar posts, are hit by these services bugs.

     

    Second, I really thing it has to do with Postgres or OpenDirectory (after setting a new service that does an OD schema change).

     

    Third, it always breaks after a reboot.

     

    For me the second thing (not enabling at all Xgrid & Podcast producer) really did the trick. server has been running for 15 days without a glitch. Even after a few reboots. Nevertheless profile manager's profiles install on Lion or iOS devices but these can't enroll because of invalily signed profiles (see my other threads). It seems to have something to do with certs not being OK in OD, but I don't know how to solve this.

     

    I suggest we have a little poll here:

     

    1) Who is using Opendirectory ?

    2) Who has Xgrid and/or Podcast producer on ?

    3) Do you have ssl certificates installed ? Commercial or serlf signed ?

    4) Moreover, in what order did you configure services ?

     


    We need to hope that Apple will sort these out in 10.7.1/10.7.2 A 10.7.2 developer seed is out but I am too afarid to test it on my server... ;-)

     

    Being an Apple user for more than a decade I am really astonished that Apple has released this. I am really used to have rock solid software when it comes to Apple. I am really surpised.

  • Beno 44 Level 1 Level 1 (15 points)

    I managed to get it going by re-running the postgre initial install command and removing the collabd file that was causing the postre to crash.

     

    After a restart, back to square one.

     

    Re-installing with time machine brought the problem back straight away.

     

    I had to revert to 10.6.8.

     

    A problem on a server is never nice and when it affects ical/addressbook/web/wiki... it's not manageable.

     

    Looking forward to see if 10.7.1 brings some stability.

     

    have a great day.

    Ben

  • rhearob Level 1 Level 1 (5 points)

    I had the same problem starting this afternoon.  I was advised to activate the "Dedicate Resources" function and it caused all manner of brreakage.  Postgres is now completely unstable and shuts down randomly.  The server was running fine with no issues.  Now its a complete joke.

     

    Beno, How did you reinitialize Postgres?  Which Collabd file wa giving you issues?

  • Beno 44 Level 1 Level 1 (15 points)
    Currently Being Moderated
    Aug 15, 2011 4:16 AM (in response to rhearob)

    Hi rhearob,

     

    As mentioned I've had to revert to Snow Leopard temporarily.

     

    I found out the command line by doing in Terminal:

     

    locate postgre

    Here there are lots of stuff but 1 one them is store in /usr/share/....rb or rbs

     

    The title is self explanatory something along the lines of inital postgre install.

     

    Once you run this I've had a fair bit of error messages saying that files were already there including the collabd file.

     

    I have moved the collabd file on the desktop, re run the command and it worked.

     

    After a restart it stopped working again.

     

    Sorry for not being overly specific but as mentioned, nowhere for me to look.

     

    Let me know if you have any other questions.

  • Benezet Calculating status...

    I too am having all the issues discussed in this thread. I have noticed that PostgreSQL, in my case, is a casualty of the calender service going down first.  Once iCal server fails just opening Server.app causes an error that shuts down Postgres (see below).

     

    2011-09-01 07:12:43 CDT LOG:  connection authorized: user=_postgres database=postgres <-- opened Server.app here

    2011-09-01 07:12:45 CDT LOG:  unexpected EOF on client connection

    2011-09-01 07:12:45 CDT LOG:  unexpected EOF on client connection

    2011-09-01 07:12:45 CDT FATAL:  terminating connection due to administrator command

    2011-09-01 07:12:45 CDT FATAL:  terminating connection due to administrator command

    2011-09-01 07:12:45 CDT FATAL:  terminating connection due to administrator command

    2011-09-01 07:12:45 CDT FATAL:  terminating connection due to administrator command

    2011-09-01 07:12:45 CDT FATAL:  terminating connection due to administrator command

    2011-09-01 07:12:45 CDT LOG:  received smart shutdown request

    2011-09-01 07:12:45 CDT LOG:  autovacuum launcher shutting down

    2011-09-01 07:12:45 CDT FATAL:  terminating connection due to administrator command

    2011-09-01 07:12:45 CDT LOG:  shutting down

    2011-09-01 07:12:45 CDT LOG:  database system is shut down

    2011-09-01 08:57:49 CDT LOG:  database system was shut down at 2011-09-01 07:12:45 CDT

     

    This in turn causes all the services which rely on the database to display the "Error reading settings" error. Fortunately I'm no longer losing the "collab" and "caldav" roles in Postgres and all the services will restore after two or three reboots (but never after just one). I have made this issue aware to the Apple enterprise support group. The senior advisors were stumped and have notified the engineers. Once I receive an answer I will post their solution back to this thread.

  • orsmo Level 1 Level 1 (0 points)

    I've discovered a way to recover from this condition reliably.  I have not discovered the true root cause of it, but I felt it best to share what I've learned sooner rather than later.  I'm going to launch into some exposition for a moment though, so if you just want the solution, skip ahead.

     

    I'm a Unix adminstrator with an extensive (nearly 20 year) background in running things like databases and web services (and quite a bit besides).  I'm a huge fan of Apple generally, and in particular I love Mac OS X.  I've given up Windows and even Linux on my own personal machines and I think that Lion is a fabulous release on the whole.  However, Server, has been nothing but trouble at every step.

     

    This problem in particular, the one of having services suddenly fail to start after a reboot, and unable to be started is my biggest reason for that last statement.  I tried everything I could think of, but the only thing I ever seemed to find that worked was to reinstall the Lion fresh, reinstall Server, and step through the entire configuration of every service from scratch.

     

    After doing this 5 times, I just couldn't take it anymore.  Thanks to this thread and some other information I found online and a very strong desire to figure this out, I set to work looking at all the possibilities I could find.

     

    The main problem seams to be that afte a reboot, sometimes even if no obvious configuration changes have been made, seems to fail to start.  Since this is a dependency of a great many of the other services, either by being where they store their data, or where they store configuration information, this is a serious problem.  To make things worse, manually starting postgres with 'sudo serveradmin start postgres' not only didn't work, but didn't seem to log anything into the PostgreSQL.log file either.

     

    It looks like Apple decided to do something with service account names for what I'm guessing is security reasons.  The postgres account is really called _postgres.  In my effort to troublshoot, I decided I'd simply 'su _postgres -c postgres [complete options from the settings]' and add the option to set the debug level to 5.  Unfortunately whatever the security of the underscore does, it appears to make it so su'd commands don't do a thing.  I mean that literally too.  You don't get an output of "foo" if you  "su _postgres -c echo foo".  So what to do?

     

    I knew I couldn't run posgres as root. It will complain about how horrible an idea that is and refuse to start.  It wants an unpriv'd account.  So I used a regular user account.  I did a "chmod -R someuser /var/pgsql" and tried to start up postgres pointed at the directory and with debugging set to 5.  Aha!  It tried to start and in the debugging output complained about postmaster.pid existing and needing to be deleted!

     

    So I went to delete it only to discover that despite owning the file, I got permission denied.  Sure enough, "le -le" revealed that the file had an ACL on it that denied everyone pretty much any access to the file.  I removed the ACL and tried again.  Still permission denied.  Oh! The directory had an ACL too that did essentially the same. ACL removed, try again.  Bingo! The next start of postgres made it past the pid file problem... only to blow up on permissions on another file deeper into the tree.

     

    "That's it! I've had it!" And so I recursively removed the ACLs from the entire /var/pgslq tree. Postgres started as with my user account.

     

    OK, time to try the same with the _postgres acount. I restored the original files from a Time Machine backup and started in on my workaround...

     

     

    ***BEGIN WORKAROUND***

     

    sudo su -

    cd /var

    chmod -R -a# 0 pgsql

    serveradmin start postgres

    exit

     

    ***END WORKAROUND***

     

    Postgres should start, and all of your configured services which depend on it should now too.

     

    Of course all this begs the question, "Why is this happening in the first place?"  I'm not sure.  In one of my attempts to fix things, I did try using Disk Utility to repair permissions, but that didn't help.  Someone on the forum suggested permissions issues and mentioned a job that runs nightly to change permissions, but I've not investigated such a possibility.  I suspect that something like this is what is happening to cause it because some people report the problem on reboot even after making no changes to the config between reboots.  It is possible though that this is a red herring and that it only happens when a certain service is enabled, or when a certain option is set.  I really don't know.

     

    Moreover, I've only *just* put htis fix into place.  It is entirely possible that I'll reboot my machine in a few hours or a few weeks and it will suffer the same problem again.  If anyone has a good clue as to the cause, I'd love to hear about it. In the mean time, at least I have a workaround when the problem recurrs.

  • rhearob Level 1 Level 1 (5 points)

    Actually for me the workaround is much simpler, and I think I have a clue to root cause.  As I dig through all of the logs I kept noticing a huge amount of certificate and SSL related errors.  My workaroud was this:

     

    1.  Set all SSL certificates to "None" in Serveradmin.app

    2.  Restart Server

    3.  Once server boots back up, Reset your cerver certificate to the correct one(s).

     

    The problem with PostgreSQL not starting went away as did all of the related issues.

     

    There is some serious bad mojo in the way serveradmin writes out its config.  Hope this helps.

  • Brian Brumfield Level 1 Level 1 (130 points)

    There are a number of threads open about this issue, but people seem to be hitting it from the perspective of whichever symptom it is that they notice. We seem to be in-line, that postgres appears to be at the epicenter of the issue.

     

    In a nutshell, it seems that what is happening is that some services that rely on postgres are going into a race condition at shutdown/reboot, and as a result postgres is not cleanly shutdown. Upon restart, postgres goes into a recovery condition, renames the old /var/pgsql directory to a "/var/pgsql-pre-recovery-[timestamp]" folder, and then tries to restart.

     

    This sabotages everything that relies on postgres ... podcast producer, certificates, xgrid, wiki, caldav, web, and so on.

     

    I've noticed that when this happens there are /etc/apache2, /etc/certificates and /var/pgsql "pre-recovery" directories created, and the system then has total amnesia about the previous configurations and data.

     

    Two things I've done that seem to work.

     

    1. Use serveradmin to stop postgres, then move /var/pgsql to /var/pgsql-recovery, and my latest /var/pgsql-pre-recovery-[timestamp] folder back to /var/pgsql ... then restart postgres. This restored everything.

     

    2. Before I reboot, I manually stopped all of the services that I could think of that relied on postgres, and then postgres itselt. Double checked with a 'serveradmin fullstatus postgres' to make sure it was down gracefully, and then reboot.

     

    This is probably overkill, but for the first time, I rebooted and DNS was up - and everything else came up cleanly - I just had to manually restart a few things (ical, wiki, profile manager, xgrid).

     

    I submitted this as a bug report today as well. Good luck with the madness!

  • andrew2011 Level 1 Level 1 (0 points)

    I have also had the 'Error reading settings' problem in Profile Manager, despite trying everything in the discussions and clean reinstalls (which work for a little while only).

     

    It seems that various different fixes work for some people but not others; and the underlying cause of the problem has not been resolved.

     

    There are now numerous threads on this problem (there are yet others with similar problems):

     

    https://discussions.apple.com/thread/3189397

    https://discussions.apple.com/thread/3195100

    https://discussions.apple.com/thread/3212015

    https://discussions.apple.com/thread/3208533

    https://discussions.apple.com/thread/3249062

    https://discussions.apple.com/thread/3199734

    https://discussions.apple.com/thread/3212304

     

    I have posted this in each to try and pull things together a bit.

     

    Does anyone know if Apple has acknowledged the issue and offered an official response?

  • Joe Pyrdek Level 1 Level 1 (135 points)
    Currently Being Moderated
    Oct 12, 2011 9:34 PM (in response to andrew2011)

    Got word from our Apple sales rep that 10.7.2 has been released for distribution.  I have not had a chance to download it so I can't report on what it may have fixed (or broke) with the new update.  Anyone give it a try yet?  If so, what's it look like?

1 2 3 4 ... 8 Previous Next

Actions

More Like This

  • Retrieving data ...

Bookmarked By (17)

This site contains user submitted content, comments and opinions and is for informational purposes only. Apple disclaims any and all liability for the acts, omissions and conduct of any third parties in connection with or related to your use of the site. All postings and use of the content on this site are subject to the Apple Support Communities Terms of Use.