-
All replies
-
Helpful answers
-
-
Aug 17, 2014 5:21 AM in response to Linc Davisby Kevin Neal,I'm just restarting the server again at the moment so I'll gather the logs and post in a few minutes.
To be honest I was hoping you would respond !
In the meantime here is a bit more info.
I make a daily clone of my boot drive using CCC this clone is working perfectly even though it performed a clone after the main boot drive started to show this error, yet if I try to restore the clone the error is still present on the newly cloned drive, which is completely baffling me, in the remote possibility that it is a hard wear issue I have just restored the working clone to a completely different drive, and the error is again present when I boot into it. Which leaves me in a pretty scary position of having just one good copy of the system which I can't restore.
This has happened to me a few time and before I have got back up and running by destroying OD and remaking certificates, but I've just tried that this time and it isn't working
Sorry for waffling I know you like short answers, but this has got me panicking as I'm off on holiday in a few days and leaving behind a company working on a dodgy server!
Ok logs ...
System Log
Aug 17 13:09:45 xserve.xxxxxxxxxxxx.co.uk sandboxd[2560] ([2548]): collabpp(2548) deny file-read-data /Library/Preferences/.GlobalPreferences.plist
Aug 17 13:09:48 --- last message repeated 1 time ---
Aug 17 13:09:48 xserve com.apple.launchd[1] (com.apple.collabauthd[2608]): Exited with code: 1
Aug 17 13:09:48 xserve com.apple.launchd[1] (com.apple.collabauthd): Throttling respawn: Will start in 7 seconds
Aug 17 13:09:52 xserve.xxxxxxxxxxxx.co.uk genatsdb[2576]: *GENATSDB* FontObjects generated = 349
Aug 17 13:09:53 xserve.xxxxxxxxxxxx.co.uk com.apple.time[2492]: Interval maximum value is 946100000 seconds (specified value: 9223372036854775807).
Aug 17 13:09:57 xserve com.apple.launchd[1] (com.apple.collabauthd[2636]): Exited with code: 1
Aug 17 13:09:57 xserve com.apple.launchd[1] (com.apple.collabauthd): Throttling respawn: Will start in 9 seconds
Aug 17 13:10:08 xserve com.apple.launchd[1] (com.apple.collabauthd[2659]): Exited with code: 1
Aug 17 13:10:08 xserve com.apple.launchd[1] (com.apple.collabauthd): Throttling respawn: Will start in 9 seconds
Aug 17 13:10:19 xserve com.apple.launchd[1] (com.apple.collabauthd[2686]): Exited with code: 1
Aug 17 13:10:19 xserve com.apple.launchd[1] (com.apple.collabauthd): Throttling respawn: Will start in 9 seconds
Aug 17 13:10:24 xserve.xxxxxxxxxxxx.co.uk BPPortsd[77]: Set Last IP Updatedate: 2014-08-17 12:10:24 +0000
Aug 17 13:10:30 xserve com.apple.launchd[1] (com.apple.collabauthd[2741]): Exited with code: 1
Aug 17 13:10:30 xserve com.apple.launchd[1] (com.apple.collabauthd): Throttling respawn: Will start in 9 seconds
Aug 17 13:10:41 xserve com.apple.launchd[1] (com.apple.collabauthd[2768]): Exited with code: 1
Aug 17 13:10:41 xserve com.apple.launchd[1] (com.apple.collabauthd): Throttling respawn: Will start in 9 seconds
Aug 17 13:10:45 xserve.xxxxxxxxxxxx.co.uk screensharingd[2781]: Authentication: SUCCEEDED :: User Name: artisan :: Viewer Address: 192.168.181.205 :: Type: DH
Aug 17 13:10:52 xserve com.apple.launchd[1] (com.apple.collabauthd[2792]): Exited with code: 1
Aug 17 13:10:52 xserve com.apple.launchd[1] (com.apple.collabauthd): Throttling respawn: Will start in 9 seconds
Collaboration Daemon Log
Aug 17 13:11:58 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSExecutionTimer:14 15e2000 +0ms] TIMER: 42ms ---> Executed batch BatchRequest{(
"Request{ContentService.entityForGUID:(df68acb4-309a-409b-af4c-3193b98b92e1)}",
"Request{ContentService.canI:entityWithGUID:(read,df68acb4-309a-409b-af4c-3193b 98b92e1)}",
"Request{ContentService.canI:entityWithGUID:(write,df68acb4-309a-409b-af4c-3193 b98b92e1)}",
"Request{ContentService.canI:entityWithGUID:(own,df68acb4-309a-409b-af4c-3193b9 8b92e1)}"
)}
Aug 17 13:11:58 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSServiceDispatcher:30 15e2000 +0ms] Detaching thread to finish 1 deferred blocks
Aug 17 13:12:29 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSSessionService:152 155f000 +0ms] willExecuteRequest: Request{SessionService.enterMagicalAuthRealm()} [ROOT]
Aug 17 13:12:29 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSServiceDispatcher:250 155f000 +63ms] TIMER: 63ms ---> Executed Request{SessionService.enterMagicalAuthRealm()}
Aug 17 13:12:29 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSContentService:152 15e2000 +0ms] willExecuteRequest: Request{ContentService.processNotificationsForBatch:(D6DE9649-2865-4D18-95B9-F9 1C79FA4A80-318-000000C4DA505D48)} [ROOT]
Aug 17 13:12:29 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Warning>: [CSContentService:47 15e2000 +15ms] Detected Magic Superuser Auth Token
Aug 17 13:12:29 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSServiceDispatcher:250 15e2000 +1ms] TIMER: 1ms ---> Executed Request{ContentService.processNotificationsForBatch:(D6DE9649-2865-4D18-95B9-F9 1C79FA4A80-318-000000C4DA505D48)}
Aug 17 13:12:29 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSServiceDispatcher:30 15e2000 +0ms] Detaching thread to finish 1 deferred blocks
Aug 17 13:14:54 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSSessionService:152 155f000 +0ms] willExecuteRequest: Request{SessionService.currentOrNewSession()} [ROOT]
Aug 17 13:14:54 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSServiceDispatcher:250 155f000 +26ms] TIMER: 26ms ---> Executed Request{SessionService.currentOrNewSession()}
Aug 17 13:14:54 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSSettingsService:152 16e8000 +0ms] willExecuteRequest: Request{SettingsService.globalSettingForKey:(projectCreators)} [ROOT]
Aug 17 13:14:54 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSServiceDispatcher:250 16e8000 +5ms] TIMER: 6ms ---> Executed Request{SettingsService.globalSettingForKey:(projectCreators)}
Aug 17 13:14:59 xserve.xxxxxxxxxxxx.co.ukcollabd[319] <Info>: [CSSessionService:152 15e2000 +0ms] willExecuteRequest: Request{SessionService.currentOrNewSession()} [ROOT]
Aug 17 13:14:59 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSServiceDispatcher:250 15e2000 +72ms] TIMER: 72ms ---> Executed Request{SessionService.currentOrNewSession()}
Aug 17 13:14:59 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSSettingsService:152 16e8000 +0ms] willExecuteRequest: Request{SettingsService.globalSettingForKey:(projectCreators)} [ROOT]
Aug 17 13:14:59 xserve.xxxxxxxxxxxx.co.uk collabd[319] <Info>: [CSServiceDispatcher:250 16e8000 +5ms] TIMER: 6ms ---> Executed Request{SettingsService.globalSettingForKey:(projectCreators)}
-
Aug 17, 2014 5:30 AM in response to Kevin Nealby Kevin Neal,As far as I can tell everything else on the server is working normally except the wiki "Log in to access more services" and Profile manager web page, both just hang indefinitely instead of bringing up the login window
I have tried:
unloading and loading com.apple.collabauthd
Deleting Server.app and replacing it
Destroying OD and restoring from backup ( has worked in the past )
-
Aug 17, 2014 5:34 AM in response to Kevin Nealby Kevin Neal,This log may also be relevant:
[Sun Aug 17 12:45:56 2014] [notice] FastCGI: process manager initialized (pid 350)
[Sun Aug 17 12:45:56 2014] [warn] RSA server certificate CommonName (CN) `xserve.xxxxxxxxx.co.uk' does NOT match server name!?
[Sun Aug 17 12:45:56 2014] [warn] RSA server certificate CommonName (CN) `xserve.xxxxxxxxx.co.uk' does NOT match server name!?
[Sun Aug 17 12:45:56 2014] [warn] Init: Name-based SSL virtual hosts only work for clients with TLS server name indication support (RFC 4366)
[Sun Aug 17 12:45:58 2014] [notice] Apache/2.2.26 (Unix) DAV/2 PHP/5.3.28 mod_fastcgi/2.4.6 mod_ssl/2.2.26 OpenSSL/0.9.8y configured -- resuming normal operations
[Sun Aug 17 12:59:13 2014] [crit] (17)File exists: mod_rewrite: Parent could not create RewriteLock file /var/log/apache2/rewrite.lock
Configuration Failed
[Sun Aug 17 12:59:21 2014] [warn] RSA server certificate CommonName (CN) `xserve.xxxxxxxxx.co.uk' does NOT match server name!?
[Sun Aug 17 12:59:21 2014] [warn] RSA server certificate CommonName (CN) `xserve.xxxxxxxxx.co.uk' does NOT match server name!?
[Sun Aug 17 12:59:21 2014] [warn] Init: Name-based SSL virtual hosts only work for clients with TLS server name indication support (RFC 4366)
[Sun Aug 17 12:59:22 2014] [warn] module fastcgi_module is already loaded, skipping
Warning: DocumentRoot [/Library/Server/Web/Data/Sites/mail.xxxxxx.co.uk] does not exist
httpd: apr_sockaddr_info_get() failed for xserve.xxxxxxxx.co.uk
httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1 for ServerName
[Sun Aug 17 12:59:31 2014] [notice] FastCGI: process manager initialized (pid 337)
[Sun Aug 17 12:59:32 2014] [warn] RSA server certificate CommonName (CN) `xserve.xxxxxxxxx.co.uk' does NOT match server name!?
[Sun Aug 17 12:59:32 2014] [warn] RSA server certificate CommonName (CN) `xserve.xxxxxxxxx.co.uk' does NOT match server name!?
[Sun Aug 17 12:59:32 2014] [warn] Init: Name-based SSL virtual hosts only work for clients with TLS server name indication support (RFC 4366)
[Sun Aug 17 12:59:33 2014] [notice] Apache/2.2.26 (Unix) DAV/2 PHP/5.3.28 mod_fastcgi/2.4.6 mod_ssl/2.2.26 OpenSSL/0.9.8y configured -- resuming normal operations
[Sun Aug 17 13:16:17 2014] [error] [client 192.168.181.205] (70007)The timeout specified has expired: proxy: error reading response
[Sun Aug 17 13:26:26 2014] [error] [client 192.168.181.205] (70007)The timeout specified has expired: proxy: error reading response
-
Aug 17, 2014 10:49 AM in response to Kevin Nealby Linc Davis,The story isn't entirely consistent, but the server does seem to have an SSL certificate with the wrong CN.
You must have a working DNS service, and the server's hostname must match its fully-qualified domain name. To confirm, select the server by name in the sidebar of the Server application window, then select the Overview tab. Click the Edit button on the Host Name line. On the Accessing your Server sheet, Domain Name should be selected. Change the Host Name, if necessary.
Check the validity of the server's certificates. The common name must match the hostname and domain name. Deselecting and then reselecting the certificate in Server.app has been reported to have an effect in some cases. Otherwise delete all certificates and create new ones.
-
Aug 17, 2014 11:42 AM in response to Linc Davisby Kevin Neal,Checked the common name and it does match the host name.
Checked DNS, host command shows the DNS resolves correctly both directions.
Deselecting and reselecting certificates has no effect.
I'll try deleting the certificates again but I tried that earlier and it didn't help.
The most frustrating thing is that the clone is working fine, even after multiple restarts, yet a block level restore results in a server with the collabauthd error
-
Aug 17, 2014 12:03 PM in response to Kevin Nealby Linc Davis,If that's really true, then you have a failing drive.
-
Aug 17, 2014 12:41 PM in response to Linc Davisby Kevin Neal,Thats what I thought, so I restored the clone to a completely different drive and the same thing happened, tried restoring using both CCC and disk utility.
Also I have just deleted the certificates and remade them but I'm still getting the errors
-
Aug 17, 2014 6:14 PM in response to Kevin Nealby Linc Davis,So everything works fine when you boot from this one volume. but not when you boot from either of two different block copies of it?
-
Aug 18, 2014 12:05 AM in response to Linc Davisby Kevin Neal,Yes strange as it sounds thats correct, I am booted from the working volume at the moment and everything is fine, but I have no way of making a reliable working copy of it as a backup.
I just tried a non block copy of it as well just incase. I've also tried making a backup of the OD of this working volume and then switch to one of the other volumes an destroy the OD and restoring with the backup OD, again not working.
I need to either find a way to make a good copy of the working volume or a way to recover from the error.
Long term I know the most likely solution is to start completely from scratch
-
-
Aug 18, 2014 1:29 PM in response to Linc Davisby Kevin Neal,They are Xserve drive bays, each on a separate drive
2009 Xserve running latest Mountain Lion with Server 2.2.2
-
Aug 18, 2014 1:59 PM in response to Linc Davisby Kevin Neal,I've just tried a slightly different approach, I booted up into the recovery partition, checked the drives and repaired permissions from there then restored again, this time when I booted up the newly restored drive was working, though I did notice wiki server and profile manager took a very long time to start up.
But at least this buys me some time now I have two usable drives.
Would love to get to the bottom of why these services suddenly fail, but as the logs don't seem to give much meaningful information, maybe a fresh start with Mavericks is a good option
-
Aug 18, 2014 3:51 PM in response to Kevin Nealby Linc Davis,After a block copy, the destination volume would have the same permissions as the source volume, so repairing them would have no effect. It's conceivable that you have one good drive bay and two bad ones. I don't have a better theory.