OS X Mavericks Server Slows Down, Trashes Swap/VM then Hangs Up
Mac mini 3,1 based OS X Mavericks 10.9.4 Server started slowing down then hanging up completely three weeks ago after a long period of great service. The interval between reboot and hang has been decreasing and is now about a day. It is a gently used file server, a DNS and DHCP, as well as doing device management in a small home office network. After a reboot it is OK, however, it progressively adds 1GB swap files, growing the total to about 32 GB, does a lot of paging, slows down, then dies. Disks are reporting fine, permissions repaired.
Nothing runs on it other than a limited set of Server services. Looking at logs, I can see the apspd process causes high level of wakeups, much CPU usage, and uses as much as 1GB+ of Real Mem, even after switching off all push notifications I could see in the Server.app.
Also, there is a large (100-200) number of postgres_real processes being run by user _devicem even though the number of managed devices is fewer than 10.
Here is a typical Activity Monitor top memory users a few hours before it will crash, generating a series of [KERNEL]: Swap File Error log messages.
I have run a log scanning script posted by others on this forum, here is the output, showing the important log lines, collected a short time after a reboot (the diagnostic script does not even finish running later on). Many thanks to anyone who can suggest the cause and a solution, otherwise I will perform a clean install and a restore of the Open Directory.
System Version: OS X 10.9.4 (13E28)
Kernel Version: Darwin 13.3.0
Boot Mode: Normal
Model: Macmini3,1
USB
Ellipse MAX (EATON)
System diagnostics
2014-09-10 AppleFileServer crash
2014-09-10 systemstats spin
2014-09-11 systemstats spin
2014-09-13 systemstats spin
2014-09-14 systemstats spin
2014-09-15 systemstats spin
2014-09-16 apspd spin
2014-09-17 apspd spin
2014-09-17 apspd spin
2014-09-17 servermgrd crash
Kernel messages
Sep 17 02:17:27 process systemstats[16205] caught causing excessive wakeups. Observed wakeups rate (per sec): 333; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 57803
Sep 17 02:19:31 (default pager): [KERNEL]: Swap File Error.
--- last message repeated 25 times ---
Sep 17 08:55:20 process apspd[112] thread 663 caught burning CPU! It used more than 50% CPU (Actual recent usage: 58%) over 180 seconds. thread lifetime cpu usage 90.047061 seconds, (89.542126 user, 0.504935 system) ledger info: balance: 90006772984 credit: 90006772984 debit: 0 limit: 90000000000 (50%) period: 180000000000 time since last refill (ns): 152762200054
Sep 17 09:20:13 process apspd[112] caught causing excessive wakeups. Observed wakeups rate (per sec): 327; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 60697
Sep 17 10:28:32 process devicemgrd[150] caught causing excessive wakeups. Observed wakeups rate (per sec): 1304; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 59429
Sep 17 10:30:53 process servermgrd[2535] caught causing excessive wakeups. Observed wakeups rate (per sec): 174; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 79393
Sep 17 10:31:51 process mds[47] caught causing excessive wakeups. Observed wakeups rate (per sec): 168; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 132369
Sep 17 10:32:00 process mds_stores[106] caught causing excessive wakeups. Observed wakeups rate (per sec): 215; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 137451
Sep 17 11:20:53 process opendirectoryd[28] caught causing excessive wakeups. Observed wakeups rate (per sec): 248; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 118062
Sep 17 11:40:12 process distnoted[27] caught causing excessive wakeups. Observed wakeups rate (per sec): 600; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 81038
Sep 17 11:48:09 process WindowServer[148] caught causing excessive wakeups. Observed wakeups rate (per sec): 241; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 84959
Sep 17 12:04:01 (default pager): [KERNEL]: Swap File Error.
--- last message repeated 25 times ---
Sep 17 12:41:07 process pkgutil[16606] caught causing excessive wakeups. Observed wakeups rate (per sec): 680; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 49099
Sep 17 12:45:41 (default pager): [KERNEL]: Swap File Error.
--- last message repeated 6 times ---
Sep 17 13:18:36 process apspd[110] thread 1764 caught burning CPU! It used more than 50% CPU (Actual recent usage: 77%) over 180 seconds. thread lifetime cpu usage 90.045854 seconds, (89.502392 user, 0.543462 system) ledger info: balance: 90007160696 credit: 90007160696 debit: 0 limit: 90000000000 (50%) period: 180000000000 time since last refill (ns): 116445461022
Sep 17 13:55:41 process apspd[110] caught causing excessive wakeups. Observed wakeups rate (per sec): 204; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 86240
Sep 17 14:22:36 process apspd[110] thread 592 caught burning CPU! It used more than 50% CPU (Actual recent usage: 66%) over 180 seconds. thread lifetime cpu usage 90.041730 seconds, (89.487447 user, 0.554283 system) ledger info: balance: 90002141218 credit: 90002141218 debit: 0 limit: 90000000000 (50%) period: 180000000000 time since last refill (ns): 135039338365
Sep 17 14:49:49 process apspd[110] caught causing excessive wakeups. Observed wakeups rate (per sec): 192; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 58411
Sep 17 15:33:49 process mds[47] caught causing excessive wakeups. Observed wakeups rate (per sec): 192; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 92736
Sep 17 15:37:24 process distnoted[26] caught causing excessive wakeups. Observed wakeups rate (per sec): 189; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 48891
Sep 17 15:40:00 process WindowServer[136] caught causing excessive wakeups. Observed wakeups rate (per sec): 267; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 76528
Sep 17 15:52:06 process distnoted[655] caught causing excessive wakeups. Observed wakeups rate (per sec): 169; Maximum permitted wakeups rate (per sec): 150; Observation period: 300 seconds; Task lifetime number of wakeups: 49859
Total CPU usage: user 42%, system 18%
CPU usage by process "apspd" with UID 0: 88.0%
DNS (not from DHCP): 127.0.0.1
Restricted user files: 39
Elapsed time (s): 574
Should I should post this in the Mavericks forum?
Thank you, and regards from Ireland.
Rafal
Mac mini, OS X Mavericks (10.9.4), OS X Server