I am using Nagios to monitor my servers (one Linux box and one Mac Mini Server with Mountain Lion, all upgrades installed).
For the Mac server, I was using check_osx_server plugin, which is using port 311 to get information from the OS-X server. After the last upgrade (I am not sure if it started with the 10.8.2 upgrade or the latest supplemental upgrade), the CPU Usage reported by the server is very strange. The thing is, it's not Nagios which is reporting wrong numbers, it is the server itself which is returning erroneous values. You can check this by opening a browser window and point it to https://yourserver.com:311 and then select servermgr_info and then from the drop-down menu getHardwareInfo and click Send Command. That XML document returned from the server, is where check_osx_server plugin is getting its numbers from.
What I have found so far:
1. When you first power up the server, if you check the CPU usage you get a number with 20-something digits. This is crazy, because the number reported is supposed to be less than 10000, since that number is divided by 100, to show you the CPU Usage percentage.
2. The number remains the same for ever. Obviously, the CPU Usage changes with time, and that was reported accordingly, in the past. Now this number changes only if you run Server.app in the server or on a remote Mac. It changes to a more meaningful number (four digits, since this number has to be divided by 100, to get the CPU Usage percentage), but again, it remains the same for ever, until you close Server.app and re-run it. It also changes, if you go to the Stats in Server.app and change the default period from 4 hours, to 1 hour, but not if you change from one hour to 4 or 24h. Here is a sample of the XML file returned:
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
> "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0">
There are more lines, but I have cut the file for brevity.
3. Again, as long as the Server.app is running, the CPU Usage reported does NOT change. Only if you quit Server.app and re-run it, it changes.
From the above, it is clear that port 311 is no longer reporting the momentary CPU Usage, but some other value, which is stored somewhere and is not updated unless you run Server.app.
Has anyone else noticed this behavior?