Q: 2 of my 8 Xsan clients are not mounting. Get fsmpm is not running.
OK, I need some help. Being new to Xsan and Xserve I am now at a loss. Today 2 of my 8 workstations just stopped connecting and mounting to the volumes.
On each of the workstations I get fsmpm is not running. When I look at the logs this is wht the client machines report.
Feb 29 12:20:18 om xsand[46]: kern.coredump: 1 -> 1
Feb 29 12:20:18 om xsand[46]: kern.corefile: '/cores/core.%P' -> '/cores/core.%N.%P'
Feb 29 12:20:18 om xsand[46]: kern.ipc.maxsockbuf: Result too large
--
Feb 29 12:20:30 om com.apple.qmaster.qmasterd[76]: compressord[116]: installing XSanTest signal handler for SIGUSR1 (_xsanTestFlag = 0)
Feb 29 12:20:30 om com.apple.qmaster.qmasterd[76]: compressord[117]: installing XSanTest signal handler for SIGUSR1 (_xsanTestFlag = 0)
Feb 29 12:20:30 om com.apple.qmaster.qmasterd[76]: compressord[122]: installing XSanTest signal handler for SIGUSR1 (_xsanTestFlag = 0)
Feb 29 12:20:30 om com.apple.qmaster.qmasterd[76]: compressord[120]: installing XSanTest signal handler for SIGUSR1 (_xsanTestFlag = 0)
Feb 29 12:20:30 om com.apple.qmaster.qmasterd[76]: compressord[118]: installing XSanTest signal handler for SIGUSR1 (_xsanTestFlag = 0)
Feb 29 12:20:30 om com.apple.qmaster.qmasterd[76]: compressord[115]: installing XSanTest signal handler for SIGUSR1 (_xsanTestFlag = 0)
Feb 29 12:20:30 om com.apple.qmaster.qmasterd[76]: compressord[121]: installing XSanTest signal handler for SIGUSR1 (_xsanTestFlag = 0)
Feb 29 12:20:30 om com.apple.qmaster.qmasterd[76]: compressord[119]: installing XSanTest signal handler for SIGUSR1 (_xsanTestFlag = 0)
Feb 29 12:20:30 om sandboxd[140]: portmap(86) deny network-outbound /private/var/tmp/launchd/sock
--
Feb 29 12:20:32 om com.apple.launchd[1] (com.apple.SystemStarter): Failed to count the number of files in "/System/Library/StartupItems": No such file or directory
Feb 29 12:20:33 om servermgrd[52]: xsan: [52/2112D0] ERROR: open_fs_connection: fsmpm is not running, cannot get connection
--
Feb 29 12:20:33 om servermgrd[52]: xsan: [52/2112D0] ERROR: open_proxy_fs_connection: SNFS Name Service connection to 127.0.0.1 failed: The Xsan File System Services on 127.0.0.1 may be stopped.
Feb 29 12:20:33 om servermgrd[52]: xsan: [52/2112D0] ERROR: process_notifications: Cannot connect to FS
Feb 29 12:20:34 om xsand[46]: Synchronizing with fsmpm.
Feb 29 12:20:34 om configd[14]: network configuration changed.
Feb 29 12:20:34 om fsmpm[147]: PortMapper: CVFS Volume RAID4-Zim1 on device: /dev/rdisk6 (blk 0xe000010 raw 0xe000010) con: 1 lun: 0 state: 0xf4 inquiry [APPLE Xserve RAID 1.51] controller # 'default' serial # '6000393000021E5D01000000D52B0C53' Size: 8790767583 Sector Size: 512
Feb 29 12:20:34 om fsmpm[147]: PortMapper: CVFS Volume RAID3-Right on device: /dev/rdisk8 (blk 0xe000012 raw 0xe000012) con: 1 lun: 0 state: 0xf4 inquiry [APPLE Xserve RAID 1.51] controller # 'default' serial # '60003930000207D801000000D52CCF2A' Size: 8790767583 Sector Size: 512
Feb 29 12:20:34 om fsmpm[147]: PortMapper: CVFS Volume RAID5-Zim4 on device: /dev/rdisk10 (blk 0xe000014 raw 0xe000014) con: 1 lun: 0 state: 0xf4 inquiry [APPLE Xserve RAID 1.51] controller # 'default' serial # '600039300001E30401000000D52AE4D4' Size: 8790767583 Sector Size: 512
Feb 29 12:20:34 om fsmpm[147]: PortMapper: CVFS Volume RAID1-Right on device: /dev/rdisk12 (blk 0xe000016 raw 0xe000016) con: 1 lun: 0 state: 0xf4 inquiry [APPLE Xserve RAID 1.51] controller # 'default' serial # '600039300001CEF301000000E5CF72FE' Size: 4688443359 Sector Size: 512
Feb 29 12:20:34 om fsmpm[147]: PortMapper: CVFS Volume RAID1-Left1 on device: /dev/rdisk14 (blk 0xe000018 raw 0xe000018) con: 1 lun: 0 state: 0xf4 inquiry [APPLE Xserve RAID 1.51] controller # 'default' serial # '6000393000010D6E01000000DA4EABCE' Size: 781383647 Sector Size: 512
Feb 29 12:20:34 om fsmpm[147]: PortMapper: CVFS Volume RAID1-Left2 on device: /dev/rdisk15 (blk 0xe000019 raw 0xe000019) con: 1 lun: 1 state: 0xf4 inquiry [APPLE Xserve RAID 1.51] controller # 'default' serial # '6000393000010D6E0200000094A2C9EA' Size: 781383647 Sector Size: 512
Feb 29 12:20:34 om fsmpm[147]: PortMapper: CVFS Volume RAID2-Right on device: /dev/rdisk13 (blk 0xe000017 raw 0xe000017) con: 2 lun: 0 state: 0xf4 inquiry [APPLE Xserve RAID 1.51] controller # 'default' serial # '600039300001CC4E01000000DC40FE22' Size: 8790767583 Sector Size: 512
Feb 29 12:20:34 om fsmpm[147]: PortMapper: CVFS Volume RAID2-Left on device: /dev/rdisk11 (blk 0xe000015 raw 0xe000015) con: 2 lun: 0 state: 0xf4 inquiry [APPLE Xserve RAID 1.51] controller # 'default' serial # '600039300001D0EE01000000D591CFC8' Size: 8790767583 Sector Size: 512
Feb 29 12:20:34 om fsmpm[147]: PortMapper: CVFS Volume RAID3-Left on device: /dev/rdisk7 (blk 0xe000011 raw 0xe000011) con: 2 lun: 0 state: 0xf4 inquiry [APPLE Xserve RAID 1.51] controller # 'default' serial # '600039300002083501000000D52CB1C0' Size: 8790767583 Sector Size: 512
Feb 29 12:20:34 om fsmpm[147]: PortMapper: CVFS Volume RAID4-Zim2 on device: /dev/rdisk5 (blk 0xe00000f raw 0xe00000f) con: 2 lun: 0 state: 0xf4 inquiry [APPLE Xserve RAID 1.51] controller # 'default' serial # '6000393000021EB801000000D52AF882' Size: 8790767583 Sector Size: 512
Feb 29 12:20:34 om fsmpm[147]: PortMapper: CVFS Volume RAID5-Zim3 on device: /dev/rdisk9 (blk 0xe000013 raw 0xe000013) con: 2 lun: 0 state: 0xf4 inquiry [APPLE Xserve RAID 1.51] controller # 'default' serial # '600039300001E3E701000000D636DB1E' Size: 8790767583 Sector Size: 512
Feb 29 12:20:34 om fsmpm[147]: Disk arb runloop starting
Feb 29 12:20:35 om _mdnsresponder[163]: /usr/libexec/ntpd-wrapper: scutil key State:/Network/Global/DNS not present after 30 seconds
Feb 29 12:20:35 om loginwindow[58]: Login Window Started Security Agent
Feb 29 12:20:35 om loginwindow[58]: Login Window - Returned from Security Agent
Feb 29 12:20:35 om loginwindow[58]: USER_PROCESS: 58 console
Feb 29 12:20:35 om _mdnsresponder[169]: sntp options: a=2 v=1 e=0.100 E=5.000 P=2147483647.000
Feb 29 12:20:35 om _mdnsresponder[169]: d=15 c=5 x=0 op=1 l=/var/run/sntp.pid f= time.apple.com
Feb 29 12:20:35 om _mdnsresponder[169]: sntp: getaddrinfo(hostname, ntp) failed with nodename nor servname provided, or not known
Feb 29 12:21:05 Om fsmpm[147]: NSS: Name Server '10.0.1.220' (10.0.1.220) heartbeat lost, unable to send message.
--
Mar 1 10:40:25 Om servermgrd[52]: xsan: [52/3701E10] ERROR: -[SANFilesystem(PrivateMethods) doXsandManagementRpcWithCommand:]: Unable to connect to xsand: No such file or directory
Mar 1 10:40:25 Om servermgrd[52]: xsan: [52/3701E10] ERROR: -[SANFilesystem mountVolumeNamed:writable:withOptions:]: unable to send 'mount' message to xsand
Mar 1 10:40:55 Om /usr/sbin/serialnumberd[246]: New xsan serial number permanently registered by pid 52.
--
Mar 1 10:56:53 Om fsmpm[312]: NSS: Name Server '10.0.1.220' (10.0.1.220) heartbeat lost, unable to send message.
Mar 1 10:58:23 Om xsand[309]: Failed to synchronize with fsmpm (error 2).
I have checked my metadata network switch and all green lights. I have 6 other clients that are running just fine. I fear rebooting them in case that would cause them to fail as well.
Any ideas??!?!?!? HELP!!!
Thanks ,
Kevin
Xserve, Mac OS X (10.7.2), 4GB RAM, Apple 2 Port 4Gbps Fiber
Posted on Mar 1, 2012 9:33 AM
The fsnameserver config file was missing the ip for the backup MDC. I added it and all is working once again.
Posted on Mar 1, 2012 2:37 PM