Q: Problems with Xsan
Hi,
I have a promise Vtrak 610f with 2 enclosures and disks of 750GB.
Few months ago I have a big problem with my Xsan because it was corrupted.
Now I'm trying to install all from begining but I had a problem yesterday with the Xsan again . My Metacontrollers can't mount the volume that it is created.
I have 4 xserves, 2 of them MC and the others Clients. I have 2 Luns of data of 6TB (12TB in total) and 2 LUNs for metadatas (750 TB each one, 1,5 TB in total).
Yesterday, the one of the MC disconnect the XSAN and the other one gets it but the volume wasn't avaliable and I dont know why . This time the system isn't in production, I'm making tests but I don't want to have problems in the future .
I would like to know what am I doing bad. Maybe, I need more LUNS defined in my promise?
The error that is reported in the MC whenn try to mount the volume is the following:
+[0202 08:51:16.519195] 0xa07b3720 (Debug) RSVD SUMMARY [192.168.0.208] reserved space max requested/0 MB accounted now/0 MB+
+[0202 08:51:16.519202] 0xa07b3720 (Debug) VOP SUMMARY [192.168.0.208] VopCapNegotiate cnt/1 avg/27+27 min/27+27 max/2727.
+[0202 08:51:16.519213] 0xa07b3720 (Debug) VOP SUMMARY [192.168.0.208] VopClientId cnt/1 avg/25+24 min/25+24 max/2524.
+[0202 08:51:16.519229] 0xa07b3720 (Debug) FSM RSVD SPACE current 4230 MB actual 0 MB max since boot 4230 MB since last stats 4230 MB+
+[0202 08:51:16.519232] 0xa07b3720 (Debug) FSM RSVD clients with files open for write: now 0 max 0 last 0+
+[0202 08:51:16.519236] 0xa07b3720 (Debug) FSM RSVD clients writing: now 0 empty 0 max 0 last 0+
+[0202 08:51:16] 0xb8e2f000 (*FATAL*) PANIC: aborting threads now.+
+Logger_thread: sleeps/131509 signals/2 flushes/14692 writes/14694 switches 2+
+Logger_thread: logged/168922 clean/168922 toss/0 signalled/2 toss_message/0+
+Logger_thread: waited/0 awakened/0+
+[0202 13:12:08] 0xa07b3720 (debug) NSS: sending message (type 2) to Name Server '192.168.0.207' (192.168.0.207:49153).+
+[0202 13:12:08] 0xa07b3720 (debug) NSS: sending message (type 2) to Name Server '192.168.0.206' (192.168.0.206:49154).+
+[0202 13:12:08] 0xa07b3720 INFO NSS: election initiated by 192.168.0.206:49154 (id 195.235.180.206) - admin request.+
+[0202 13:12:08] 0xa07b3720 (debug) NSS_VOTE2 to 192.168.0.206:49154+
+[0202 13:12:08] 0xa07b3720 (debug) startfssvote could not find FSS Cronos in master - vote aborted.+
+[0202 13:12:08] 0xa07b3720 (debug) NSS: removing vote inhibitor for FSS 'Cronos'.+
+[0202 13:12:09] 0xa07b3720 NOTICE PortMapper: Initiating activation vote for FSS 'Cronos'.+
+[0202 13:12:09] 0xa07b3720 (debug) Initiatenssvote for FSS Cronos+
+[0202 13:12:09] 0xa07b3720 (debug) NSS: sending message (type 2) to Name Server '192.168.0.207' (192.168.0.207:49153).+
+[0202 13:12:09] 0xa07b3720 (debug) NSS: sending message (type 2) to Name Server '192.168.0.206' (192.168.0.206:49154).+
+[0202 13:12:09] 0xa07b3720 INFO NSS: election initiated by 192.168.0.206:49154 (id 195.235.180.206) - admin request.+
+[0202 13:12:09] 0xa07b3720 (debug) NSS_VOTE2 to 192.168.0.206:49154+
+[0202 13:12:09] 0xa07b3720 (debug) startfssvote could not find FSS Cronos in master - vote aborted.+
+[0202 13:12:09] 0xa07b3720 (debug) NSS: removing vote inhibitor for FSS 'Cronos'.+
+Feb 2 13:12:10 hermes Xsan Admin[18617]: ERROR: Error starting volume…: Operation could not be completed. (SANTransactionErrorDomain error 100036.) (100036)+
+Feb 2 13:12:10 hermes servermgrd[94]: xsan: [94/103B80] ERROR: getfsm_processstats(Cronos): Unable to find pid of fsm+
+Feb 2 13:12:52 hermes servermgrd[94]: xsan: [94/103B80] ERROR: getfsm_processstats(Cronos): Unable to find pid of fsm+
+Feb 2 13:13:52 hermes servermgrd[94]: xsan: [94/103B80] ERROR: getfsm_processstats(Cronos): Unable to find pid of fsm+
+Feb 2 13:14:52 hermes servermgrd[94]: xsan: [94/103B80] ERROR: getfsm_processstats(Cronos): Unable to find pid of fsm+
Please, could u help me?
Thanks for all.
Álvaro
I have a promise Vtrak 610f with 2 enclosures and disks of 750GB.
Few months ago I have a big problem with my Xsan because it was corrupted.
Now I'm trying to install all from begining but I had a problem yesterday with the Xsan again . My Metacontrollers can't mount the volume that it is created.
I have 4 xserves, 2 of them MC and the others Clients. I have 2 Luns of data of 6TB (12TB in total) and 2 LUNs for metadatas (750 TB each one, 1,5 TB in total).
Yesterday, the one of the MC disconnect the XSAN and the other one gets it but the volume wasn't avaliable and I dont know why . This time the system isn't in production, I'm making tests but I don't want to have problems in the future .
I would like to know what am I doing bad. Maybe, I need more LUNS defined in my promise?
The error that is reported in the MC whenn try to mount the volume is the following:
+[0202 08:51:16.519195] 0xa07b3720 (Debug) RSVD SUMMARY [192.168.0.208] reserved space max requested/0 MB accounted now/0 MB+
+[0202 08:51:16.519202] 0xa07b3720 (Debug) VOP SUMMARY [192.168.0.208] VopCapNegotiate cnt/1 avg/27+27 min/27+27 max/2727.
+[0202 08:51:16.519213] 0xa07b3720 (Debug) VOP SUMMARY [192.168.0.208] VopClientId cnt/1 avg/25+24 min/25+24 max/2524.
+[0202 08:51:16.519229] 0xa07b3720 (Debug) FSM RSVD SPACE current 4230 MB actual 0 MB max since boot 4230 MB since last stats 4230 MB+
+[0202 08:51:16.519232] 0xa07b3720 (Debug) FSM RSVD clients with files open for write: now 0 max 0 last 0+
+[0202 08:51:16.519236] 0xa07b3720 (Debug) FSM RSVD clients writing: now 0 empty 0 max 0 last 0+
+[0202 08:51:16] 0xb8e2f000 (*FATAL*) PANIC: aborting threads now.+
+Logger_thread: sleeps/131509 signals/2 flushes/14692 writes/14694 switches 2+
+Logger_thread: logged/168922 clean/168922 toss/0 signalled/2 toss_message/0+
+Logger_thread: waited/0 awakened/0+
+[0202 13:12:08] 0xa07b3720 (debug) NSS: sending message (type 2) to Name Server '192.168.0.207' (192.168.0.207:49153).+
+[0202 13:12:08] 0xa07b3720 (debug) NSS: sending message (type 2) to Name Server '192.168.0.206' (192.168.0.206:49154).+
+[0202 13:12:08] 0xa07b3720 INFO NSS: election initiated by 192.168.0.206:49154 (id 195.235.180.206) - admin request.+
+[0202 13:12:08] 0xa07b3720 (debug) NSS_VOTE2 to 192.168.0.206:49154+
+[0202 13:12:08] 0xa07b3720 (debug) startfssvote could not find FSS Cronos in master - vote aborted.+
+[0202 13:12:08] 0xa07b3720 (debug) NSS: removing vote inhibitor for FSS 'Cronos'.+
+[0202 13:12:09] 0xa07b3720 NOTICE PortMapper: Initiating activation vote for FSS 'Cronos'.+
+[0202 13:12:09] 0xa07b3720 (debug) Initiatenssvote for FSS Cronos+
+[0202 13:12:09] 0xa07b3720 (debug) NSS: sending message (type 2) to Name Server '192.168.0.207' (192.168.0.207:49153).+
+[0202 13:12:09] 0xa07b3720 (debug) NSS: sending message (type 2) to Name Server '192.168.0.206' (192.168.0.206:49154).+
+[0202 13:12:09] 0xa07b3720 INFO NSS: election initiated by 192.168.0.206:49154 (id 195.235.180.206) - admin request.+
+[0202 13:12:09] 0xa07b3720 (debug) NSS_VOTE2 to 192.168.0.206:49154+
+[0202 13:12:09] 0xa07b3720 (debug) startfssvote could not find FSS Cronos in master - vote aborted.+
+[0202 13:12:09] 0xa07b3720 (debug) NSS: removing vote inhibitor for FSS 'Cronos'.+
+Feb 2 13:12:10 hermes Xsan Admin[18617]: ERROR: Error starting volume…: Operation could not be completed. (SANTransactionErrorDomain error 100036.) (100036)+
+Feb 2 13:12:10 hermes servermgrd[94]: xsan: [94/103B80] ERROR: getfsm_processstats(Cronos): Unable to find pid of fsm+
+Feb 2 13:12:52 hermes servermgrd[94]: xsan: [94/103B80] ERROR: getfsm_processstats(Cronos): Unable to find pid of fsm+
+Feb 2 13:13:52 hermes servermgrd[94]: xsan: [94/103B80] ERROR: getfsm_processstats(Cronos): Unable to find pid of fsm+
+Feb 2 13:14:52 hermes servermgrd[94]: xsan: [94/103B80] ERROR: getfsm_processstats(Cronos): Unable to find pid of fsm+
Please, could u help me?
Thanks for all.
Álvaro
Posted on Feb 2, 2011 4:19 AM