Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

unlabeled LUNs in XSan

This morning I had to restart my secondary metadata controller (from now on: fry), only to find out that it wouldn't mount again the XSan volume it was using (/Volumes/raid5; raid5 from now on).

It was weird since it was mounted and working in the primary MDC. After checking the logs, and making sure there was no firewall issue, etc... I realized that all the LUNs that composed my volume appear now as "UNLABELED" in the XSan Admin tool.

Even in the Primary MDC (lila), the volume is mounted but the config in the XSan Admin seems lost. No labeled LUNs and no LUNs on the volumes...

I don't dare to reboot the Primary MDC since two months ago I already suffered the pain of reintalling and restoring all data.

Any clues??

I'm starting to get REALLY dissapointed with XSan performance and stability, the LUNs just became ulabeled out of the blue and now I'm fearing another data-loss.



G5 2.7x2 Mac OS X (10.4.6)

G5 2.7x2, Mac OS X (10.4.9)

Posted on Jul 3, 2007 7:58 AM

Reply
9 replies

Jul 3, 2007 8:21 AM in response to Guillermo Nuez Marin

Some more info, portion of the logs:

From the volume log in the client (2ry MDC) I get this when trying to mount it:

[0703 10:21:37.606514] 0x1801000 (Debug) sigwait handler starting
[0703 10:21:37] 0xa000ed88 (Info) Server Revision 2.7.201 Build 7.23 Built for Darwin 8.0 Created on Mon Nov 13 11:49:56 PST 2006
[0703 10:21:40] 0xa000ed88 (Info)
Configuration:
DiskTypes-15
Disks-15
StripeGroups-2
ForceStripeAlignment-1
MaxConnections-75
ThreadPoolSize-128
StripeAlignSize-8
FsBlockSize-131072
BufferCacheSize-32M
InodeCacheSize-8192
RestoreJournal-Disabled
RestoreJournalDir-None
[0703 10:21:40] 0xa000ed88 ( *FATAL*) Server could not find any Meta-Data devices!
[0703 10:30:12.954182] 0x1801000 (Debug) sigwait handler starting
[0703 10:30:12] 0xa000ed88 (Info) Server Revision 2.7.201 Build 7.23 Built for Darwin 8.0 Created on Mon Nov 13 11:49:56 PST 2006
[0703 10:30:15] 0xa000ed88 (Info)
Configuration:
DiskTypes-15
Disks-15
StripeGroups-2
ForceStripeAlignment-1
MaxConnections-75
ThreadPoolSize-128
StripeAlignSize-8
FsBlockSize-131072
BufferCacheSize-32M
InodeCacheSize-8192
RestoreJournal-Disabled
RestoreJournalDir-None
[0703 10:30:15] 0xa000ed88 ( *FATAL*) Server could not find any Meta-Data devices!
[0703 16:16:19.303688] 0x1801000 (Debug) sigwait handler starting
[0703 16:16:19] 0xa000ed88 (Info) Server Revision 2.7.201 Build 7.23 Built for Darwin 8.0 Created on Mon Nov 13 11:49:56 PST 2006
[0703 16:16:21] 0xa000ed88 (Info)
Configuration:
DiskTypes-15
Disks-15
StripeGroups-2
ForceStripeAlignment-1
MaxConnections-75
ThreadPoolSize-128
StripeAlignSize-8
FsBlockSize-131072
BufferCacheSize-32M
InodeCacheSize-8192
RestoreJournal-Disabled
RestoreJournalDir-None
[0703 16:16:21] 0xa000ed88 ( *FATAL*) Server could not find any Meta-Data devices!




From the system log I get this messages:

Jul 3 16:18:10 fry kernel[0]: CVFS 'raid5': FsBlk size 131072, bits 17, mask 0x1ffff
Jul 3 16:18:10 fry kernel[0]: CVFS 'raid5': Sector size 0, bits 0, mask 0x0
Jul 3 16:18:10 fry kernel[0]: Not all drives available on stripe group 1 for filesystem 'raid5'
Jul 3 16:18:10 fry kernel[0]: Could not mount filesystem raid5, cvfs error ' IO error' (3)
Jul 3 16:18:13 fry kernel[0]: CVFS 'raid5': Buffer Cache blksize 4096, #blocks min 256 max 8192
Jul 3 16:18:13 fry kernel[0]: CVFS 'raid5': request reserved space 0x2333333
Jul 3 16:18:13 fry kernel[0]: CVFS 'raid5': FsBlk size 131072, bits 17, mask 0x1ffff
Jul 3 16:18:13 fry kernel[0]: CVFS 'raid5': Sector size 0, bits 0, mask 0x0
Jul 3 16:18:13 fry kernel[0]: Not all drives available on stripe group 1 for filesystem 'raid5'
Jul 3 16:18:13 fry kernel[0]: Could not mount filesystem raid5, cvfs error ' IO error' (3)
Jul 3 16:18:14 fry servermgrd: xsan: [37/15375A0] ERROR: -[XsanAutomounter threadedAutomount:]: Error automounting 'raid5' (CANNOT MOUNTERROR)
Jul 3 16:18:22 fry servermgrd: xsan: [37/302DD0] ERROR: get fsm_processstats(raid5): Unable to find pid of fsm

(......)

Unable to find pid of fsm
Jul 3 17:28:22 fry servermgrd: xsan: [37/302DD0] ERROR: get fsm_processstats(raid5): Unable to find pid of fsm
Jul 3 17:28:55 fry kernel[0]: CVFS 'raid5': Buffer Cache blksize 4096, #blocks min 256 max 8192
Jul 3 17:28:55 fry kernel[0]: CVFS 'raid5': request reserved space 0x2333333
Jul 3 17:28:55 fry kernel[0]: CVFS 'raid5': FsBlk size 131072, bits 17, mask 0x1ffff
Jul 3 17:28:55 fry kernel[0]: CVFS 'raid5': Sector size 0, bits 0, mask 0x0
Jul 3 17:28:55 fry kernel[0]: Not all drives available on stripe group 1 for filesystem 'raid5'
Jul 3 17:28:55 fry kernel[0]: Could not mount filesystem raid5, cvfs error ' IO error' (3)
Jul 3 17:29:22 fry servermgrd: xsan: [37/302DD0] ERROR: get fsm_processstats(raid5): Unable to find pid of fsm
Jul 3 17:30:22 fry servermgrd: xsan: [37/302DD0] ERROR: get fsm_processstats(raid5): Unable to find pid of fsm


G5 2.7x2 Mac OS X (10.4.6)

Jul 5, 2007 4:14 AM in response to Donald Kok

Hi Donald, thanks for the reply.

I know I can loose everything (would be 2nd time already), that's why I'm afraid of trying. The only lun I know for sure is the metadata one, since has a different size. The others are all the same size... 😟

I was thinking that maybe there would be a file or something relating the device and the LUN number, but the only thing I found was the LUN names on the XSan config files. For example in my volume .cfg file:

# **************************************************************************
# A disk section for defining disks in the hardware configuration.
# **************************************************************************

[Disk metaLUN cab0101raid01]
Status Up
Type metaLUN cab0101raid01Type

[Disk cab0202lun03]
Status Up
Type cab0202lun03Type

[Disk cab0202lun02]
Status Up
Type cab0202lun02Type

[Disk cab0202lun01]
Status Up
Type cab0202lun01Type

[Disk cab0202lun00]
Status Up
Type cab0202lun00Type

[Disk cab0201lun03]
Status Up
Type cab0201lun03Type

[Disk cab0201lun02]
Status Up
Type cab0201lun02Type

[Disk cab0201lun01]
Status Up
Type cab0201lun01Type

[Disk cab0201lun00]
Status Up
Type cab0201lun00Type

[Disk cab0102lun03]
Status Up
Type cab0102lun03Type

[Disk cab0102lun02]
Status Up
Type cab0102lun02Type

[Disk cab0102lun01]
Status Up
Type cab0102lun01Type

[Disk cab0102lun00]
Status Up
Type cab0102lun00Type

[Disk cab0101lun02]
Status Up
Type cab0101lun02Type

[Disk cab0101lun00]
Status Up
Type cab0101lun00Type


# **************************************************************************
# A stripe section for defining stripe groups.
# **************************************************************************

[StripeGroup meta]
Status Up
Exclusive Yes
Metadata Yes
Journal Yes
Read Enabled
Write Enabled
MultiPathMethod Rotate
StripeBreadth 8
Node metaLUN cab0101raid01 0

[StripeGroup datos]
Status Up
Exclusive No
Metadata No
Journal No
Affinity datos
Read Enabled
Write Enabled
MultiPathMethod Rotate
StripeBreadth 8
Node cab0202lun03 0
Node cab0202lun02 1
Node cab0202lun01 2
Node cab0202lun00 3
Node cab0201lun03 4
Node cab0201lun02 5
Node cab0201lun01 6
Node cab0201lun00 7
Node cab0102lun03 8
Node cab0102lun02 9
Node cab0102lun01 10
Node cab0102lun00 11
Node cab0101lun02 12
Node cab0101lun00 13


I have all the LUN names, but don't know if the order is this one or not... 😟

¿did you change the LUN names from the XSan administrator .app?

Jul 5, 2007 5:21 AM in response to Guillermo Nuez Marin

Hi,
I label with the command prompt, but I do all of the administration with the prompt. No xsan GUI on my systems.
you might have other places where you could have these info:
- the nicknames in the FC switch.
- the cvlabels file used when creating the labels. You can find this in /Library/Filesystems/Xsan/config on the system that created the luns.
- any kind of administration you keep yourself. (duh)
Goodluck
Donald

Jul 5, 2007 7:22 AM in response to Donald Kok

Got the results from the cvlabels:

/dev/rdisk1 [APPLE Xserve RAID 1.51] MBR Controller#: '5000393000011723' Serial#: '5000393000011723L0' Sectors: 1560207616. SectorSize: 512.
/dev/rdisk4 [APPLE Xserve RAID 1.51] MBR Controller#: '5000393000011703' Serial#: '5000393000011703L0' Sectors: 1172043776. SectorSize: 512.
/dev/rdisk8 [APPLE Xserve RAID 1.51] MBR Controller#: '5000393000011513' Serial#: '5000393000011513L0' Sectors: 1172043776. SectorSize: 512.
/dev/rdisk12 [APPLE Xserve RAID 1.51] MBR Controller#: '50003930000114AB' Serial#: '50003930000114ABL0' Sectors: 1172043776. SectorSize: 512.
/dev/rdisk9 [APPLE Xserve RAID 1.51] MBR Controller#: '5000393000011513' Serial#: '5000393000011513L1' Sectors: 1172043776. SectorSize: 512.
/dev/rdisk5 [APPLE Xserve RAID 1.51] MBR Controller#: '5000393000011703' Serial#: '5000393000011703L1' Sectors: 1172043776. SectorSize: 512.
/dev/rdisk2 [APPLE Xserve RAID 1.51] MBR Controller#: '5000393000011723' Serial#: '5000393000011723L1' Sectors: 781367296. SectorSize: 512.
/dev/rdisk14 [APPLE Xserve RAID 1.51] MBR Controller#: '50003930000114AB' Serial#: '50003930000114ABL1' Sectors: 1172043776. SectorSize: 512.
/dev/rdisk10 [APPLE Xserve RAID 1.51] MBR Controller#: '5000393000011513' Serial#: '5000393000011513L2' Sectors: 1172043776. SectorSize: 512.
/dev/rdisk6 [APPLE Xserve RAID 1.51] MBR Controller#: '5000393000011703' Serial#: '5000393000011703L2' Sectors: 1172043776. SectorSize: 512.
/dev/rdisk15 [APPLE Xserve RAID 1.51] MBR Controller#: '50003930000114AB' Serial#: '50003930000114ABL2' Sectors: 1172043776. SectorSize: 512.
/dev/rdisk3 [APPLE Xserve RAID 1.51] MBR Controller#: '5000393000011723' Serial#: '5000393000011723L2' Sectors: 1560207616. SectorSize: 512.
/dev/rdisk7 [APPLE Xserve RAID 1.51] MBR Controller#: '5000393000011703' Serial#: '5000393000011703L3' Sectors: 1172043776. SectorSize: 512.
/dev/rdisk11 [APPLE Xserve RAID 1.51] MBR Controller#: '5000393000011513' Serial#: '5000393000011513L3' Sectors: 1172043776. SectorSize: 512.
/dev/rdisk13 [APPLE Xserve RAID 1.51] MBR Controller#: '50003930000114AB' Serial#: '50003930000114ABL3' Sectors: 1172043776. SectorSize: 512.

I see I get the 15 LUNs I have.

I'll check if I find a way to match their controller/serials to the physical disks, because I named all the LUNs following a pattern (cabin/controller/lun). This way I would be able to re-create the cvlabels file.

I'm afraid it was my first XSan and I didn't think of backing up the /Library/Filesystems/Xsan directory just in case.

Jul 5, 2007 8:09 AM in response to Guillermo Nuez Marin

Ok, I think I worked it out.

matching the "controller#" column with the controllers WWNN I can identify the first part of the lun name. cab0101, cab0102, etc...

Now, since the first cabin and first controller are hosting the metadata drives, and those are different in size to all the other luns, I can assume that the last part of the "serial#" with the "L" matches the lun number because:

from the cvlabel -l -s command I get:

/dev/rdisk2 [APPLE Xserve RAID 1.51] MBR Controller#: '5000393000011723' Serial#: '5000393000011723L1' Sectors: 781367296. SectorSize: 512.

from the raidAdmin util I get the first cabin and first controller WWNN:

50:00:39:30:00:01:17:23

and it's size matches with the Metadatalun I had (from the .cfg file):

[DiskType metaLUNcab0101raid01Type]
Sectors 763054K
SectorSize 512


Doing -> 781367296 / 1024 = 763054

and checking again in RaidAdmin I see it's in lun 1 which matches the last part of the serial: L1



I assume that I can re-create the labels and bring it back to life again, but I don't dare to touch it unless the last machine goes down and I have no other choice.

G5 2.7x2 Mac OS X (10.4.9)

G5 2.7x2 Mac OS X (10.4.9)

Jul 5, 2007 11:35 PM in response to Guillermo Nuez Marin

Nice that you can locate the lunname back to the rdisk name.
Now you can create a good cvlabels file and apply them on the luns. from the adic manual:
Creating a File System Server Using CLI
Use this procedure to install the file system server using CLI.
1 Install StorNext. For instructions, refer to the chapter in the StorNext Installation Guide for UNIX
Users that applies to your operating system.
2 Write the list of system and FC disks to a file in a format recognized by the cvlabel command.
Type:
/usr/cvfs/bin/cvlabel -c > /usr/cvfs/config/cvlabels
The created file displays an entry for disk located by the /usr/cvfs/bin/cvlabel command.
CvfsDisk_UNKNOWN /dev/sdb # host 4 lun 1 sectors 639570752 ...
CvfsDisk_UNKNOWN /dev/sdc # host 4 lun 2 sectors 639570752 ...
CvfsDisk_UNKNOWN /dev/sdd # host 4 lun 3 sectors 639570752 ...
3 Edit the cvlabels file which has a list of all system and FC disks visible on the machine. Edit
the file to remove all the system disks and any FC disks you do not want labeled or are already
labeled.
4 Label the FC drives. Type:
/usr/cvfs/bin/cvlabel /user/cvfs/config/cvlabels
Note Output example shown will differ from the output you will see, but will be similar
in structure and information.

The paths and labels are not right for xsan, off course.

unlabeled LUNs in XSan

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.