This discussion is locked
James McMahan2

Q: Meta Data Lun Corrupted!!!!

Hello All,

Our XSan went down today and upon rebooting the metadata controller it was discovered that the lun that contained the meta data had become corrupt. Whereas the two data luns showed up as CVFS format and had labels the metadata lun prompted an "unreadable, eject initialize ignore" message at startup and in the XSan admin was listed as unlabeled. It was listed as "unknown" format and unlabeled when listed with the other volumes in the command line.

How do I repair this lun? When I try to use CVFSCK command it complains that the lun "mylun2" which was the name of the corrupted lun is unavailable. Is there a disk utility of sorts that repairs a volume (lun) that is CVFS file system?

I see a few command line options but all come with serious warnings. So, I wish to seek guidance first.

A few more things.

1. the metadata lun is a Raid 1 two disk mirror.
2. the backup controller had not been updated from 1.1 to 1.2 and had taken over metadata controlling earlier this week before the first crash which was solved with a restart and the main metadata controller resumed responsibility. both controllers now have 1.2. I intend to have a conversation with the guy who only updated the one controller.
3. both meta data controllers are running 10.4.3 server
4. restarting all of the components of the san did not resolve the issue.


I really don't want to tell this client that all their data on the San is completely lost. They are a very good client and very good people. Any help that you could give would be THOROUGHLY appreciated.

Thank you for your time,
James L. McMahan Jr.

G5 1.8   Mac OS X (10.4.6)  

Posted on Apr 7, 2006 9:22 PM

Close

Q: Meta Data Lun Corrupted!!!!

  • All replies
  • Helpful answers

  • by sapridyne,

    sapridyne sapridyne Apr 7, 2006 11:21 PM in response to James McMahan2
    Level 1 (144 points)
    Apr 7, 2006 11:21 PM in response to James McMahan2
    Rebuild your metadata LUN, restore your data from backup.

    Sapridyne
  • by James McMahan2,

    James McMahan2 James McMahan2 Apr 7, 2006 11:51 PM in response to sapridyne
    Level 1 (5 points)
    Apr 7, 2006 11:51 PM in response to sapridyne
    Funny you should mention that. the San failed during a second attempt to backup the data to tape.

    So, what you are saying is that the data is lost if it isn't successfully backed up?

    Thanks for the response,
    James
  • by sapridyne,

    sapridyne sapridyne Apr 8, 2006 7:21 AM in response to James McMahan2
    Level 1 (144 points)
    Apr 8, 2006 7:21 AM in response to James McMahan2
    Unfortunately, yes.

    Sapridyne
  • by Will O\'Neal,

    Will O\'Neal Will O\'Neal Apr 8, 2006 12:38 PM in response to sapridyne
    Level 1 (65 points)
    Apr 8, 2006 12:38 PM in response to sapridyne
    Sapridyne-

    Are you trying to tell James that this entire SAN is toast? What is the point of a SAN with separate dual metadata controlers if it's not redundancy and security? We need to help get this SAN back up and running, and "reformat," while it's the standard response from Apple Support, doesn't work here. I wish I had a better answer for him, but even if I don't, you can bet reformatting would not be the number one suggestion I'd bring out.

    Will
  • by sapridyne,

    sapridyne sapridyne Apr 9, 2006 6:18 AM in response to Will O\'Neal
    Level 1 (144 points)
    Apr 9, 2006 6:18 AM in response to Will O\'Neal
    Uh... okay. If you have a better solution, Will, go for it. The fact of the matter, though, is if your metadata is lost and you don't have a backup (bad idea), you don't have any other option but to start over. There isn't some magic utility to run to make things all better again.

    Xsan (or StorNext) is a cluster file system-based SAN, which is dependant on metadata stored on an Xserve RAID. Having redundant controllers isn't going to help with redundancy of his metadata -- it's going to help with redundancy of the controllers. It looks like he had an issue with the primary controller, which failed over to the secondary correctly. Why his primary controller failed, I do not know -- I haven't seen any logs (and even if I had, they might not be that much help to me at this juncture, anyway). The next problem he had was his secondary controller didn't have the same version of Xsan as the primary, which probably made matters worse.

    At this point, I don't know what's up with his controllers, but he is stuck with a metadata LUN without a label on it, which is worthless to him. That's rare and unfortunate, but that's why you backup your data. If the metadata is lost, your data isn't accessible. A SAN without a backup isn't any better than a RAID set -- or even a single drive.

    Will, while I appreciate your desire to help people, you can't always pull a rabbit out of a hat. Probably the reason why my response is what is a standard response from Apple is because it's the truth. I've put up and maintained many Xsan environments and serve as a Senior Storage Engineer for a large company. I know what I'm talking about, but I am not trying to claim that I know all. If there is someone out that who has a better solution, this is the forum to offer those suggestions, where James can be helped and the rest of us can maybe learn something we didn't know before.

    But criticizing me for my response isn't going to help him, either. If you have a better solution, speak up. Otherwise, can it.

    Sapridyne
  • by James Knotts,

    James Knotts James Knotts Apr 10, 2006 7:01 AM in response to sapridyne
    Level 1 (55 points)
    Apr 10, 2006 7:01 AM in response to sapridyne
    SAPRIDYNE is right. Running a SAN (which I do as well) isn't something you fly by the seat of your pants on. SAN admins go great lengths to make sure that their data is backed up, redundant, and most importantly, standardized across the board. Even though Apple's products are 99% backwards compatible, there are certain things that HAVE to be kept the same, SANs being one.
  • by drewfiero,

    drewfiero drewfiero Apr 13, 2006 2:13 PM in response to sapridyne
    Level 1 (0 points)
    Apr 13, 2006 2:13 PM in response to sapridyne
    How do you back up meta data?

    Thank you Drew
  • by James McMahan2,

    James McMahan2 James McMahan2 Apr 13, 2006 2:52 PM in response to drewfiero
    Level 1 (5 points)
    Apr 13, 2006 2:52 PM in response to drewfiero
    That is a very good question. I have learned a great deal after this failure. One would think the meta data backup process would be front and center in the documentation. That does not appear to be the case.

    Unfortunately, too late for me, I came across this knowledge base article.

    http://docs.info.apple.com/article.html?artnum=303371

    In it is a reference to a command called "snmetadump" which does a "snapshot" of the meta data. The article seems to use the term snapshot and backup interchangeably. But, it is the only solution that I have found.

    James L. McMahan Jr.
  • by JotJot,

    JotJot JotJot Apr 18, 2006 5:45 AM in response to James McMahan2
    Level 1 (85 points)
    Apr 18, 2006 5:45 AM in response to James McMahan2
    I would not want to rely on old metadata snapshots, specially since the utility does dump but nobody tells you how to restore... Some years ago there was a lun mirror feature in StorNext, that would have allowed you to bring in more redundancy.

    Now you only have the backup left. Unfortunately there is no restore feature available, as in StorNextManager.

    jotjot
  • by sapridyne,

    sapridyne sapridyne Apr 21, 2006 5:00 PM in response to JotJot
    Level 1 (144 points)
    Apr 21, 2006 5:00 PM in response to JotJot
    I agree with JotJot. The best way to backup metadata is to backup your SAN. Maybe 1.5 or 2.0 (whenever they come out) will be more rebust in this area.

    Sapridyne
  • by James McMahan2,

    James McMahan2 James McMahan2 Apr 21, 2006 6:52 PM in response to sapridyne
    Level 1 (5 points)
    Apr 21, 2006 6:52 PM in response to sapridyne
    Oh I agree. It is by no means an ideal solution. For my scenario, I look at it as: Some hope is better than none at all. Those snapshots might have provided some hope
  • by Rudolph Mallamas,Helpful

    Rudolph Mallamas Rudolph Mallamas Apr 23, 2006 7:58 PM in response to James McMahan2
    Level 1 (5 points)
    Apr 23, 2006 7:58 PM in response to James McMahan2
    For those who can afford it, ADIC has a storage manager piece to the SAN puzzle that Apple did not license for XSAN. XSAN is just the file system, but Storage Manager is a complete data system. I am using SM in my XSAN and have been very pleased so far. It automatically backs up the meta's as well as all data to tape. It is not a cheap solution but if the data is critical.......
  • by James McMahan2,

    James McMahan2 James McMahan2 Nov 20, 2010 6:07 PM in response to James McMahan2
    Level 1 (5 points)
    Nov 20, 2010 6:07 PM in response to James McMahan2
    The answer was to rebuild the SAN and pull as much as possible from the backup.