ktwalker69

Q: symbolic links get corrupted by system process?

Greetings Folks,

 

This was posted in another forum, so I'm reposting two messages here:

 

I am having a problem with symbolic links getting corrupted.  I have a new Mac Pro running 10.7.3.  I have defined symbolic links

 

/Users/walker/G2S -> /Volumes/L2A/G2S [this is pointing to a different partition on the same JBOD RAID]

/home -> /Users

 

The second link was created after unmounting /home and removing it from the /etc/auto_master file.

 

Both symbolic links worked for several days.  But then for some reason, without a reboot, the links became corrupted:

 

> pwd

/Users/walker

> ls -al G2S

lrwxr-xr-x  1 walker  staff  16 Mar 24 03:08 G2S -> X??G???Gҡ?G???G

> cd G2S

G2S: No such file or directory.

 

Same nonsensical definition for /home link.  I repeat, this did not happen after a reboot.  It first happened on /home.  I thought that might have been related to a new OS handling of the "/home" label.  So I deleted the /home link and did a clean reboot.  The G2S link was created after that reboot, not before.

 

After the above two problems happened, I created a new symbolic link

 

/Users/walker/G2S2 -> /Volumes/L2A/G2S

 

I then did not use this new symbolic link in any of my processing scripts.  A few weeks went by, then this link somehow got corrupted too:

 

lrwxr-xr-x   1 walker  staff     16 Apr  2 17:22 G2S2 -> 꺄G???Gĺ?Gú?G

 

Does anyone here know how symbolic links are managed on a Mac (any process that controls their linking?), or have any information to help me figure out how to fix this?  For example, could it be due to bad RAM?  I have 32 GB.

 

Thank you,

Kris Walker

Mac Pro, Mac OS X (10.7.3)

Posted on Apr 20, 2012 3:47 PM

Close

Q: symbolic links get corrupted by system process?

  • All replies
  • Helpful answers

first Previous Page 12 of 16 last Next
  • by hstimer,

    hstimer hstimer Apr 14, 2013 1:58 PM in response to daboulet
    Level 1 (0 points)
    Apr 14, 2013 1:58 PM in response to daboulet

    I've seen the problem on 2 separate  MacPro system. One was a 3,1 I believe, and the other is a 2012 system. Both were configured with software 10 raid, a total of 3TB on the former, and 4TB on the latter.

     

    The old machine had the problem with 10.7.x, and the new machine has the problem with 10.8.3.

     

    Neither machine suffered data corruption, just link corruption.

     

    Both machines were used as developer machines so they had many symlinks from Hombebrew, my own scripts, etc.

     

    Both machines had other links go bad -- frameworks, mail.

     

    I replaced all 4 of the disks on the new MacPro. Went from Seagates to WD, because I assumed my disks were going bad. Didn't help.

     

    Neither machine sleeps, and maybe 1 reboot a month.

     

    I have no solution, I'm just frustrated.

     

    daboulet, thanks for the tool, I'll give it a shot.

  • by hstimer,

    hstimer hstimer Apr 14, 2013 4:23 PM in response to hstimer
    Level 1 (0 points)
    Apr 14, 2013 4:23 PM in response to hstimer

    I used daboulet's tool and found the following errors (I had to use a few more backslashes to get find to work right):

     

    find / -xdev -type l -type l -exec ./checkSymlink \{\} \;

     

    /Applications/Adobe InDesign CS6/Adobe InDesign CS6.app/Contents/MacOS/Required/Application UI.InDesignPlugin/Resources

    /Applications/Motion.app/Contents/Frameworks/TextFramework.framework/Resources

    /Applications/Motion.app/Contents/Frameworks/TextFramework.framework/Versions/Cu rrent

    /Applications/Numbers.app/Contents/Frameworks/SFControls.framework/Resources

    /Applications/Numbers.app/Contents/Frameworks/SFControls.framework/Versions/Curr ent

    /Applications/Numbers.app/Contents/Frameworks/SFInspectors.framework/Resources

    /Applications/Utilities/Adobe Application Manager/core/Adobe Application Manager.app/Contents/Frameworks/adbeapecore.framework/Versions/Current

    /Applications/Utilities/Adobe Application Manager/core/PDApp.app

    /Applications/Utilities/Adobe Application Manager/UWA/UpdaterCore.framework/Versions/Current

    /System/Hidden/Versions/Current

    /System/Library/Frameworks/CoreFoundation.framework/Headers

    /System/Library/Frameworks/Security.framework/Headers

    /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bundle

    /Users/hans/Library/Containers/com.apple.iPhoto/Data/Library/Application Support/AddressBook

    /Users/hans/Library/Containers/com.spanning.contactscleaner/Data/Library/Applica tion Support/AddressBook

    /Users/hans/Library/Preferences/Keyboard Maestro

    /Users/hans/Pictures/iChat Icons

    /usr/bin/llvm-gcc

    /usr/llvm-gcc-4.2/libexec/gcc/i686-apple-darwin11/4.2.1/as

    /usr/llvm-gcc-4.2/libexec/gcc/i686-apple-darwin11/4.2.1/ld

    /usr/llvm-gcc-4.2/libexec/gcc/i686-apple-darwin11/4.2.1/libllvmgcc.dylib

    /usr/local/bin/git

    /usr/local/Library/ENV/4.3/gcc

    /usr/local/Library/ENV/4.3/llvm-gcc

     

    Diskutility doesn't report any raid errors.

     

    DiskUtility

    ---

    Verifying permissions for “Striped RAID Set 3”

    Permissions differ on “System/Library/Frameworks/CoreGraphics.framework/Headers”; should be lrwxrwxrwx ; they are lrwxr-xr-x .

    Permissions differ on “usr/include/arpa/nameser.h”; should be lrw-r--r-- ; they are lrwxr-xr-x .

    Permissions differ on “usr/bin/what”; should be -rwxr-xr-x ; they are -r-xr-xr-x .

     

     

    Permissions verification complete

     

     

    Repairing permissions for “Striped RAID Set 3”

    Permissions differ on “System/Library/Frameworks/CoreGraphics.framework/Headers”; should be lrwxrwxrwx ; they are lrwxr-xr-x .

    Repaired “System/Library/Frameworks/CoreGraphics.framework/Headers”

    Permissions differ on “usr/include/arpa/nameser.h”; should be lrw-r--r-- ; they are lrwxr-xr-x .

    Repaired “usr/include/arpa/nameser.h”

    Permissions differ on “usr/bin/what”; should be -rwxr-xr-x ; they are -r-xr-xr-x .

    Repaired “usr/bin/what”

     

     

    Permissions repair complete

     

     

    Verify Disk

    ---

    Verifying volume “Striped RAID Set 3”

    Checking file systemPerforming live verification.

    Checking Journaled HFS Plus volume.

    Checking extents overflow file.

    Checking catalog file.

    Checking multi-linked files.

    Checking catalog hierarchy.

    Checking extended attributes file.

    Checking volume bitmap.

    Checking volume information.

    The volume Striped RAID Set 3 appears to be OK.

     

     

    >df

    Filesystem                               512-blocks       Used  Available Capacity   iused     ifree %iused  Mounted on

    /dev/disk4                               7812714496 4271467072 3540735424    55% 266998690 221295964   55%   /

    devfs                                           410        410          0   100%       711         0  100%   /dev

    map -hosts                                        0          0          0   100%         0         0  100%   /net

    map auto_home                                     0          0          0   100%         0         0  100%   /home

     

     

     

     

    I paid for FileXRay, to see if I can get some insight as to what is going on. They haven't sent me a link yet for downloading. I don't like Kagi.

  • by Ed Newman,

    Ed Newman Ed Newman Apr 14, 2013 6:52 PM in response to hstimer
    Level 1 (4 points)
    Mac OS X
    Apr 14, 2013 6:52 PM in response to hstimer

    Not seeing anything with fileXray - I also bought this. Change do not appear to be written to disk. Also seeing a lot of broken symlinks but not all are related to this issue (Sparkle Framework appears to have alink for french resources to a person's home drive hardcoded and this is know about on web). So far changes only to Application / Library symlinks and not data.

  • by hstimer,

    hstimer hstimer Apr 14, 2013 10:10 PM in response to Ed Newman
    Level 1 (0 points)
    Apr 14, 2013 10:10 PM in response to Ed Newman

    Do you use HomeBrew? I have a theory that their heavy use of softlinks causes an Apple bug.

     

    Some other ideas:

     

    * Virus trying to write directly to raw block device, but screws up

    * FS metadata bug -- when you get to a certain amount of metadata in the FS, then you get some corruptions.

     

    Maybe use filexray to discover which physical blocks are corrupted. Are they in the same area of the disk? Or are they spread out?

     

    btw - according to filexray, a softlink is just a file with a special bit set. So the link isn't stored in the inode.

     

    I just spent the last 60 minutes fixing links. But thanks to your tool, I can easily find which links to fix!

  • by daboulet,

    daboulet daboulet Apr 15, 2013 4:03 AM in response to Ed Newman
    Level 1 (0 points)
    Apr 15, 2013 4:03 AM in response to Ed Newman

    If the changes are not being written to disk then a reboot of the computer will make the corrupted symlinks "heal themselves".  I've not had the problem for about four months now and I am not sure whether or not a reboot 'fixed' the problem when I was having it although my recollection is that it did not.

     

    Simple experiment to try.

  • by hstimer,

    hstimer hstimer Apr 15, 2013 8:53 AM in response to daboulet
    Level 1 (0 points)
    Apr 15, 2013 8:53 AM in response to daboulet

    I rebooted, and the broken links I had before are still there.

     

    I'm going to run your tool before and after different actions to see if I can narrow down when it happens: app and system upgrades, brew updates/upgrades. I'm guessing that is might happen when links are being manipulated.

     

    I ran clamxav across /Applications and /Users. It found a bunch of old junkmail that was infected, but nothing that mattered.

     

    I'm assuming that I'm not having disk corruption because I'm running a mirror and if there was a problem, with the disks that DiskUtility would spot it. I wonder if that is a safe assumption.

  • by daboulet,

    daboulet daboulet Apr 15, 2013 9:21 AM in response to hstimer
    Level 1 (0 points)
    Apr 15, 2013 9:21 AM in response to hstimer

    I'm going to assume that by disk mirroring you mean the style of mirroring that one can configure using Disk Utility (this seems pretty clear from your comment but I could really confuse things if you meant something different).

     

    If the Mac OS X kernel is writing foolish things to the filesystem from time to time and if this is what is causing the problem then you should end up with the problem on both copies of a disk mirror when the foolish thing gets written. On the other hand, if we're dealing with a hardware problem (seems very unlikely to me) then in practically all scenarios that I can think of, only one copy of the mirror would be corrupted and whether or not you even saw the corruption would depend on which copy the Mac OS X kernel happened to decide to read the data which happens to represent the symlink from.

     

    From everything that I've seen on this thread and in my own experience, this feels like a kernel bug. It could be a bizarre interaction between the kernel and some hardware strangeness but I doubt that given the pretty wide range of systems that see the problem. It could be strictly a hardware defect but that seems unlikely again given the wide range of systems that see the problem.

     

    That leaves "kernel bug" as the only likely scenario. I could be wrong but that's my interpretation of the evidence.

  • by hstimer,

    hstimer hstimer Apr 15, 2013 9:51 AM in response to daboulet
    Level 1 (0 points)
    Apr 15, 2013 9:51 AM in response to daboulet

    Confirmed: DiskUtility Mirror

     

    I'm inclined to thinking it could be the kernel too, but I would like to consider all the options.

     

    I think this is the link code:

     

         http://www.opensource.apple.com/source/xnu/xnu-2050.18.24/bsd/hfs/hfs_link.c

     

    Of course, the bug could be elsewhere.

  • by etresoft,

    etresoft etresoft Apr 15, 2013 10:41 AM in response to hstimer
    Level 7 (29,385 points)
    Apr 15, 2013 10:41 AM in response to hstimer

    hstimer wrote:

     

    I'm inclined to thinking it could be the kernel too, but I would like to consider all the options.

     

    I think this is the link code:

     

         http://www.opensource.apple.com/source/xnu/xnu-2050.18.24/bsd/hfs/hfs_link.c

     

    Of course, the bug could be elsewhere.

    If this "bug" even exists, you would have to compare the 10.6.8 kernel to the 10.7.0 kernel. I don't think that reference to the source for HFS hard link logic will help much. You would have to check every single line in the entire kernel for a bug in existence since 10.7 that affects maybe a dozen people out of 40 million.

     

    Good luck with that. I would be more interested to know what is in /System/Hidden.

  • by hstimer,

    hstimer hstimer Apr 15, 2013 10:56 AM in response to etresoft
    Level 1 (0 points)
    Apr 15, 2013 10:56 AM in response to etresoft

    Yep, it is hardlink. I would like to start with fileXray (if they ever send me a link to the download) to see if the data in the corrupt links is found in any other blocks on disk, or if it is just random. Additionally, it might be interesting to see if corrupt links are physically near each other.

     

    I thought Apple had a mircokernel where the different sub-systems couldn't stomp on each other. If they have all same risks as a monolithic kernel, then I don't know why they would bother with a microkernel.

  • by hstimer,

    hstimer hstimer Apr 15, 2013 11:09 AM in response to etresoft
    Level 1 (0 points)
    Apr 15, 2013 11:09 AM in response to etresoft

    I'm sure this bug effects many more people than 12/40m. Very few % people have the skills to figure out that their software is busted because the symlinks are corrupted. I've been seeing this problem for over a year before I decided to even look into it. Before, I would just reinstall any applications that quit working right. It was only because my .gitconfig soft link was getting repeatedly stomped, did I bother to look into what could be going on.

  • by etresoft,

    etresoft etresoft Apr 15, 2013 12:03 PM in response to hstimer
    Level 7 (29,385 points)
    Apr 15, 2013 12:03 PM in response to hstimer

    hstimer wrote:

     

    I thought Apple had a mircokernel where the different sub-systems couldn't stomp on each other. If they have all same risks as a monolithic kernel, then I don't know why they would bother with a microkernel.

    Again, you are assuming that someone has proven the existence of a kernel bug to explain this. No such proof has been provided by anyone for almost two years since Lion was released. Such things require hard, definitive proof. Anyone who has made custom, low-level changes to their operating system is automatically unable to demonstrate said proof regardless of evidence. I'm still waiting for an explanation of "/System/Hidden".

  • by hstimer,

    hstimer hstimer Apr 15, 2013 12:38 PM in response to etresoft
    Level 1 (0 points)
    Apr 15, 2013 12:38 PM in response to etresoft

    Actually, I'm completely open to:

    * disk corruption, if someone can explain how a mirror raid doesn't report it

    * virus

    * bug in any number of tools that run under sudo and do raw device access

    * os bug

     

    I suspect that it is an os bug, but right now I'm focusing on try to reproduce, which will hopefully help guide further efforts.

     

    I don't understand "Anyone who has made custom, low-level changes to their operating system is automatically unable to demonstrate said proof regardless of evidence."

  • by sydvicious,

    sydvicious sydvicious Apr 15, 2013 12:44 PM in response to hstimer
    Level 1 (0 points)
    Apr 15, 2013 12:44 PM in response to hstimer

    Disk First Aid checks any number of known corruptions. Having a symlink point to nowhere is not normally considered a corruption, so Disk First Aid would not detect it.

     

    As to causes, I could not find any pattern to my corruption. I noticed that my iPhone stopped syncing, and when I checked the Console, it told me MobileDevices.framework could not be loaded. That's when I found the broken symlink. I fixed it, but a few weeks later, it got corrupted again. I used my own tools and various tools posted here to find many more broken symlinks, and then other things started corrupting quite frequently. I finally gave up, and partitioned my boot disk not to be 3 TB. Have not seen any corruption sense.

     

    From what I saw, the pathnames of the corrupt symlinks contained what looked like memory overwrites. Sometimes, they spelled text. Other times, they had data that looked regular.

  • by hstimer,

    hstimer hstimer Apr 15, 2013 2:02 PM in response to sydvicious
    Level 1 (0 points)
    Apr 15, 2013 2:02 PM in response to sydvicious

    I'm running TechToolPro 6.0.6's surface scan to get a second opinion on the validity of the disks. Micromat has a special right now for $49 which includes both TechToolPro and Checkmate (whatever that is).

     

    I've tried a number of ways to get the broken links to happen again. Heavy use of homebrew, uninstalling and resinstalling apps, but so far everything is still good.

     

    I wish Apple didn't give up on ZFS. There is a checksum at the start of each block which helps you to detect errors right away.

     

    Still no response from the filexray people; they have my money, and I don't have their product.

first Previous Page 12 of 16 last Next