Skip navigation

Network home folder clients (10.8.2) freezing

9045 Views 45 Replies Latest reply: Mar 4, 2014 6:23 PM by ziondotcom RSS
1 2 3 4 Previous Next
Nicholas Woolridge Calculating status...
Currently Being Moderated
Mar 4, 2013 12:12 PM

Hi all,

 

I have a Mac OS X Lion Server (10.7.5, all updates, Mac mini server with TB RAID attached) serving network home folders to Mac OS X Mountain Lion 10.8.2 clients.

 

Some of our users are experiencing freezes that manifest shortly after login. It appears that the shared volume is no longer available, based on the following system log entries. Again, this only happens to some users, but those users have it happen consistently...

 

Any insight?

 

Mar  4 13:46:47 BMC-CCT3110-Chloroplast.local KernelEventAgent[47]: tid 00000000 type 'afpfs', mounted on '/Network/Servers/www.ourURL.foo.foo/Volumes/User_partition/BMCusers', from '//erink@www.ourURL.foo.foo/BMCusers', not responding

Mar  4 13:46:47 BMC-CCT3110-Chloroplast.local KernelEventAgent[47]: tid 00000000 found 1 filesystem(s) with problem(s)

Mar  4 13:46:47 BMC-CCT3110-Chloroplast.local KernelEventAgent[47]: tid 00000000 received event(s) VQ_NOTRESP (1)

Mar  4 13:46:47 --- last message repeated 1 time ---

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: ASP_TCP Disconnect: triggering reconnect by bumping reconnTrigger from curr value 8 on so 0xffffff802d71c370

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: ASP_TCP asp_tcp_usr_control: invalid kernelUseCount 0

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: AFP_VFS afpfs_DoReconnect started /Network/Servers/www.ourURL.foo.foo/Volumes/User_partition/BMCusers prevTrigger 8 currTrigger 9

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: AFP_VFS afpfs_DoReconnect:  doing reconnect on /Network/Servers/www.ourURL.foo.foo/Volumes/User_partition/BMCusers

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: AFP_VFS afpfs_DoReconnect:  posting to KEA EINPROGRESS for /Network/Servers/www.ourURL.foo.foo/Volumes/User_partition/BMCusers

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: AFP_VFS afpfs_DoReconnect:  Max reconnect time: 600 secs, Connect timeout: 15 secs for /Network/Servers/www.ourURL.foo.foo/Volumes/User_partition/BMCusers

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: AFP_VFS afpfs_DoReconnect:  connect to the server /Network/Servers/www.ourURL.foo.foo/Volumes/User_partition/BMCusers

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: AFP_VFS afpfs_DoReconnect:  Logging in with uam 10 /Network/Servers/www.ourURL.foo.foo/Volumes/User_partition/BMCusers

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: AFP_VFS afpfs_DoReconnect:  Restoring session /Network/Servers/www.ourURL.foo.foo/Volumes/User_partition/BMCusers

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: ASP_TCP ReplayPendingReqs: replaying slot 7 with reqID 51198 afpCmd 0x44 on so 0xffffff802d71c370

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: AFP_VFS afpfs_DoReconnect:  get the reconnect token

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: ASP_TCP Disconnect: triggering reconnect by bumping reconnTrigger from curr value 9 on so 0xffffff802d71c370

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: ASP_TCP asp_tcp_usr_control: invalid kernelUseCount 0

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: AFP_VFS afpfs_DoReconnect started /Network/Servers/www.ourURL.foo.foo/Volumes/User_partition/BMCusers prevTrigger 9 currTrigger 10

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: AFP_VFS afpfs_DoReconnect:  doing reconnect on /Network/Servers/www.ourURL.foo.foo/Volumes/User_partition/BMCusers

Mar  4 13:46:47 BMC-CCT3110-Chloroplast.local KernelEventAgent[47]: tid 00000000 type 'afpfs', mounted on '/Network/Servers/www.ourURL.foo.foo/Volumes/User_partition/BMCusers', from '//erink@www.ourURL.foo.foo/BMCusers', not responding

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: AFP_VFS afpfs_DoReconnect:  posting to KEA EINPROGRESS for /Network/Servers/www.ourURL.foo.foo/Volumes/User_partition/BMCusers

Mar  4 13:46:47 BMC-CCT3110-Chloroplast kernel[0]: AFP_VFS afpfs_DoReconnect:  Max reconnect time: 600 secs, Connect timeout: 15 secs for /Network/Servers/www.ourURL.foo.foo/Volumes/User_partition/BMCusers

Mac OS X (10.6.7), OS X server
  • cafarom Calculating status...

    Did you figure out this issue? We have a similar problem.

  • ebrind Level 1 Level 1 (15 points)

    Hello,

     

    I dont know if this will help or not but here are the first things I check when a user can not login.

     

    1. Open workgroup manager, select the user, click Home.

     

    2. Click on the server.local/Users/ NOT the server.com/Users/

     

    3. Set the disk Quota

     

    4. Create Home Now

     

    5. Save

     

    Screen Shot 2013-03-12 at 1.44.04 PM.jpg

     

     

    6. Open server.app and verify the user home folder is set to Custom

     

    Screen Shot 2013-03-12 at 1.51.48 PM.jpg

     

    The user should be able to login.

     

    Hope this helps!

     

    Thanks,

     

    ebrind

  • ebrind Level 1 Level 1 (15 points)

    Hello,

     

    Is the Home Directory Location correct in the Advanced Options of the user having the issue in Server.app?

     

    Screen Shot 2013-03-12 at 3.13.22 PM.jpg

  • cafarom Level 1 Level 1 (10 points)

    Hi Nicholas,

     

    I've just determined the cause of our problem. Try disabling spotlight on the affected computers:

     

    sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.metadata.mds.plist

     

    See if that stops the freezing. If so, rebuild the spotlight index on the computer or leave spotlight disabled.

     

    -Mark

  • ServerBurninator Calculating status...
    Currently Being Moderated
    Apr 21, 2013 5:38 PM (in response to cafarom)

    I can confirm that after 8 months of fighting this problem, this worked for us:

    sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.metadata.mds.plist

     

    Cafarom, thank you so much!!

     

    File a bug report with Apple about this. I just did, here is my report, hope it helps you.

     

     

     

    Summary:

        Some network accounts freeze soon after logging into OSX server. Upon logging in, soon after all the applications are reopened from the previous session, the problem starts (after about 1 minute from login). Applications will begin to hang one-by-one as they try to access the disk. This is fatal and the system must be hard-reset to recover.

      

        This seems to be Spotlight related, disabling spotlight on the client machines seems to "solve" the problem.

     

    Steps to Reproduce:

        Not all accounts seem to be affected, I have not yet been able to isolate the trigger. Once a network account becomes "infected", nothing can be done to fix it and the problem persists across all client computers. That is, trying to login with the infected account from any client machine on the network, will always produce the same degenerate behavior.

     

        Once an account becomes 'infected', it can be temporarily cured by backing up the contents, deleting it and recreating it on the server. However the problem soon returns again.

     

        All client machines and the server are running OSX 10.8.3, and are bound to the server. The server is the Open directory master and the DNS server for all machines in the network. No other servers or non-Mac machines are on the network. The server is using a self signed certificate and has the following services running: Calendar, Contacts, DNS, File Sharing, Open Directory, Profile Manager, Websites.

     

        All network accounts use the following Home directory on the server: /Users/

     

    Expected Results:

        Upon logging in, applications respond normally, the system and applications do not freeze.

     

    Actual Results:

        Some network accounts will always cause the system and all applications to freeze starting after about 1 minute from logging in.

     

    Workaround:

        Permanently disable spotlight on all client machines with: sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.metadata.mds.plist

     

       This "fixes" the problem and the affected network accounts will now be able to log in without freezing.

     

    Regression:

        Here is the error message from the client machine that accompanies the problem. This message repeats verbatim on every login by an infected account. When applying the workaround this message is not present.

     

    4/21/13 3:52:50.937 PM KernelEventAgent[42]: tid 00000000 received event(s) VQ_NOTRESP (1)

    4/21/13 3:52:50.938 PM KernelEventAgent[42]: tid 00000000 type 'afpfs', mounted on '/Network/Servers/XXXXX/Users', from '//user@XXXXX/Users', not responding

    4/21/13 3:52:50.000 PM kernel[0]: ASP_TCP Disconnect: triggering reconnect by bumping reconnTrigger from curr value 0 on so 0xffffff8026a6d370

    4/21/13 3:52:50.000 PM kernel[0]: ASP_TCP asp_tcp_usr_control: invalid kernelUseCount 0

    4/21/13 3:52:50.000 PM kernel[0]: AFP_VFS afpfs_DoReconnect started /Network/Servers/XXXXX/Users prevTrigger 0 currTrigger 1

    4/21/13 3:52:50.000 PM kernel[0]: AFP_VFS afpfs_DoReconnect:  doing reconnect on /Network/Servers/XXXXX/Users

    4/21/13 3:52:50.000 PM kernel[0]: AFP_VFS afpfs_DoReconnect:  posting to KEA EINPROGRESS for /Network/Servers/XXXXX/Users

    4/21/13 3:52:50.000 PM kernel[0]: AFP_VFS afpfs_DoReconnect:  Max reconnect time: 600 secs, Connect timeout: 15 secs for /Network/Servers/XXXXX/Users

    4/21/13 3:52:50.000 PM kernel[0]: AFP_VFS afpfs_DoReconnect:  connect to the server /Network/Servers/XXXXX/Users

    4/21/13 3:52:50.939 PM KernelEventAgent[42]: tid 00000000 found 1 filesystem(s) with problem(s)

    4/21/13 3:52:50.000 PM kernel[0]: AFP_VFS afpfs_DoReconnect:  Logging in with uam 10 /Network/Servers/XXXXX/Users

    4/21/13 3:52:51.000 PM kernel[0]: AFP_VFS afpfs_DoReconnect:  Restoring session /Network/Servers/XXXXX/Users

  • cafarom Level 1 Level 1 (10 points)

    I struggled with this for an equally long time. Glad I could help.

  • cafarom Level 1 Level 1 (10 points)

    Yes, you can reenable spotlight as long as you rebuild the spotlight index. http://osxdaily.com/2012/01/17/rebuild-spotlight-index/

     

    Make sure in addition to rebuilding the local harddrive index you also rebuild the network home directory index in the same fashion (drag the home directory into spotlight privacy and then remove it).

     

    I believe the issue occurs when the spotlight index gets stuck on a file located on the server. You may find it crops up again. You'll have to repeat this process again.

  • ServerBurninator Level 1 Level 1 (0 points)

    I did not have any luck with rebuilding the spotlight indexes either. In my case aswell the only way is to leave spotlight perminantly disabled, which as you said - is a huge problem that Apple really needs to fix.

     

    Nicholas, Cafarom, if you haven't done so already, I would highly encourage you to submit your own bug reports here: https://bugreport.apple.com

     

    Apple proritizes their work based on the number of bug reports received so your added information should help them solve the problem faster.

  • cafarom Level 1 Level 1 (10 points)

    Hmmm... I haven't had any problems after rebuilding the index.

     

    Are you sure you're rebuilding the entire index? You have to make sure to drag all network locations (including the user's home directory) as well as the entire local harddrive to the privacy window.

1 2 3 4 Previous Next

Actions

More Like This

  • Retrieving data ...

Bookmarked By (7)

Legend

  • This solved my question - 10 points
  • This helped me - 5 points
This site contains user submitted content, comments and opinions and is for informational purposes only. Apple disclaims any and all liability for the acts, omissions and conduct of any third parties in connection with or related to your use of the site. All postings and use of the content on this site are subject to the Apple Support Communities Terms of Use.