NFS bug in 10.6.8, PCP Server. xgrid clients, NFS timeouts/disconnects, jobs fail.
Nasty NFS bug in 10.6.8, running a PCP Server. xgrid clients, have NFS timeouts/disconnects, and jobs fail. Anyone else experiencing this issue?
Jul 29 17:01:34 servername org.machx.snmp-data[46790]: Time to sleep 60 seconds
Jul 29 17:02:34 servername org.machx.snmp-data[46790]: done
Jul 29 17:02:41 servername KernelEventAgent[48]: tid 00000000 received event(s) VQ_NOTRESP (1)
Jul 29 17:02:41 servername KernelEventAgent[48]: tid 00000000 type 'nfs', mounted on '/Network/Servers/someserver.some.edu/Volumes/PCP2_Media/Podcast_Producer_Libra ry', from 'someserver.some.edu:/Volumes/PCP2_Media/Podcast_Producer_Library', not responding
Jul 29 17:02:41 servername KernelEventAgent[48]: tid 00000000 found 1 filesystem(s) with problem(s)
Jul 29 17:02:58 servername KernelEventAgent[48]: tid 00000000 unmounting 1 filesystems
Jul 29 17:03:08 servername pcastaction[46809]: PodcastProducer::Actions::QTImport: FINISH
* Mac Pro running 10.6.8v2 with two Mac Pro xgrid clients.
* When the clients access the NFS share, the server stops responding, All TCP connections die, and after a few minutes/seconds it responds again. Jobs fail.
* Not surprisingly, NFS disconnect has caused clients to Kernel Panic.
* I reverted to 10.6.7 and the issue subsided.
is anyone else having PCP/NFS/Xgrid issues with 10.6.8 server ?
Any insigt would be greatly appreciated.