Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

PubSub agent and networked home dirs

This isn't exactly OS X server related, but the brain trust to answer my question is more likely to be in this forum than others. Quick background: I serve home directories off a ReadyNAS via netatalk, and manage an openldap directory via Workgroup Manager.

Some of my users are having odd problems which I think started with 10.6.2 or 10.6.3. If a client host goes to sleep, on wakeup there is an afp reconnect issue that eventually works itself out. However, the PubSubAgent becomes quite angry. Example:

4/13/10 10:28:20 PM kernel AFP_VFS afpfs_DoReconnect: Restoring session /Network/Servers/readynas.local/Users
4/13/10 10:28:20 PM KernelEventAgent[43] tid 00000000 received event(s) VQ_NOTRESP (1)
4/13/10 10:28:20 PM PubSubAgent[15149] SQL Error: SQLITE_IOERR[10.15]: disk I/O error in "select feed from Subscriptions where refreshInterval>=0 and feed>0 group by feed"
4/13/10 10:28:20 PM PubSubAgent[15149] SQLite::Exception "SQLITE_IOERR[10.15]: disk I/O error in "select feed from Subscriptions where refreshInterval>=0 and feed>0 group by feed"" caught in void Foundation::AbstractTimer::_fire()
4/13/10 10:28:21 PM kernel AFP_VFS afpfs_DoReconnect: Primary Reconnect failed 5 on /Network/Servers/readynas.local/Users
4/13/10 10:28:21 PM kernel AFP_VFS afpfs_DoReconnect: trying Secondary Reconnect on /Network/Servers/readynas.local/Users
4/13/10 10:28:21 PM kernel AFP_VFS afpfsCheckOpenFiles: File <Database.sqlite3> can not be reopened due to deny modes or byte range locks.
4/13/10 10:28:21 PM kernel AFP_VFS afpfsCheckOpenFiles: File <WebpageIcons.db> can not be reopened due to deny modes or byte range locks.
4/13/10 10:28:21 PM kernel ASP_TCP CancelOneRequest: cancelling slot 23 error 35 reqID 61046 flags 0x29 afpCmd 0x22 so 0x7e02000
4/13/10 10:28:21 PM kernel ASP_TCP CancelOneRequest: cancelling slot 24 error 35 reqID 61047 flags 0x29 afpCmd 0x7A so 0x7e02000
4/13/10 10:28:21 PM kernel AFP_VFS afpfs_DoReconnect: get the reconnect token
4/13/10 10:28:23 PM PubSubAgent[15149] SQL Error: SQLITE_IOERR[10.15]: disk I/O error in "select feed from Subscriptions where refreshInterval>=0 and feed>0 group by feed"
4/13/10 10:28:23 PM PubSubAgent[15149] SQLite::Exception "SQLITE_IOERR[10.15]: disk I/O error in "select feed from Subscriptions where refreshInterval>=0 and feed>0 group by feed"" caught in void Foundation::AbstractTimer::_fire()

The PubSubAgent's sqlite errors repeat every 3 seconds, during which time safari is crippled and takes forever to start, then periodically freezes. I verified that the sqlite3 db is fine. I can even run the queries that are failing. If I delete the db, then the problem occurs again after a sleep/wake cycle. It appears to have something to do with afpfsCheckOpenFiles losing a lock on the db.

Anyone have advice?

Core Duo 2.16 MBP, Mac OS X (10.6.3), 2g iPhone;C2D Mini;3 x g4 Mini;AE 802.11n;AE dualband 802.11n;3 x AX 802.11n

Posted on Apr 13, 2010 10:21 PM

Reply
Question marked as Best reply

Posted on Apr 14, 2010 5:42 AM

Netatalk doesn't support "Primary Reconnects", only "Secondary Reconnects". The difference being primary reconnect maintains state while secondary doesn't. State here referrs to stuff like open files and associated locks. You get the idea?
The only viable workaround is probably to not use the sleep mode with users with network homes.
5 replies
Question marked as Best reply

Apr 14, 2010 5:42 AM in response to Daryn Sharp

Netatalk doesn't support "Primary Reconnects", only "Secondary Reconnects". The difference being primary reconnect maintains state while secondary doesn't. State here referrs to stuff like open files and associated locks. You get the idea?
The only viable workaround is probably to not use the sleep mode with users with network homes.

Apr 14, 2010 5:23 PM in response to slowfranklin

Ah, thanks! Primary reconnects are the replay cache, right? If so, that's an AFP 3.3 feature but netatalk is only 3.2 and doesn't seem to advertise a replay cache according to a packet sniff.

Seems like two bugs:
1) The 10.6.x afp client is erroneously trying to use a feature that netatalk doesn't advertise.
2) More importantly, PubSubAgent/sqlite3 isn't recovering correctly from losing a file descriptor.

I'd say my first viable option is a launchd login process for network users to use IORegisterForSystemPower to kill PubSubAgent on wakeup. Then maybe try to implement a replay cache in netatalk, although the more I think about it, it will not be an easy task...

PubSub agent and networked home dirs

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.