afp server issue - very hign cpu load

hallo

i googled an searched this forum al long time but i found no solution.

my problem is that my os x 10.5.4 server with about 30 networked homeddrive users have an issue with the afp server. the afp server process uses all 8 cores of this newest intel xserve with 14 gigs of ram installed. when this happens all users get an spinning wheel. the incoming network traffik is reduced to some kb´s.

ok all users shut down there clients - restart server and about 30 minutes later i have the same problem.

i have dumped the network traffic with wireshark and there i see some tcp retransmissions.
now i need someone who can help me analyse the wireshark protocol, because i cant´s handle that.

so if there is someone out there who can help me plz send me an email to support@premedia.at so that i can send you the wireshark log.

thank you in advice

Macbook Pro, Mac OS X (10.5.4)

Posted on Aug 29, 2008 2:47 AM

Reply
279 replies

Feb 10, 2009 2:52 PM in response to Kery O

Could you test if the cpu gets pegged when there are NO leopard clients attached at all?

A single leopard client (10.5.6) is able to completely saturate our G5 xserves' cpu (10.4.11) resulting in spinning beach balls for the rest of the users running 10.4.11.
When there are only tiger clients connected, the cpu hardly comes above 30%. No matter how much traffic they create.
But as soon as a single leopard client connects, spikes start to appear. These spikes will become wider and wider until they reach a point where they hog the cpu for several minutes.
As soon as the leopard client disconnects or quits all Adobe cs2 apps the cpu drops to something like 20%...

Feb 10, 2009 2:54 PM in response to Manfred Rumpl

Ok, I´ve had the AFP problem AND the Logout Freeze Problem. So I´m posting my solution in both threads because I think the problems are directly connected. Why? Well, if I have less user connected to the server, the chance that the machine freezes when a user is logging out really decrease.

First of all my "hood"

1x MacPro, running 10.5.6 Server
LDAP OD Master, 300 user (all with home directorys on the server)
Services: AFP, OD, Software Update, iCal
no external RAID
1x DNS/Mail Server (Redhat)

50x machines (all intel, but MacPros, MacMinis, etc.) running 10.5.6
all with CS3, Office 2004, etc.


The mess:
After updating Server and clients to 10.5.6 the problems starts with machine freeze at logout. The afp-problem was not noticed in the beginning because the user didn´t recognize the slow behaviour at first.
So first of all I was searching for the logout problem. After days of searching, I realized that the logouts freezes and the high AFP load are maybe two different problems, but only two sides of the same medal.


possible solutions/workaounds

- increase ram
no solution and can´t be really serious - my server runs with 6GB RAM 😀
- kill the afp by hand or with script
no solution because in an enviroment like mine, lots of angry users will loose files and there mailclients will go crazy.
- Turn off spotlight
I´ve tried this. Well, not the effect I was hoping and without spotlight usability; no solution
- turn off applications, go to stanby, wake up, then logout....
try to use this in an environment with 300+ user. hehe. no solution
- disabled Allow Host Cache Flushing
no effect
- Disabled auto-disconnect in AFP after idle time
no effect
-Disabled kerberos for AFP authentication
no effect and no option cause of security issues

first of all, the solutions I´ve posted as followed are brought together by searching, reading, and testing.
Anyone who find his own hints inside - thanks so far. "Charles Wiles", Rob@Bis etc.
I don´t want to adorn myself with borrowed plumes.


working solution:

Setting new thresholds
I do this like that:
sending all machines over ARD the unix command:
defaults write /Library/Preferences/com.apple.AppleShareClient -dict-add afp wanthreshold -int 1000
defaults write /Library/Preferences/com.apple.AppleShareClient -dict-add afp wanquantum -int 131702


checking it by sending:
defaults read /Library/Preferences/com.apple.AppleShareClient


If the systems returned the correct data, fine. 😀
You may set these thresholds in the global file, but I decided to set them this way.

My AFP load reduced to 60% max. load with 80-100 user in dayly business at our environment

but still having logout freeze, so there seemed to be another problem.

After searching, the problem seemed to be combination of fonts and some kind of heavy load through caches.

get rid of the fonts
I´ve don ethe following steps:
1. delete all the useless fonts in the font folder of the users. For 300 user, well lot of work. But you can make it easy.
Go to the terminal on the server, use the following commands to delete the fonts in the folder ~/Library/Fonts/>
sudo rm -r /Library/Fonts/

make sure you´re in the right direction. In my environment, we have three folders, containing user, named studis, doz, verw. So I have to do the job three times.
Maybe you run into problems if you have many users. Maybe you get an "argument list too long" this is a unix based limitation. You can write a script to avoid the problem, but if you´re not firm with unix, use the easy way and do it in steps. use it this way:
sudo rm -r [a-z] /Library/Fonts/

There have to be square brackets around the letters and an asteriks before and behind the /Library/Fonts/
you can than use the command in the first step with [a-c], than [d-f] etc. Easy workaround. 😀
You can do this also with the MCX redirector in WGM, but if you´re not firm in it, better use this method. For me it´s best cause I´m a fan of "hardcoded" solutions. kill the fonts the right way. 😀
After all, don´t forget to check if any home directory has the .TemporaryItems-folder becaus if not, MS will always get you an "Word cannot save this document due to a naming or permissions error on the destination volume."
You can see this invisible folder if you use the above way and navigate through your folders by terminal. If the Folder dont exists:
$ sudo mkdir .TemporaryItems

in the right directory. And don´t forget to check if the permissions for the folder are right.

Now I changed the Folder in the Microsoft programm folder itself. This folder contains the fonts, MS is installing again maybe after an update of the office package. To avoid installing the problem fonts again I throw all fonts out of the fonts folder excepting: Arial rounded Bold, Wingdings 2, Wingdings 3, MS PGothic.ttf. (BTW: The font format is not the problem)
Why this fonts? Well, i´ve tried it, but if I delete them although, Word, PP etc. always made errors and anoying popups.
I´ve done this the easy way, using ARD and the copy command. I´ve than taken my local fonts folder from microsoft, containing only the four fonts and copying it with the replace command.
BTW. Also when you´re not using Office, the user fonts folder can be a problem. fonts inside this foder seems to conflict with something on the logout process. So try clearing this folder.

redirect the folders
After that, I thought about making the redirecting safe. If you´re firm with the WGM and the MCX-redirect, you can use it. But the easy way seems to be the use of the ready NHR scripts. Install it to all machines - thats it.

This steps, made seven days ago, and the system runs stabel and speedy. Machine runs at 40% load at 80 simultanuous connections, no freeze at logout and the whole behaviour aof my machines seems to be much faster. Well, my students said that. 😀


So, Step-by-Step
1. changing Thresholds
2. delete fonts Folder
3. installing NHR for decrease the AFP load

Don´t forget to do this also with the server, cause mine has office installed and after all the work I was at home, restarting the server over vpn and ... well, it freezes cause I´ve forgotten to delete the fonts an do all the other things. 😉

P.S.: I gave up in trying to bring the correct commands and good text formatting to the apple forum. If anybody knows how to format the text correct, please let me know. Sorry for that. 😟

Feb 12, 2009 6:38 AM in response to Perry K Lund

Perry, make sure the script is owned by root and the script is in root's crontab. This command can only be run by a user with super-user privileges or root. When running from the command line, you need to su to root to run the command or 'sudo kill -9 [process number]' in another user's shell.

My script is running fine and without it, we'd be running into big problems. I have sent the output of the script to a log file and the following is the log from just the last couple of days...

2009-02-10 07:00 CPU @ 100.0 % - Killed AppleFileServer
2009-02-10 12:00 CPU @ 0.0 %
2009-02-10 15:00 CPU @ 0.0 %
2009-02-10 17:00 CPU @ 99.0 %
2009-02-10 19:00 CPU @ 199.1 % - Killed AppleFileServer
2009-02-10 21:00 CPU @ 0.0 %
2009-02-11 07:00 CPU @ 100.0 % - Killed AppleFileServer
2009-02-11 12:00 CPU @ 199.4 % - Killed AppleFileServer
2009-02-11 15:00 CPU @ 199.0 % - Killed AppleFileServer
2009-02-11 17:00 CPU @ 99.2 %
2009-02-11 19:00 CPU @ 598.0 % - Killed AppleFileServer
2009-02-11 21:00 CPU @ 0.0 %
2009-02-12 07:00 CPU @ 99.0 %

Feb 12, 2009 2:47 PM in response to md2298

I plan on upgrading over the weekend. From the Read Me:

Description: A race condition in AFP Server may lead to an infinite loop. Enumerating files on an AFP server may lead to a denial of service. This update addresses the issue through improved file enumeration logic. This issue only affects systems running Mac OS X v10.5.6.

Anyone have an idea what the phrase "file enumeration logic" might mean?

Feb 13, 2009 4:25 PM in response to Russell Hoorn

I did the following to my misbehaving server and things improved dramatically.

(Received advice from Apple Support)

1. For the time being please (if possible) decrease the amount of
automounts to 3 for testing purposes.

(I reduced mine from 4 to 2)

2. Would it be possible to distribute half the home folders to another
drive or RAID device? If so please consider doing that.

(I split my homes over 2 drives)

3. What is the users workflow? What applications do the users access?

(Told them it varied a lot)

4. Please disable spotlight indexing on the nethomes if it is enabled.

(It wasn't)

5. Remove the odpac.bundle from /System/Library/KerberosPlugins/
KerberosAuthDataPlugins and reboot the server.

(Odd.. but I did it)

6. Are any accounts mobile users? If so, what sync preferences are
configured?

None were.

Things improved dramatically.... down to two brief (2 min spikes) over the past 2 days. CPU hovers around 100% instead of nearly 800%.

Feb 14, 2009 12:06 PM in response to npynenberg

After the update to both server and clients I can still recreate the problem. This is getting really depressing! I tried following the Apple advice posted by npynenberg including removing the odpac.bundle. That was either no help or not much help, I can't really tell right now. For my setup, a sure-fire way to get AFS to peg itself to 100% is to start up two or three MS Office 2008 apps one right after the other (Word, Excel, Powerpoint etc.) on one or more network home clients. The users have been told to go slow, but as everyone here knows, there are many other situations that get us into this state.

APPLE please work harder on this problem!
I'm starting to lose my faith...

<Edited by Moderator>

Feb 16, 2009 1:37 PM in response to tech79

We have tested the system heavily today after installing the latest security patch to the server and we updated all clients to Leopard (we had 30, 10.4.11 clients running previously). We had 56 Leopard Workstations and 20 Windows XP clients login simultaneously and the AFP process went to a max of 636% and then fell back to between 8% and 50% which is what we would have expected. Logins took between 15 seconds and 60 with the simultaneous login challenge, but the process released and came down. Previously it would go to 800% and then peg at about 300% continuously with clients seeing slow performance and the spinning beach ball. I am cautiously optimistic that the patch has worked.

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

afp server issue - very hign cpu load

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.