6 Replies Latest reply: Feb 5, 2013 4:01 AM by Paul_Cossey
kittonian Level 1 Level 1

We have a brand new Mac Pro (12 core with 64GB of RAM) running OS X Lion Server in a corporate environment. The server is running only file sharing and software update servers, and we have around 40 users who need to be connected over AFP at all times. This company runs 24/7 and we have an XSan environment using an ATTO Celerity 8GB 4 channel fiber card (84EN) along with a 6 port 10GB Ethernet card. The Ethernet card is configured in a link aggregation bond using ports 1-4. The idea is that clients who do not have fiber cards installed on their machines can still connect to the SAN via Ethernet and this file server. They AFP connect to the share, and of course the share is the SAN. It's a single mount point and everyone has read/write access.


The issue is that this machine keeps crashing (multiple times per day) and I cannot find any reason why. Syslog shows nothing of value and I've called into Apple Enterprise Support who also brought nothing to the table.


We initially had SMB and AFP file sharing activated but as soon as a Windows 7 client connected the machine was brought down. So, I disabled SMB via terminal (sudo serveradmin stop smb) and deactivated it via the Server app for the share point). That at least allows the machine to be up for 4-6 hours before crashing again.


This is seemingly the simplest of setups for file sharing and I would've thought that this beast of a machine would be able to handle being a file server without issue for far more than 40 clients. I'm seeing high CPU usage, which Apple support told me was perfectly normal (around 60% on the kernel_task process and around 55% on the AppleFileServer process). It also seems to consume all 64GB of memory, though it shows 60GB as inactive, but at the same time it's paging in and out.


Virtually all of the clients are running Lion (10.7.4), the server itself is running 10.7.4. There are a few ethernet connected clients running 10.6 along with two running 10.5. As I mentioned I disabled SMB so there are no Windows computers connecting to this machine at this time (though it would be nice to get that functionality back if AFP can be stabilized).


None of this makes any sense to me and I'm hoping someone can shed some light on this issue. This company simply cannot be down, especially not multiple times per day. The only way to bring things back and running from a crash is to hard boot the machine via the power button as you cannot perform a restart or a shutdown. Once the machine comes back up everything is back to working order for a few more hours until it happens again.

Mac Pro, Mac OS X (10.7.4), 12 Core, 64GB, 2xSSD, ATTO 84EN
  • Good-heart Level 1 Level 1

    Gut reaction: memory problem.


    What does Hardware Test tell you?


    Do you have the means to test the memory modules individually?

  • kittonian Level 1 Level 1

    I can always pull memory if hardware test tells me something is wrong, but to test the modules and see if I get a crash one at a time would not be something I could do in this situation.


    I also did a ton more research over the weekend and found that File Sharing in System Prefs was enabled, as was the File Sharing server in the Server app. I turned off the one in System Prefs and we'll see what happens.


    I will run the hardware test to see if it shows me anything interesting if it goes down again today.

  • kittonian Level 1 Level 1

    All the memory is fine. The server rarely if ever goes down when there are only around 10-12 users connected. When there are 20+ users connected and working heavily it goes down often. When I say working heavily, I mean they are transferring huge files to the SAN (100GB+), sometimes 5 at a time per user, and there are a bunch of others who are reading large video files at a minimum of 220MB/sec from the SAN.


    Though this worked on Snow Leopard without any issues, Lion just doesn't seem to be able to handle it. The odd thing is, on Snow Leopard there was only a single 1GB ethernet connection to a NAS system, whereas with Lion we have a much more powerful machine with a 6-port 10GB ethernet card and a 4 lane 8GB fiber card to a true SAN. You would think that the newer scenario with Lion would handle far more users with ease.


    So far, very disappointing with regards to Lion's file serving performance.

  • Kaminparis Level 1 Level 1

    I hope this helps.  But, Apple has confirmed a significant bug in 10.7.4 that breaks SMB.  Their support says they are working on it, but they do not know when the fix will be released.  It has broken SMB at all sites I support including my own location where Lion server has the updated version as well.  Only AFP works.  The suggestion is to restore 10.7.3 from a backup or wait until the fix is released.  SMB does work - but only when admin authentication is used.  Good luck!

  • kittonian Level 1 Level 1

    I appreciate the advice. We had already traced SMB as a serious problem and turned it off. This is now all happening with only a single share and only AFP enabled. It also doesn't seem to be related to how many users are connected but could still be related to significant load being placed on the box.


    Whatever it is I'm at a loss because on a much slower Snow Leopard machine this worked just fine. Please keep the suggestions coming.

  • Paul_Cossey Level 1 Level 1

    Did you ever come across a resolution to this?


    We have a very similar problem. 10.7.4 eats all the RAM and puts it in to inactive. We have an extra 12GB on order, but after reading this post I'm not sure it will fix the issue at hand!


    Thanks Paul