Apple Event: May 7th at 7 am PT

Newsroom Update

Beginning in May, a special Today at Apple series titled “Made for Business” will offer small business owners and entrepreneurs free opportunities to learn how Apple products and services can support their growth and success. Learn more >

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Mavericks mail server stops distributing group email after a few hours of usage

I've been having problems with our Mavericks mail server ever since our upgrade with group mail.


The server will distribute group email for a few hour, then it simply stops working.
No error message, no bounces, and the server keeps the incoming mail in its incoming queue.


Restarting the mail services by either server.app or serveradmin mail stop; serveradmin mail start command like are the only way I can get it to continue to distribute the email. I suspect it has to do with postfix itself hiccuping? But this is a reproducable problem.


Mail filtering is on, score set to 6, 10 mb limit, spamhaus is active.


Looking for tips on how to stablize this service short of setting up cron job to restart the services every few hours.

Mac mini, OS X Mavericks (10.9), OpenLDAP and Server.app user

Posted on Nov 1, 2013 10:43 AM

Reply
40 replies

Apr 16, 2014 12:21 PM in response to Lynda Leung

We experienced the same issue. Upon further review (and after much troubleshooting) the process that appears to be causing the headaches is the "list_server_mgr" process that spawns the agents responsible for delivering the messages. The quick and dirty solution to this (until it is fixed in Server.app) is to just run a script that executes: "killall list_server_mgr" at some interval you may determine, which will re-initiate that managing service to spawn the processes that deliver the messages. This is a less heavy-handed approach to offlining the entire Mail environment and then bringing it back up again.


Here's hoping for a fix well before Server.app v3.2.

Apr 17, 2014 11:52 AM in response to Miggl

Joey Appleseed's report caused me to look a bit more into the list_server_mgr aspect of things. My List Server Log shows the following:



Apr 17 11:44:58 genealabs.com list_server_post[11758] <Info>: message: 1397760298.343242.DED78BC8F0360958.msg from: xxxxxxx@gmail.com posted to: info (size=11639)

Apr 17 11:45:01 genealabs.com list_server_agent[10789] <Info>: list agent wake from sleeping

Apr 17 11:45:01 genealabs.com list_server_agent[10789] <Info>: connect to: 127.0.0.1 [port 25]

Apr 17 11:45:01 genealabs.com list_server_agent[10789] <Error>: SMTP write returned error: (null)

Apr 17 11:45:01 genealabs.com list_server_agent[10789] <Error>: write command: "

.

" failed

Apr 17 11:45:01 genealabs.com list_server_agent[10789] <Info>: disconnect to: 127.0.0.1

Apr 17 11:45:01 genealabs.com list_server_agent[10789] <Error>: message deliver error: 1396990516.086732.3583944666D7530B.msg to list: info

Apr 17 11:45:01 genealabs.com list_server_agent[10789] <Info>: list agent sleeping for 300 seconds


This tells me there is something wrong in this area, but I'm not sure where to go from here. I have tested that port 25 is live and well using telnet 127.0.0.1 25, as well as testing the connection from outside of my network.


Any suggestions how to troubleshoot this error? I've googled for it, but haven't seen anything posted on it.

Apr 25, 2014 8:41 AM in response to Joey Appleseed

It appears that list_server_mgr spawns some number of list_server_agent processes when there are group messages to be sent out. These agents stick around if there is always some mailing list activity. However, if there is a dearth of messages for a few minutes, the agents terminate automatically.



list_server_mgr creates two pipes to communicate with the agent processes. In my case, when the agents terminate, the pipes in list_server_mgr are not closed, and become orphaned in the FD list (you can observe this with lsof). After 11 agent processes have exited, all mailing list activity stops. When mailing list activity is dead, try running this command, and count the number of PIPE lines:

lsof -p `ps -A | grep 'list_server_mgr' | grep -v grep | awk '{print $1}'`


If I see 22 of those, and if all of the agents have exited, I think that is time to kick the mailing list process. I've created a script called listkicker.sh on my server, and run it as root in the background like this:

sudo ./listkicker.sh &


The code for the script is as follows:


#!/bin/bash


echo "Starting mailing list kicker with process ID $$" | logger


while [ 1 ]

do

mgr_pid=`ps -A | grep 'list_server_mgr' | grep -v grep | awk '{print $1}'`

numpipes=`lsof -p ${mgr_pid} | grep PIPE | wc -l | awk '{print $1}'`


if [ $numpipes -gt 21 ]

then

numagents=`ps -A | grep -v grep | grep list_server_agent | wc -l | awk '{print $1}'`


if [ $numagents -lt 1 ]

then

echo "Restarting list manager due to stupid bug." | logger

killall list_server_mgr

else

echo "Not restarting list manager with ${numagents} agent(s) running." | logger

fi

fi


sleep 60

done

Apr 25, 2014 8:46 AM in response to Dustin Wenz

Thanks, Dustin, this might come in helpful. In my case, I have found that certain incoming emails can hold up the queue, causing no further messages to be delivered. Removing the offending messages one at a time from the inbox queue frees things up. I have sent a sample of these messages to Apple's Server Engineering team, and they said they are working on the issue and have acknowledged the problem.


I hope this includes all issues listed in this thread, as I'm guessing we are seeing more than one issue here.


As long as I keep the inbox clog-free I am receiving list emails and don't have to run any scripts.


I am monitoring the following folder for "clogs":

/Library/Server/Mail/Data/listserver/messages/inbound/<list guid>/


And keeping an eye on these folders:

  • /Library/Server/Mail/Data/listserver/messages/hold/
  • /Library/Server/Mail/Data/listserver/messages/error/


The emails I do need to remove to unclog the system I put in an bad_emails folder on my desktop, and send that via the Server > Provide Server Feedback ... menu item to the engineering team.

Jun 1, 2014 10:19 PM in response to Lynda Leung

Apparently Server version 3.1.2 had fixed this issue.
I've been running it for a couple of days. without my crontab script. Group mail are still delivering.
Also, old messages that has been caught in the queue are being flooded to the recipiants. So that's something to consider and warn your groups about when you do this update


Anyways, case closed, and about freakin' time.

Mavericks mail server stops distributing group email after a few hours of usage

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.