Apple Event: May 7th at 7 am PT

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

SSH hangs + local terminal prompt blank

This has been happening more and more often, and is driving me up the wall. It's affecting all of our 10.6 server machines and rebooting is the only way to fix it (which is unacceptable in a production environment).

After the machine has been running for a while, SSH becomes unresponsive, where ssh from a remote host hangs on: debug1: Entering interactive session.

When trying to open up a terminal window locally on the machine, it gives no prompt. Repairing permissions, deleting prefs, trying multiple accounts... all do the exact same thing once this starts and the only option is to ARD in and reboot or reboot via LOM.

Any ideas?

Multiple - MacPro/MacBook Air/Xserve, Mac OS X (10.5.6)

Posted on Nov 24, 2010 4:28 PM

Reply
Question marked as Best reply

Posted on Nov 24, 2010 5:07 PM

I do not have an answer. I can only suggest some diagnostic approaches.

When this happens, can you start Applications -> Utilities -> Activity Monitor, then select one of the apparently hung processes, and use Activity Monitor -> Sample Process to see what the process in question is doing?

Another thing to try. Compare the failed ssh -v -v -v output against a successful ssh -v -v -v, and see if the next item after "Entering interactive session" is helpful. On my system, I get:

debug1: Entering interactive session.
debug2: callback start
debug2: clientsession2setup: id 0
debug2: channel 0: request pty-req confirm 1
debug2: channel 0: request shell confirm 1
debug2: fd 3 setting TCP_NODELAY
debug2: callback done
debug2: channel 0: open confirm rwindow 0 rmax 32768
debug2: channelinput_statusconfirm: type 99 id 0
debug2: PTY allocation request accepted on channel 0
debug2: channel 0: rcvd adjust 2097152
debug2: channelinput_statusconfirm: type 99 id 0
debug2: shell request accepted on channel 0

Now the question is, what does a "callback start" entail. Does it involve sshd making a network request back to the client? If so, is it possible that the server is having problems opening new connections? Is it possible that sshd is using the client's DNS name, and running into issues with DNS lookup (or maybe reverse DNS lookup). I'm just guessing, but it seems to me that something involved in "Entering interactive session", or "callback start" is being blocked.

Does the server use any kind of network authentication method (LDAP, Microsoft Domain controler, Unix NIS, etc...)?

On the server is there anything in the /var/log/* files related to sshd?

It is possible to start sshd with its own debugging output. I think you can either run it manually with debugging option, or modify the /etc/sshd_config to include LogLevel debugging options (see: man sshd_config).

Is it possible that the server is running out of process slots? By any chance are there a lot of zombie processes hanging around on the server that might be tying up process slots? (I do not know if there is Mac OS X process limit, I just thinking of generic Unix problems I've run into in the past).

Is it possible that the server is running out of allowed file structures? Too many concurrent open files? (again, I do not know if there is a Mac OS X open file limit, I'm just thinking of previous Unix systems I've worked on).

Since ssh is network based, what about a TCP/IP Socket limit? (again wild speculation). I do know that TCP IPv4 only has 65K worth of ports numbers. If by some strange situation, there was an open socket on ever available port, that would jam things up (I highly doubt this to be the case, but I'm trying to cover as many bases as I can with my guesses).

Bottom line is that there is most likely a resource that is either blocked, and thus not allowing any thing else that needs to go through that resource to continue, or there is a resource constraint, and sshd processes are blocked waiting for the constrained resource to have some free units. The hard part is tyring to figure out what they are. Hopefully, then Activity Monitor -> Sample Process output will help with that.

You might also want to file a Bug with Apple at:

BugReporter
<http://bugreporter.apple.com>
Free ADC (Apple Developer Connection) account needed for BugReporter.
Anyone can get a free account at:
<http://developer.apple.com/programs/register/>

They may post back additional things to try in order to figure out what is going on.

And finally, since this involves Mac OS X Server, you might want to Mac OS X 10.6 Sever forum:
<http://discussions.apple.com/category.jspa?categoryID=264>

Message was edited by: BobHarris
5 replies
Question marked as Best reply

Nov 24, 2010 5:07 PM in response to Alex Geis

I do not have an answer. I can only suggest some diagnostic approaches.

When this happens, can you start Applications -> Utilities -> Activity Monitor, then select one of the apparently hung processes, and use Activity Monitor -> Sample Process to see what the process in question is doing?

Another thing to try. Compare the failed ssh -v -v -v output against a successful ssh -v -v -v, and see if the next item after "Entering interactive session" is helpful. On my system, I get:

debug1: Entering interactive session.
debug2: callback start
debug2: clientsession2setup: id 0
debug2: channel 0: request pty-req confirm 1
debug2: channel 0: request shell confirm 1
debug2: fd 3 setting TCP_NODELAY
debug2: callback done
debug2: channel 0: open confirm rwindow 0 rmax 32768
debug2: channelinput_statusconfirm: type 99 id 0
debug2: PTY allocation request accepted on channel 0
debug2: channel 0: rcvd adjust 2097152
debug2: channelinput_statusconfirm: type 99 id 0
debug2: shell request accepted on channel 0

Now the question is, what does a "callback start" entail. Does it involve sshd making a network request back to the client? If so, is it possible that the server is having problems opening new connections? Is it possible that sshd is using the client's DNS name, and running into issues with DNS lookup (or maybe reverse DNS lookup). I'm just guessing, but it seems to me that something involved in "Entering interactive session", or "callback start" is being blocked.

Does the server use any kind of network authentication method (LDAP, Microsoft Domain controler, Unix NIS, etc...)?

On the server is there anything in the /var/log/* files related to sshd?

It is possible to start sshd with its own debugging output. I think you can either run it manually with debugging option, or modify the /etc/sshd_config to include LogLevel debugging options (see: man sshd_config).

Is it possible that the server is running out of process slots? By any chance are there a lot of zombie processes hanging around on the server that might be tying up process slots? (I do not know if there is Mac OS X process limit, I just thinking of generic Unix problems I've run into in the past).

Is it possible that the server is running out of allowed file structures? Too many concurrent open files? (again, I do not know if there is a Mac OS X open file limit, I'm just thinking of previous Unix systems I've worked on).

Since ssh is network based, what about a TCP/IP Socket limit? (again wild speculation). I do know that TCP IPv4 only has 65K worth of ports numbers. If by some strange situation, there was an open socket on ever available port, that would jam things up (I highly doubt this to be the case, but I'm trying to cover as many bases as I can with my guesses).

Bottom line is that there is most likely a resource that is either blocked, and thus not allowing any thing else that needs to go through that resource to continue, or there is a resource constraint, and sshd processes are blocked waiting for the constrained resource to have some free units. The hard part is tyring to figure out what they are. Hopefully, then Activity Monitor -> Sample Process output will help with that.

You might also want to file a Bug with Apple at:

BugReporter
<http://bugreporter.apple.com>
Free ADC (Apple Developer Connection) account needed for BugReporter.
Anyone can get a free account at:
<http://developer.apple.com/programs/register/>

They may post back additional things to try in order to figure out what is going on.

And finally, since this involves Mac OS X Server, you might want to Mac OS X 10.6 Sever forum:
<http://discussions.apple.com/category.jspa?categoryID=264>

Message was edited by: BobHarris

Nov 24, 2010 5:05 PM in response to BobHarris

Here's the -vvv result:


debug1: Entering interactive session.
debug2: callback start
debug2: client session2setup: id 0
debug2: channel 0: request pty-req confirm 1
debug2: channel 0: request shell confirm 1
debug2: fd 3 setting TCP_NODELAY
debug2: callback done
debug2: channel 0: open confirm rwindow 0 rmax 32768


Right now, no directory services... open directory's set to standalone and all accounts are local accounts. DNS has been excellent and no error at all in the bind logs. I'll be looking the rest once I get the issue to come up again (right now, upgrading to 10.6.5 to see if that'll help)

Nov 24, 2010 5:43 PM in response to Alex Geis

Alex Geis wrote:
(right now, upgrading to 10.6.5 to see if that'll help)


Please don't do that. If the problem you describe is clearly listed here: http://support.apple.com/kb/HT4249
or here: http://support.apple.com/kb/HT1222

then the update would fix it. Otherwise, you are just making the problem worse by turning it into a moving target.

I think a likely candidate would be DNS reverse resolution for ssh validation.

I just did a Google search for:
"debug2: channel 0: open confirm rwindow 0 rmax 32768" hang
and I think that it is.

Nov 24, 2010 6:40 PM in response to etresoft

K, so far.. updated to 10.6.5, booted off external HD and did disk/permissions repairs, and did some research and found directory access app was moved after 10.4 to CoreServices. While it's an OD standalone, I did notice LDAPv3 was checked in DA, so turned that off. Also double checked DNS, and reverse is fine on both NICs, as well as normal lookups and 0 errors /Library/Logs/named.log.

All's good for now, login's are a little zippier than usual...

SSH hangs + local terminal prompt blank

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.