Apple Event: May 7th at 7 am PT

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Unable to use basic shell commands over ssh

Hi,

Our sysadmin at work just replaced an ancient slackware box we have been using as our primary server at work, with a brand new centos box.

On the new box, I get strange "hangs" while trying to do basic commands such as `ls` and `man` and `top`. By "hang" I mean, the command will display some output (not necessarily all of the output), and then simply stop, refusing to accept any input (including control-c), and the only way out is to close the tab in Terminal.app.

`ls` works fine for small directory listings (only one or two lines of output), but for large directories it will display a few lines and then stop with the "hang" symptom.

`top` doesn't display any output at all, it just "hangs" as soon as i try to run it.

`man` usually works, but from time to time it will also "hang". I haven't been able to reproduce this reliably, but it has happened 4 or 5 times in the last hour, so worth including.

Here's the output I'm getting:

+Last login: Sun Jan 23 22:30:58 on ttys000+
+~@ma% ssh crema.coc+
+Last login: Sun Jan 23 22:31:02 2011 from 192.168.removed.removed+
+abhi@icrema ~$ ls /home/coc/www | wc -l+
1265
+abhi@icrema ~$ ls /home/coc/www+
+00 removed.com.au+
+removed removed.com.au-svn+
+removed.com lang_test+
+3rd-party-libs removed.com.au+
+removed.com removed.com.au+
+.8080 removed.com.au-new+
+removed.com removed.com.au+
+abcode.info removed.new+
+abhi-goldcoins removed.com+
+abhitest201008 removed.com.au+
+abhitest.com removed+

As you can see, there are 1265 files in the directory, but it's only showing 22 of them and then there's no more output, and it won't accept any inputs.

Has anyone else seen this issue? Some things I've tried to fix the problem:

- change from bash to tcsh on the server
- download/install iTerm and try with that
- play with various "declare terminal as..." settings in Terminal.app's preferences
- google searched, but couldn't find anything

None of these fixed the problem, even in iTerm the problem exists. But my colleagues are able to log in with my account on their windows workstations (with PuTTY) and have no problems. Unfortunately I'm having issues getting my windows virtual machine to connect to the VPN on our new server (setting up a new server takes time...), so I wasn't able to test in PuTTY myself. But they say it works.

This is pretty much a virgin centos install, and our sysadmin is responsible for all the servers in a small hosting company, so it's pretty safe to assume he hasn't done anything wrong... and I've never had this problem on the dozens of other servers he's setup. I don't think he did anything different, except maybe it's not quite the same version of CentOS.

Before I ask him to look into it*, has anyone else seen this problem/know a fix?

* sysadmin doesn't know much about macs, and he's got a ton of other work to do as part of setting up this new server, so don't want to bother him if possible

Mac Mini Server, Mac OS X (10.6.6)

Posted on Jan 23, 2011 4:48 AM

Reply
Question marked as Best reply

Posted on Jan 23, 2011 5:06 AM

Have you had any problems connecting to any other machines? I work all day connecting to a variety of machines including RedHat, CentOS, and Sun with no problems. It is definitely a configuration issue on either your machine or that particular server. It actually sounds like more of a VPN problem on the server. There is no such thing as a virgin Linux install. Each individual Linux machine is a unique operating system.
14 replies
Question marked as Best reply

Jan 23, 2011 5:06 AM in response to Abhi Beckert

Have you had any problems connecting to any other machines? I work all day connecting to a variety of machines including RedHat, CentOS, and Sun with no problems. It is definitely a configuration issue on either your machine or that particular server. It actually sounds like more of a VPN problem on the server. There is no such thing as a virgin Linux install. Each individual Linux machine is a unique operating system.

Jan 23, 2011 6:30 AM in response to Abhi Beckert

While you could enable and use the diagnostics within ssh to look for differences...

ssh -vvv user@example.com

between the working and the not working configurations, I'd expect results from examinations over in VPN-land. VPNs are a very common trigger for this sort of behavior, particularly if there's NAT involved, blocked ports, whatever. The VPN implementations between Windows and Mac differ, which means firewalls can sometimes pass all of one set of VPNs, and can scrozzle others. And then there's the whole fun of the VPN through NAT, where some of the configurations can get parallel connections tangled. See if (for instance) screen-sharing via the VPN also shows this same minute-and-dropped behavior.

Jan 23, 2011 8:52 AM in response to Abhi Beckert

The part of your question that's relevant to this forum is whether the problem is on the client (Mac) end or the server end, or perhaps in the network. Can you connect normally to other SSH servers from that client? Can other clients connect to that server? If the first answer is "yes" and the second "no," there's nothing wrong with your Mac.

Jan 23, 2011 12:44 PM in response to MrHoffman

What exactly do you mean by "minute-and-dropped behavior"? There's no "one minute" time delay.

I am connecting to the VPN over NAT, and with the new server we have switched to a different VPN setup. I don't see how that could fail reproducibly with a couple of shell commands while everything else works perfectly (including other shell commands), but we will investigate it.

Message was edited by: Abhi Beckert

Jan 23, 2011 12:44 PM in response to Linc Davis

Yes, I am able to connect to other servers, I do it every day.

Everyone else can log in to the same server with my username/password account just fine. They are all using windows or linux.

My colleagues who use macs were not at available at the time (we did the server upgrade at midnight on sunday), I was hoping to get this resolved by the time I start work today (monday). It's going to be impossible to get any work done without being able to run `ls`.

When they come online I'll ask them to test for me.

Jan 23, 2011 12:46 PM in response to Abhi Beckert

Just like others, I live all day long in ssh sessions to Linux (RedHat), Solaris (SPARC and X86), AIX, as well as to other Macs. My ssh sessions are 100% reliable.

My normal terminal is iTerm, and the fact that you have tried BOTH Terminal and iTerm would eliminate the terminal emulator.

While logged in from home I go over a Cisco AnyConnect VPN connection and that works without any issues. However, there are many VPN implementations/protocols, so your mileage may vary with respect to VPN.

While I'm not hopeful, I would at least try a very minimum TERM setting on the remote system

export TERM=vt100

You cannot get much simpler than vt100, just in case the centos is trying to send complex escape sequences. Then again, if there is something sending strange escape sequences, and ignores the TERM environment variables, then nothing is going to stop it.

PS1="Simple Prompt> "

You might try resetting your PS1 prompt (sh, bash, ksh, zsh shells), just in case the system has put some strange escape sequences in your prompt. tcsh uses

set prompt="Simple Prompt> "

NOTE: I'm not sure this has anything to do with terminal escape sequences, but this is one way to make sure there is nothing related to your TERM setting as seen on the remote system and your prompt are not involved.

As suggested, try "ssh -v -v -v" and see if any ssh debug messages show up when your problems occur.

Can you ssh into another system at work, and then from that system ssh into the centos system? Does that work or fail?

If you 'ls', 'man', 'top' redirecting output into a file, and then 'cat' that file, do you get the same behavior?

Can you scp a large file to/from the centos system. scp is going to go over an ssh connection, but it will NOT have any terminal interactions, so if this is terminal emulator related, then scp would work perfectly. However, I think this is not terminal emulator related, so I would expect scp to fail as well.

However, if scp works, then there is one other possibility, and that is the pty driver on centos, as this is the code that pretends to be a tty interface over the network. A failure here could make it look like a problem with your terminal emulator. However, I'm not really thinking there are any problems with the pty, as that is long standing stable tech, so I do not expect it to be broken. But I'm trying to cover all bases.

Jan 23, 2011 2:18 PM in response to Abhi Beckert

Abhi Beckert wrote:
What exactly do you mean by "minute-and-dropped behavior"? There's no "one minute" time delay.


I think he means "working one minute and dropped the next". What you describe sounds very much like a problem with lower-level networking protocols. I have encountered a few networks that seem to be heavily optimized for only web browsing. They expect you to make a single network connection for each request and then close it immediately. A web browser would work normally in such a situation. An ssh client would fail.

Another possibility is something even lower, related to network packets. Your connections do seem to be failing after a single packet.

I suspect it is a problem with the VPN. What VPN is it? Do you have the most recent client version? Are your settings identical to what Windows clients are using?

All of the above are just guesses. I can assure you that MacOS X can connect to any server OS you want to use. I can also assure you that either Terminal or iTerm will work fine. MacOS X uses the same ssh software as the Linux server is using. That leaves the VPN. As much as I can assure you that MacOS X, Terminal, and ssh work perfectly, I can assure you that most VPNs, especially corporate ones, are going to be flaky - especially on MacOS X.

Jan 23, 2011 8:25 PM in response to Abhi Beckert

The big question for me is whether the problem only exhibits under SSH, or whether other protocols are affected to.

From the description I'm thinking it's a low-level networking issue, something to do with the volume of data to send, which implies a MTU or MSS related issue - small chunks of data (such as login banners) are fine, but as soon as you exceed one packet's worth of data (typically about 1500 bytes) the subsequent packets are dropped.

As already suggested, ssh -vvv may help, but if it is a low-level networking issue then I'd expect other protocols to be affected too - can you load large files via the web server? what about other file sharing protocols? I'd expect them to all fail, too, in which case it's a problem for the server admin to sort out (but at least knowing to look at the network stack will give him a pointer).

Jan 24, 2011 2:42 AM in response to BobHarris

BobHarris wrote:
While logged in from home I go over a Cisco AnyConnect VPN connection and that works without any issues. However, there are many VPN implementations/protocols, so your mileage may vary with respect to VPN.


I'm not sure what's running on the server, but I'm using the VPN feature built into System Preferences -> Network, with "VPN Type" set to PPTP. The settings are exactly the same as on my iPhone, which works just fine.

BobHarris wrote:
While I'm not hopeful, I would at least try a very minimum TERM setting on the remote system

export TERM=vt100

You cannot get much simpler than vt100, just in case the centos is trying to send complex escape sequences. Then again, if there is something sending strange escape sequences, and ignores the TERM environment variables, then nothing is going to stop it.


I had already tried this, with no luck.

BobHarris wrote:
As suggested, try "ssh -v -v -v" and see if any ssh debug messages show up when your problems occur.


I also tried this, again, no luck.

BobHarris wrote:
Can you ssh into another system at work, and then from that system ssh into the centos system? Does that work or fail?


I may be able to try this at the moment, but not right now. We recently moved nearly all of our servers out of our small datacentre into a proper one, to get more bandwidth and reliable power. This is pretty much the only server left (our development/staging server, where we need gigabit ethernet for those who work in the office).

BobHarris wrote:
If you 'ls', 'man', 'top' redirecting output into a file, and then 'cat' that file, do you get the same behavior?


Yes. This works fine, and cat of the output works fine too.

BobHarris wrote:
Can you scp a large file to/from the centos system. scp is going to go over an ssh connection, but it will NOT have any terminal interactions, so if this is terminal emulator related, then scp would work perfectly. However, I think this is not terminal emulator related, so I would expect scp to fail as well.

However, if scp works, then there is one other possibility, and that is the pty driver on centos, as this is the code that pretends to be a tty interface over the network. A failure here could make it look like a problem with your terminal emulator. However, I'm not really thinking there are any problems with the pty, as that is long standing stable tech, so I do not expect it to be broken. But I'm trying to cover all bases.


It looks like HTTP is failing too. I'm unable to access any of our intranet websites, they time out trying to load the page, even though I can ping them (initially I thought this was a DNS issue, so did not mention it as I have fairly complicated local DNS).

Jan 24, 2011 2:47 AM in response to etresoft

etresoft wrote:
Another possibility is something even lower, related to network packets. Your connections do seem to be failing after a single packet.

I suspect it is a problem with the VPN. What VPN is it? Do you have the most recent client version? Are your settings identical to what Windows clients are using?


We're also now convinced it's a VPN issue, as HTTP to intranet sites is unreliable / broken. I'm using exactly the same settings as on my iPhone, and it works perfectly there. I'm using PPTP as mentioned in my last reply.

etresoft wrote:
All of the above are just guesses. I can assure you that MacOS X can connect to any server OS you want to use.


Of course. I've been doing programming for linux servers from my mac for nearly 10 years, and never had any problems before. I knew from the start it's likely a problem on the server's end, but that doesn't change the fact that this problem only occurs when connecting from a mac (windows, linux, iOS can all connect without issues), so Mac specific advice is needed.

Jan 24, 2011 3:00 AM in response to Camelot

Camelot wrote:
From the description I'm thinking it's a low-level networking issue, something to do with the volume of data to send, which implies a MTU or MSS related issue - small chunks of data (such as login banners) are fine, but as soon as you exceed one packet's worth of data (typically about 1500 bytes) the subsequent packets are dropped.


I worked from the office instead of from home today (1 hour 30 minute commute each way... ugh!), but tomorrow I'm going to work from home and try to get the issue solved. We'll focus on MTU and so on, hopefully we can get it going now.

Thanks everyone for your advice! It's really helped, I will be sure to post the solution when we find it. In the mean time, if you have any more ideas... that'd be great. 😉 I'm a bit out of my depth, and our sysadmin is much more familiar with windows/linux than mac os x and bsd style unix.

I found out today the new server wasn't quite ready to be activated, he was half way through creating the new server when suddenly the old server started to show early warning signs of RAID failure. The last time the RAID in that server failed, it was a nightmare to get everything restored from backups and running smoothly. So we moved to the new server prematurely and a bunch of stuff is broken. Managing our dev server isn't really his job, he's supposed to be running our all the servers our client's pay us for, so this has left us temporarily short-staffed.

I'm the only employee who needs VPN access to work, and am trying to use up as little of his time as possible, looks like he's going to be working 16 hour days even without my problem.

Jan 24, 2011 7:16 AM in response to Abhi Beckert

When VPNs and IP routing and NAT go sideways due to a configuration error, it seemingly takes about a minute or so for the routers or the VPN servers or the NAT to get sufficiently tangled and confused, and which is then followed by connection loss or VPN drop or an infinite stall if a keep-alive isn't enabled.

This is a classic footprint of a VPN-level or NAT-level failure, in my experience. Multipath IP routing errors tend to be a little more up front about the disconnection; you don't ever hear back.

L2TP can get tangled with NAT, for instance. (This is also why I tend to prefer to establish the tunnel to the firewall and an associated VPN server, rather than VPN pass-through at the firewall.) A VPN server at the firewall is easier to manage and easier to configure and easier to maintain. Yes, I've done this successfully with pass-through, too. It's possible, but the results can tend to be rather more squirrelly and the configuration can be far more delicately balanced.)

There's also the more subtle parallel tunnel "fun", where everything works until somebody else raises a tunnel, and then things go sideways. Until you realize there's a second tunnel getting involved, and a routing issue with that second tunnel, this case can be particularly puzzling.

Jan 24, 2011 1:39 PM in response to Abhi Beckert

Knowing you're on VPN strongly points to that as being the cause. It's almost certainly a MTU issue - the standard ethernet packet is 1500 bytes but when you're on a VPN connection, there's a 40-byte per packet overhead for all the VPN headers, reducing the MTU to 1460 bytes.

Normally this isn't a problem - the client and the server typically transmit their MTU when they initiate a connection, and the lowest one wins. However, in this case your MTU isn't getting reset by the VPN so your client is telling the server it can handle 1500-byte packets when, in reality, it can't, and large packets are getting truncated and/or dropped - that's why small transactions (< 1460 bytes) work, but large ones fail.

Until the tech works out the server-side issues your easiest solution is to fix it yourself - you can change your MTU in System Preferences -> Network -> (interface) -> Advanced -> Ethernet. Change it from the default (1500) to 1460 or lower and see how that goes.
This will have a slightly negative impact on other connections, but it beats the alternative if you need to talk to this server.

Jan 26, 2011 8:45 PM in response to Abhi Beckert

It was MTU related, we were able to temporarily solve the problem with:

sudo ifconfig ppp0 mtu 1270


And more permanently by installing this shell script at /etc/ppp/ip-up:

#!/bin/sh

/sbin/ifconfig $1 mtu 1270


I'm not sure if the command line approach is different to setting it in system preferences, but it works. I have not tried setting it in the GUI but i'm sure that would work too, for anyone else who's encountering this problem. 🙂

Thanks everyone for your time solving this.

Message was edited by: Abhi Beckert

Message was edited by: Abhi Beckert

Unable to use basic shell commands over ssh

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.