Level 6

10,225 points

Disable Self-Tuning TCP

I need to disable this or, at the very least, take control of it. It's made doing anything on the internet a miserable, lengthy process. I had no trouble with Tiger and nothing but trouble with Leopard. It's throttling my downstream on both my Macs to useless levels.

If I can't disable or modify this I'm going to go back to Tiger to get things done. I own a business. I cant have this time wasting nonsense and I need a solution now. Much of my work is on the web. I'm paying for high speed access and getting low speed throughput.

The DNS settings are in place, no firewall setting improves this, I've reset all the Airports. Nothing changes. Downloading anything crawls to low speed but ONLY on the Macs.

Does anyone have a clue? Can I archive and install back to Tiger?

iMac G5 2.0 + MacBook 2.0, Mac OS X (10.5.1)

Posted on Nov 22, 2007 8:24 AM

37 replies

Nov 25, 2007 1:06 PM in response to Ken.

Very mysterious! I have no obvious, magic solution. For what it's worth, here's what I would try to narrow the problem down to one component.

1. Run an ethernet cable to one of the Macs, turn off Airport on that Mac and see what happens with successive downloads.

2. While you have Airport off on the other Mac, see if the problem still occurs on the Mac with Airport still on.

Ken. Author

Level 6

10,225 points

Nov 25, 2007 2:24 PM in response to Cardiakke

It gets better. I've checked several of the Leopard and Tiger computers here at work and none have sysctl.conf. I suspect that this config file is not completely in control of this situation.

I don't know how long I can keep Leopard at this rate. Reloading and reconfiguring the system fresh with Tiger would take the better part of a day. I've been tinkering with this for weeks now. It's a deal breaker. I can't waste time with wonky network issues.

Nov 25, 2007 2:41 PM in response to Ken.

I have an iMacG5 that I upgraded to Leopard and it has sysctl.conf. My MacBook where I did an Erase & Install of Leopard does not have that file. Both using Airport and I am getting the same speed on downloads from both machines.

Ken. Author

Level 6

10,225 points

Nov 25, 2007 6:28 PM in response to Cardiakke

I'm beginning to think this goes a lot deeper than sysctl.conf. Something is very wrong in the tcp/ip and dhcp. I had my computers off today since I was at the office and could use Tiger there (I work on Sunday). I came home, and one again, networking lost track of the Airport Extreme Base and Airport Express. That should not be, ever.

I dug up a copy of the old Apple Broadband Tuner patch. That restores the sysctl.conf. It seems to work at bootup and for the first connection. After that something seems to override it.

I went to download the latest Linux build. (At least the G5 can run Linux). The time to complete the download climbed to an astounding 53 hours for 700mb as the speed dropped. Fortunately, there was a torrent feed. That works fine.

Whatever the problem is it's a big one.

Ken. Author

Level 6

10,225 points

Nov 30, 2007 3:59 AM in response to Ken.

I resolved the problem. I pulled out all the Airport base stations and replaced them with a Dlink router. Wired, full speed is back and the some. Wireless is about 90% of the wired speed. It varies about 10%.

Here is where the problems really show up. X simply will not remember that the base is using WPA2 and falls back to WPA unless Network is locked. Even then it forgets. The preferred network simply does not appear and must be manually entered. Adding to the charm of it all the firewall randomly forgets allowed incoming connections. If I leave the Network panel up and running and unlocked it can't be reselected. I have to quit System Preferences, restart it and then Networking reverts back to WPA for wireless. WPA2 does not stick, nor does the preferred network.

These are minor issues but pretty much confirms what I'd thought all along: the firewall and networking are a mess. So four weeks into the problem and I'm back in business with minor network issues (so far). What kills me is the cost of the Base stations. The Dlink cost 20% of the amount I spent on them and is working far better than they ever did.

mreckhof

Level 2

405 points

Nov 30, 2007 5:55 AM in response to Ken.

Let me explain what is going on with sysctl.conf. Mostly from memory so I don't spend all day on this, so forgive me if there are a couple minor mistakes.

You normally don't need to adjust sysctl variables. Therefore, not having a sysctl.conf is not unusual. The kernel is configured 'out of the box' with the correct settings for 'most' users.

That is why a clean install does not have this and an upgraded box does. The Apple Broadband tuner does some measurements and sets those values for you based on what it finds.

What window sizes you use, buffer sizes, etc. are largely dependent upon what kind of network you sit on and more importantly, what sort of latencies you have between you and what you're testing against.

This is typically called the 'Long Fat Pipe' problem and has become more of an issue over the last few years. As a network engineer, we routinely have to 'tune out' latency to improve the efficiency of TCP. You have to remember that when this protocol was written, we didn't have Gbit pipes spanning 40+ ms (or at all) and people wanting to actually send a TB of data across it at full Gbit speeds. In addition, there have always been folks wanting to tune out every spare bit over their 225ms PPP link over a 33.5 kbps modem.

Another description of the 'problem' and a simple calculator to help calculate what to set these values to is here:

http://en.wikipedia.org/wiki/Bandwidth-delay_product

Without going into an entire seminar on TCP/IP, the major factors that you are dealing with are:

Latency (time between point A and point B)

Bandwidth (how much you want each individual transfer to use - you might not want a single TCP session to be able to eat up an entire network pipe - for reasons I'll explain in a minute)

Window size (Goes by many names, but RWIN is one of the values that impacts this)

The math works out as:

Bandwidth * Delay = # of packets to keep 'in flight' (Window size)

So if I want to fill a 15 Mbit pipe (my verizon FIOS speed) and I have 40ms of latency between me and the end node, I need to have 75,000 bytes of data 'in flight' at any one time.

So what do I mean by 'in flight'. One of the reasons why TCP/IP (opposed to UDP/IP) is reliable is that it has a method to acknowledge receipt of packets. The window size is one of the things that is used to determine how often data needs to be acknowledged.

If you have an 8KB window (which is typically what the effective window is of NetBIOS when you talk to a window server for instance), then every 8KB the received data has to be acknowledged or the sender will quit sending data.

Therefore, if there is more than 1ms of latency between me and the server, I'm not going to be able to fill that 100Mbit link I have since I have to wait for the other side to ack the packets I sent - so I can't keep that pipe full. This only gets worse when you get up to Gbit and higher speeds.

What's worse is that the 'window' value in a TCP/IP packet can only be an 16-bit number. Which means it can only be as high as 65535.

So let's go back to that FIOS problem of 15 Mbps and 40ms latency. We need 75,000 bytes of data in flight and we can only do 65,535. So without enabling a new 'feature' of TCP, we're not going to get that 15 Mbps (In a single tcp session).

That's where the window scaling comes into play - rfc1323 ( http://www.ietf.org/rfc/rfc1323.txt). This is enabled/disabled with the sysctl variable net.inet.tcp.rfc1323. You can enable/disable this with sysctl -w net.inet.tcp.rfc1323=1 (enable) and/or set the value you want in /etc/sysctl.conf.

Window scaling adds a multiplier to the value set for the TCP window. So now we can get past 65535 and fill that pipe - but only if we ask for the right size value.

The value of 358,400 provided earlier should be good up to what... almost 9 Gbps over 40ms. So you should be good there.

So this is great - we can now get a pile of data UNACKNOWLEDGED out on the wire at once and we're keeping our pipe full.

Then a packet gets dropped.

Without yet another feature, the receiver is only able to acknowledge the last bit of data that is received. This means that the entire window size (70KB+ of data in this case) has to be RE-SENT.

What's worse is that we no longer trust that the network has the carrying capacity to support this massive amount of data we're shoving down the pipe, so as a good network citizen, the window size gets auto-magically cut in half and we (the sender machine) then has to 'slide' it back to where it was until it's comfortable that it's not going to cause more congestion problems.

That's where SACK (Selective Acknowledgements) comes into play and sysctl value net.inet.tcp.sack.

As long as both sides of the connection support SACK (you'll be surprised at how many that do not), we can now specify both the start and end of the data that we received. So the sender can now only transmit the data that was lost rather than the entire window.

That said, other behaviors such as congestion avoidance (the reduction and re-growth of the window size) will still take place.

There are a number of extensions that have been enabled that try to 'self tune' these parameters and help recover more gracefully from packet loss on these long fat links.

One of them that you'll also see OS X now doing is adding a timestamp option to its packets (also part of RFC1323). In this, the sender puts in a timestamp when it sends a packet. When the other side acknowledges it, it puts in a timestamp. With this, we now know how much time it takes to get from point A to point B (without having to ping it and manually adjust things) and can adjust ourselves.

So this gets to the root of what I think you may be seeing here.

If you are not able to get full speed out of the self tuning protocols AND when you force set the values (basically kick-starting the self tuning values) your transfer rates fall back down, you may have packet loss issues.

If you are, the default behavior is the correct one as it does not cause more of an issue since it'll grow up to the point where the drops start, then back off, maintaining a balance. Where kick-starting may cause a large number of drops, drastically reducing your throughput and delaying that growth back up to where it optimally should be.

You may want to do a network trace and see if you have packets getting lost.

sudo tcpdump -ni en0 host x.x.x.x

... where x.x.xx is the host you are having issues with. You may also need to change en0 to en1 if you're using Airport and not wired network. netstat -in will show you which interface to use for sure - check for your current IP address.

If Sack is supported (you'll see 'sackOK' on both initiating connections (marked with an S for SYN)), it'll be pretty easy to detect packet loss.

For more info on how to read these traces, you can check out this thread:

http://discussions.apple.com/message.jspa?messageID=5994347#5994347

I hope this helps.

mreckhof

Level 2

405 points

Nov 30, 2007 6:04 AM in response to mreckhof

Oh, one VERY important thing I forgot to mention that you'll notice in the traces for sure.

Just because we set a window size of say 300KB doesn't mean:

a) That we have 300KB of data to send in one chunk.
b) That the application's buffers are large enough to take advantage of the full network buffers and windows.

Therefore, if you recall I mentioned an 8KB 'effective' window when I mentioned NetBIOS, the application may flag data for 'Push' and force the receiver to process that data immediately and not wait for a windows worth of data to show up. Depending on how much data gets down the wire before the app forces this push, you may see a reduction in speed as well.

That's why different apps using the same protocol may work better than others. In addition, different protocols may work better than others (Say FTP vs. SSH vs. HTTP).

Ken. Author

Level 6

10,225 points

Nov 30, 2007 7:40 AM in response to mreckhof

That's great information.

I tracked this down, once again, to the Airport drivers themselves, keychain problems, Network preference issues and WPA2 issues. If I set the G5 to use Airport it pings the router like crazy. This does not show on the Airport Base but it does on the Dlink where I can see which computer is accessing what port. The Network pref panel does not lock onto WPA2 Personal and reverts to WPA, even if the panel is locked. Even though the router is set as preferred and listed, it falls off the list, has to be added manually again and then is rejected several times with the error "incompatible security" before finally accepting the password and connecting.

Keychain crashed several times this morning when I tried to verify the contents. The new firewall would pop up asking if I would allow the browser incoming traffic even though this has been previously allowed.

There's been other little notable things. My wireless signal is significantly better without the Airports. Noise levels are much less. Oddly, there are 2 Apple certificates in Keychain marked "not trusted".

I would guess that the errors that Disk Utility are throwing with ARD and SUID cannot be helping any either. Is Disk Utility fixing anything or failing outright?

Whatever the problems are I cannot wait for Apple to get around to sorting this out like last few times. It's been a month and they're not even acknowledging the issues.

Dave Cook

Level 2

355 points

Nov 30, 2007 8:34 AM in response to mreckhof

Awesome post. Thanks for taking the time to write that down for us.

rnae1992

Level 1

0 points

Nov 30, 2007 1:40 PM in response to Dave Cook

I don't know how many times I've opened that sysctl.conf file in the past three days, changed the sockthreshold to 0 and the send and receive windows to different values only to continue with a sluggish internet connection.

I bought Leopard the day it came out, installed it on top of Tiger and as best I can remember (it's been a long week trying to sort this out) the Internet connection worked pretty much as before. But a lot of the iLife and iWork programs were crashing way too frequently and I wrote it off as the computer just being bloated. So Tuesday I got the bright idea to format the drive and do a clean install of Leopard. I have an Airport Extreme base station I bought earlier this year, but I connect my iMac to it via ethernet cable. The Airport is mostly a very expensive ethernet switch that allows us to share a USB printer! After the clean install of Leopard, my Internet connection slowed to a crawl (not good for BellSouth/AT&T's 6.0 Extreme DSL service).

As I said, I tried playing with settings in the sysctl.conf file. That didn't work. I spent an hour on the phone with AT&T tech support and they couldn't find anything wrong. My D-Link modem/router was connecting to them fine. The support guy finally gave up and said he would send a technician out. The technician tested the line and it was fine. He gave us a new modem (the D-Link was cheap and had to be reset a lot) and his laptop tested fine with the new modem. My wife's older Mac Mini has been fine since we got the new modem (Power PC machine, Leopard installed on top of Tiger). But on my iMac, still the same old problems. I tried everything I could find in these threads in all of my free time over a two or three day period. I'm using Cox's DNS servers because someone suggested them. I tried plugging my iMac directly into the modem. Nothing worked. I finally gave up at 4 this morning and made the decision to go back to Tiger. I'm running on Tiger now as I write this and I'm back to normal with my Internet connection. I'm getting 6 Mbps down and 428 Kbps up.

That said, I'm really itching to reinstall Leopard despite all of the hassle of the past week. It has a lot of useful features I was getting used to. Like I said, the Mac Mini is running Leopard fine, which doesn't make sense. So my question: does someone with more IT knowledge out there than I have think installing Leopard on top of Tiger vs. a straight clean install of Leopard would make a difference as far as the Internet speed is concerned? I believe it was after I did a clean install that my Internet connection went in the toilet.

mreckhof

Level 2

405 points

Nov 30, 2007 2:19 PM in response to rnae1992

I have a macbook santa rosa that shipped with Leo and a macbook from last year that shipped with tiger and was upgraded to Leopard.

I have had exactly the same issue with both systems (DNS + Safari favicon hangs) - both of which are cosmetic only and don't really impact transfer rates. In other words, once data starts flowing, it flows just fine at 5101 kbps down / 1792 kbps up at 100ms latency. Unfortunately, that's exactly just enough to not have to utilize tuning (comes out to a required window size of 63,762 bytes).

So in my professional opinion, on a network with no packet loss, high speed, and high latencies, Leopard is working just as well as Tiger did on both an upgraded and native platform.

But the question still remains - Why then are folks reporting such crappy performance. I would LOVE to be able to take over one of these systems and do some tests and network traces to boil it down to a root issue. It pretty much has to be one of the following:

1) The DNS issue and not truly resolved / fixed correctly

2) Something unique to a website, such as the favicon issue, and not an 'across the board' issue

3) Packet loss for any number of reasons - of which, Leo could potentially not be handling it as well as Tiger due to some of the auto-tuning mechanisms or firewall issues.

4) Packet delays / out of order packets / unnecessary packet reassembly due to any number of reasons.

5) An actual bug or driver issue.

Honestly, I'd put my money on 1-4.

So if anyone wants to volunteer a system that isn't performing right and let me log into it, let me know!

Ken. Author

Level 6

10,225 points

Nov 30, 2007 3:07 PM in response to mreckhof

Some of us have higher speed connections and that may be where it is showing up the most. There's a slew of things going on here. I don't want to rehash what I've written previously.

If I had to guess, this is acting like a buffer overrun. The receive buffer is filling up and not passing data fast enough to the hard drive cache or ram or both. Then a resend loop starts and drops the speed to a negotiated slower speed. The trouble with this theory is it would implicate hardware but it must be software or software/hardware. This did not happen with Tiger.

There are issues again with WPA2 yet again. Set the router for WPA2, set Network Prefs for WPA2 and it jumps back to WPA. This is two different encryption algorithms. WPA2 simply does not stick.

Adding further to this mess, I have my complete router and DNS settings entered manually in Network Prefs. I have my network set as the preferred network, turning on Airport should connect reliably. It doesn't, it hunts, it misses every time.

Message was edited by: Ken.

mreckhof

Level 2

405 points

Nov 30, 2007 3:09 PM in response to Ken.

When WPA2 fails, do you get messages like the following in dmesg?

en1: Group TKIP MIC failure reported!
en1: Group TKIP MIC failure reported!
en1: TKIP countermeasures enabled.

This is where the interface then disables itself for a minute (unless you force re-enable it). Moving things back to WPA made this stop.

Ken. Author

Level 6

10,225 points

Nov 30, 2007 7:18 PM in response to mreckhof

Funny you should go there. I never looked there but had a similar bug hunting trip. Dmesg ties into syslogd which becomes activated by launchd. I use a Wacom tablet and that sent the OS into non-stop exception errors, a soft crash. The cpu would race up and launchd wrote to syslogd. All the while the OS did not crash but recorded tens of thousands of lines of errors. Not a single line ended up in the crash logs, where you'd normally look. It took me days to sort that out.

Now here's an interesting little side effect of taking down the Airport Bases: the G5 is now silent again. Since Leopard the fans were whining nonstop. Not full on but just enough that you knew the system was loading up a bit.

As for WPA2, this mess goes back before Leopard. It's high time Apple sorted this out. X has had wonky connectivity issues since 10.0.

mreckhof

Level 2

405 points

Dec 1, 2007 8:08 AM in response to Ken.

You guys got me curious enough that I picked up a Buffalo Wireless-N router to see if different models cause different problems.

Sure enough, I'm seeing hangs, random drops with Airport, few other things.

WPA2 was by far the worse. I told it WPA2/AES and did not allow it to fall back to WPA. It would work for a while, the go away - usually saying that it thought my WLAN had been compromised and that it was going to disable for a minute. A stop/start of Airport brought it right back up again (rinse, repeat).

I then set it for WPA/WPA2 TKIP/AES which I assume WPA was using TKIP and WPA2 AES and again, had similar issues. This time though is where the TKIP errors came up. So I don't see where I could have been using WPA2 or it would have been using AES in the first place.

So, now I'm at WPA/AES and it seems to be stable. No more errors. Going to try WPA/TKIP later and see what it does.

But yes - I'm starting to think something is wrong here as well. But then again, I'm also running Wireless-N draft right now. I'll need to do all the tests in G-Only to know that it's not an issue with something that isn't fully standardized yet - and I can't blame apple for that.

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

Disable Self-Tuning TCP