ashvartsman

Q: Mavericks and Failed ARP causing network drops!

I have been wracking my brain about why on our corporate network, after Mavericks upgrade, we start to see dropped packets every 30-60 seconds.  Here is an example of that ping.

 

64 bytes from 10.11.12.13: icmp_seq=135 ttl=63 time=3.705 ms

64 bytes from 10.11.12.13: icmp_seq=136 ttl=63 time=3.473 ms

64 bytes from 10.11.12.13: icmp_seq=137 ttl=63 time=3.811 ms

64 bytes from 10.11.12.13: icmp_seq=138 ttl=63 time=4.110 ms

Request timeout for icmp_seq 139

Request timeout for icmp_seq 140

Request timeout for icmp_seq 141

Request timeout for icmp_seq 142

Request timeout for icmp_seq 143

64 bytes from 10.11.12.13: icmp_seq=144 ttl=63 time=5.417 ms

64 bytes from 10.11.12.13: icmp_seq=145 ttl=63 time=3.587 ms

64 bytes from 10.11.12.13: icmp_seq=146 ttl=63 time=3.744 ms

64 bytes from 10.11.12.13: icmp_seq=147 ttl=63 time=3.486 ms

64 bytes from 10.11.12.13: icmp_seq=148 ttl=63 time=3.466 ms

 

 

I think I have found a strange ARPing issue which is causing it.  In our corporate environment, we run GLBP (Gateway load balancing protocol) on Cisco gear.  As such, the gateway address floats between two devices requiring the mac_addr to change.  Looks something like this in the arp table:

 

efl-ashvartsman:~ ashvartsman$ arp -a

? (10.224.165.1) at 0:7:b4:2:cb:2 on en0 ifscope [ethernet]

efl-ashvartsman:~ ashvartsman$ arp -a

? (10.224.165.1) at 0:7:b4:2:cb:1 on en0 ifscope [ethernet]

 

On my mountain lion machine, it does a broadcast arp and gets a response for the new mac_addr immediately. 

 

25826.783206000Apple_78:29:ddBroadcastARP42Who has 10.224.165.1?  Tell 10.224.165.55
25926.786929000Cisco_e0:ff:40Apple_78:29:ddARP6010.224.165.1 is at 00:07:b4:02:cb:01

 

This happens seemlessly in the background and no packet loss is observed.  However, looks like Mavericks is doing something completely different, and WRONG.  It is sending out 5 UNICAST requests back to the mac address it had before (ARP should always be broadcast!!!).  It fails these 5 times and then finally does a BROADCAST attempt.  Looks like the below.  It causes then about a 5 second outage to the network of the machine.

 

394          67.052366000          Apple_b9:a6:b2          Cisco_02:cb:02          ARP          42          Who has 10.224.165.1?  Tell 10.224.165.225

395          68.053450000          Apple_b9:a6:b2          Cisco_02:cb:02          ARP          42          Who has 10.224.165.1?  Tell 10.224.165.225

396          69.053595000          Apple_b9:a6:b2          Cisco_02:cb:02          ARP          42          Who has 10.224.165.1?  Tell 10.224.165.225

397          70.053893000          Apple_b9:a6:b2          Cisco_02:cb:02          ARP          42          Who has 10.224.165.1?  Tell 10.224.165.225

398          71.054363000          Apple_b9:a6:b2          Cisco_02:cb:02          ARP          42          Who has 10.224.165.1?  Tell 10.224.165.225

399          72.054466000          Apple_b9:a6:b2          Broadcast          ARP          42          Who has 10.224.165.1?  Tell 10.224.165.225

400          72.058079000          Cisco_e0:ff:40          Apple_b9:a6:b2          ARP          60          10.224.165.1 is at 00:07:b4:02:cb:01

 

 

Here is the arp table during this period:

 

macsccmtest:~ administrator$ arp -a

? (10.224.165.1) at (incomplete) on en1 ifscope [ethernet]

? (10.224.165.220) at f0:b4:79:21:4c:ec on en1 ifscope [ethernet]

 

 

My hunch is that Apple did this to try to reduce bandwidth utilization on the network but it will cause BIG problems on corporate networks that use GLBP or any other protocol to provide redundancy across multiple devices!

 

Anyone else seeing this?  Everyone in my office who has moved to Mavericks can replicate this behavior.

OS X Mavericks (10.9)

Posted on Oct 25, 2013 11:12 AM

Close

Q: Mavericks and Failed ARP causing network drops!

  • All replies
  • Helpful answers

first Previous Page 5 of 5
  • by commorancy,

    commorancy commorancy Aug 16, 2014 9:20 PM in response to MacStadium
    Level 1 (0 points)
    Aug 16, 2014 9:20 PM in response to MacStadium

    Wow. I'm just a little taken aback by this response. Apple doesn't have a testing lab internally? Apple had to rely upon a lab that MacStadium put together? Concerned. And, specifically, I'm concerned that if Apple doesn't have a testing lab, how the heck are they testing releases? Though, I realize you can't possibly build a lab configuration for every possible hardware config. Knowing that this issue exists, Apple should have been able to put a lab together to replicate the problem and determine a cause. That it took an outside third party 9 months to get Apple to notice, again worrisome.

     

    I'm glad that Apple has finally acknowledged the issue and was able to replicate the problem thanks to MacStadium. And I'm glad to know a fix is on the way. That it required this level of third party intervention is troubling on so many levels.

  • by Hector Castillo,

    Hector Castillo Hector Castillo Aug 20, 2014 9:38 AM in response to MacStadium
    Level 1 (20 points)
    Aug 20, 2014 9:38 AM in response to MacStadium

    HI MacStadium, We are experiencing a couple of issues and were wondering if there caused by the same thing you are describing in this post.  We are using Mavericks clients and server and have been fighting this since Mavericks came out.  The network users home folders are hosted on the server.  The two issues we are facing is the mail app keeps asking for password (and does not save it when reentered) and in server app users  do not get disconnected in file sharing when the user logs out.  The mail app issue only happens after switching users on the same client machine. The only fix for this we have found is to reboot the client on logout.  We have found a post that mentions something about the ethernet adapter.  Here is the link to the post. discussions.apple.com/thread/5547625?start=45&tstart=0

    Do you think using the USB to ethernet adapter on our server will help us with either of these issues?  Any feedback or insight would be greatly appreciated.

  • by bitgeeky,

    bitgeeky bitgeeky Aug 20, 2014 4:04 PM in response to ashvartsman
    Level 1 (0 points)
    Aug 20, 2014 4:04 PM in response to ashvartsman

    I had similar issues with my MacBook Pro (late 2013) and explored a lot into this problem. Here[0] is a detailed explanation and solution.

    http://pankajmalhotra.com/ARP-and-ethernet-issues-with-osx-mavericks/

     

    Hope you find it useful.

     

    Cheers !

    Bitgeeky

  • by abbfromff,

    abbfromff abbfromff Sep 3, 2014 3:45 AM in response to ashvartsman
    Level 1 (0 points)
    Sep 3, 2014 3:45 AM in response to ashvartsman

    After the last update (July-August 2014) on OS X 10.9.4, MacBook Pro, the same problem appeared. Losing 3-50% of the packets. Ehternet is affected on both my (large+fast) work networks and not at home. Setting net.link.ether.inet.arp_unicast_lim to zero does not help.

     

    Surely, I don't have to stress that the guys in the IT support are not in the least concerned by the woes of a Mac user in a company with >500 Windows users. So, please please Apple, fix this.

  • by DDV,

    DDV DDV Sep 24, 2014 6:09 PM in response to ashvartsman
    Level 1 (35 points)
    Sep 24, 2014 6:09 PM in response to ashvartsman

    While this sysctl solution works, it should be noted that the RFC states that sending Unicast ARP messages is perfectly valid. See Section 2.3.2.1 of RFC1122, "ARP Cache Validation." for details.

     

    The onus is on the network vendors to fix their bugs, and not on Apple to work around this issue.

     

    -DDV

     

    (Senior Network Administrator for the past 15 years)

  • by abbfromff,

    abbfromff abbfromff Sep 25, 2014 12:30 AM in response to DDV
    Level 1 (0 points)
    Sep 25, 2014 12:30 AM in response to DDV

    Thank you for the update, it is still  valuable information in a way.

    Certainly what you say is correct, vendors should fix this bug. Please note that the sysctl solution DOES NOT always work, because jwsullivan, codythedog and myself have reported that it doesn't.


    And now, try to see it from a user point of view. The network vendors did not change anything this Summer, I simply installed the next update on my OS. And now my network doesn't work, while my admins and Cisco couldn't care less. If I knew that ethernet connection at work is not going to be functioning on my laptop, I would  have never suggested to my employer to buy it for my.. work.



  • by DDV,

    DDV DDV Oct 7, 2014 11:32 AM in response to abbfromff
    Level 1 (35 points)
    Oct 7, 2014 11:32 AM in response to abbfromff

    Thanks for your reply. It is good to have both sides represented!

     

    I just received news that my network vendor (Arista Networks) has reproduced this problem on a very specific hardware platform (which we use), and are implementing a fix.

     

    Good news for us, not much help to those using Cisco or other vendors equipment, nor to the users on said equipment.

     

    -Dave

  • by abbfromff,

    abbfromff abbfromff Oct 30, 2014 10:07 AM in response to abbfromff
    Level 1 (0 points)
    Oct 30, 2014 10:07 AM in response to abbfromff

    An update: the problem is gone after installing Yosemite.

  • by bkoch709,

    bkoch709 bkoch709 Dec 18, 2014 2:26 PM in response to abbfromff
    Level 1 (1 points)
    Dec 18, 2014 2:26 PM in response to abbfromff

    Apple should certainly fix this!

    In an attempt to provide some form of quality of service, they've filled the network with unnecessary traffic as well. I'm in IT at a medical school in which the D.O. student body gets an iPad and Macbook pro, so our wireless network is saturated with all of this unnecessary traffic, and we are also getting lots of complaints about dropped connections in well covered areas. This has been such a headache, as OS X doesn't have readily accessible network tweaking tools to remedy this. I hope they fix this soon.

first Previous Page 5 of 5