81158 Views Previous 1 … 6 7 8 9 10 Next 149 Replies Latest reply: Apr 9, 2010 4:44 PM by jice0 Go to original post
I've done some more research, and there is something completely broken about DNS in Snow Leopard.
I've double and triple checked that I have only one DNS server configured via DHCP, and none configured manually. This DNS server is on a private lan with NAT to the outside world. The DNS server my SL box queries responds with private IP addresses. I've blocked outgoing dns queries on port 53 to the outside network, so there's no way mDNSresponder is rolling over to some other DNS server outside my lan. But even so, 15 to 60 seconds after flushing the dns cache, my local cache will get poisoned with outside addresses for some internal resources.
It certainly appears that SL is querying some hidden DNS server on a nonstandard port. If I block all access to the outside network at my firewall, my SL box never gets any bad DNS info. As soon as I unblock all access except for port 53, the problem comes back within 60 seconds, every time.
Do you have another device or devices on your network that may be responding to Bonjour requests for information about other systems with external addresses?
You may want to try using the third-party Bonjour Browser to see if that might be the case.
I don't see anything with Bonjour Browser that would be relaying or responding to DNS requests. Here are the only services that devices are supplying via Bonjour:
libvirt, scanner, afpovertcp, http, ipp, nfs, printer, pdl-datastream, rfb, smb, sftp-ssh, ssh, and workstation
Also, the problem goes away completely when I block outside access to my SL box at the firewall. I'm still allowing it to communicate with all the other devices on my lan, and still allowing all other devices on the lan outside access; so it's not relaying the queries through another computer.
Well I can guarantee you it's not querying some random DNS server to get addresses, so they're coming from some (otherwise) legal source; I suspect one or more of your systems is responding to an mDNS request query with its external address.
Multicast DNS (mDNS) is a protocol that uses similar APIs to the normal unicast Domain Name System but implemented differently. Each computer on the LAN stores its own list of DNS resource records (e.g., A, MX, SRV) and when an mDNS client wants to know the IP address of a PC given its name, the PC with the corresponding A record replies with its IP address. The mDNS multicast address is 18.104.22.168 for IPv4 and ff02::fb for IPv6 link-local addressing.
It it always the same host or hosts that show an external address?
You could test this by blocking multicast packets with your firewall.
Isn't mDNS only supposed to resolve names in the .local domain? I'm using fully-qualified domain names, and still having the problem. The server I'm trying to reach is a linux server, and had the Avahi daemon running, which responds to the mDNS queries; but that server isn't even aware of it's external IP address to put in an mDNS response, since the firewall at the border of my network is performing NAT.
The only devices that knows anything about the external address is the border firewall, and DNS servers outside the network. The border firewall isn't running anything that would respond to an mDNS query, and I've blocked port 53 so devices inside the network can't query external DNS servers if they tried. The internal DNS server only has the internal address associated with the server in question.
If mDNS can resolve domains other than .local, I guess it could be the culprit; but I can't figure out how to identify where the problem could be. And it certainly seems like a big problem if any device can respond to any mDNS query with an invalid response.
Is there any way to tell SL do always look at the response from the local DNS server first, and only fall back to mDNS if the DNS server doesn't respond, or doesn't have a record?
There's no way to configure mDNS that I'm aware of.
When you block port 53 you're blocking both TCP and UDP, correct?
How is your network setup when you're doing these tests - local systems have port 53 to the outside world blocked but can access port 53 on your local DNS server?
If so, it's got to be a bug in your firewall's DNS server that is occasionally sending external DNS data back to your internal hosts.
I am blocking both TCP and UDP port 53 at the firewall so that the internal computers can't access any DNS server except for my internal DNS.
I don't think it's a bug in my DNS server, because it's the same server I've been using for over a year, and I've never had this problem before, and windows and linux computers on the same network, using the same DNS server don't exhibit this problem.
It really seems like something is providing false information via mDNS, which would explain why it only happens to the mac, and still happens when port 53 is blocked. I just don't understand why SL would be using mDNS to resolve a fully-qualified domain name in the first place.
If you can make this recur fairly easily, run the following command in a Terminal window on your Mac:
tcpdump -n -i interface port 53 > /tmp/nslog 2>&1
where interface is the network interface in use (e.g. en0, en1, etc.)
When you see DNS has returned external entries, control-C the tcpdump command and examine /tmp/nslog you should see exactly where the external entries came from.
For example, I use OpenDNS for name resolution (22.214.171.124), so my execution of the command above shows entries like:
08:13:03.541428 IP 192.168.1.2.63891 > 126.96.36.199.53: 731+ A? www.amazon.com. (32)
08:13:03.570861 IP 188.8.131.52.53 > 192.168.1.2.63891: 731 1/0/0 A 184.108.40.206 (48)
08:13:59.573499 IP 192.168.1.2.59325 > 220.127.116.11.53: 9568+ A? www.kernel.org. (32)
08:13:59.582707 IP 192.168.1.2.61246 > 18.104.22.168.53: 38193+ AAAA? www.kernel.org. (32)
08:13:59.602798 IP 22.214.171.124.53 > 192.168.1.2.59325: 9568 5/0/0 CNAME www.geo.kernel.org., CNAME pub.geo.kernel.org., CNAME pub.us.kernel.org., A 126.96.36.199, A 188.8.131.52 (125)
08:13:59.777205 IP 184.108.40.206.53 > 192.168.1.2.61246: 38193 3/0/0 CNAME www.geo.kernel.org., CNAME pub.geo.kernel.org., CNAME pub.us.kernel.org. (93)
Note the sequence number after the destination IP addresss that is the serial number of the request and of the answer; other queries and replies may be interleaved so you MUST match request serial numbers.
Note also that the log can get pretty big if the system in question is making a lot of queries.
I'm also assuming that the Mac in question does not have more than one active network, e.g. Ethernet and AirPort are not active at the same time.
OK, that shed some light on the situation. The server I've been having trouble with is named peach. My internal DNS has an A record for peach with the internal address. The external DNS has a CNAME record for peach that points to apple. Apple only has an A record in the external DNS.
When SL queries the internal DNS for the A record for peach, it get the internal IP address as expected. But when it queries the internal DNS for an AAAA record for peach, the internal DNS has no AAAA records, so it forwards the request to the external DNS, which responds with the CNAME record referring to apple. Apple has no A record in the internal DNS, so its A record request gets forwarded to the external DNS, and responds with the external IP address.
So, it seems like there's three different problems:
1) My internal DNS only has an A record and not an AAAA record for peach. I'm not using IPv6 anywhere on my internal network, so this doesn't seem like it's a real configuration problem.
2) My SL box has a valid IPv4 address for peach that it uses for a short period of time. It doesn't seem like it should make another query when it already has valid information.
3) My SL box has IPv6 disabled. I would think that it shouldn't query AAAA records at all.
I've been having the exact same problem others here have described -- a local DNS server, and my SL client will sometimes not be able to resolve local hosts (e.g., I can dig or nslookup, but when I try to connect via ping or ssh, it fails to resolve the name.) Restarting the mDNSResponder fixes it -- this can be accomplished PC-style with a reboot (bwahahaha!), or preferably, via the following command:
*sudo killall -HUP mDNSResponder*
I've only had to try this a few times, but it's worked for me.
3) My SL box has IPv6 disabled. I would think that it shouldn't query AAAA records at all.
Unfortunately the appropriate IETF specs mandate that compliant DNS clients query for both A and AAAA records regardless of whether the system is using IPv6, as AAAA records can contain CNAME information that is to be used even on IPv4 networks.
The problem you've run into is that if a valid response to a query for an AAAA record is received, that information is to take precedence over the data returned in response to a query for an A record.
So the bottom line is either you will need to setup your internal server to return an AAAA record for your internal hosts, or your external DNS should be configured to not return an AAAA record for queries originating inside your network.
I came to a similar conclusion. Unfortunately, neither my internal nor external DNS give me any control over AAAA records; so I made sure that any A records on my internal DNS had A records in the external DNS; and not to mix and match A records and CNAME records in the internal and external DNS for the same host.
It's a subtle difference in behavior between IPv4 only name resolution and a system that is designed for, but isn't necessarily using, IPv6; and I'm wondering if a lot of people in this thread, particularly those with VPNs, are having this problem and having trouble identifying what's causing it.
I'd think that a quick summary of this behavior somewhere prominent in these discussion boards might be helpful.
Hey folks... first of all, I`m newbie on all this network stuff.
I just bought my macbook a couple of days ago and it already came with Snow Leopard. As everyone on this post I had the same DNS problem.
Fortunelly, I solve the issue, at least, on my home network.
What I did was access my modem (is this what you guys call ISP?) configuration to search for a DNS number to try on the airport DNS list. So, I put that number in the airport configuration and now it's working fine (I found this DNS number at TCP/IP info).
The dns number I entered manually in airport configuration is totally different from those automatically provided by DHCP... Why?
I`ll have to do this with all connections? Or is this a problem only with my router or modem?
As you guys notice, I'm not an expert... so if anyone could explain me all this...
Brian Dantes wrote:
A Cisco Anyconnect VPN split DNS configuration is a legitimate use case completely busted by this Snow Leopard bug. A typical setup looks something like this:
domain : vpn.domain
nameserver : <vpn-resolver-ip>
nameserver : <standard-isp-resolver-ip>
order : 1
The intent is that non-VPN hosts will be cascaded from the vpn-resolver back to the standard-isp-resolver, and VPN hosts will be resolved by the vpn-resolver. However, due to this bug, the former works still but the latter intermittently fails because it sometimes sends VPN hostname lookups to the standard-isp-resolver first.
The only workarounds are to keep bouncing the mDNSResponder or move away from a split-DNS policy.
I really hope Apple fixes this in 10.6.3 which I hear rumblings is imminent. This is a terrible bug.
I've commented earlier in this thread on a different symptom of this bug, but I'm also seeing this same VPN bug as described by Brian Dantes. My particular VPN solution is a SonicWall/Aventail SSL VPN. When I connect to the VPN, my 10.6.2 box at home should start resolving using the additional DNS servers that it gets via the VPN, in order to resolve internal 10.x.x.x addresses and such. Instead, they don't resolve at all (except with dig, which works properly via another resolution mechanism, as we know).
Killing mdnsresponder works, of course, and I can do that for myself (as annoying as that is), but there's no way that my users are going to figure that out if we upgrade them to Snow Leopard. For now, this is a bug that is forcing us to stay with 10.5 and not upgrade.
I sure hope that 10.6.3 fixes this silliness: it's a dumb bug that should have been fixed long before now.