Chrome Networking: DNS Prefetch & TCP Preconnect

crazygringo · on June 4, 2012

This is actually quite fascinating. The lengths the Chrome team is going to, in the name of page-rendering speed, is really astonishing.

vardump · on June 4, 2012

This got me thinking...

When TLS, cookies or same origin policy are not a concern, why not resolve resource domains on the server side and output for example <img src="hxxp://10.0.0.1/picture.jpg"> to avoid that extra client side DNS request completely?

igrigorik · on June 4, 2012

Yup, we can do that too! I didn't mention it in the writeup, but you can also insert <link rel="dns-prefetch" href="http://www.domain.com/>; hints into the head of the document to tell the browser to pre-resolve certain domains.

An often overlooked reason for supporting this is to pre-resolve hostnames behind a redirect. If you have a "http://a.com/resource, which redirects to "http://b.com/resource, and you know this relationship while you're generating the page, then injecting a dns-prefetch link can help hide the cost of the extra DNS lookup when the browser is following the 3xx.

mike-cardwell · on June 4, 2012

DNS pre-fetching is a great way for spammers to determine whether or not an email has been read when the recipient is using webmail with a browser which supports it. Although, I believe DNS pre-fetching is disabled over https by default on all current browsers which support it.

EDIT: This is one of the things that a website I developed tests for: https://emailprivacytester.com/

reginaldo · on June 4, 2012

This only works when there's only one site hosted at 10.0.0.1. When an HTTP request is made, it looks like this (not trying to be pedantic here, I'm sure you know how HTTP requests are made):

  GET /picture.jpg HTTP/1.1
  Host: host.sample.com
  [Other header: value]...

Many times, the Host header is an essential part of the equation. If, for instance, you have a single IP serving as an endpoint to many different sites, you need the Host header to decide where you'll look for the resource "picture.jpg".

That could of course be solved with something like

  <img src="hxxp://host.sample.com/picture.jpg" ip="10.0.0.1">
  or even
  <img src="hxxp://10.0.0.1/picture.jpg" host="host.sample.com">

vardump · on June 4, 2012

Or you could just "virtual host" by initial URL path, like "hxxp://10.0.0.1/host.sample.com/picture.jpg" or "hxxp://10.0.0.1/p2ezQ/picture.jpg".

reginaldo · on June 4, 2012

Of course... This is actually a much better solution than introducing a img tag attribute. It would be more visible at the template level and (at least the first example) requires minimal configuration.

Out of curiosity: do you do (or know someone who does) this?

Do you think the functionality is worth the code to implement it?

vardump · on June 4, 2012

No, I don't know anyone who does this. This is just a curiosity for me.

No idea if it's worth it. Measure. Build a test case that implements a limited version or emulates this somehow. Test it against some group for a few weeks or months. Did it benefit you, made no difference or harm your goals? Did it give you something unexpected that you can turn into your benefit?

Wilya · on June 4, 2012

Generally, nothing forces one DNS to resolve to one ip and one ip to correspond to one DNS. So you're losing both a simple way to do load-balancing and virtual hosts.

That being said, both can be worked around by other ways. You definitely lose a way to do geographical load-balancing, though (like in a CDN).

vardump · on June 4, 2012

This only moves the logic you need to do in CDN DNS server to somewhere in your web serving stack.

You could still implement crude geographical load-balancing by simply looking at client IP address for example at your load balancer/reverse proxy or the HTML serving server itself.

Nothing forces you to serve same IP address to each client. For media streaming, you can do some IP round robining to balance load. For serving CSS, images, scripts, etc. you'd want to hash some relatively static client derived attribute to pick a server to avoid defeating client side caching.

At its simplest, map first octet of client IP to optimal geographical load-balancing IP(s). I'm sure there'd be some sub-optimal choices, but it'd still be a reasonably good approximation. This works, because first octet determines at least whether a given IP is assigned to RIPE, ARIN, APNIC, specific organization, etc. You get about country level accuracy, at worst about continent.

Or use a full fledged geo IP database to find an optimal server IP for each client IP.

Of course this isn't very practical, because requires more maintenance and is not effortless like using traditional CDN DNS resolution based methods. Maybe some CDN vendor could offer this in a more managed, easily integrable fashion.

I'm sure it could be made work, if the potential latency win is worth the effort for you...

rwg · on June 4, 2012

That breaks NAT64+DNS64, for one. (T-Mobile says IPv6-native with NAT64+DNS64 for legacy [IPv4] connections is the future on their network.)

on June 4, 2012

[deleted]

ef4 · on June 4, 2012

You're missing the point. Chrome does respect your global DNS configuration.

The article is about how Chrome cleverly makes DNS requests (to your resolver of choice) as early as possible.

CUR10US · on June 4, 2012

"after much deliberation the <bold>Chrome team is now experimenting with building its own DNS resolver</bold>"

With all due respect, I think you have missed the point.

stock_toaster · on June 4, 2012

That would not be a welcome change for me.

I use some hosts file overrides for two reasons:

1) testing at $dayjob

2) to block some spammy things, and if their in-browser resolver did not take those into consideration

If either of those were impacted, it would be bye-bye chrome.

igrigorik · on June 4, 2012

Just tested it in Canary builds and the async DNS resolver does parse and respect /etc/hosts.

stock_toaster · on June 4, 2012

Great news! Thanks for testing igrigorik.

afimrishi · on June 4, 2012

The Chromium experimental DNS resolver should respect that. If it does not, it's a bug and please file it at new.crbug.com.

CUR10US · on June 4, 2012

"after much deliberation".

Why did the team deliberate so much?

I will let the readers ponder this.

There are already some very good stub resolvers and resolver libraries available to users (e.g. dnsqr and the djbdns library). I have a hard time believing Google is going to do better than djb.

Of course I have no problem with them or anyone else writing another one. Have at it. The more attention brought to name resolution the better -- because it can be so easily abused for questionable purposes, it is something that deserves user oversight.

But why does Google need to place theirs _inside the browser_? That is a very curious design decision.

afimrishi · on June 4, 2012

The original comment in this thread seems to have been deleted, so I can't tell what was said. The primary reasons for implementing our own DNS resolver include: * Being able to fully instrument it. As the article mentions, we have internal debugging pages like about:net-internals, which rely on this instrumentation. * Being able to run experiments. Google Chrome releases often run A/B experiments to play around with different configurations to see which has better performance and what not. This is harder to do with a 3rd party library.

As Ilya notes, a fuller discussion can be found at the G+ post's comments section.

Note: I'm a Chromium developer on our network stack. I'm also the author of the G+ post linked to in the article.

CUR10US · on June 4, 2012

I remember reading that thread some time ago. Are you the engineer who was rude to the journalist?

afimrishi · on June 5, 2012

I think you have someone else in mind. Perhaps one of the commenters on her article?

igrigorik · on June 4, 2012

I linked to Will's post in the article, definitely worth a read: https://plus.google.com/103382935642834907366/posts/FKot8mgh...

Check the comments, there are some very good discussions in there with Daniel Stenberg about c-ares and other resolvers.

stock_toaster · on June 4, 2012

I hypothesize their reason would be because they want to turn chrome more into a full os (chrome os/chromebook), and bring more things in house. I also recall something about some resolvers having through when ipv6 is enabled but not actually functional. Possibly they are trying to make such things a bit more seamless.

I, currently at least, would still prefer that the os handle name resolution.

james4k · on June 5, 2012

Interestingly, but off topic, Go has its own DNS resolver that it uses when the getaddrinfo C API isn't available.

CUR10US · on June 5, 2012

You mean the Go tools have their own resolver?

The blog at miek.nl says that Go had a DNS library written in Go in the provided samples for a while, then they removed it. It would be interesting to know why.

james4k · on June 5, 2012

Go's net package has its own resolver. See http://golang.org/src/pkg/net/dnsclient.go http://golang.org/src/pkg/net/dnsclient_unix.go

But yes, it does not appear to be used anywhere now, as all of the dns lookups for each platform seem to end up at cgo calls.

CUR10US · on June 6, 2012

Interesting to read the comments. They should just rewrite the C resolver library for UNIX. Let's face it, the Plan 9^W^WGo team would probably produce a more elegant result than what we're using now, which has had its share of bugs over the years.