The single most frequent question I get from people discovering Varnish is "Do you support request collapsing?". And I must admit, it always catches me off guard, because of course, it has been doing so for years, and it's a caching server, so that's a must-have, right? (Yes, yes it is.)
For this reason, in this post, we are going to review some of the little features we take for granted but that also make Varnish a great origin shield.
A content delivery network (CDN) is a set of geographically distributed servers that will cache requests directed at your server(s), that is, the origin(s). The benefit is that you can serve much higher volumes of traffic, all across the globe, even with a small origin and little bandwidth because the CDN will reduce the traffic to your server by a few orders of magnitude.
But sometimes, that's not enough, notably because a CDN is rarely perfect: different data centers rarely share their caches, for example, so each of them needs to fetch a copy from the origin. And if you are using multiple CDNs, each CDN needs to fetch its own copy of each request. In the end, the CDN(s) may still bring your origin down if the traffic reduction factor isn't good enough.
The solution is to add more caching! We can place an extra layer of caching right in front the origin to shield it from redundant requests so that even if you are using 5 CDNs with 100 data centers each, your origin will only see one request per object. And it turns out that Varnish is a pretty good candidate for this.
We collapse requests so your setup doesn't
The big bad wolf of content distribution is something called the "thundering herd" effect triggered by a massive amount of requests arriving at the same time. Think about something like Black Friday, but with HTTP requests, the systems get flooded and just choke on the incoming traffic. It either slows down to a crawl or crashes entirely. In practice, that usually happens for big online events, such as the Olympic Games, football matches, or, well, Black Friday.
However, that kind of traffic is usually highly cacheable, and Varnish implements request coalescing (or collapsing) to save the day. The idea is dead simple: if 20 users request the same content at the same time, fetch the data only once from the origin. There are quite a few interesting technical details to it, but I won't bore you with that today, don't worry. The important point is that Varnish does all the hard work for you, transparently, so that the thundering herd doesn't trample your origin.
HEAD, range requests and compressionIt's now time to go one step beyond (it's Madness, I know!) and learn about normalization, which makes the request collapsing even more powerful. Consider four requests coming in for the same URL, but:
- request 1 would like the data to be compressed using gzip (
- request 2 only wants the last 10 bytes of data (
range: bytes -10)
- request 3 is only interested in the metadata of the URL (
- request 4 asks for the first 512 bytes, and would like them compressed with brotli (
accept-encoding: brotli, range: bytes 0-511)
These will yield four different results, even though the object behind them is actually the same. And, as you can guess, Varnish can help here, needing only one request to the backend.
For cacheable requests, Varnish will normalize them:
- set the method to GET to grab the full object.
- unset the range header, for the same reason.
- force the accept-encoding header to gzip or brotli to save bandwidth and cache space.
With this, Varnish has all the information it needs and can tweak the response object on-the-fly to please the client.
An important point is that the "accept-encoding" header isn't an order, it's a declaration of capability, and a server is free to ignore it if it doesn't make sense. So Varnish will ask for compressed data, and if it doesn't get it, it'll assume it's best to keep and transmit it uncompressed. That's the case for images for example.
And of course, these are just out-of-the-box defaults that can be modified if your origin needs special care. For example, if you origin doesn't know how to compress text files, Varnish can do for you, saving cache space.
Access-controlEven though your origin is "behind" a CDN, most of the time, it's available on the open internet, so it's pretty important to make sure only the CDN can request from it. Otherwise, you're in for a repeat of the Maginot Line.
On this front, Varnish is extremely open and lets you (or the CDN) decide how to restrict access. To name a few schemes:
- shared secret in a header: possibly the simplest of them all, we just need to check that a header has the right value
- JWT: this one is getting popular thanks to a simple standard, easy technologies to pick up and the ability to sign or encrypt using a symmetric or asymmetric key
- custom tokens: they are legion, but they're almost all about extracting some data from the request (which Varnish was born to do) combined with a signing mechanism (easy to do with the right VMOD). This means that we can integrate virtually any token with just a few lines of configuration.
- IP ACL: it's an oldie but goodie; we just need to check the client IP against one or more IP range, and deny access if we don't get a match.
- reverse-DNS: a new trick up Varnish's sleeve, it's now able to reverse-lookup an IP and make sure it matches a list of trusted domains. It's generally used for bot validation, but it's also valuable here when ACL won't do because of dynamic IPs.
- External API: after all, we don't need to do everything ourselves, and we can ask a third-party API endpoint to validate the request for us. And of course, we can cache the response to save time the next time it pops up.
And because of configuration composability, we can combine schemes and ask for an ACL match and for a valid JWT token, or for a shared secret in a head or a successful reverse-DNS lookup. The sky's the limit!
A lot more in store
As you can imagine, this is far from a complete list of features, but we had to draw the line somewhere, and I believe it highlights two important facets of Varnish:
- powerful and sensible default behavior is key to performance
- customization is an integral part of the experience to cover new and highly specific cases