Recently I had someone ask me if it's possible to have Varnish Cache support brotli compression. After giving it some thought, my answer was yes, Varnish Cache can serve brotli encoded responses and it can do so without native support for brotli.
For those who don't know what brotli is, it's a new compression format from Google that promises both speed and higher compression ratios. Firefox, Chrome, and Opera released support for brotli in 2016 while IE, Safari, and curl (libcurl) currently do not support it and it's not known when they will.
Brotli Encoding Support
First, let's examine how to add support for a new encoding using just VCL. This example requires a backend that supports serving brotli encoded responses.
sub vcl_recv
{
if(req.http.Accept-Encoding ~ "br" && req.url !~
"\.(jpg|png|gif|gz|mp3|mov|avi|mpg|mp4|swf|wmf)$") {
set req.http.X-brotli = "true";
}
}
sub vcl_hash
{
if(req.http.X-brotli == "true") {
hash_data("brotli");
}
}
sub vcl_backend_fetch
{
if(bereq.http.X-brotli == "true") {
set bereq.http.Accept-Encoding = "br";
unset bereq.http.X-brotli;
}
}
The above VCL snippet uses hashing to support brotli for any given request. On lines 3 and 4, we check and see if the client supports brotli and if the request isn't for content that is already compressed (brotli encoding would add little to no value for these file types). If the request passes these conditions, we set the header X-brotli to true (line 5). This signals Varnish Cache to add a new hash entry for the request (line 12) and to tell the backend we support brotli encoding (line 19). Varnish will now store an additional brotli encoded object in cache and clients that support brotli compression will get a brotli encoded response!
We could also have done this using traditional Content-Encoding and Vary support. However, it's worth looking into how Varnish Cache natively supports different encodings first.
Native Varnish Cache Encodings
Varnish Cache only recently got gzip support in 3.0. Before this, Varnish Cache would rely on the backend for encoding support. Typically the backend (or Varnish Cache) would add a "Vary: Accept-Encoding" header in the response, which would then tell Varnish Cache to create a new hash entry based on the Accept-Encoding request field. This allowed Varnish Cache to store multiple encodings for each request and deliver the matching encoding to the client.
When Varnish Cache released native gzip encoding support in 3.0, this changed. The first change was that Varnish Cache will always request gzip encoding from the backend. This means that it will store a single compressed copy of the object in cache. Your backend can override Varnish Cache by responding with no encoding and you can even override this by reinforcing compression via beresp.do_gzip = true. Regardless, this significantly reduces the cache storage size because we no longer have multiple encodings stored in cache. The only thing Varnish Cache needs to do is support clients that do not accept gzip by ungzipping the response on delivery. This is an extremely positive trade-off since clients that do not accept gzip are in the very slim minority.
What this means is that Varnish Cache is hardcoded to ignore certain encodings, it will force gzip when it thinks it's appropriate, and it will totally ignore Varying on Accept-Encoding. It's also worth noting that ESI assembly is done using a highly optimized gzip algorithm. This is why we cannot follow standard Vary procedures when introducing brotli support in Varnish Cache. We must either explicitly hash on it, as done in the above VCL, or we must introduce a new brotli specific header, like X-brotli, and Vary on that.
Native Brotli Support?
Supporting brotli compression natively in Varnish Cache is not too tall of an order. Varnish Cache has internal support for delivery and fetch processors. Gzip, ESI, Edgestash, and SSL/TLS are all implemented as such and adding a new processor for brotli would be relatively straightforward. However, this leads us to some higher level questions. For example, what is the default encoding Varnish Cache uses for storage in its cache? Do we transcode between between brotli and gzip on delivery or do we store multiple encodings per request? Delivery transcoding is very CPU expensive and not very Varnish-like since we like to perform as little work as possible when delivering an object. Storing multiple encodings makes more sense since this is what a cache does, but this will have an additional storage cost. Maybe Varnish Cache makes a single backend request and transcodes multiple responses into cache on insert? Or maybe outsource content encoding to the backend again and forgo native support?
It's too early for us to answer these questions given brotli's relative newness to the scene and only time will tell how native brotli support in Varnish Cache unfolds.
Would you like to learn more about Varnish Cache and what it can do? Read more in the Varnish Book.
Photo (c) 2014 Mike Mozart used under Creative Commons license.