April 15, 2024
5 min read time

Video caching is easy with Varnish

A few days ago we looked at how Varnish is great at video streaming, and we touched on the high-level concepts that make your video streams fly with Varnish. And it's all good and fine, but how do we actually implement it? Wouldn't that be nice to know?

It turns out that it's actually super easy to set up and configure Varnish to cache video streams, like, really easy. Let me show you how!

Foreword: Show us the code!

Before we begin, I want to point out that this article is based on our minimal Video Streaming CachEr Reference Architecture (VISCERA????) on Github. It's a great resource if you want to start hacking right away on your VCL as it provides both live and VOD streams with very little setup.

In this article though, we're just going to focus on the VCL part of it, and see how little of it to get a working cache layer.

VCL: Origins

Let's start at the beginning: the Varnish Configuration Language is actually a tiny programming language that will help us dictate Varnish's behavior. It may sound intimidating, but fret not, it's a piece of cake.

For now, we only need to tell Varnish two things:

  • the VCL syntax we are using (we'll use the latest one, 4.1)
  • where is our origin, or backend, located, so we can fetch uncached data from it

It looks like this:

vcl 4.1;
     .backend default {
     .host "";

Easy, right? That is, if you have a well-behaved backend that sends meaningful cache-control. But let's assume that you don't, partly because it's usually not the case, and partly because I want to write more VCL!

Maximum caching for VOD

VOD assets are pretty nice to cache for a basic reason: they don't change. All the chunks are known in advance and the manifests just list them, so we can virtually cache them forever. Note that Varnish will automatically evict less popular content to make room for new data if we fill up our allocated storage, so there's no need to be shy, let's cache everything for one year?

Everything? Well, maybe not, just the requests that returned a 200 status code. The others, we'll cache for a few seconds, so that users can't hammer the origin by asking for non-existent resources.

In our VCL, it translates into a new function vcl_backend_response that gets executed when we receive a BackEnd RESPonse (beresp):

vcl 4.1;
backend default {     .host = ""; }

sub vcl_backend_response {     if (beresp.status != 200) {         set beresp.ttl = 5s;     } else {         set beresp.ttl = 1y;     } }

And boom, that's it, VOD: handled! Let's tackle live now.

Live gotchas

VOD was straightforward, but live is going to require a bit more thought, but not that much more. As you probably know, HLS and DASH live streaming is akin to that scene in Wallace and Gromit where Gromit lays rail pieces in front of the train as it moves forward.


New chunks are added to the manifest, and old ones are removed, meaning the manifest only keeps a relatively short list of "active" video segments. So, we can cache them forever.

Instead, we are going to want the manifest to be cached for a short while so any changes to it are reflected promptly in the cache, and we can keep segments in cache for a while.

Note: it seems like we should be able to cache the segments for a very long time, because after all, they won't change, they just won't be referenced by the manifest. However, the origin will often recycle names, for example, after data999.ts comes data000.ts, so we need data000.ts to be out of the cache by then.

For the manifest, we also need to get rid of the grace, which defaults to 10 seconds, and could be over-caching, so we'll just zero it.

The only questions left are about identifying different requests:

  • live data? We can check if the URL of the BackEnd REQ (bereq) starts with /live/
  • manifest? Again, we check the path and see if it contains "m3u8", the manifest extension for HLS
  • segments? Easy, we'll assume they'll all the non-manifest requests!

Tying everything together:

vcl 4.1;
backend default {     .host = ""; }
sub vcl_backend_response {     if (beresp.status != 200) {          # non-200         set beresp.ttl = 5s;     } else if (bereq.url ~ "^/live/") {  # live assets         if (bereq.url ~ "\.m3u8") {         # manifest             set beresp.grace = 0s;             set beresp.ttl = 1s;         } else {                            # segments             set beresp.ttl = 5m;         }     } else {                             # VOD assets         set beresp.ttl = 1y;     } }

We're done!

Going further

With just a few lines, we built a configuration that will handle both live and VOD HLS, all through HTTP-centric concepts, that is pretty neat, but there's a lot more we'll talk about in the future.

For example, if you have a large VOD catalog, you will probably want highly performant disk caching, we can also think about clustering and sharding. And of course there's the ever important question of sizing and of pre-fetching.

Also, as we've seen here, and in the previous post, it's all HTTP to Varnish, and it's very good at that, meaning it's trivial to build hybrid caching platforms that handle video AND APIs, AND load-balancing, AND reporting, it's all a matter of pulling the right building block in.