Speed is one of the most crucial aspects of any online experience. And today, speed isn’t just a nice-to-have—it’s a necessity, especially when it comes to video delivery. Just as a runner may train for a race with different techniques to improve their speed, engineers can use various strategies to make HTTP requests load faster. But wouldn’t it be even better to get a head start?
Let's focus on video delivery. Most video traffic on the internet relies on HTTP, and for the best video experience, we want to use something fast like a warm cache. This is where we can use one of the great features of Varnish to speed up the process of our video delivery. We are going to enable content prefetching.
Content prefetching is a powerful technique used to optimize web performance by proactively loading resources before they are explicitly requested by a user. And when it comes to video delivery, prefetching can be massively beneficial. For video, the order of the chunks is known by the origin server, and if someone starts a video and requests chunk 1, it’s safe to assume they will need chunk 2 soon, and so on. Using Varnish, we can prefetch the next video chunks by taking advantage of Common Media Server Data (CMSD), an HTTP extension that can help us by adding metadata to the response headers. Notably, it proposes the origin adds a nor
(next object response) attribute to its response. The nor
attribute is described as:
The URL-encoded relative path to one or more objects which can reasonably be expected to be requested by a media client consuming the current response. This key will typically be added by the origin. An intermediate server MAY use this key to perform a prefetch action.
We can take a look at an example below:
In the case of our definition, Varnish is the intermediate server, and we are definitely going to do that prefetch action. Fetching an object before it’s even been requested is going to give us a massive head start on our future client requests. By strategically preloading these resources, Varnish can significantly reduce load times and the chances of buffering, even during peak traffic periods.
Choosing to implement this feature or not is going to have an effect on your user experience but also on your origin servers. It is worth noting that a user may not request all of the objects in the nor header for different use cases; however, for video delivery it is safe to assume the playback is going to be very predictable. Even better, VCL, or Varnish Configuration Language, is powerful, and we can use it to take advantage of these CMSD headers to prefetch only the values we choose.
To understand how, now that we have seen a CMSD Static nor header, we will walk through how Varnish uses them and then walk through the code.
We have seen above that our nor
value is a list of relevant paths to objects, separated by | (the pipe symbol). In the full code, we use the VCL to prefetch the first three nor
values. You can take a look at the code in the repository here. If you choose to adjust the VCL, deciding on the number of these objects you want to make requests for will be up to you. But if we want to request more objects, we can follow the same logic just incrementing the necessary values. For more information or clarity on this, you may want to take a look at the relevant documentation for the different vmods imported at the top of the VCL file. The documentation can be found here.
Now how does it work?
To help break down content prefetching, lets visualize it with a diagram:
First, a client will make a request for a video chunk which will be received by Varnish. The Varnish server does not have the object so it is fetched from the Origin. The Origin sends Varnish the video chunk, but has attached a CMSD nor
header to its response. Once Varnish receives the video chunk, it shares it with the Client. At the same time though, it processes the CMSD header so it can prefetch the requests. Varnish does so by grabbing the base of the url from the request for chunk one, or https://example.com, and appending each nor
value. So for the second chunk we get https://example.com/1080/chunk_2.mp4, which we set as the url for our prefetch request to the Origin. When the Origin gets our prefetch request, it sends Varnish the requested chunk, but knowing what comes next, has attached another CMSD header for the next few chunks. As a result, a prefetch can lead to another prefetch, leading to Varnish requesting the entire video from one Client request, in a sort of recursive prefetching. Now, when the Client needs the next video chunk, our cache is warm and Varnish is ready to send the request immediately.
Not only is this logic powerful, but it's also super simple to implement. For Varnish Enterprise users, this is as simple as a single include statement in your VCL. One line for all of the functionality, and it’s purely dependent on whether or not these CMSD headers are included from the backend. As a result, there will be no effect on any existing logic. Implementing the “cmsd-static.vcl” solution into your VCL can be as simple as this:
And just like that, we have implemented a prefetching solution into our environment! Our clients are happy, our managers are happy, and you are looking at getting a raise and promotion ;)