May 31, 2021
3 min read time

Handling large files with Varnish Transit Buffer


Transit Buffer is a feature available in Varnish Enterprise release 6.0.8r2. Read more here.


Caching and uncacheable content


In the world of caching, a cacheable object is something that the cache has been programmed to keep in memory for a certain duration. For example, an index.html file. Delivering cached objects is extremely fast, and an HTTP cache’s primary purpose is fast delivery of cached content. Once that duration passes we say that the object expires and it gets removed. Any future requests on that object will have the cache fetch it again from the source - a backend server - and the cycle begins anew.


So, is there any content that you wouldn’t want cached? Definitely. Content with cookies, and any other private information, but also, very large files. The reason why you don’t want to cache large files is because they tend to evict thousands of smaller files from the cache to make room. Those smaller files are there for a reason: They have been requested by someone else before, and they may be popular content. So, caching large objects will reduce the performance of the whole site.




Handling large files


Varnish has a state called vcl_pass, which allows you to serve objects directly, without using the cache. This feature is very useful when sending big, often multi-gigabyte, files. For example large video segments or ISO files. What happens during a pass is that Varnish will handle the delivery to the client itself, but it will fetch the object directly from the backend server every time.


This is an example of letting Varnish handle delivery of content that is always taken directly from a backend server:


vcl 4.1;
backend default {
   .host = "";
   .port = "1234";
sub vcl_recv {
   return (pass);


One issue with this feature was that Varnish would not pace the transfer, and often backend servers are much faster at delivering content than clients are at receiving it. This means that, by and large, content will be pulled in full into Varnish, and will slowly be sent to the client, taking up a lot of memory while this is going on. So if a client is requesting a 200GB file and is a slow receiver, you can imagine that Varnish is sitting on 190GB+ of segments early on. This has been our experience.

One workaround for this has been to use vcl_pipe. When piping transfers, Varnish no longer controls the delivery, and it is actually the backend server that does this, directly to the client. If there are any weaknesses in the delivery process of the backend, they are now exposed. Pipe also requires you to have logic in both vcl_pipe and vcl_backend_fetch. In the absolute worst case, a backend could decide to keep the connection open and parse any subsequent requests from the client, effectively bypassing the Varnish cache.


Transit Buffer

There is now a new feature in Varnish Enterprise called Transit Buffer. It allows you to specify how far ahead a private backend transaction can be relative to the client receiving. Enabling it is very straightforward:


sub vcl_backend_response {
    set beresp.transit_buffer = 1M;


When this is enabled, Varnish will pause receiving from the backend as soon as it has 1MB of unsent data to the receiver. This feature works for regular stream delivery with private fetches, and will be disabled if that changes. That is, any time there is a client receiving something alone.


Transit Buffer is available in Varnish Enterprise release 6.0.8r2, which you can find more about here.