Unlocking Varnish’s Potential: Using Synthbackend for Live Video Streaming and Ephemeral Object Storage

Written by Audun-Marius Gangstoe | 3/4/25 12:15 AM

One of Varnish's key strengths is its ability to quickly look up an object by hash in memory or on disk and return it as fast as possible. However, that object must originate from somewhere—usually a backend that Varnish fetches from. We can also directly insert cache content (with a POST request, for example), using synthbackend.

This opens up a lot of interesting applications, like using Varnish as a key/value store or even an object store. One of the simplest ways to use this is to serve live video streams.

To understand why this makes sense, let’s first look at how this works:

Crawling

  
    
    
    
    
      vcl 4.1;
import synthbackend;

// there's no backend!
backend default none;

sub vcl_recv {
  if (req.method == "POST") {
    set req.http.post-method = "yes";
    set req.hash_always_miss = true;
    return (hash);
  }
}

sub vcl_backend_fetch {
  // if the request was a POST, set the
  // "mirror" backend that will send back
  // the request body as response body
  if (bereq.http.post-method == "yes") {
    set bereq.backend = synthbackend.mirror();
    return (fetch);
  }
}
    

This minimal example will:

accept POST requests
hash them
make sure it will always miss (i.e. we can override the cache with a new object) so we can continue to vcl_backend_fetch, where we set the backend to our synthetic backend, and return straight to fetch
mirror the request body into a response body
and finally, turn that response body into a cache insert

After the cache insert, Varnish follows the default route of delivering the cache hit, which in this case will send the response back to the requester. Of course, the interesting thing is that when we GET that endpoint, we now retrieve the content we previously pushed.

Here’s a simple test of that behavior:

  
      curl 'http://localhost:6081/hello' # Will return 503 backend fetch fail
curl -X POST 'http://localhost:6081/hello' --data-binary hello
hello

# Now we will get cache hits on this URL until the TTL times out (default 120s + 10s grace):
curl 'http://localhost:6081/hello'
hello

But, how is that useful in video delivery?

Walking

Of course, it all depends on your encoder, but if it is based on ffmpeg or similar, then you can set that to push video with POST (or default, PUT) requests straight to a web server or a cache.

Here’s an example ffmpeg command you can try to stream a test video into this test instance:

  
    
    
    
    
      ffmpeg \
-threads 4 \
-f lavfi -re -i testsrc=s=1280x720:r=24  \
-vf "drawtext=fontfile=monofonto.ttf:fontsize=84:fontcolor=black:x=200:y=45:text=Varnish Test Video" \
-sc_threshold 0 \
-vcodec libx264 -preset superfast\
-f hls -hls_time 2\
-loglevel 31 -stats\
-method POST \
http://localhost:6081/master.m3u8
    

You can they play it back why it’s actively encoding:

ffplay http://localhost:6081/master.m3u8

What happens is that ffmpeg (the first long command) will push video segments, and continuously update the master.m3u8 manifest file, straight into cache, and it will immediately be playable by clients, as if there was a backend.

The huge benefit here is that the new data is available immediately. In a regular setup, we’d have to wait for the manifest TTL to expire before the caching layer realizes it needs to refresh it (it works, of course, but this is objectively better, without the need to actively purging the cache).

Running

To make this more useful, we can make some Quality-of-Life change to this minimal example:

TTL of 120s is probably too short, things will be deleted too fast.
- That’s solved by just setting TTL in backend_response to 2 hours
When POSTing, we don’t actually need the response to be sent back to us
- That’s easily solved by doing a synth() in vcl_deliver, since the cache insert has already happened we can override the response sent to the client
All cache misses will produce a 503 Backend Fetch Failed since we don’t have an actual backend to fetch from. We can answer with a more intuitive 404, using vcl_backend_error.

With those, our VCL evolves to:

  
    
    
    
    
      vcl 4.1;
import synthbackend;
backend default none;

sub vcl_recv {
  if (req.method == "POST") {
    set req.http.post-method = "yes";
    set req.hash_always_miss = true;
    return (hash);
  }
}

sub vcl_backend_fetch {
  // if the request was a POST, set the
  // "mirror" backend that will send back
  // the request body as response body
  if (bereq.http.post-method == "yes") {
    set bereq.backend = synthbackend.mirror();
    return (fetch);
  }
}

sub vcl_deliver {
    // If the request was a POST, then this response contains the original request
    // We don't need/want the original (large) data mirrored back at us,
    // since the cache insert already happened, 
    // so instead we synth an empty response:
    if (req.http.post-method == "true") {
        return(synth(1200));
    }
    set resp.http.X-ttl = obj.ttl;
}

sub vcl_backend_response {
    if (bereq.http.post-method == "true") {
        set beresp.ttl = 2h; // FIXME: This can be set with -t or -p default_ttl, but for simplicity we are here.
    }
}

sub vcl_backend_error {
    // Give a 404 instead of a backend fetch failed since we don't have a backend.
    set beresp.status = 404;
    set beresp.http.Content-Type = "text/plain";
    set beresp.http.Cache-Control = "no-store; no-cache";
    set beresp.body   = """
404 not found
""";
    return(deliver);
}

sub vcl_synth {
    // This synth is just for delivering an empty response
    if (resp.status == 1200) {
        synthetic("");
        set resp.body = """
200 OK
""";
        set resp.status = 200;
        return(deliver);
    }
}
    

Now the same stream segments will linger for 2 hours before they are deleted. And our live video will be playable and updated live as long as the encoder is pushing.

Depending on the specs and the network card of the Varnish node, you can now serve live video to hundreds or tens of thousands of users.

This basis can easily be extended, for example, we can:

Add authentication on the POST requests to make sure not everyone can push content, using Basic or JWT authorization
More importantly, add redundancy by scaling out, and adding both cluster replication on cache insertions, and cross cluster lookup so that newly spun up nodes are able to serve “old” content
With a cluster, we can add persistence, and serve potentially massive amounts of non-live video-on-demand

With these, the VCL becomes a bit too long for such a blog post, but we have it available on Github for you to check if you’d like.

Flying

Live video stream is obviously only one of the many use cases, but blog posts are short and the possibilities are infinite. Every time you need fast access to ephemeral content, this concept provides a quick and efficient way to deliver your content, without having to organize the whole pipeline of pushing the content and then remembering to garbage collect it.

That’s it for me, I hope you enjoyed this post as much as I had crafting the code. Varnish is an amazing toolbox that allows us, and you the users, to solve cache and load-balancing problems that may seem outside of our usual perimeter in a quick and elegant way.

View full post