August 17, 2021
3 min read time

Two-Minute Tech Tuesdays - Grace Mode

This week’s episode is about Grace mode, Varnish’s interpretation of stale-while-revalidate.




Grace mode allows Varnish to deliver slightly stale content to clients while getting a fresh version from the backend. This means that if Varnish receives a client request for expired content, it can still return a cache hit, knowing that it's stale content but, at the same time, can send an asynchronous revalidation request to the origin server to fetch the latest version of the object. 

It's pretty much a balancing act between keeping the content up-to-date and maintaining a decent level of performance. The standard value of Grace is set to 10 seconds which means you can serve stale content up to 10 seconds past the expiration time of an object. 

Here's how that works: as long as the sum of the remaining TTL and the Grace value are greater than zero, we can still serve cache hits. So when you're dealing with a positive TTL, Grace becomes irrelevant because it's fresh content. It's a cache hit, regardless. But if the TTL drops to zero, and there's still some Grace left, we can still serve the response as a cache hit knowing that it is still content. That also applies when the TTL drops below zero. As long as there's enough Grace, you're fine, it's a cache hit with a stale object. But at a certain point, the TTL will be so low that not even the Grace value can keep the sum above zero. The result is a cache miss and your asynchronous revalidation requests will become synchronous revalidation requests. 

There's plenty of ways to set the Grace value in Varnish. You can do this on a system-wide basis by overriding one of the runtime parameters. In this example, we're attaching a -p runtime parameter to set the default Grace with a value of 100 seconds. 


/usr/sbin/varnishd \
-a :80
-a localhost:8443,PROXY \
-p feature=+http2 \
-f /etc/varnish/default.vcl\
-s malloc,265m \
-p default_grace=100


However, you can still override this value on a per-object basis. 


Cache-Control: public, max-age=3600, stale-while-revalidate=100


You can use the cache control header, more specifically, the stale-while-revalidate syntax to override the Grace value on a per-object basis. In this case, to 100 seconds.

You can do the same in VCL by setting beresp.grace in the VCL back in the response subroutine, so in this case, it would be an hour of Grace.


vcl 4.1;

sub vcl_backend_response {
set beresp.grace = 1h;


Even though we override the Grace value on a per-object basis, we can still choose to only use a limited amount of Grace on a per-request basis based on specific criteria defined in VCL receive. 


vcl 4.1;

import std;

sub vcl_recv {
if (std.healthy(req.backend_hint)) {
set req.grace = 10s;

sub vcl_backend_response {
set beresp.grace = 6h;


That was Grace mode, and it's quite a significant feature that allows you to revalidate your content without exposing the potential latency to the end-user. 

We'll be back next week with more content, and another video. In the meantime, check out our previous videos for more helpful tips: