September 11, 2015
3 min read time

Conditional requests versus cache invalidation

If your content ever changes you’ll need some way to make sure the updated content reaches the users. The traditional way of doing this is to devise some sort of cache invalidation. You’ll hook it into your CMS and when content changes a HTTP PURGE or REFRESH call goes out to the Varnish Cache servers and the stale content is discarded. This is all pretty nice. There is one problem: if you should ever lose a PURGE request then your content will stay stale for as long as your TTL allows. You have a few options: 

  1. Make sure you don’t lose a PURGE
  2. Learn how to care less (nobody cares if the lolcats are slightly stale)
  3. Do something else. 

Read on:

There is an alternative to cache invalidation. You can have Varnish extend the validity of the content by making conditional requests of the backend. Let’s say you have a cached response that has just gone stale. Varnish can be configured to keep it around beyond the TTL. However, since Varnish isn’t allowed to serve content that is older than the TTL Varnish will need to verify that the object hasn’t changed before extending its validity. 

In order to do so Varnish will make a conditional request to the backend. Let’s have a look at what this looks like. First the initial request (edited for brevity):

GET / HTTP/1.1
Host: www.varnish-software.com

The server responds (also edited).

HTTP/1.1 200 OK
Age: 1766
Last-Modified: Tue, 21 Jul 2015 12:05:01 GMT
Content-Encoding: gzip
(..)
(body)

If we wanted to re-request this a couple of seconds later with a conditional request we could make a request like this:

GET / HTTP/1.1
Host: www.varnish-software.com
If-Modified-Since: Tue, 21 Jul 2015 12:05:01 GMT

And if the object has indeed not been modified the server should respond with a HTTP 304.

HTTP/1.1 304 Not Modified
Age: 1762
Last-Modified: Tue, 21 Jul 2015 12:05:01 GMT

The response will have no body. This tells Varnish that the object is unaltered. Varnish will now copy the stale response into a new one, merging the headers from the previous response with the new one - keeping the body unaltered.

Tunables
You’ll have to decide on two tunables: How long past its TTL to keep the stale object and the TTL. There is probably no reason to keep content around for less than an hour unless you are running very low on memory and even then you can rely on the LRU (or LFU if you are using MSE). The TTL should probably be in the 3-60 seconds range. If you are all right with data being potentially 20 seconds out of date then that should be your number. And as with all TTL calculations - the more traffic you have, the lower it makes sense to go.

Why is this useful?
If your CMS is very efficient at responding with a 304 whenever the content isn’t altered you have a good candidate for a setup without having to rely on purges. Having the last modified data in a fast key/value datastore is probably a good idea. There will be a performance cost to pay, naturally. When revalidating an object this will add several milliseconds to the delivery time. Having Varnish deliver content straight from memory will of course always be significantly faster, as memory access times are measured in nanoseconds whereas network access is measured in milliseconds.

The upshot will be that you’ll have a system that will be more robust and you’ll know, as long as your CMS is doing whatever it is supposed to do, that you’ll serve content that isn’t more out of date than N seconds.

Image is (c) 2009 Jannes Pokele used under CC license.