June 22, 2017
14 min read time

Varnish Web Developer Wiki Highlights: Cache Invalidation in Varnish with Examples

cache.jpg

 

At the risk of sounding repetitive, cache invalidation is routinely seen, thanks to a now-immortal statement from Phil Karlton, as one of computer science's two most difficult things. 

And it's true: due to the difficult nature of maintaining real-time, up-to-date cache coherency, cache invalidation is a challenge. It's one we've been working on for just about as long as Varnish has existed.

To delve further into how exactly the different components of cache invalidation work with Varnish, here's an excerpt from the Varnish Wiki on this very subject. 

As usual, I'd like to remind you that the Wiki isn't just ours - it's also yours. I encourage you not only to take a look at it but also to contribute to it based on your own Varnish experience.

HTTP purge

acl purge {
        "localhost";
        "x.x.x.x"/24;
}

sub vcl_recv {
        # this code below allows PURGE from localhost and x.x.x.x

        if (req.method == "PURGE") {
                if (!client.ip ~ purge) {
                        return(synth(405,"Not allowed."));
                }
                return (purge);
        }
}

# Notice the return (purge) this means that this subroutine will end here and
# jump to the next sub routine vcl_hash without appending to the built-in vcl_recv
# To read more on this go to https://www.varnish-cache.org/docs/trunk/users-guide/purging.html

PURGE an article from the backend

acl purge {
        "localhost";
        "x.x.x.x"/24;
}

sub vcl_recv {
        # this code below allows PURGE from localhost and x.x.x.x

        if (req.method == "PURGE") {
                if (!client.ip ~ purge) {
                        return(synth(405,"Not allowed."));
                }
                return (purge);
        }
}

# Notice the return (purge) this means that this subroutine will end here and
# jump to the next sub routine vcl_hash without appending to the built-in vcl_recv
# To read more on this go to https://www.varnish-cache.org/docs/trunk/users-guide/purging.html

Purge with restart

This allows Varnish to re-run the VCL state machine with different variables.

acl purgers {
    "127.0.0.1";
    "192.168.0.0"/24;
}

sub vcl_recv {
    # allow PURGE from localhost and 192.168.0...
    if (req.restarts == 0) {
        unset req.http.X-Purger;
    }

    if (req.method == "PURGE") {
        if (!client.ip ~ purgers) {
            return (synth(405, "Purging not allowed for " + client.ip));
        }
        return (purge);
    }
}

sub vcl_purge {
    set req.method = "GET";
    set req.http.X-Purger = "Purged";
    return (restart);
}

sub vcl_deliver {
    if (req.http.X-Purger) {
        set resp.http.X-Purger = req.http.X-Purger;
    }
}

Source: http://book.varnish-software.com/4.0/chapters/Cache_Invalidation.html?highlight=vcl_recv

Accessed: 17th August 2016

Softpurge

  • Reduces TTL to 0
  • Allows Varnish to serve stale objects
sub vcl_hit {
    if (req.method == "PURGE") {
        softpurge.softpurge();
    }
}

source: https://github.com/varnish/varnish-modules/blob/master/docs/vmod_softpurge.rst

Accessed: 17th August 2016

Purge call

Purge call to X-Headers

Banning

Examples in the varnishadm command line interface:

ban req.url ~ /foo
ban req.http.host ~ example.com && obj.http.content-type ~ text
ban.list

Example in VCL:

ban("req.url ~ /foo");

Example of VCL code to act on HTTP BAN request method:

sub vcl_recv {
    if (req.method == "BAN") {
        ban("req.http.host == " + req.http.host +
            " && req.url == " + req.url);
        # Throw a synthetic page so the request won't go to the backend.
        return(synth(200, "Ban added"));
    }
}

source: http://book.varnish-software.com/4.0/chapters/Cache_Invalidation.html

To inspect the current ban-list, issue the ban.list command in the CLI:

0xb75096d0 1318329475.377475    10      obj.http.x-url ~ test0
0xb7509610 1318329470.785875    20C     obj.http.x-url ~ test1

Lurker-friendly bans

The following snippet shows an example of how to preserve the context of a client request in the cached object:

sub vcl_backend_response {
   set beresp.http.x-url = bereq.url;
}

sub vcl_deliver {
   # The X-Url header is for internal use only
   unset resp.http.x-url;
}

Now imagine that you just changed a blog post template that requires all blog posts that have been cached. For this you can issue a ban such as:

$ varnishadm ban 'obj.http.x-url ~ ^/blog'

Since it uses a lurker-friendly ban expression, the ban inserted in the ban list will be gradually evaluated against all cached objects until all blog posts are invalidated. The snippet below shows how to insert the same expression into the ban list in the vcl_recv subroutine:

sub vcl_recv {
   if (req.method == "BAN") {

   # Assumes the ``X-Ban`` header is a regex,
      # this might be a bit too simple.

      ban("obj.http.x-url ~ " + req.http.x-ban);
      return(synth(200, "Ban added"));
   }
}

Purge and ban together example

sub vcl_recv {
  if (req.method == "PURGE") {
      return (purge);
  }

  if (req.method == "BAN") {
      ban("obj.http.x-url ~ " + req.http.x-ban-url +
          " && obj.http.x-host ~ " + req.http.x-ban-host);
      return (synth(200, "Ban added"));
  }

  if (req.method == "REFRESH") {
      set req.method = "GET";
      set req.hash_always_miss = true;
  }
}

sub vcl_backend_response {
  set beresp.http.x-url = bereq.url;
  set beresp.http.x-host = bereq.http.host;
}

sub vcl_deliver {
  # We remove resp.http.x-* HTTP header fields,
  # because the client does not neeed them
  unset resp.http.x-url;
  unset resp.http.x-host;
}

Force cache miss

sub vc_recv {
  set req.hash_always_miss = true;
}

Causes Varnish to look the object up in cache, but ignore any copy it finds This is a useful way to do a controlled refresh of a specific object. If the server is down, the cached object is left untouched. Depending on the Varnish version, it might leave extra copies in the cache. It is useful to refresh slowly generated content.

source: http://book.varnish-software.com/4.0/chapters/Appendix_G__Solutions.html#solution-write-a-vcl-program-using-purge-and-ban

Xkey (formerly known as Hashtwo)

The idea behind Xkey is that you can use any arbitrary string for cache invalidation. You can then key your cached objects on, for example, product ID or article ID. In this way, when you update the price of a certain product or a specific article, you have a key to evict all those objects from the cache.

Xkey can be used to support Surrogate Keys in Varnish in a very flexible way.

On Debian or Ubuntu:

apt-get install varnish-modules

On Red Hat Enterprise Linux:

yum install varnish-modules

Finally, you can use this VMOD by importing it into your VCL code:

import xkey;

VCL example code for xkey:

import xkey;

backend default { .host = "192.0.2.11"; .port = "8080"; }

acl purgers {
    "203.0.113.0"/24;
}

sub vcl_recv {
    if (req.method == "PURGE") {
        if (client.ip !~ purgers) {
            return (synth(403, "Forbidden"));
        }
        set req.http.n-gone = xkey.purge(req.http.key);
        # or: set req.http.n-gone = xkey.softpurge(req.http.key)

        return (synth(200, "Invalidated "+req.http.n-gone+" objects"));
    }
}

Normally the backend is responsible for setting these headers. If you were to do it in VCL, it would look something like this:

sub vcl_backend_response {
  set beresp.http.xkey = "secondary_hash_key";
}

source: http://book.varnish-software.com/4.0/chapters/Cache_Invalidation.html A complete Grace example ————————

# grace mode

sub vcl_hit {
    if (obj.ttl >= 0s) {
        # normal hit
        return (deliver);
    }
    # We have no fresh fish. Lets look at the stale ones.
    if (std.healthy(req.backend_hint)) {
        # Backend is healthy. Limit age to 10s.
        if (obj.ttl + 10s > 0s) {
            set req.http.grace = "normal(limited)";
            return (deliver);
        } else {
            # No candidate for grace. Fetch a fresh object.
            return(fetch);
        }
    } else {
        # backend is sick - use full grace
        if (obj.ttl + obj.grace > 0s) {
            set req.http.grace = "full";
            return (deliver);
        } else {
            # no graced object.
            return (fetch);
        }
    }
}

sub vcl_backend_response {
    set beresp.ttl = 10s;
    set beresp.grace = 1h;
}

sub vcl_recv {
    # intial state
    set req.http.grace = "none";
}

sub vcl_deliver {
    # copy to resp so we can tell from the outside.
    set resp.http.grace = req.http.grace;
}


# source: https://gist.github.com/perbu/93803707dbcdbc345da0
# blogpost: https://info.varnish-software.com/blog/grace-varnish-4-stale-while-revalidate-semantics-varnish

Source: https://info.varnish-software.com/blog/grace-varnish-4-stale-while-revalidate-semantics-varnish

Ready to interact with the Varnish Wiki?

Check out the Varnish Wiki