In this week's 2-Minute Tech Tuesday we'll be talking about "purging", which is Varnish's built-in mechanism to remove non-expired objects from the cache.
Purging is one of many implementations in Varnish used to invalidate the cache. When you put stuff in the cache, you need to find a way to get it out when required.
Every object that is stored in the cache has a Time-To-Live (TTL) associated with it and until that TTL expires, that version of the content will be sent by Varnish to the client without revalidating with the origin.
But what if the origin already has updated content that will clearly not be visible to the client? Under those circumstances, it’s important for the origin to connect back to Varnish and to actively purge content from the cache. The origin can invalidate content from the cache by sending an HTTP purge request to Varnish.
PURGE /products HTTP/1.1
Host: example.com
This is an HTTP request using the custom purge method. When Varnish is properly configured for this, it acknowledges the request, removes the object from cache and sends back a 200 ok status code.
This is especially important for time-sensitive sites, for example, media websites that have breaking news to announce. It’s crucial to have the latest content out there in the cache.
In terms of the implementation, we rely on Varnish Configuration Language (VCL). We do this by hooking into the VCL receive subroutine and adding an extra if-conditional that matches the request method. If the request method equals purge, we know we're dealing with a purge request, so we can call the built-in purge logic via return purge. This will remove the object from cache and acknowledge the purge request via the following HTTP response:
vcl 4.1;
sub vcl_recv {
if (req.mthod == "PURGE') {
return (purge);
}
}
It sends this response back to the client, which could well be the origin that triggered the cache invalidation. However, from a security point of view, you're quite vulnerable. Everyone who is clever enough to try sending purge requests to a server might end up emptying your cache. That's why you need a level of security, implementing an access control list in VCL.
We call this access control list purge and it can contain host names, IP addresses, or subnets. Then it's just a matter of adding another if-conditional to check whether or not the client IP matches the purge ACL. When this is not the case, we return a synthetic response that returns a 405 method, saying that you're not allowed to perform this call.
Come back next Tuesday where we’ll cover banning in two minutes or less. Stay tuned!