There are "things" I quite don't like, and "yet another" was already one of those even before I first browsed the web. While looking for a punny catchy title, this is ironically what came to my mind. But the irony's not on me, it sadly is on the topic at hand. Also, it's one of those topics that regularly comes around so having a blog post to refer to might come handy. But you may ask, why yet another post on a solved problem? Well, why not?
Space: the final frontier. These are the voyages of the Varnish Project. It's continuing mission: to explore strange new requirements, to seek out new patches and new features, to boldly go where no Varnish user has gone before.
Let's go to the cloud and find them dynamic backends.
I remember very well my first encounter with Varnish. It was this thing with a weird name that worked differently from the tools I had used before. Trying to understand why it was so unfamiliar quickly settled things and Varnish's architecture suddenly became an eye-opener.
When I give a Varnish training, one thing I use when we reach the topic of cache invalidation, is the following quote: "With great power comes great responsibility". Getting things into your cache is only one side of your policy, getting things removed from your cache (invalidation) is another.
I like to emphasize the word “policy”, and make the backends responsible for providing useful information (but that’s a topic for another blog post). When the backend tells you all you need to know, it allows you to not have to handle specific cases in your VCL. This way you can keep a minimal cache policy in Varnish, containing only value-added features, such as robust invalidation schemes, or cache poisoning mitigation.
What is cache poisoning?
Cache poisoning (or cache pollution) is a kind of denial of service attack, which consists of abusing the behavior of a cache and forcing it to keep junk instead of relevant data. If you manage to fill a cache with enough junk, you can significantly slow down the happy path to legitimate content.
Take Varnish for instance. If you start making random requests at a high rate, you can fill the cache with 404 (Not Found) responses if the backends allow Varnish to cache them. For very high traffic websites, however, it would most likely have little effect on fresh content and instead hurt the long tail.