Varnish Cache is versatile. To date we’ve seen it utilized as a website cache, API gateway/manager, API cache, CDN reverse proxy and a few others. One role which hasn't gotten as much attention is the role of origin protection or the slightly fancier term “Origin Shield”.
To understand what origin protection is let’s have a look at what CDNs do. They basically have two roles.
- To cache static or semi-static content as close to the user as possible.
- To keep established TCP or SSL sessions open between the user and the origin. This is particularly important for SSL as the cost of establishing a new SSL connection is significant, both in latency and computational intensity.
However they might not always cache as well as you’d like. They are shared resources and your responses might be dropped from cache if the pressure on the CDN increases or if the CDN starts to struggle. Some CDNs will automatically start to pass more and more requests to the origin if they are running out of IO capacity. In addition you might have a lot of CDN servers and depending on the configuration they might all go directly to the origin for the content. So if the CDN you are using is powered by thousands of servers they might all ask the origin for the same content - over and over again.
Having a reverse proxy cache like Varnish at your origin might be a good investment if you have a bit of traffic. Even though the CDN is working quite efficiently we often see cache hit rates of well above 60% on cache nodes deployed behind a CDN.
In addition we’ve seen that even quite reputable CDNs might fail from time time, especially when facing huge traffic spikes or a traffic pattern with a longer and fatter tail than usual. Having the capability of serving a big chunk of traffic out of origin can mean the difference between staying up and going down. Remember that when serving content from cache on an established connection there is almost no limit to the amount of traffic that a single Varnish Cache server can deliver.
It is worth mentioning that when having multiple origin servers behind Varnish you should take advantage of Varnish’s capabilities as a load balancer. Features such as grace and saint mode (available in Varnish 3.0 and in the upcoming 4.1 release) can significantly add to the resilience of your infrastructure.
So you want to protect your origin server. Now what?
There isn’t much to it. The tricky bit is figuring out what to cache and what not to cache. If you are using the Cache-Control header or the Surrogate-Control header to set TTL, Varnish will automatically do the right thing.
Compared to a CDN you can allow your Varnish installation to cache a bit more since you’ll have better control over the content and you’ll be able to invalidate any content cached there within a millisecond or so if you have some sort of cache invalidation mechanism going.
Want to try out all the Varnish Plus tools and discover the best way to protect your origin server?
Image is (c) 2009 Craig Loftus, used under CC license.