Managing dynamic CDN content is a very challenging and possibly costly task. Large company or small, dynamic content is something every organization has to deal with. This blog post is meant to go over the ideas and practices around using Varnish Cache to manage dynamic CDN content.
Dynamic content can be defined as any content which can change, either periodically or regularly. Generally dynamic content cannot rely on practices like URL fingerprinting because the URL must be static. Something as simple as your homepage can be classified as dynamic content because it will change over time and the URL cannot be easily fingerprinted. Caching your website on a CDN is considered standard practice, so dynamic content caching practices must be applied to maximize both content responsiveness and performance.
So how do we cache dynamic content? There are generally two strategies: direct cache invalidation and TTL management.
In this first installment of a three-part series (here are parts two and three), I’ll discuss direct cache invalidation.
Direct cache invalidation
Direct cache invalidation is a good strategy if your dynamic content is updated very infrequently or unpredictably. This means you can leverage a longer max-age or TTL (time to live) and you only invalidate your content when an update is made. If Varnish Cache is your only caching layer, this would be very simple. Varnish Cache provides several facilities for invalidating content like banning, purging, and Varnish Plus Enhanced Cache Invalidation for more advanced use cases. The results are immediate, the content is instantly removed. These facilities can be integrated into your backend and content can be automatically invalidated when updated.
However, cache invalidation at the CDN level becomes more tricky. This is due to the simple fact that not all CDNs are equal. Some may not support direct cache invalidation, others may not guarantee immediate invalidation. The invalidation API will also vary between CDNs and there is no guarantee of feature parity. When using multiple CDNs (which is common practice for low-cost, high-performance, global distribution), this problem becomes more acute. More often than not, you are limited to standardizing on a single CDN vendor or choosing to forgo using a CDN at all.
Finally, we must consider client-side browser caching. In the days of HTTP 1.1, there is no way for a server to invalidate a resource that has been cached by the browser. A browser must simply choose to re-request a cached resource, which you cannot control. With HTTP 2.0, there are new facilities for servers to push content updates to clients, but orchestration of these facilities between your backend, caching layers, and CDNs may not be guaranteed, uniform, or even possible.
It is for all of these reasons that direct cache invalidation is commonly avoided as an impractical way to manage dynamic content at the CDN level. However, we can still leverage the flexibility of Varnish Cache to manage a more rigid CDN, or even a fleet of heterogeneous CDNs.
In the next installment, I’ll discuss TTL management for managing dynamic CDN content. The third installment brings it all together.
Ready to learn more about Varnish Cache and everything else you'd like to know about Varnish?
Image (c) 2013 Peter Trimming used under Creative Commons license.