As we launch into the tenth-year anniversary festivities for Varnish Cache, we thought it would be a great time to dig into the history and take a look back at how your favorite web app accelerator started.
Varnish Cache is a web application accelerator also known as a caching HTTP reverse proxy. You install it in front of any server that speaks HTTP and configure it to cache the contents. Varnish Cache is really, really fast. It typically speeds up your web application content significantly and depending on your architecture it can mean by orders of magnitude.
Varnish was designed specifically to replace Squid, a client-side proxy that can be adapted and used as a web accelerator. Its main design goal was to increase scalability and capacity of content-heavy dynamic websites as well as heavily consumed APIs. Such sites run on web servers, such as Apache or nginx, primarily origin servers. which create the web content that is to be served. Varnish’s job is not to create content, but to make content delivery lightning fast. It follows the Unix philosophy concept of DOTADOIW and thus focuses exclusively on the HTTP protocol.
Fast forward ten years and Varnish is used today by high-profile websites including Wikipedia, online newspaper sites such as The New York Times, The Guardian, The Hindu and social media and content sites, such as Facebook, Twitter, Vimeo, and Tumblr. More than 13% of the top 10K sites and more than 2.3 million sites across the whole web run it, according to Builtwith.
But how did the project start?
It all started in Oslo, Norway. Once upon a time there was a Dane of FreeBSD fame, Poul-Henning Kamp (PHK), and a couple of Norwegians, Anders Berg, sysadmin of Norway’s biggest online newspaper vg.no and Dag-Erling Smørgrav, consultant at what at the time was was Norway’s biggets FOSS consultancy Linpro.
Poul-Henning just published a nice write-up about how Varnish itself, and more specifically VCL, were designed starting in February 2006. Anders did a Q&A session on how it all started. In this post I’m going to pick up from there and try and give you the context in which this amazing piece of software came about. So let me tell you the story.
Web performance was not a priority the first decade of the web
HTTP is the underlying building block of today’s intertwined online world. It is the protocol your mobile apps use, that your cable TV provider uses to stream your flicks and the web runs as the vastest application on top of this protocol.
The surfing patterns of today have no resemblance whatsoever to what we understood as the web back in the mid 2000s. Until that time the focus of almost all web applications was to provide users with features. So we saw development in the direction of richer, interactive applications. But they were slow.
Why did Varnish happen?
The rise of online media consumption, at-scale video consumption online (read YouTube) and the arrival of mobile internet access turned the tables. It showed us that we needed to think about the delivery of web content anew: Since the web was introduced we focused on features, but nobody focused on actual web performance.
In the early 2000s to manage a site with millions of visitors a day you had two choices: buy an F5 Big-IP, NetApp NetCache or a (later Citrix) Netscaler appliance or run Inktomi’s Traffic Server (later Yahoo’s and now living at the Apache Foundation) and run with that. Provided you had the budget. If not, you could set up servers with Squid, NetCache’s remote cousin and originally a client-side cache, and run it in reverse proxy mode, effectively making it an ineffective web accelerator. The more users you had, the more servers. It was not uncommon to have tens of servers, which was neither technically effective, nor cheap. Remember that budget part?
By 2005 new hardware technology had made inroads into the server room and memory was cheaper than ever. So when the good people at vg.no, a Norwegian media property, paired 64-bit computing, cheap RAM and advancement in network hardware and made some calculations on a napkin, they knew that something could be done. They knew that the future belonged to open source, so the resulting software was to have an open and liberal license. And so they did.
Demand-driven enterprise open source software
They allocated some budget (yes, you need that everywhere for everything) and, only after some persuasion, they managed to contract Poul-Henning Kamp to design and develop the software. At the time he was one of the main contributors to the FreeBSD kernel. They also contracted a Norwegian open source consultancy called Linpro, to be responsible for the Linux port and to run the project long-term. Later it became the Nordics’ biggest FLOSS consultancy company, Redpill-Linpro. In 2010 Varnish Software was eventually spun off to solely focus on software development of Varnish and a suite of commercial software products and support.
And it has been developed in this fashion ever since. Varnish is demand-driven enterprise software. All major features have a great story behind them, a need they solved for someone somewhere and they have all been financially sponsored. Features such as streaming, ESI, load balancing directors and countless VMODs such as surrogate keys, cURL, memcached and many others come to mind. Besides VG, we count sponsors as varied as Facebook, LiveJournal, Globo, Ustream and many more.
It is also worthwhile to mention that lots and lots (and lots) of minor community contributions have made it rock solid over the last ten years. Be it feedback in our IRC channel, bug reports, emails to our lists, code patches or github pull requests, it has all been valuable.
Super stable core. Modularized development. HTTP/2.
The Varnish core has been in constant evolution since the very beginning. As we know that we cannot solve all problems, our approach has always followed the good advice of having a strong and stable core while providing escape hatches.
Even in our first release a decade ago, when the software could more or less only run in vg.no, you already had the ability to extend it using Inline-C. With the release of Varnish 2.0 more features were added, but it became obvious that extending Varnish needed a modularized approach. Varnish 3.0 introduced Varnish Modules, VMODs, which could add functionality to VCL. Varnish 4.0 turned traffic directors into VMODs allowing for even more flexible and extendable load balancing, while Varnish 4.1 introduced the ability to create VMODs that can implement their own custom transport towards the backend as HTTP over TCP doesn’t fit all use cases. The sum of all these changes effectively allows users to turn Varnish into an advanced HTTP engine, beyond the mere cache usage.
As the HTTP protocol has evolved, so has our software. Architectural changes introduced in 2013 have laid down the path for future changes and we are currently working towards support of HTTP/2, at least the good parts of it.
Building great solutions on top of Varnish
Varnish is not only a cache. In its first years it was used to cache static objects, normalize HTTP headers, rewrite URLs or similar things.
Nowadays it is used for load balancing, API metering, implementing some of the biggest paywalls, SSO gateways, building WAFs, caching the objects of major CDNs.
And I am sure the future will bring even greater possibilities for users of Varnish, many of which we are already seeing today.
Some users turn Varnish into a static web server fetching objects directly from the file system, which is rather fast and interesting in itself, but not as exotic as using Varnish as a cache with a FastCGI backend. Other things that come to mind are the ability to re-shape and change content on delivery as it is currently done with Gzip and ESI. There are solutions in production doing object decryption, image optimization and video pseudo-decoding. I would not recommend that for everyone, but I would not be surprised if facilities for this should be available to users in the future.
Celebrate with us all the way to Varnish 5.0
As PHK wrote a couple of weeks ago Varnish will be moving to a fixed time release. So you can expect the next major release to come on September 15th.