October 6, 2016
8 min read time

How it's made: Varnish 5.0 and http/2

On September the 15th, the Varnish Cache project released the fifth major version of Varnish, including one of its most anticipated features: HTTP/2 support, so it's time to look in more detail at what that really means for you.

The boring context section

As you may have noticed, Varnish is a tad late to the HTTP/2 party, and this is due to not jumping onto the hype wagon early on, because of a lack of "technical enthusiasm" regarding the specification.

HTTP/2 has problems. Some are in the protocol itself, some are in the current implementations, but after all, the "current" HTTP version is also a bag of hurt and over the course of 25 years, we managed to have decent implementations and the protocol is now ubiquitous. So, let's work with the H/2 we have instead of the one we want, right?

However for this to happen properly, some (read: "a lot of") work needed to happen in Varnish to decorrelate the protocol, session and request bits in order to have a maintainable and extensible code base. That happened over the course of the last two years, and to be honest, we probably didn't need that amount of work just to make H/2 work.

But that will be needed for H/3, or Quic, or whatever, so the move toward the next big thing should happen more painlessly. To give you an idea, the actual H/2 work in Varnish really happened in three or four months, which is, all in all, quite fast considering how big H/2 is.

Sooooo, "experimental" support, eh?

That adjective is probably a bit too scary, but that's what's in the release note, so let's go with it. The main reason for it being "experimental" is that development was frozen within a week of the release. As you may know, the Varnish Cache project switched to a biannual release schedule, 5.0 was the first one to adhere to it and, well, we made it in time, but barely.

This means that while the code is there, has been tested and works, it hasn't been battle-tested in a heavy-duty production environment. On the other hand, this is actually why we are now releasing earlier and releasing oftener, so that people can test stuff and help us progress.

However, we are absolutely not in the "no need for QA, we have users!" way of thinking, H/2 support is announced as experimental, and it is treated as such: you have to explicitly enable it at runtime, or even on-the-fly using varnishadm (no compile-time options, this is truly the 21st century!). For more info about this, read Lasse's post here.

H/2 requires TLS! where's my TLS?

This is sadly a very invalid question. In the wake of the Snowden revelations, the IT world went through (some are not out of it yet) a "let's encrypt everything" phase and there were talks of mandating TLS to encapsulate every H/2 connection.

However, it's not the case in the final RFC, and one would argue that HTTP and TLS don't even belong to the same OSI layer. One would of course be ill-advised to do so, as respecting the OSI model is simply something people don't do, ever. In that final RFC, you may open an H/2 session using three ways:

  1. Use TLS and ALPN to tell the server you want some sweet H/2.
  2. Upgrade from H/1, using the normal upgrade path, setting no less than 3 headers!
  3. Blast a "connection preface" at the server, hoping it doesn't misinterpret that as an invalid H/1request.

Side note: while H/1 has an upgrade mechanism, this doesn't exist anymore in H/2, the reason for it is sane: you can only open an H/2 connection by negotiating it first. So if you negotiated it, you could have negotiated whatever other protocol you needed.

Consequence of the side note: if you are using WebSockets, you are stuck with upgrading from H/1.

All this to say that Varnish won't encrypt your HTTPS connection because it's not its job, but we are not leaving you naked in the wild either: Varnish supports the PROXY protocol (see Lasse's article for a direct approach to the matter) so you can plug in whatever SSL/TLS terminator you want.

I would personally (and professionally too!) recommend hitch, first because it's written by Dag who's awesome, but also, it's super lightweight. Contrary to other terminators that parse HTTP requests/responses, it absolutely doesn't care about what it's transporting, meaning less parsing and fewer resources used.

Your site might not get better, but it won't get dead-er

H/2 was finalized last year, and since then, people started using it (implementations started long before that, obviously). And, well, let's say that not everything is great.

There are multiple talks about H/2 performance, how it's better or not, and I will leave that on the side for this blog post, what I'd like to stress instead is that H/2 brings a handful of new attack vectors to the table, opening faulty implementations to worlds of pain.

Bugs happen, and none of us is safe from them, be it as users or as developers, and Varnish isn't an exception. Yet, we acknowledge this issue, and try to create a coding environment where mistakes aren't devastatingly creepling. The fact that Varnish had no major CVE in 10 years is a good sign that it works (having a freeBSD paranoid security nut as architect helps a lot, for sure). Following are just a few things that make Varnish a little bit safer.

Asserts, asserts everywhere

The Varnish code is written in C, a very powerful language, but coding in C is akin to strapping a plane engine to a skateboard: you'll go super fast, but any mistake might end up with you in a splatter pattern on a wall if you're not careful.

And so, the Varnish codebase is coded very defensively, with assert calls killing the program if something goes wrong internally. As counter-intuitive as it sounds, it's actually the right thing to do because you have no idea about how far things went wrong, so it's better to just scrap the whole thing and start with a clean slate.

Also, as we are purposefully "suiciding" Varnish in those cases, we have the facilities (panic) to debug fairly quickly because a good deal of information is available to us right away. And since only the worker process will crash, it will get respawned within moments by the management process, resuming work.

The take-away here, is that bugs should, at worst, reset the worker process, and not, say, allow arbitrary code execution. And if it happened, the process drops its privileges at startup to run as a regular user anyway, greatly limiting the attacker's possibilities.

The sky is the limit, but let's be reasonable

While the asserts (and coding practices as a whole) are good, generic coding hygiene, Varnish has a few strong opinions regarding HTTP security specifically. The most important one is that it won't put up with everything thrown at it, and the administrator should be aware of this.

If you have some crazy requests with massive cookie headers, or if your API requires calls with 100 headers just for arguments, chances are that Varnish caused a few problems. This is due to a few parameters defining what is considered legitimate traffic. You can find and adjust all of them using "varnishadm param.show", but here are a few of them as examples:

  • http_req_hdr_len: maximum size of one request headers. (default: 8k)
  • http_req_sizemaximum total size the request headers. (default: 32k)
  • timeout_idle: time to receive the complete request headers.(default: 5s)
  • and more.

This may seem limiting, and you'll probably want to raise some of them because sometimes you DO need 16KB-long cookies (or rather, your backend framework does). On the other hand, those restrictive defaults probably save a few websites by throwing out obvious attacks.

The issue with H/2 is that the number of possible attacks has grown by an order of magnitude. For example, the big selling point of the new protocol is that you can multiplex requests, meaning one client can request 1000 objects at once. No need for friends to DoS servers anymore! Obviously, H/2 thought about that, and a server can limit that number of concurrent requests, but it's supposed to become active only once the client acknowledges that setting. See the problem?

Side note: by the way, the RFC recommends NOT TO set a limit to concurrent streams, and if you really must, to set it no lower than 100. That's a huge jump when compared to the 6 simultaneous connections of H/1.

To avoid these sort of issues, Varnish 5.0 has a few additionnal counters to try and protect you from the new vectors. Defaults are currently at best a clever guesstimate, and any feedback based on production use would be of great value.

Not sharing is caring

In a similar fashion, Varnish also has a interesting, and defensive, memory model. Roughly, each task is handle by a thread, and each thread gets it own memory workspace, and all the needed work, for headers for example, will happen in that space.

The performance aspect is interesting: CPUs are happier because memory can be localized much better, and CPUs love that. But it also means that if one task needs more memory than planned, it will just fail, and won't keep eating memory.

Again, this can be seen as limiting, but one failed task is probably better than one whole server being brought to its knees. Plus, it's easy using varnishlog, varnishstat and others to figure out what is going on, if the requests are legitimate, and if you should change the workspaces to be larger.

The bright future

My take on this may seem a bit grim, with H/2 having problems, opening some cans of worms and others. And yet, things will get better as we learn how to work it, implement things more smartly, and develop new patterns based on these new possibilities.

And Varnish is ideally architectured for the things to come. If you think for example about the PUSH promises (sending a resource to a client before it knows it needs it), this is the kind of complex decision that is subject to a lot of inputs and heuristics, so we'll see a lot of different approaches. While one may prevail and get included in Varnish, it's more plausible that we'll see VMODs tackling this problem, allowing you to use one policy, or the other, or both!

And H/2 will without doubt continue H/1's journey as "the" internet protocol, whether you/we like it or not, delivering all kinds of content and catering to all sorts of use cases. As Varnish is an awesome HTTP toolbox, we are sure to see some interesting (read "crazy") setups with it, totally unrelated to content distribution.

The future is going to be fun!