Life as a reverse-proxy can be tough, you know? On one side, you have users in a hurry coming to you to get their content fast, and on the other side, you have backend servers that need to be protected because if you just overwhelm them, who's going to serve new content?
Up until a few days ago, the open-source Varnish project offered only one strict and uncompromising answer. But now… now we have a second option: connection queueing!
What we had before
First, let's understand the actual problem, and how it was solved previously. The issue at hand is spikey traffic. Sometimes you need to create more requests than your backend(s) can handle. This can happen in bunch of cases, for example:
- There's an influx of users, and they all want content that isn't, or can't be, in cache
- You brought a new node online, and you have no cache warming, no persistence and no sharding, so for a while, that server needs to build its cache
- You lost a backend from your pool, and the others need to pick up the slack
Backends are getting better and better at handling load gracefully, but some still struggle, so Varnish has a solution for that: the max_connection
parameter in the backend definition:
In this example, Varnish will still do connection pooling, reusing old connections and creating new ones if all the existing ones are busy. BUT it will refuse to create a 31st busy connection, capping the pool size.
And so far, if you were the client needing that 31st, well, you wouldn't be getting your content. Varnish would start going to the backend, realize that it can't/won't create a new connection and therefore simulate a connection failure.
It's a bit rude, but the reasoning makes a lot of sense: it needs to protect the backend, and waiting on a connection will tie up resources on the off chance that things get less busy. So instead, Varnish just simulates a backend error and goes on with its life.
Good things come to those who wait
It's nice to be principled and all that, but life is made of compromises...and of knowing your environment. Sometimes, it IS actually better to tie up some resources for a little while to avoid delivering error messages (I heard that users usually don't like them too much). And Varnish is all about empowering users, so we needed a new option. Well, two actually: backend_wait_timeout
and backend_wait_limit
.
Starting with Varnish Enterprise 6.0.11r4, when max_connections
is reached, connections will now be queued until a slot is available. Of course we can't wait forever, so the two new (very intuitively named) parameters impose limits on the queue:
backend_wait_timeout
restricts how long a task can stay in the queue before failingbackend_wait_limit
will ensure we do not have too many tasks in the queue, failing tasks that don't fit in it
Connection queueing is activated as soon as those global parameters are set. But if you want to a more fine-grained approach, you can also use their per-backend equivalents: .wait_timeout
and .wait_limit
:
Of course, you can set truly huge numbers to get a virtually unlimited wait capability if you want to, which I usually don't recommend because an infinite backlog isn’t nice. But that's the beauty of freedom, you do what YOU need and want.
Didn't you say something about open source?
I absolutely did! backend_wait_timeout
and backend_wait_limit
made their debut last year in the Enterprise version of Varnish, but now it's time to contribute them to the open-source version. Our engineers Walid and Darryl have been hard at work building the feature and aligning the code base for open-source inclusion, and the Pull Request just got merged a few days ago!
The road ahead
There are also a handful of enhancements that we are ironing out to lower the resource usage and enhance the ergonomics, but those are topics for another blog post.
In the meantime, you can test this feature right now with Varnish Enterprise, or wait until September for the next open-source version!