This blog post is part one of a two-part series. (Find part two here.)

Did you know that originally Quentin Tarantino's Kill Bill was supposed to be one four-hour movie? Later, he decided to make it a two-part deal, allowing him to give each part a very different feel and tone. It also happened to George RR Martin with the Storm of Swords book: it grew so big that he eventually had to split it in two volumes, so we don't have to read a 1,300-page book.

Well, it turns out that after George and Quentin, it's my turn. I originally planned an article about backends, probes and directors, but as I wrote and wrote, I realized it was becoming too large, so I did what they did: I split it in two! This piece here is going to tackle the backend and probe side of the things, including their monitoring, and the second part will take care of the whole director and load balancing chunks of it.

We are going to cover a large portion of knowledge here, and even though we won't touch Varnish internals and it's only half of the original post, it's still a big one. So, grab a sandwich, buckle up and let's go!

Baby got backends

Defining your backends

Varnish is a cache, and as such, won't create content (most of the time, anyway) and it must find it somewhere, namely, a server. A backend is a definition explaining to Varnish how to establish and maintain the connection to such an HTTP server to get this content.

The quickest way to define a backend is to put it directly into the command line, like this:

/usr/bin/varnishd -b 127.0.0.1:8080 ..

This tells us, as you can probably guess, that our backend is on the same machine (IP 127.0.0.1) and on port 8080. It's super easy, and very practical to protect a flaky web service (or a prototype server that can't handle the load yet) quickly.

However, you can only specify IP:PORT this way, and that can be quite limiting, plus, for reasons that will become obvious later, you can only define one backend this way. If we want more backends, we have to go to our good friend, the VCL (Varnish Configuration Language) and do something like this:

backend mybackend {
    # mandatory:
    .host = "127.0.0.1";
    #optional:
    .port = 8080;
    .host_header = "mybackend.example.com";
    .connect_timeout = 2s;
    .ssl = 1;
}

This defines the same backend, but it offers a lot more configurability, as you can see, and there are more:

.host_header: a string to set the host header to when sending requests to this backend. That can be important as quite a few webservers will have virtualhosts indexed on this header.
.connect_timeout, .first_byte_timeout and .between_byte_timeout: these are the exact same as the global ones (found using varnishadm param.show) but will be specific to this backend. It's useful to give a little more leeway to a slower server, for example.
.probe: don't sweat it for now, we'll see that below.
.max_connections: how many connections you authorize Varnish to open to this backend. Remember that connections are pooled and reused, so your backend should have an easier life anyway, but sometimes, you need to protect it even more. Be careful though because that is a hard limit, and if it is reached, Varnish won't use this backend, even if it means returning an error to the user.
.proxy_header: this is currently only in master, and will be part of 5.0 in September; it allows you to use the proxy protocol to talk to this backend, and you can specify the version of the protocol to use (1 or 2).
.ssl: should Varnish use https to talk to the backend? Note that this parameter is badly named and TLS is of course supported while SSL will be avoided most of the time. This one is an exclusive Varnish Cache Plus feature.

I often say this, but it bears repeating: don't overtune! Varnish gives you a lot of switches and knobs, which doesn't mean you should touch them all, especially timeouts. The defaults are pretty reasonable and should work in most cases. You should only change them if you identified a problem with your backend that could be solved by tweaking them.

New year's and backend resolutions are hard

Now, before we go further, there's one thing you should know about the .host field: it is resolved to an IP only once (at run time for the command line, and at load time for the VCL).

For IPs, that is not a problem (by the way, of course, IPv4 and IPv6 are both accepted), but if you ask for a domain name, it must resolve to exactly one IP, otherwise Varnish won't start (command line) or won't load the configuration (VCL).

This seems pretty restrictive, but deciding to be simple and strict allows for better code, fewer heuristics and more predictability. If you need to resolve domain names on-the-fly and/or manage pools of resolved IPs, you can look at vmod-named and vmod-gwist. But if you don't know yet what a director is, keep reading, and be sure to catch the second part of this post.

This can probe useful

Are you in( a )band?

Some reverse-proxies do in-band probing, meaning that they'll use the normal traffic to determine if a server is up and running ("healthy" in Varnish terminology) or down and non-responsive ("sick"). Typically, it goes like this if your backend is unresponsive:

client: Hi, can you give me "/otter.html", please?
cache: Sure, I don't have it, lemme ask my backend.
cache (to backend): Hi backend, can you send "/otter.html" over, please?
backend: ...
cache: Dude?
backend: ...
cache: Man, you're taking too long here, I'm gonna have to time out.
backend: ...
cache (to client): I'm sorry sir, I can't help you, here, have a 503.

Of course, I'm ignoring the fail-over capabilities here, but it's not the point: the inconvenience of this system is that you have to wait for a user request to discover that your server is dead.

Varnish on the other hand does out-of-band probing, meaning it will periodically poll the server and update the status "in the background". So, with Varnish, the previous conversation becomes:

client: Hi, can you give me "/otter.html", please?
cache: Sorry sir, my backend is down, can't help you. Please, take this 503.

In the end, the user still receives a 503, but it's much faster because we are not piggybacking on their request to poll our backend, waiting for a timeout. It also decorrelates the fetching options from the probing options, allowing finer tuning.

Allowing a few missteps

Let's look deeper into how it works. At regular intervals, a minimal HTTP request (using "/" as URL) is sent and if the backend responds quickly enough and with the correct code, that probe is ok. Then we look at the last probes (window) and see if enough (threshold) of them are good, if so, the backend is healthy and will be used, otherwise it's sick and will be avoided.

Here's an example: with a window of 5 and a threshold of 3, here's a timeline of probe results with the matching backend health.

As soon as the last 5 probes contain less than 3 OKs, the backend becomes sick. As you can see, it makes the probe more robust to flip-flopping, as one "OK" or one "NOK" isn't enough to change the health.

Recycle, reuse

In VCL, declaring and using a probe is done like so:

probe myprobe {
    .window = 5;
    .threshold = 3;
}

backend mybackend1 {
    .host = "127.0.0.1";
    .probe = myprobe;
}

You can of course reuse myprobe for multiple backends.

Backends with no probe will be considered healthy, UNLESS there's a probe named "default", in which case they will use it, names have meaning!

Finally, if you just need a quick and dirty probe, you can use the anonymous syntax:

backend mybackend1 {
    .host = "127.0.0.1";
    .probe = {
        .window = 5;
        .threshold = 3;
    }
}

Live, learn and adapt

Defining good probes is critical because you want them to reflect the health of you server as accurately and with as much responsiveness as possible. Unfortunately, these two goals are often contradictory: choosing a large window will smooth out erroneous probes, but will do so by sacrificing reaction time.

Finding the right parameters is an iterative process, as the probe mechanism is only an approximation of the real backend's health, but the probes' versatility means you can get very close easily. Regarding responsiveness, you already know about .window and .threshold but there's also .interval (time between two probes start) and the obvious .timeout. Be wary of this last one, it is way too common to see backends going sick because a sysadmin decided that "0.1 seconds should be enough time for any server to reply".

Above, I urged you not to overtune, there is however a parameter that I'd like you to consider carefully when creating a probe: .url, aka the location in the probe request. It's very important to choose it well because, again, a probe can only approximate the health of a backend, but it must try and do it correctly. So, select a location that cause your server to do some actual work that resembles regular requests. For example, choosing a static file (super easy to serve) for a PHP server is not a good idea, because the server could be choking under dynamic requests and still serve your probe without effort.

And because you need a probe reflecting your workload, the minimal request can be too minimal, for example because you need to had an extra header to include a token, or because you need to send a specific body. In those not-rare-enough cases, the .request field will allow you to specifiy the whole request, headers AND body.

And if we do this for the request, it'd be unfair to not do it for the response, wouldn't it? That why you can use .expected_response to specify the return code you want for an OK probe. However, if you do use this field, please send me a mail, or find me on Twitter, I'd really like to know more about your backend.

Now what?

Take the hint!

Now that our backends are defined, it's time to use them! I will dutifully avoid directors here, as it will be the subject of the second post, so what we can do in VCL is a bit limited, but still very useful.

We'll play with these two backends:

# look, minimal definitions:
backend alpha { .host = "192.168.0.101"; }
backend bravo { .host = "192.168.0.102"; }

The first thing you can do in VCL is assign a backend to a request, so that if a miss, pass or pipe occurs, Varnish will go to it to process the request:

set req.backend_hint = alpha;

Combined with the rest of the syntax of the VCL, you can route requests according to their host header, like a plain old web server would do:

sub vcl_recv {
    if (req.http.host == "alpha.example.com") {
        set req.backend_hint = alpha;
    } else if (req.http.host == "bravo.example.com") {
        set req.backend_hint = bravo;
    } else {
        return (synth(404, "Host not found"));
    }
}

Of course, you can use any element of the request to make your choice, using the same syntax, for example, that can also be the extension of the requested file, or the language asked by the user. One prominent case these days is API routing (something the API engine helps you do easily), where you use Varnish as a single entry point and route the request to dedicated API servers, based on the beginning of the URL (ie, the API endpoint).

And what happens when no req.backend_hint is defined? The answer is obvious: the default backend will be used. BUT, be careful, it may not be named "default", if no backend was actually named "default", the default one is the first one defined.

Going back to that point we left unexplained at the start, that's why you can't specify a VCL file and a backend, or two backends in the command line: you'd have one or two anonymous backends and no way to reach/set them.

Say "aaaaaaaah"

Routing is fine and all, but alpha or bravo may be down, and sending a request to them is a bit dumb, since Varnish won't even try and generate an error. Instead, we may want to send a custom message.

In those kind of cases, we'll rely on the trusty vmod-std to check the health of the backend:

sub vcl_recv {
    if (req.http.host == "alpha.example.com") {
        set req.backend_hint = alpha;
    } else if (req.http.host == "bravo.example.com") {
        set req.backend_hint = bravo;
    } else {
        return (synth(404, "Host not found"));
    }
    if (!std.healthy(req.backend_hint)) {
        return (synth(503, "No healthy backend"));
    }
}

We just added a check after our previous code and generate a tailored message if the selected backend is not up. This example is a bit anti climactic because how easy and logical it is, but that's what configuration SHOULD be anyway.

Another cool example of std.health features grace. A graced object is an object whose TTL expired but that we decided to keep around to deliver in case our backend is unavailable:

sub vcl_hit {
    if (obj.ttl >= 0s) {
        # A pure unadultered hit, deliver it
        return (deliver);
    }
    if (!std.healthy(req.backend_hint) && (obj.ttl + obj.grace > 0s)) {
        return (deliver);
    } else {
        return (fetch);
    }
}

What we do here is simple but can use some explaining:

we are in vcl_hit, so we found a matching object
we check its ttl, which is a countdown timer, and if it's not yet negative, the object is still valid, so we deliver it.
the TTL has expired, but the grace may still be okay. If it's the case AND the backend is down, we deliver the cached object. It's a bit stale, but it's better than nothing.
Finally, if the grace is gone or the backend is up, we get a new object to give to the client.

Keeping an eye on all that

It would be nice to know a bit more about what's going on, wouldn't it? Fortunately, we have some tools at our disposal that do just that.

But before I show you the wonders that are varnishadm, varnishlog and varnishstat, I have to tell you a bit about VCL and backends and temperature.

In the old days, backends where managed by Varnish directly, but to allow a bunch of nice stuff, they are now part of the VCL's responsibility, and this means that everytime a new VCL is loaded or discarded, it backends are created or destroyed.

Why does it matter? Because while there can only be one...(pause for dramatic and nerd effect)...active VCL, old VLCs have to finish treating the ongoing requests before being discarded, so you may have multiple backends with the same name, representing the same machine across multiple VCL. And that's why you will see the names of your backends prefixed with the name of the VCL responsible for them.

varnishadm

Our first stop is varnishadm because while it doesn't allow or show much, the little it does is invaluable.

Getting the health of your servers is as easy as "varnishadm backend.list" that will return something looking like:

gquintard@rayorn:~/$ varnishadm backend.list -p *.bravo
Backend name Admin Probe
boot.bravo probe Sick 0/5
Current states good: 0 threshold: 3 window: 5
Average response time of good probes: 0.002031
Oldest ================================================== Newest
444444444-4444--4444------4444444444444444444444444---444------- Good IPv4
XXXXXXXXX-XXXX--XXXX------XXXXXXXXXXXXXXXXXXXXXXXXX---XXX------- Good Xmit
RRRRRRRRR-RRRR--RRRR------RRRRRRR--RRRRRRRRRR----RR---RRR------- Good Recv
HHHHHHHHH-HHHH--HHHH------HHHHHHH--HHHHHHHHHH----HH---HHH------- Happy

Here you get a more detailed report, notably containing some backlog. The last four lines tell you if the probing went wrong, and at which point of the communication. I use a python http.server as my backend, and I either stopped it, or froze it (using ctrl+z), and you can clearly see which is which here.

In the first example, you probably wondered why, if boot.alpha has no probe, the "Admin" field defiantly said "probe", and you weren't wrong. It just means you are not forcing the backend status and letting the probe, if any, do its job.

Which means you CAN force the status! And it's very easy to do:

varnishadm backend.set_health boot.* sick

With this, I can set all backends belonging to the "boot" VCL to sick, which may sound stupid, because, why would you want your backends to not respond? But, actually, it's pretty clever and useful, because a sick backend won't be used anymore, hence you can use this to drain its connections before a maintenance operation. And once it's done, you just have to revert:

varnishadm backend.set_health boot.* auto

And that's it!

varnishlog

There's also the ever-useful varnishlog, which is not very surprising when you consider the amount of information already available in it.

But even if you're a regular Varnish user, you may have never seen a line about probes in a varnishlog output - why is that?

Simply because varnishlog, by default, will group lines by transactions, and probe requests are not linked to a transaction, so you won't see them unless you add "-g raw" to your command. The command I recommend is this:

varnishlog -g raw -i Backend_health

It produces lines resembling:

0 Backend_health - boot.alpha Still healthy 4--X-RH 5 3 5 0.001904 0.001956 HTTP/1.0 200 OK

That's quite a lot of information, so here's a little guide on how to read it:

0 Backend_health -: consider that invariant in this context, but you can find more information with "man vsl".
boot.img: the name of the backend, prefixed with the VCL's name as mentioned earlier.
Still healthy: the status of the backend, you can also get "Went sick" for example.
4--X-RH: these seven symbols represent bits (all 0s is "-------", all 1s is "46xXrRH"), and they actually mean the same things as the varnishadm output, the only new ones are:
- "x": the transmission failed
- "r": the reception failed

5 3 5: number of good probes in the last window, threshold and window, easy, right? One good idea is to check regularly that the first and last numbers are the same, i.e., to verify that the probe isn't too flaky.
0.001904 0.001956: response time and average response time. It's again good to check that these two don't diverge too much (i.e., small variance), and that they don't go too near the timeout limit (a.k.a. dangerously close to failing).
HTTP/1.0 200 OK: the backend response, so you can see how it failed.

That is A LOT of data! Hopefully, it's quite readable (and parsable by say, awk) too, and all of it is useful to both understand why a backend is marked sick and to analyze if your probes are correctly tuned.

Sure, you won't use this command every day, but know it's available and that Varnish is not a blackbox is pretty reassuring, don't you think?

varnishstat

And lastly, we have our varnishstat, ready to give us some metrics about our backends. As usual, we will retrieve absolute data since the start of Varnish, so if you want "per period" info, you'll need to compare two datapoints, one at each end of the period.

The fields we'll have a look at follow this pattern: "VBE.vclname.bename.property".

VBE: won't change, this is the stat section we're in: Varnish BackEnd.
vclname.bename: this is the full backend name, again
property: well, that's the property name, but you already guessed that, and we have:
- happy: number of OK probes
- bereq_hdrbytes/bereq_bodybytes/beresp_hdrbytes/beresp_bodybytes: how many headers/body bytes were sent and received to and from that backend.
- pipe_hdrbytes/pipe_out/pipe_in: volume of data sent and received when piping.
- conn/req: number of connections and requests used on that backend.

I'd like to direct you attention to conn: that's a gauge, meaning it reports the current number of connections used for that backend. If you set the backend to sick and are waiting for it to finish handling old request before restarting it (or just killing it, who knows?), this is the field you want to watch as it goes down to 0. But be careful and remember that "that" backend may exists multiple times across VCLs.

If you want a pipeable output of varnishstat, you can use "varnishstat -1", or you can use "varnishstat -j" for the JSON output that notably includes descriptions of all counters (the same descriptions are available via "man varnish-counters").

To be concluded

Aaaaaaaaaaand we are mostly done for today! You now have all the tools to define backends and set probes to monitor this little crowd. Also, note that I only showed you "Varnish" tools, but as the whole idea is to be pluggable, if you already have a tool able to read the Varnish Shared Memory and Log, chances are you can just use it to get your backend/probes information.

Stay tuned for the second part in a few days, this will be almost exclusively about directors, how they work and how to use them better!

Backends: well done, with a side of load balancing (part 1)