Two-Minute Tech Tuesdays - Backend Health

In this week’s episode of Two-Minute Tech Tuesday, we'll talk about backend health. Specifically, backend logs, health probing, and what to do when your backend goes down.

VCL health probe

Asking a backend about its health can be done through a VCL health probe. As you can see in the VCL code below, the probe has a name and a collection of properties, such as the URL, the timeout, the polling interval, the evaluation window, and the threshold, which represents the minimum amount of successful probes within a window to consider the backend healthy.

vcl 4.1;

probe health {
    .url = "/";
    .timeout = 2s;
    .interval = 5s;
    .window = 5;
    .threshold = 3;
}

backend origin {
    .host = "127.0.0.1";
    .port = "8080";
    .probe = health;
}

That probe can be assigned to the backend through the .probe property. Through the varnishlog command, more specifically, the varnishlog -g raw -i backend_health command, you can have a look at the Varnish shared memory logs and see what the backend health is in real-time.

You can see whether or not the backend is healthy, went sick, or became back healthy. A collection of window probe bits give you more insight. In this case, it's an IPv4 connection with a good transmission, good reception, and a happy backend.

The following field shows the number of successful probes within the window. Of course, the two next fields represent the pre-set threshold and the pre-set window size. The logs also show the amount of time it took for the probe, the average amount of time, andthe server output.

0 Backend_health - boot.origin1 Still healthy 4---X-RH 5 3 5 0.013721 0.010522 HTTP/1.1 200 OK
0 Backend_health - boot.origin1 Still healthy -------- 4 3 5 0.000000 0.010522 Open error 110
0 Backend_health - boot.origin1 Still healthy -------- 3 3 5 0.000000 0.010522 Open error 110
0 Backend_health - boot.origin1 Went sick -------- 2 3 5 0.000000 0.010522 Open error 110
0 Backend_health - boot.origin1 Still sick -------- 1 3 5 0.000000 0.010522 Open error 110
0 Backend_health - boot.origin1 Still sick -------- 0 3 5 0.000000 0.010522 Open error 110
0 Backend_health - boot.origin1 Still sick 4---X-RH 1 3 5 0.029645 0.015303 HTTP/1.1 200 OK
0 Backend_health - boot.origin1 Still sick 4---X-RH 2 3 5 0.020088 0.016499 HTTP/1.1 200 OK
0 Backend_health - boot.origin1 Back healthy 4---X-RH 3 3 5 0.014341 0.015959 HTTP/1.1 200 OK
0 Backend_health - boot.origin1 Still healthy 4---X-RH 4 3 5 0.007238 0.013779 HTTP/1.1 200 OK

One way of dealing with a sick backend is by serving stale content. This, of course, assumes that the requested content is already stored in the cache. We can configure this through VCL. Besides the boilerplate VCL code that contains the probes and the backend definitions, we can define some logic in vcl_recv and check whether or not the backend is healthy through std.healthy().

vcl 4.1;

import std;

probe health {
    .url = "/";
    .timeout = 2s;
    .interval = 5s;
    .window = 5;
    .threshold = 3;
}

backend origin {
    .host = "127.0.0.1";
    .port = "8080";
    .probe = health;
}

sub vcl_recv {
    if(std.healthy(origin)) {
        set req.grace = 10s;
    }
}

sub vcl_backend_response {
    set beresp.grace = 24h;
}

When the backend is healthy, we reduce the staleness which overrides the staleness that was pre-set to 24 hours. This means that cached content can be served 24 hours beyond its TTL if the backend is unavailable.

Another solution that may also cater to content that is not stored in the cache is defining a secondary backend. We can define time separately in VCL and use a concept called "directors" to expose multiple backends as a single backend.

vcl 4.1;

import directors;

probe health {
    .url = "/";
    .timeout = 2s;
    .interval = 5s;
    .window = 5;
    .threshold = 3;
}

backend origin {
    .host = "origin1.example.com";
    .port = "80";
    .probe = health;
}

backend origin2 {
    .host = "origin2.example.com";
    .port = "80";
    .probe = health;
}

sub vcl_init {
   new director = directors.fallback();
   director.add_backend(origin);
   director.add_backend(origin2);  
}

sub vcl_recv {
    set req.backend_hint = director.backend();
}

We import the directors VMOD and initialize it in vcl_init. We'll choose a fallback strategy and assign that as a directors object called director. Through the .add_backend functions, we can assign multiple backends.

Because it's a fallback strategy, the first defined backend will take precedence and when it becomes unhealthy, the secondary will take over.

In vcl_recv we can assign an individual backend on a per-request basis, which uses the backend health. However, this is not always that efficient because the secondary backend is only used when the primary is down.

We can also use a round robin strategy and use an active-active distribution, which means we're actually load-balancing, and health is taken into account.

Join us next week for another Two-Minute Tuesday topic, presented to you in 2 minutes or less!

Two-Minute Tech Tuesdays - Backend Health

VCL health probe

Two-Minute Tech Tuesdays - Synthetic Responses

Two-minute Tech Tuesdays - Dynamic Backends

Stay sane: Drain to avoid pain and bane

Two-Minute Tech Tuesdays - Backend Health

VCL health probe

You may also like this

Two-Minute Tech Tuesdays - Synthetic Responses

Two-minute Tech Tuesdays - Dynamic Backends

Stay sane: Drain to avoid pain and bane

SUBSCRIBE TO OUR BLOG

SEARCH OUR BLOG