This post was initially titled "Top 6 Varnish mistakes", to echo Espen's blog. Even though his material is three years old, the content is still relevant. Plus, there's a healthy colleague competition going on here, and I can't just mimic Espen if I hope to beat him, so I had to do something different.
 
Instead, I'm going to compete on brutality: this post isn't about explaining the most common mistakes, but rather the ones that can really kill your service, one way or another. I know it sounds a bit scary, and it is, but that's for your own good, really. So buckle up, and let's see what I have in my Little Shop of Errors.

thread_pool_add_delay REALLY adds delay

If your installation started with Varnish 4.0+ AND you followed Espen's advice to not trust everything you read on the interwebz, you are probably safe. On the other hand, if your installation was migrated from Varnish 3.0, have a very careful look at your Varnish command line and search for thread_pool_add_delay. More lazily, if
ps aux | grep varnishd | grep thread_pool_add_delay
returns something, we need to act! It may have been a great idea to add it at some time, but not any more.
 
See, that parameter was created with a noble goal in mind: try and not crush your kernel by asking it to create threads too rapidly. Indeed, contrary to other software with a fixed number of workers, Varnish will happily spawn and destroy threads to handle the traffic load (roughly using one thread per request (client or backend)).
 
But traffic can spike very quickly, imagine streaming a live football match, for example, or the opening of the sales period, as another example, leading to Varnish creating a boatload of threads in a short amount of time. And a few years ago, kernels weren't that great at this, so thread_pool_add_delay was added to pace ourselves and not flood the kernel, it defines the minimal amout of time between two Varnish thread creation. Two milliseconds was deemed a sensible default at the time, to give you a frame of reference.
 
As kernels got better, and users needed more and more performance, thread_pool_add_delay became a limitation rather than a safeguard, and "the web", started advocating for reducing that number. People started adding this to their command line options:
-p thread_pool_add_delay=1
Websites flew even faster, performance was had and life was grand.
 
Fast-forward to 4.0, when we decided to tidy the parameters a little bit. We had some using seconds, others in milliseconds, which was messy. We used the major release to clean that up, settling on using seconds for all time parameters. Since we broke the ABI (expected for major release) but not the ABI, this:
-p thread_pool_add_delay=1
was still "working", but meant a whole different thing. In the 4.0 world, it said "only create one thread every second, at most". And since Varnish only tries to create threads when it really needs them, you understand why it can be catastrophic to have it wait for more than a fraction of a second. Thankfully, the fix is easy: just remove the parameter from your command line options, the default is now 0, and you probably shouldn't touch it ever again.
 
Note: if you want to understand the meaning of a certain parameter, you can use "varnishadm param.show PARAMETER", it will notably give you the default value of the option as well as a detailed description of it.

default_ttl is more of a fallback than a default

Vocabulary: TTL stands for "Time To Live" and is the period during which an object should be served from cache.

I've encountered quite a few users expecting that setting default_ttl to 120 was a blanket instruction telling Varnish to keep all object for 2 minutes. Welp, guess what? That's not quite the case.

Let's have a look at what varnishadm has to say about it:

    default_ttl
        Value is: 120.000 [seconds] (default)
        Minimum is: 0.000

        The TTL assigned to objects if neither the backend nor the VCL
        code assigns one.

        NB: This parameter is evaluated only when objects are created.
        To change it for all objects, restart or ban everything.

There's a very important sentence here: "The TTL assigned to objects if neither the backend nor the VCL code assigns one." Varnish tries to be a good HTTP citizen, and it will trust the backend, reading its "Expires", "Age" and "Cache-Control" headers (among others) to compute the TTL according to the RFC. default_ttl is only used if we weren't able to get ANY information from the backend. If you wish to set a blanket TTL, you have to do it in VCL:

sub vcl_backend_response { set beresp.ttl = 120s; }

One other consequence is notably that changing "max-age" in beresp doesn't impact the TTL for Varnish because the TTL is computed only once, before we enter vcl_backend_response, as can be seen in the varnishlog:

- Timestamp Beresp: 1510666680.699230 0.000801 0.000719
- BerespProtocol HTTP/1.0
- BerespStatus 200
- BerespReason OK
- BerespHeader Server: SimpleHTTP/0.6 Python/3.6.3
- BerespHeader Date: Tue, 14 Nov 2017 13:38:00 GMT
- BerespHeader Content-type: text/html; charset=utf-8
- BerespHeader Content-Length: 858
- TTL RFC 120 10 0 1510666681 1510666681 1510666680 0 0
- VCL_call BACKEND_RESPONSE
- BerespHeader X-Url: /
- BerespHeader X-Host: localhost

Varnish first decides on the TTL (default here is 120s because the backend stays mums about what we should do), and only then it executes the VCL code that has a chance of overriding the TTL.

Note: if the TTL line is a bit obscure, have a look at "man vsl", it lists all the tags and fields you'll find in the varnishlog.

default_grace will kill your live streaming

Vocabulary, again: grace is the time during which, after the TTL, Varnish is allowed to deliver an object. If that happens, a new object fetch will be triggered asynchronously to freshen the cache. That way we can always deliver something immediately without waiting for the backend, at the cost of delivering something slightly stale.

default_grace is the same as default_ttl, but for grace (duh!): if the backend gives us no indication, we set the grace to that value, and the default default_grace is 10s. Easy enough.

The issue for live streaming is that "slightly stale" isn't good enough. Live streaming relies on a frequently updated manifest file with the same URL but with its content being renewed every few seconds to point to new chunks. By delivering a manifest that can be up to 10 seconds late, you hide the newer chunks from the player, which may react badly.

If you are doing live streaming, the best thing to do is to add "-p default_grace=0" to your command line so that the backend can still override it if needed, but you can also set beresp.grace in vcl_backend_fetch.

Builtin

This one was present in Espen's entry, and with good reason. Failure to understand the builtin can destroy your perfomance because of cookies, notably, but not only.

When you load a VCL, Varnish will append its own built-in version to it. It makes sure that even an empty VCL can still work. The idea is that if your subroutine doesn't return with a decision (pass or lookup, for example), Varnish can turn to the built-in to get it.

You can output the built-in VCL using

varnishd -x builtin

But, you can also find it here: https://github.com/varnishcache/varnish-cache/blob/4.1/bin/varnishd/builtin.vcl . And your distribution probably also packages a commented version somewhere in /usr/share/varnish/.

Let's have a look at the most frequent and traumatic annoyances. In vcl_recv for example, you have these lines:

if (req.http.Authorization || req.http.Cookie) {
    /* Not cacheable by default */
    return (pass);
}

If a request has an Authorization header, or a Cookie one, the request is certainly customized, so Varnish backs off by default, to avoid caching private information. If your CMS is using trackers, such as Google Analytics, for example, all requests will probably arrive with a cookie, meaning they won't be cached, which can be a danger to your backend.

Another example is vcl_backend_response:

sub vcl_backend_response {
    if (bereq.uncacheable) {
        return (deliver);
    } else if (beresp.ttl <= 0s ||
      beresp.http.Set-Cookie ||
      beresp.http.Surrogate-control ~ "no-store" ||
      (!beresp.http.Surrogate-Control &&
        beresp.http.Cache-Control ~ "no-cache|no-store|private") ||
      beresp.http.Vary == "*") {
        # Mark as "Hit-For-Miss" for the next 2 minutes
        set beresp.ttl = 120s;
        set beresp.uncacheable = true;
    }
    return (deliver);
}

If the backend returns with a set-cookie header, a negative TTL, or a no-cache instruction, Varnish will mark the object as uncacheable for 120 seconds (to disable request coalescing).

There are various ways to fix this, from correctly configuring the backend to cleaning the request or response before reaching the built-in to bypassing the built-in altogether. The built-in is here to make sure things don't fail spectacularly, security-wise; the trade-off as you can see is performance, but it's only a default configuration, and you can override it to fit your setup perfectly.

Note: There's an interesting discussion about the built-in VCL and cookies on the varnish-misc mailing-list archive, I warmly encourage everyone to read it: https://varnish-cache.org/lists/pipermail/varnish-misc/2017-July/026023.html.

Varnishlog

varnishlog is an amazing tool. Let me insist on that point: varnishlog is probably the greatest and most useful tool in the Varnish arsenal. Yeah, that much!

It delivers an amount of information unparalleled (to my knowledge) regarding HTTP transactions; you get:

  • All the headers of requests and responses, both on the client and backend sides
  • Timing information
  • Error messages
  • TTL, Grace, Keep information
  • Backend info (which one was used, probe states)
  • IPs and ports used for each transaction

And frankly, I can't live without it. It's quite normal that people would want to store them on disk. But this amount of information comes at a price: let's compare the same request viewed via varnishncsa and via varnishlog:

# varnishncsa
::1 - - [14/Nov/2017:07:38:00 -0600] "GET http://localhost/ HTTP/1.1" 200 858 "-" "curl/7.56.1"
# varnishlog
* << Request >> 6 
- Begin req 5 restart
- Timestamp Start: 1510666680.699664 0.000045 0.000000
- ReqStart ::1 52104
- ReqMethod GET
- ReqURL /
- ReqProtocol HTTP/1.1
- ReqHeader Host: localhost
- ReqHeader User-Agent: curl/7.56.1
- ReqHeader Accept: */*
- ReqHeader X-Forwarded-For: ::1
- VCL_call RECV
- VCL_return hash
- VCL_call HASH
- VCL_return lookup
- Hit 3
- VCL_call HIT
- VCL_return deliver
- RespProtocol HTTP/1.0
- RespStatus 200
- RespReason OK
- RespHeader Server: SimpleHTTP/0.6 Python/3.6.3
- RespHeader Date: Tue, 14 Nov 2017 13:38:00 GMT
- RespHeader Content-type: text/html; charset=utf-8
- RespHeader Content-Length: 858
- RespHeader X-Url: /
- RespHeader X-Host: localhost
- RespProtocol HTTP/1.1
- RespHeader X-Varnish: 6 3
- RespHeader Age: 0
- RespHeader Via: 1.1 varnish (Varnish/5.1)
- VCL_call DELIVER
- VCL_Log vcs-key:localhost
- VCL_Log vcs-key:localhost/
- VCL_Log vcs-key:TOTAL
- RespHeader X-Cache: HIT
- VCL_return deliver
- Timestamp Process: 1510666680.699681 0.000062 0.000017
- RespHeader Accept-Ranges: bytes
- RespHeader Connection: keep-alive
- Timestamp Resp: 1510666680.699699 0.000080 0.000018
- ReqAcct 73 0 73 299 858 1157
- End

As you can expect, the latter is going to consume a vast amount of disk space, which has brought many servers to their knees because the admins were not ready for it and the disks got full super fast.

To alleviate that, you can of course use logrotate to clean older logs, or use logstash to push that outside of the machine, but maybe you just don't need to log EVERYTHING, do you? The varnishlog output is awesome for debugging, but on a day-to-day basis, you probably only need the synthetic report or varnishncsa (did you know you can add fields to its format line? Read "man varnishncsa" for more!).

One popular trick is to let varnishncsa collect all the transactions and use that as primary log, and to set up varnishlog to only log problematic transactions. Here's an example:

varnishlog -q "Timestamp:Resp[3] > 2.0 or RespStatus != 200" -g request -w log.bin

This one asks to filter only the requests that took more than two seconds to be delivered, or with a status code that wasn't 200. I'm also grouping by request so that I can get the backend request associated. And, VERY important, I'm not just outputting text, I'm writing to a binary file (log.bin) so that I can use filtering again on it:

varnishlog -r log.bin -q "RespStatus == 503" -g request

Note: upon receiving a SIGHUP, varnishlog and varnishncsa will reopen the file, creating a new one if needed, making sure the binary file is still usable. That means they are totally compatible with lograte default behavior. Ain't life good?

The file storage has no persistence

The primary cache storage in Varnish is the "malloc" one, a memory storage that's extremely fast. It has one tiny little flaw though: if your server reboots, or Varnish is restarted (not reloaded), you lose your cache. For this reason, users turned to the "file" storage, thinking that since the data is stored on disk, they would be able to reuse it.

That can be a serious issue because if a server with a clean cache gets thrown in the deep end of the pool directly (i.e., full traffic, no ramp-up period), its hit ratio is going to be terrible, resulting in high backend traffic, potentially crushing the backends.

Obviously, the remedy is to have persistence, and this is solved with the Massive Storage Engine. MSE also stores on disk, avoids performance issues linked to fragmentation and features optional persistence through the use of a journal, much like file systems do, to keep track of what's actually on disk and what's reusable so that a dirty power-off won't corrupt your cache.

The more you know

These mistakes can do a lot of damage, but at the same time, they are also very easy to avoid, once you know about them, which is why I hope this helped at least some of you to dodge a bullet or two. All this can be found in the documentation, of course, but remember that if you need some help, we are always lurking on IRC, Twitter as well as listening on the varnish-misc mailing-list, so I invite you to join us there so we can assist you!

Also you can watch our on-demand webinar, 10 Varnish Cache mistakes and how to avoid them, to get more insight into avoiding fatal Varnish mistakes.

WATCH THE WEBINAR