July 26, 2018
4 min read time

Using obj.hits, clearly a hit and miss affair

labels

VCL is awesome, but it's also a bit frightening to newcomers, and most of the time, people are more interested in something that "just works"™ than "doing the right thing"™ and they google their way out of it.

Problem is that the first thing that sort of works isn't necessarily the right thing, and the proliferation of one particular code snippet is what prompted the writing of this post. The snippet in question? This guy:

sub vcl_deliver {
    if (obj.hits > 0) {
        set resp.http.X-Cache = "HIT";
    } else {
        set resp.http.X-Cache = "MISS";
    }
}

Recognize it? Have it somewhere in your code, maybe? If so, read on, and let's try to do the right thing, and hopefully, get a better setup thanks to it. Don't worry, this is a short one, so hop in.


The smallest hill to die on

First, to be sure we are on the same page, let's explain what the snippet does:

# vcl_deliver is called right before deliver an object
# allowing to tweak it one last time
sub vcl_deliver {
    # read the "hits" property of the objects, ie,
    # how many times a cache search found it
    if (obj.hits > 0) {
        # more than once means it was already in the cache
        # so it's a hit, so add an X-cache header to the
        # response
        set resp.http.X-Cache = "HIT";
    } else {
        # else, it's a miss and write the header as such
        set resp.http.X-Cache = "MISS";
    }
}

Now, what is wrong with that code? For one, it's very black-and-white: for example, if you decide to bypass the cache (i.e. "return(pass)") then you get a "MISS", which is wrong.

Note: You'll surely argue that if we mean that a miss is just a request that wasn't in the cache, then it's ok. I'll respond that if we redefine the word "window" as "agree", then I window with you.

But there's "worse": what happens when you purge? Or simply return a synthetic object? Remember that those objects go through vcl_synth, not vcl_deliver. In all the setups I saw sporting this code, the vcl_deliver branch was never covered, meaning you will sometimes get an X-Cache header and sometimes not.

So yeah, it's not the end of the world if you get crappy debugging headers, and most of the time you'll remember the gotchas. Most of the time...

Let's write some code

Hopefully, we can do better. Here's what I offer:

sub vcl_recv {
	unset req.http.x-cache;
}

sub vcl_hit {
	set req.http.x-cache = "hit";
}

sub vcl_miss {
	set req.http.x-cache = "miss";
}

sub vcl_pass {
	set req.http.x-cache = "pass";
}

sub vcl_pipe {
	set req.http.x-cache = "pipe uncacheable";
}

sub vcl_synth {
	set resp.http.x-cache = "synth synth";
}

sub vcl_deliver {
	if (obj.uncacheable) {
		set req.http.x-cache = req.http.x-cache + " uncacheable" ;
	} else {
		set req.http.x-cache = req.http.x-cache + " cached" ;
	}
	# uncomment the following line to show the information in the response
	# set resp.http.x-cache = req.http.x-cache;
}

Admittedly, it's a bit longer, but it's also more complete and covers all the paths to vcl_deliver, and handles the synthetic case. Note that the focus here is the type of object we found (or not) in the cache, not what routines we went through, hence we are not doing anything in vcl_purge because from there you can only either restart or go to vcl_synth, and both are already covered.

Note: no, we are not handling pipe in resp (because we can't), but pipe is bad and you shouldn't use it unless forced to do so.

A light vsl query example

Ok, we now have a slightly better debugging header and we could stop here. Let's take the opportunity to have some fun with Varnish's excellent logging capabilities. Since we are marking all our responses (piped ones excepted), we can use it in varnishncsa by using a custom specifier: %{VSL:RespHeader:x-cache[1]}x

If it makes no sense to you, fret not! It should be completely clear once we have a look at one response with varnishlog:

...
- VCL_return deliver
- RespProtocol HTTP/1.0
- RespStatus 200
- RespReason OK
- RespHeader Server: SimpleHTTP/0.6 Python/3.6.6
- RespHeader Date: Thu, 19 Jul 2018 23:49:48 GMT
- RespHeader Content-type: text/html; charset=utf-8
- RespHeader Content-Length: 2431
- RespProtocol HTTP/1.1
- RespHeader X-Varnish: 32770 3
- RespHeader Age: 5
- RespHeader Via: 1.1 varnish (Varnish/5.2)
- VCL_call DELIVER
- ReqHeader x-cache: hit cached   <-- Here's the line we care about
- VCL_return deliver
- Timestamp Process: 1532044194.515186 0.000046 0.000046
...

With this, it gets easier:

  • %{...}x: extended variable (ok, you have to sort of know that one)
  • VSL: that's Varnish Shared Log, so we are going to look at record lines directly
  • ReqHeader: more specifically, the ones of the ReqHeader type
  • x-cache: even more specifically, the ones starting with "x-cache:" (that's us!), if there are more than one, we'll pick the last occurrence
  • [1]: we pick the first word, in the above case, that's "hit"

Note: if you need more, have a look at man varnishncsa.

With this we can log what happens to our requests:

#varnishncsa -F "%{VSL:ReqHeader:x-cache[1]}x %U %s"

hit /favicon.ico 200
miss /index.html 200
hit /logo.png 304
pass /admin.php 403
...

And with a simple awk command you can filter all passes or all misses, which wasn't as easy before.

On that point, I realize veterans can craft a vsl query (man vsl-query) that can isolate passes, misses and such, but really, isn't that much nicer and easier to remember?

Note: As pointed in the comments, you can also use %{X}i (or %{X}o for the RespHeader lines), however, you will lose the ability to split the header.

Spreading the good word

Obviously, this article is part of my long and ongoing crusade to end cargo-culting and explain the good practices related to Varnish, but I also appreciate that people don't necessarily want to read my rambling when what they need is just that little snippet of code.

So, you can find the TL;DR of this post in the tutorial section of the documentation website, here. And we you would like to see some more posts about best practices or how to approach certain issues, so be sure to comment below!