July 23, 2015
2 min read time

Ratelimiting the varnishlog

Varnish is typically very busy, running several thousands of transactions per second. Combine this with the rather extreme verbosity of varnishlog, and you have a firehose of information that can be rather hard to manage. 

Varnish Cache 4.0, with its log query language and filters goes some way in helping to manage the flow and help you find the things you’re looking for. Having a varnishlog instance running alongside your varnishd processes can help. Let’s have a look at what you’ll need in order to keep an eye on what is happening.
$ varnishlog -g request -q 'RespStatus>= 500'

This command will only output log if the response status is 500 or above. You can add other conditions you want to log as well.

If something should happen to your infrastructure so that all your backends start failing all your transactions varnish will start to spit out tens of thousands of log lines per second of error messages. If you are then piping the logs into any kind of log management system you’ll most likely run that system into the ground.

The key is to use some sort of throttling. Your log system might have a way to throttle an overly verbose system but even if it can throttle you’ll probably have better control doing the throttling in Varnish itself.

About a year back we wrote the vsthrottle VMOD. There was a throttle VMOD available already but a review of the code made it clear that we couldn’t support it. So one of our engineers, Dag, wrote a new one that so far seems to work great.

In order to throttle we’ll add a header to those error messages that are over the quota and we’ll instruct varnishlog to ignore those transactions so they won’t find their way into the log management. The vsthrottle VMOD can throttle based on an arbitrary string so we can have different kinds of quotas for different kind of error messages.
The VCL needed would be some changes in vcl_deliver. Everything passes through vcl_deliver so this is a good place to put stuff logic that gets applied to everything Varnish does.

vcl 4.0;
import std;
import vsthrottle;

backend default {
    .host = "127.0.0.1";
    .port = "8080";
}

sub vcl_recv {
    # disable caching.
    return(pass);
}

sub vcl_deliver {
    if (resp.status >= 500) {
       if (vsthrottle.is_denied(""+ resp.status , 1, 1s)) {
          std.log("throttled");
       }
    }
}

The VCL will track different HTTP status codes. If a single HTTP status above 500 occurs more than once a second it will get a VSL log line attached to it - “throttled”. So, if you have a storm of 503 errors and those get throttled a single 505 will still be logged.

The last bit is to modify the varnishlog command line so it suppresses log entries that are throttled.

$ varnishlog -g request -q 'RespStatus>= 500 and not VCL_log eq "throttled"'

There are lots of different use cases for throttling. You can probably come up with quite a few of them. If anyone uses this kind of throttling I would appreciate it you could share your experience in the comment section below.

Happy throttling.

Image is (c) 2013 Stone65 used under CC license.