In this week’s episode of Two-Minute Tech Tuesday, we'll talk about varnishncsa, a tool that you can use to produce web server logs in Varnish.
Let's address the NCSA part first, it's short for National Center for Supercomputing Applications and it's the organization that invented the NCSA HTTPD web server back in 1993. Because they were early on in the game, they also invented the corresponding log format, which is now an industry standard.
We call it the NCSA common log format and contains the following fields:
- the host or client IP
- the log identifier
- the authenticated user
- the date and time
- the first line of the request
- the status code
- the bytes transferred
172.17.0.1 - - [04/May/2022:11:57:56 +0000] "GET / HTTP/1.1" 200 45
There's also an extended log format called the NCSA combined log format. It contains the same information but also has the referer and the user agent.
172.17.0.1 - - [04/May/2022:11:57:56 +0000] "GET / HTTP/1.1" 200 45 "http://localhost/test.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36"
It will come as no surprise that varnishncsa can produce the same style of logging output, which is particularly interesting, given the position of Varnish. Because Varnish sits in front of the web server and receives all the incoming requests, it can reliably produce web server logs. However, the server can't, because if the proper caching is applied, not every request ends up on the server.
Varnish Shared memory Logs
All these logs are stored in the VSL (Varnish Shared memory Logs). This is a circular memory buffer in shared memory, stored inside the varnishd runtime process. This shared memory can be consumed by tools like varnishncsa or varnishlog.
Let's take a look at varnishlog returns, in terms of output:
** << Request >> 1752690
-- Begin req 1752689 rxreq
-- Timestamp Start: 1651675446.691433 0.000000 0.000000
-- Timestamp Req: 1651675446.691433 0.000000 0.000000
-- ReqStart 172.21.0.1 55962 http
-- ReqMethod GET
-- ReqURL /
-- ReqProtocol HTTP/1.1
-- ReqHeader Host: localhost
-- ReqHeader User-Agent: curl/7.79.1
-- ReqHeader Accept: */*
-- ReqHeader X-Forwarded-For: 172.21.0.1
-- VCL_call RECV
-- VCL_return hash
-- VCL_call HASH
-- VCL_return lookup
-- ExpBan 1679159 banned lookup
-- VCL_call MISS
-- VCL_return fetch
-- Link bereq 1752691 fetch
-- Timestamp Fetch: 1651675446.693975 0.002542 0.002542
-- RespProtocol HTTP/1.1
-- RespStatus 200
-- RespReason OK
-- RespHeader Content-Type: application/json; charset=utf-8
-- RespHeader Content-Length: 567
-- RespHeader ETag: W/"237-ulyqMICHeGkCSm+psitDJ2K3qqg"
-- RespHeader Date: Wed, 04 May 2022 14:44:06 GMT
-- RespHeader X-Varnish: 1752690
-- RespHeader Age: 0
-- RespHeader Via: 1.1 varnish (Varnish/6.0)
-- VCL_call DELIVER
-- VCL_return deliver
-- Timestamp Process: 1651675446.693981 0.002548 0.000006
-- RespHeader Accept-Ranges: bytes
-- RespHeader Connection: keep-alive
-- Timestamp Resp: 1651675446.694030 0.002597 0.000049
-- ReqAcct 73 0 73 273 567 840
-- End
*** << BeReq >> 1752691
--- Begin bereq 1752690 fetch
--- VCL_use boot
--- Timestamp Start: 1651675446.691561 0.000000 0.000000
--- BereqMethod GET
--- BereqURL /
--- BereqProtocol HTTP/1.1
--- BereqHeader Host: localhost
--- BereqHeader User-Agent: curl/7.79.1
--- BereqHeader Accept: */*
--- BereqHeader X-Forwarded-For: 172.21.0.1
--- BereqHeader Accept-Encoding: gzip
--- BereqHeader X-Varnish: 1752691
--- VCL_call BACKEND_FETCH
--- VCL_return fetch
--- BackendOpen 32 boot.default 172.21.0.2 8080 172.21.0.3 36102
--- BackendStart 172.21.0.2 8080
--- Timestamp Bereq: 1651675446.691704 0.000143 0.000143
--- Timestamp Beresp: 1651675446.693860 0.002299 0.002156
--- BerespProtocol HTTP/1.1
--- BerespStatus 200
--- BerespReason OK
--- BerespHeader Content-Type: application/json; charset=utf-8
--- BerespHeader Content-Length: 567
--- BerespHeader ETag: W/"237-ulyqMICHeGkCSm+psitDJ2K3qqg"
--- BerespHeader Date: Wed, 04 May 2022 14:44:06 GMT
--- BerespHeader Connection: keep-alive
--- BerespHeader Keep-Alive: timeout=5
--- TTL RFC 120 10 0 1651675447 1651675447 1651675446 0 0 cacheable
--- VCL_call BACKEND_RESPONSE
--- VCL_return deliver
--- Storage malloc s0
--- Fetch_Body 3 length stream
--- BackendReuse 32 boot.default
--- Timestamp BerespBody: 1651675446.693945 0.002384 0.000085
--- Length 567
--- BereqAcct 145 0 145 214 567 781
--- End
There's tons of information, but varnishncsa only returns the core information under the form of a single log line:
%h %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i"
While varnishncsa returns output in the NCSA combined format, you can extend the format using a variety of shortcodes. Here's a table containing all the possibilities:
This ranges from request and response headers but also internal VCL and VSL variables. The options are truly limitless. Through the -F option, we can specify a custom log format. For example, inject the Varnish "handling" variable which returns whether or not the request resulted in a cache hit, a cache miss, or a cache bypass.
varnishncsa -c -b -F '%h %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i" "%{Varnish:handling}x"'
If you want to use varnishncsa in production, you might want to consider daemonizing the process. If you use the official packages, there's already a system service for varnishncsa allowing you to run it as a daemon.
Join us next week for another Two-Minute Tuesday topic, presented to you in 2 minutes or less!