Do you wish you knew what device or browser your clients are using? Do you need to take on-the-fly decisions based on this information? (i.e. you can’t afford to do offline log processing)
If so, you probably need device detection, an ancient and arcane technique, and today, we’re going to learn three different ways to do it inside Varnish, starting from the simplest one and working our way up to the most efficient.
Let’s talk about the past
With age, you accumulate experience and…baggage (I’m being polite here). An example for HTTP would be the user-agent header. It was initially a nice idea, allowing clients to announce themselves to the server and indicating “what” they were, i.e. which browser, which version, etc.
As an example, here’s the user-agent my browser usually sends:
You can clearly tell that I use Firefox version 126.0 on a x86+64 architecture and that I’m running Linux (BTW I use Arch).
Back to the history lesson: in the olden times, HTML/CSS/JS compatibility wasn’t great and web developers often had to special-case their code to serve different browsers, so they used the user-agent header to detect browsers and versions and apply the relevant special behaviors needed for each user.
As browsers became more compatible and more unified, the special behaviors also started breaking them, so they updated the values they sent as user-agent to trick the server into giving them the “right” behavior. Of course, some did this poorly, forcing servers and web developers to be more clever in their detection and catch liars and apply the proper fixes. So of course the clients upped their game too…
Rinse and repeat for a few years and we are now left with a giant mess of very diverse user-agent headers that are usually ignored by everybody because we have better ways of knowing who’s compatible with what. However, we can still use the header to suss out some properties.
Introducing UA-Parser
The UA-Parser is a community project that started years ago, with the goal of tracking as many user agent headers as possible, mapping them to more useful categories. Once a new user agent pops up, the contributors add it to the database, increasing coverage iteratively.
Given a string, the database can tell you about the client itself, the operating system and hopefully the device. In my case, here’s what it returns:
No luck on the device, but it managed to split the string pretty accurately! Remember that not all UA strings look like this, so there’s quite a lot of knowledge going into this.
There are a few cool things to like about UA-Parser: it’s free and community-based, it’s regularly updated, and the point that interest us most here: the database is just a YAML file listing regular expressions:
This means an enterprising individual can easily convert that information into VCL:
The regexes are a tiny bit scarier because they had to be adjusted to Varnish’s tastes, but other than that, it’s a fairly straightforward mapping.
If you want to try it yourself, head over to the toolbox repository, we’ve built a handy go tool that will download the UAP database and will produce the corresponding VCL. The VCL uses no vmod at all, so you can use it with a stock Varnish installation.
It works well, but I must warn you: the resulting VCL is around 7k lines long, leading to a very beefy C file that gcc spent 15 seconds compiling on my laptop. It usually doesn’t matter since VCL updates are atomic and graceful anyway, but it’s worth mentioning.
UAP-Parser: rewritten
Coincidentally, a few years ago, a Varnish Enterprise customer needed to check host names against thousands of regexes, probably 20k or 30k of those, and we noticed that repeated access to a header could be an issue performance-wise[1]. This is the reason we built vmod-rewrite (part of Varnish Enterprise), which has since turned into a staple in our toolbox, both for its speed and ergonomics.
In the UAP case, there are “only” 1200 regexes or so, which isn’t as dramatic as the 20k talked above, but we figured we could also create a tool that generates vmod-rewrite
rulesets, and check if it would result in a speed boost. Spoiler alert: it does, check the results in the last section.
The generated VCL looks like this:
It’s probably as scary as the previous code, but now each case fits on one line, and there’s no crazy long if-else statement, vmod-rewrite
handles this for us.
Oh, and the VCL load time is back to near instantaneous, so that’s pretty great too.
DeviceAtlas, Rise!
Confession time: I really wanted to play with UA-Parser because I think it’s a cool community project, and I can never resist building tools to generate VCL…
…BUT…
…there’s a faster, cleaner and more comprehensive solution for device detection, and we’ve had it for years: it’s DeviceAtlas.
DeviceAtlas provides device databases that are updated daily and covers more than 200 device properties such as screen size, touch capability or whether the user is a bot. You can get a subscription directly from them, or as an add-on to Varnish Enterprise which gives you access to vmod-deviceatlas3.
The vmod allows you to load a database and just query it when you need it, asking for whichever property you are interested in:
It’s clean, to-the-point, and as we’ll see in the next section, it’s also very, very fast.
I know speed is relative, but still…
As promised, let’s talk about performance and how we tested those implementations. First we built a BIG VCL, about 72k lines of setting a user agent header and then calling a device detection function:
This amounts to more than 18k calls to uap_detect
, those calls will happen for every single request and should cover a wide range of user agents, avoiding a match on the first regex and preventing caching in vmod-rewrite
[2].
This VCL is then tested with 4 different implementations of uap_detect
:
- noop: does nothing, we’ll use it as baseline
- pure VCL: uses UA-Parser data and the giant if-else statement
- rewrite: uses UA-Parser data and
vmod-rewrite
- DeviceAtlas: uses DeviceAtlas data and vmod
Before starting we just plugged varnishlog
with awk
(awk
is amazing, don’t @me) to give us and average of processing time after each request:
Note: this only includes VCL time, so we can focus on actual processing and exclude HTTP parsing and transmissions.
Finally, we just ran 100 requests sequentially through curl
:
These were run on my laptop, and on a single core, so expect much better performance in production, the big value here is how the different setups fare against each other, all other things being equal. And boy oh boy, those are numbers:
- noop: 0.015 ms/req
- pure VCL: 12.2 s/req
- rewrite: 7.5 s/req
- DeviceAtlas: 0.21 s/req
I mean, it’s not even funny. To start, our baseline shows Varnish setting and unsetting the header 18000 times in 15 milliseconds, like it’s nothing.
The pure VCL implementation scores a decent 15s/req which translates to ~1500 decisions per second. Again: that’s one cpu, on my laptop.
Then vmod-rewrite
, not to be undone, raises the bar to ~2400 decisions per second, that’s a 60% boost, pretty nifty.
But finally, DeviceAtlas waltzes in with a shattering 85k decisions per second??? Did I mention it was fast?
Wrapping up
Here you go, three ways to go about device detection, for the most frugal to the absolute fastest. If you are interested in leveraging UA-Parser, be sure to check the github repository and let us know what you think about it!
[1] for the technically-inclined: reading the header means calling an accessor function, and that can be costly as we do a linear search through the header table.
[2] this is a Varnish vmod built for performance, did you really expect us to NOT include caching?