One of the two biggest changes in Varnish 4.0 is how the threads work. Varnish uses threads for doing all the heavy lifting and it seems to be working out quite well. In Varnish 3.0 one thread would service each client, doing whatever that client wanted it to do. Within reason, obviously. These are very decent threads.
The thread would deliver from cache, fetch content from the backend, pipe, etc.
Anyway, in Varnish Cache 4.0 there are now basically two kinds of threads. The backend and the frontend threads.
This changes a lot of semantics about how we interact with clients and backends opening up a couple of new ways for Varnish to do content delivery.
We can do asynchronous fetches
The default solution today with Varnish Cache 3.0 is to enable grace mode and then let one client take the hit, waiting for the object to arrive. If the object users 15 seconds to arrive that poor user will probably get bored and go do something else. With 4.0 in place the stale object gets delivered and it will be refreshed right away. Grace is a lot more graceful :-)
The new thread organization supports streaming delivery. As a client will request a backend thread to fetch something from the backend it will start delivering data as soon as it arrives. In addition other threads can now attach it to that same backend thread and deliver the same content.
For you as a user this means your VCL has to change, as the VCL closely mirrors how Varnish works. There are now certain VCL subroutines that are clearly backend thread only:
Varnish Cache 3.0 has one thread that is responsible for accepting new connections. Turns out this can sometimes be a bit of a bottleneck on systems with multiple CPUs when the requests start coming in a several tens of thousands per second.
For 99 percent of you it won’t matter, but for those of you who need those last bits of performance squeezed out of your iron, this will be the thing. Enabling this is simple. You start up Varnish with a certain number of threadpools. One thread in each pool will take it upon itself to be the acceptor and will listen for incoming requests.
So, for top performance you might want to look into setting the number of pools equal to the number of CPU cores you have.
Ready to learn more about Varnish Cache and VCL? Check out our Varnish Cache webinar.