Microservices emerged as a pattern some years ago. Initially it was an even fuzzier and more vague pattern than what it is today. One of Varnish Cache’s earliest supporters, Amedia, decided to redesign their infrastructure and went for a microservice pattern. They did something a bit different from what most others have done. They decided to stick Varnish in the middle of their microservices. Instead of having the microservices talk directly to each other they would connect them to Varnish and Varnish would proxy the connections.
There were a couple of obvious wins:
- Simplify endpoints.
The endpoints don’t have to worry about other endpoints being up or down. Varnish will probe the endpoints and take malfunctioning endpoints out of circulation.
- No distributed service directory.
Today, this might not be such a big deal. There are a number of well-functioning distributed service catalogs. This wasn’t the case back in 2008. However, even today, a distributed service directory will add a bit of complexity.
- Stateless microservices and central caching.
This is perhaps the biggest win. Since Amedia was able to leverage Varnish for caching the responses of the service, stateless microservices and central caching removed the need for adding caching to each service. They pushed it one step further requiring all services to be stateless. You’ll want stateless services mainly for two reasons. 1) They scale up and down flawlessly. 2) They are so much simpler to debug. Especially if you can provide the same input data that initially caused the service to misbehave, and varnishlog can easily capture all this data.
- Easy cache invalidation.
Imagine having 25 microservices with discrete caching systems built into each one. Now imagine something changes in your data and you’ll need to invalidate all data derived from the now invalid data. All the tools (purges, bans, x-key) available in Varnish to invalidate cached content makes this a walk in the park.
Detecting slow elements in your Microservices architecture
Seven years later Amedia have no regrets with regard to their architecture. However, as the number of microservices grew they came to experience some of the traditional problems of distributed architectures. If a single service on a single server suddenly develops a cold and starts adding 20-50 ms of latency to every other request, does your monitoring help you track it down? Or worse: what if only three percent of the requests get a slow down?
Amedia were not the first to encounter this problem. Twitter encountered this some years ago and came up with Zipkin. Zipkin is a rather heavy tool. It relies on a Java library to gather data about each transaction and then gathers everything in a database and makes sense of it. However, it depends on the JVM, so it's at odds with an important promise of microservices - you should be free to pick the architecture for each service independently. If something makes sense to develop and deploy on .Net that is what you want to use.
Since there already is a compelling argument for having Varnish in a microservice architecture and the logging in Varnish is, from a performance perspective, free, why not take advantage of it? Varnish can, with a tiny bit of help from the endpoints, know how much time each request takes, what service calls are dependent on what other calls to be fulfilled, what special circumstances occurred during execution of this call and a few other data points. This is enough to create the waterfall diagrams that make Zipkin so useful.
At this point we met with Amedia and we decided this was cool enough for us to develop into a solution. And we immediately realized that it makes perfect sense to develop it as open source.
Half a year later we’re done - and we're ready to release Zipnish 1.0.
The Zipnish architecture
A process will fork off and start consuming varnishlog, listening for transactions. It leverages the new Varnish logging API that was introduced in Varnish Cache 4.0. Since Shohei Tanaka was nice enough to write Python bindings we decided to use Python and the event library Twisted. Data is stored in MySQL.
Initially we just reused Zipkin's whole presentation layer. However, since it was written in Scala we were worried about not attracting developers. While Scala is most likely wonderful and all, it doesn’t really see much use. So we redid the presentation backend in Python. The frontend is more or less the original Zipkin frontend.
Liberal open source license
We've released everything so far under the FreeBSD license.
Packages will show up here: https://github.com/varnish/zipnish/tree/master/builds
There will be an introductory webinar on Zipnish on the 13th of January 2016. Why wait? Sign up now.
This image is (c) Andy Morffew 2012 and used under CC License.