Why did we create the Massive Storage Engine (MSE), how it works and who should be using it?
Companies that traffic in large volumes of data – for example, CDNs and anyone delivering video – grapple with the same kinds of challenges: how do you manage all this data and maintain performance, availability and scalability… and, in the event of an outage or restart, avoid the time-consuming wait for the data/content to repopulate?
We’ve given a lot of thought to the why, and invite you to watch our whiteboard session about the creation of the Varnish MSE to learn more.
Traditionally Varnish writes to the file backend – and the issues we encountered there incentivized us to write the Massive Storage Engine, as the sheer amount of data started to increase. Some of the key points/issues you’ll hear about:
- How a memory-mapped file affects performance (CPU and OS creates a waste of time and effort, and with large files – over time, everything gets more fragmented, and efficiency suffers).
- Avoid fragmentation and “expand the holes”: Need data to be efficiently organized, so MSE is built in a least frequently used data matrix instead of the traditional least recently used list. Because Varnish is a cache and we can retrieve content from the backend at any time, we can sacrifice some of the objects in the cache to achieve less fragmentation and higher performance.
MSE: Keeps on running for big workloads
Varnish Massive Storage Engine makes sure that fragmentation is kept to a minimum for as long as the storage runs.
Ultimately, MSE is great for meeting the needs of anyone who wants to combine a large cache with high performance.
Watch now
Join our CTO and founder, Per Buer, as he gives a comprehensive explanation of the why behind MSE, the issues it addresses and why you need it.
Photo (c) 2014 Nuclear Regulatory Commission used under Creative Commons license.