January 21, 2025
5 min read time

Boost Object Storage Performance with Conditional Requests

HTTP has been around a long time. Like, it's now at the age where it's probably considering buying a red convertible. But with age comes experience, and HTTP has learned a few tricks along the way, including the one that will interest us today: **conditional requests**.

It's one of those features that is so ingrained into HTTP and Varnish itself that I tend to wrongfully take it for granted. And in the context of object storage, that feature can save you time, bandwidth and money. So let's use this short article to revisit the basics.

Saving resources

HTTP is centered around a request-response concept: the client sends a question, and the server answers with a response. Easy.

The problem is that sometimes, you'd check for a resource - like a homepage background, firmware file or directory listing - only to find it hasn't changed since the last time you checked yet you're still stuck in a question/response exchange.

But HTTP before version two lacked a good way to interrupt a response, so you only had about three choices:

  • Just read the response, consuming both time and bandwidth
  • Interrupt the connection, which is not only rude but also means you need to take the time to reestablish the connection right after
  • Send a HEAD request first to check the resource headers, then potentially commit to actually downloading the resource with a GET. That means an extra round-trip, and nobody likes that, plus it's not atomic either


That's three possible compromises, none of them great. The solution was to extend HTTP to support conditional requests: ask the server to send you the full response only if the object has changed.

In practice, it would look like this:

# our first request: > GET / HTTP/1.1 > Host: example.com # to which the server answers: < HTTP/1.0 200 OK < Last-Modified: Fri, 12 Jul 2024 17:14:04 GMT < Content-Length: 9001 (followed by a body with over 9000 bytes) # We can now come back with the "If-Modified-Since" # our first request: > GET / HTTP/1.1 > Host: example.com > If-Modified-Since: Fri, 12 Jul 2024 17:14:04 GMT # to which the server answers: < HTTP/1.0 304 Not Modified < Last-Modified: Fri, 12 Jul 2024 17:14:04 GMT < Content-Length: 9001 (however, there's no body after it, we just save 9001 bytes!)

Note the 304 status code, telling the client to not expect a body. And if the resource had changed, then we'd have received a regular 200 OK response, containing the new body.

(Almost) nothing has changed

We came up with conditional requests because bandwidth was limited at the time and these optimizations provided a significant boost. Interestingly, as bandwidth capabilities grew, so did the sizes of our objects. Where we only sent a few kilobytes of HTML and CSS we now send megabytes of bundled JS frameworks or gigabytes of video game patches.

The point is: conditional requests are more relevant than ever, especially for object storage which often deals with large files that need to be delivered fast and at scale.

We've also pushed the concept to support faster updates. If you look at the example above, you'll notice Last-Modified and If-Modified-Since are dates with precision down to the second. But what if your resource changed three times within a second?

The solution was yet another HTTP extension: the ETag and If-None-Match. In short, instead of passing dates, you pass hashes. Every version of the object will get a different ETag (hash), so a request can uniquely advertise the one it received last.

Varnish is just boringly good

Varnish likes to support HTTP features, especially the ones that can save resources, so of course it supports conditional requests, both in the If-Modified-Since and If-None-Match variants.

On the client side, Varnish will happily send back 304 Not Modified responses when appropriate. There's nothing for you to do to activate it; it's on by default. However, that's not even the best part!

The best part is that conditional requests are also supported when issuing requests to the origin. Varnish will automatically revalidate resources it already has in its cache thanks to conditional requests, saving time and bandwidth on the origin side too! And as for the client side, it's also on by default.

What if the origin doesn't support it?

To issue a conditional request, you do need some data from the backend, namely those If-Modified-Since and If-None-Match headers. This can be a problem if your origin doesn't send either of those.

Well, no, there's really no problem because you can actually fake both headers on-the-fly, so you actually have something to send to the clients. For this to work, there is *some* configuration involved, but see for yourself, it's very light:

import xbody; sub vcl_backend_fetch { # no IMS header? just use the current time if (!beresp.http.last-modified) { set beresp.http.last-modified = now; } # no etag, ask Varnish to hash the body as it receives it if (!beresp.http.etag) { xbody.hash_body(sha1); } } sub vcl_deliver { # set the ETag header to the hash we computed during fetching if (!resp.http.etag) { set resp.http.etag = "\"" + xbody.get_hash() "\""; } }

Et voilà! Note that xbody is a Varnish Enterprise feature, but you can elect to just implement the Last-Modified with open-source Varnish Cache, just remove the xbody and etag bits from the example above:

sub vcl_backend_fetch { # no IMS header? just use the current time if (!beresp.http.last-modified) { set beresp.http.last-modified = now; } }

Big objects means big savings

Varnish makes conditional requests easy, and it’ll work on all kinds of traffic. However, as you can imagine, the larger your objects are, the more bandwidth and time you save by not having to repeatedly transfer them. We recently talked about how Varnish can save you money when delivering S3 content, and conditional requests, while not essential, are a very nice bonus since most S3-compatible storages will support them.