Writing VMODs...in rust

In this post, we're going to have a look at how we can write VMODs, without writing a line of C. In this post, we're going to have a look at how we can write VMODs, without writing a line of C. And, as our beloved CTO once said, I'm a "damn hipster", so we are going to use Rust to avoid C. Doing so, we'll have to touch some VMOD-specific, language-agnostic points that should be interesting to anyone wanting to dive into the VMOD world.

C'est la VMOD

A VMOD is a "Varnish MODule", a plugin that can be loaded and then used by the VCL to do... stuff. If that sounds generic, it's because it is, VMODs can do pretty much everything:

To use them is very simple, you must first import them, once, at the beginning of your VCL:

import cookie;

and then use the provided functions, prefixing them with the name of the VMOD plus a ".":

cookie.parse(req.http.cookie);
set resp.http.id = cookie.get("SESSID");

That's pretty easy, and it can greatly simplify your VCL and make it less error-prone. If you are not convinced, please let me remind you that parsing querystrings and cookies using regex isn't pleasant, safe or smart.

What's the plan?

Nothing's better than an example, except maybe cake, and pictures of otters. But since I have neither cake nor otter pictures to give you, an example will have to do.

We are going to build a pretty simple VMOD that can actually be useful: we'll create a little hashmap, counting occurences of strings. The functions will be:

INT push(STRING) # create a entry for STRING if needed, 
                 # and increment it by one
INT peek(STRING) # return the recorded number of occurences
                 # of STRING
INT pull(STRING) # decrement the entry for STRING if it exists,
                 # and is more than 0.

And since alliterations are fun, let's call the VMOD "pupa".

In vmod we trust

Let's go back to our cookie example:

cookie.parse(req.http.cookie);
set resp.http.id = cookie.get("SESSID");

Here,  get returns a string, while parse doesn't return anything. The interesting thing is that Varnish knows about this. For instance, it won't let you load this:

set req.http.foo = cookie.parse();

or, more simply, it won't allow:

set req.http.foo = cookie.gimesome();

and will return "Symbol not found: 'cookie.gimmesome'" because that function simply doesn't exist.

So this means that a VMOD advertises the functions it offers and the vcc (Vcl to C Compiler) uses that information to avoid crazy invalid function calls.

What's in a name^H^H^H^Hvmod?

A VMOD is actually a pretty simple shared object containing one agreed upon structure, listing what the VMOD can do.

This is a tried technique, used for example by the old NAPI plugins for browser (Flash, anyone?). When pointed at it, the program with use dlopen on the shared library, and look for that particular symbol, in our case, the structure is named Vmod_$NAME_Data.

What is actually in the structure isn't really important to us because we won't write it directly (C devs are masochists, but we have our limits). The struct, and the file containing it are generated from a .vcc file, let's call it vmod-pupa.vcc, and its content is here.

It contains some info about the VMOD itself, but what really interest us here are the five lines starting with a "$":

$Module pupa 3 Pupa VMOD
$Event init_function
$Function INT .push(PRIV_VCL, STRING)
$Function INT .peek(PRIV_VCL, STRING)
$Function INT .pull(PRIV_VCL, STRING)
Some info about that:
  •  $Module gives the name of the module, among others.
  • $Event registers an init function that will treat various events
  • $Function declares a function that is usable by the VCL.

Using these, vmodtool will create vcc_if.c and vcc_if.h, containing:

VCL_INT vmod_push(VRT_CTX, struct vmod_priv *, VCL_STRING); 
VCL_INT vmod_peek(VRT_CTX, struct vmod_priv *, VCL_STRING); 
VCL_INT vmod_pull(VRT_CTX, struct vmod_priv *, VCL_STRING); 
vmod_event_f init_function;
And, for you information, here are the relevant bits from vrt.h:
#define VRT_CTX         const struct vrt_ctx *ctx
typedef const char * VCL_STRING;
typedef long VCL_INT;
typedef int vmod_event_f(VRT_CTX,
                            struct vmod_priv *,
                            enum vcl_event_e);

Giving us the function prototypes to implement. So, let's do it!

Wait, wait, wait! What's that $Event function about?

Right, I haven't told you about that yet. The event function is called by Varnish, when certain things occur in the life of the VMOD to allow it to allocate or free resources for example. An event function receives two parameters, the type of the event, and a pointer to its own private struct that will be available during the whole life of the vmod.

The types of events are defined in vcl.h, and are well explained in the documentation, so I just annotated the ones we'll use:

enum vcl_event_e {
    VCL_EVENT_LOAD,
    VCL_EVENT_WARM, /* the VCL using the vmod could be
                     * loaded soon, so let's allocate resources
                     * to be ready when that comes */
    VCL_EVENT_USE,
    VCL_EVENT_COLD, /* the vcl isn't used anymore,
                     * and we should have the smallest
                     * memory footprint possible */
    VCL_EVENT_DISCARD,
};

And the private struct is in vrt.h:

struct vmod_priv {
    void *priv;             /* pointer to your content */
    int len;                /* used for BLOB types, disregard it */
    vmod_priv_free_f *free; /* function to call at the end
                             * of life of the struct to free
                             * the priv pointer
                             */
};

Rusting away!

Everything is now crystal clear, we have to implement the init function, respecting the prototype, allocating the hashmap on WARM events, and freeing it on COLD ones.

The VMOD expect this prototype:

int init_function(VRT_CTX, struct vmod_priv *, enum vcl_event_e);

Translated to Rust, we get:

// mimic the C enum
#[repr(C)] //tell Rust to make it C compatible
pub enum VclEvent {
    Load = 0,
    Warm,
    Use,
    Cold,
    Discard,
}
// mimic vmod_priv
#[repr(C)] 
pub struct vmod_priv { 
    prv : *const Mutex<HashMap<String, c_int>>,
    len : c_int,
    free : *const c_void
}

#[no_mangle]
pub extern fn init_function(_ : *const c_void,
    prv : &mut vmod_priv,
    ev : VclEvent ) -> c_int{
    match ev {
        VclEvent::Warm => {
            let hash = Mutex::new(HashMap::new());
            prv.prv = Box::into_raw( Box::new(hash));
        },
        VclEvent::Cold => {
            unsafe { Box::from_raw(prv.prv);}
        },
        _ => ()
    }
    0
}

The struct/enum definition should easily understood, but the function merits some explanations:

  • #[no_mangle]: tells the compiler to keep the name of the function as-is in the shared library. rustc, by default hashes symbols name to ensure ABI compatibility, but we are interested in Varnish finding the function when it opens the library.
  • pub extern: that function should be callable by external language, and should be made public.
  • _ : *const c_void: this is the VRT_CTX, except we won't care about it, so just tell Rust that it's a pointer, and don't name it. The syntax is similar to Python, for example.
  • match: this is a switch/case statement, with some extra niceties, like checking that all cases are covered.
  • Mutex::new(HashMap::new()): one fun thing about Rust is that you don't lock code, you lock data, here, we create a Mutex protecting a new hashmap.
  • Box::into_raw( Box::new(hash)): this line is a bit foxy: Box::new moves the data onto the heap, and Box::into_raw hides it behind a dumb pointer (instead of a clever reference). This trick allow Rust to leak memory, which is normally bad, but exactly what we want here.
  • unsafe { Box::from_raw(prv);}: retrieves the data behind our pointer and makes an object out of it. Then, as soon as the unsafe scope is over, that object is destroyed automatically by Rust.
  • 0: Rust is exprsseion-oriented, and the last expression of a scope will bubble up to the parent scope, so we are actually returning 0 here, telling Varnish that everything is ok.

Also, note that in event, we don't need to lock as we are guaranteed that no pupa function is in use (we wouldn't transition from/to WARM otherwise), and that no other event function is run.

I pull, you push

The three functions, as you may guess will be pretty similar, so we'll just cover the pull function here. but feel free to hit me with any questions you may have.

Let's have a look at that match statement again. I told you it had some nice features, but didn't go too deep about them. One very cool aspect is destructuring, that we'll use. Match is able to give you direct access to element of a struct or of an enum, like so:

struct Point {
    x: i32,
    y: i32,
}

let origin = Point { x: 0, y: 0 };

match origin {
    Point { x, y } => println!("({},{})", x, y),
}

And it's going to be exactly what we need because the hashmap we have returns entries with two possible variants: Occupied (the entry exists), Vacant (doesn't exist) and match allows us to have direct access to the entry.

Our function looks like this:

#[no_mangle]
pub unsafe extern fn vmod_pull(_ : *const c_void,
    prv : &vmod_priv,
    input : *const c_char) -> c_int {
 
    let mut hash = (&*prv.prv).lock().unwrap();
 
    let key = conv(input);

    match hash.entry(key) {
        Occupied(mut entry) => {
            if *entry.get() == 0 { 0 }
            else {
                *entry.get_mut() -= 1;
                *entry.get()
            }
        },
        Vacant(entry) => 0
    }
}

Not a lot of new stuff, but still, there are some cool bits:

  • (&*prv.prv).lock().unwrap(): get back our hashmap, then lock it, and we trust that locking when fine with .unwrap().
  • conv(input): input is a C string, and we need a Rust String, and those are no the same type, at all! String is more akin to a C++ string in that it's heap-allocated and can grow/shrink on demand. Rust actually has quite a few strings types (String, str, CString, CStr), but that's a subject for another day.
  • Occupied(mut entry): told you it would come in handy! Note that entry is requested as mutable (mut) here, but not for the Vacant block (we don't even use it). Rust is not permissive at all and won't give you write access unless you ask for them.

But, does it compile?

Of course not! It can't because we haven't told the build system about our Rust file (src/pupa.rst), so it has no reason to care about it.

First, let's compile that Rust file on its own. To do so, we'll take the easy way out, and use Cargo (the Rust packager) instead of rustc (the Rust compiler) directly. All it amounts to is writing a Cargo.toml file:

[package]
name = "vmr-pupa"
version = "0.1.0"
authors = ["Guillaume Quintard <your@email.com>"]

[dependencies]
libc = "*"

[lib]
name = "pupa"
crate-type = ["staticlib"]

Everything is self-explanatory, but the most important line is the last one, specifying that we want a static library so we can use it in our VMOD.

And now, we can compile this Rust file using "cargo build" and check that everything is fine.

But it isn't plugged to the general build system yet. Turns out, that like most open-source projects, VMODs tend to be compiled using autotools. This is terrifying, I know, but once in place, it does the job pretty well, so, if it ain't broken, don't fix it, right?

We need to add a few line to src/Makefile.am:

# add the lib created by cargo to the dependency list
# of the vmod:
libvmod_pupa_la_LIBADD = ../target/debug/libpupa.a

# libtool being an annoying pest,
# it won't add libpupa.a to the link command,
# so we need to do it ourselves.
libvmod_pupa_la_LDFLAGS += -Wc,../target/debug/libpupa.a

# finally, let make know how to create target/debug/libpupa.a
../target/debug/libpupa.a: pupa.rs
    cargo build

And that's it! Or is it?

Friends don't let friends not test

It is true that the VMOD compiles, but does it work? Like, really really work good? To know, there's only one way: test, with our good friend varnishtest.

A varnishtest post is already planned on this blog, so I won't go too much into the details, but look at that test! It creates the client, server and run varnish with your VMOD on an isolated instance. There's no need to deploy, reload or change anything to an already established Varnish.

Simply place your vtc (Varnish Test Case) files in src/tests and they will be picked up by the next make check, courtesy of autotools.

And if you wish to only run one test:

varnishtest foo.vtc -Dvmod_topbuild=path_to_build_dir

Read That Fine Manual

One last step remains on our trip to release a VMOD: the documentation. "Read the code!" isn't very user friendly, and you may not have the time to write up a blog post every time you release a new version.

Fortunately, vmodtools and autotools are there for you! Do you remember the vcc file we wrote to declare our functions? It's also used to generate the man page of the VMOD.

Describe the functions below their declarations, and you're set. If you're feeling verbose, know that the vcc file is read mostly as reStructuredText and thus support titles, list and other formatting tools.

Wrapping up

We are at the end of this post, and looking back, it was quite long, but I hope you also found it quite painless.

Writing a VMOD is actually pretty easy...once you know where to begin :-)Obviously, the reason for it to be easy is that we have a solid and sane base that automate the boring stuff, and I didn't come up with it. The pupa vmod is actually a small fork of libvmod-example that already includes everything to get you started.

So, what will your next VMOD be?

Image (c) 2007 by Chi King used under Creative Commons license.

Topics: VMODs, varnishtest, Rust

06/04/16 13:14 by Guillaume Quintard

All things Varnish related

The Varnish blog is where the our team writes about all things related to Varnish Cache and Varnish Software...or simply vents.

SUBSCRIBE TO OUR BLOG

Recent Posts

Posts by Topic

see all

Varnish Software Blog