07. Key-Value Server: get, set, del

7.1 Request-response message format

Starting with the current protocol, let’s replace the “msg” part with something meaningful: a KV store with just get, set, del commands.

┌─────┬──────┬─────┬──────┬────────
│ len │ msg1 │ len │ msg2 │ more...
└─────┴──────┴─────┴──────┴────────
   4B   ...     4B   ...

A Redis request is a list of strings, just like a Linux command. Representing a list as a chunk of bytes is the task of (de)serialization. We’ll use the same length-prefixed scheme as the outer message format.

┌────┬───┬────┬───┬────┬───┬───┬────┐
│nstr│len│str1│len│str2│...│len│strn│
└────┴───┴────┴───┴────┴───┴───┴────┘
  4B   4B ...   4B ...

nstr is the number of items in the list, followed by each item. Each string item is again prefixed with its length. It may be tempting to just concatenate null-terminated strings or delimit them with spaces, but a delimited format will only cause problems, because the data may contain the delimiter.

For now, the response is just an integer status code and another string. Some Redis commands can return complex data types other than a single string, which we will implement later.

┌────────┬─────────┐
│ status │ data... │
└────────┴─────────┘
    4B     ...

7.2 Handle requests

What to do?

3 steps to handle a request:

  1. Parse the command.
  2. Process the command and generate a response.
  3. Append the response to the output buffer.
static bool try_one_request(Conn *conn) {
    // ...
    // got one request, do some application logic
    std::vector<std::string> cmd;
    if (parse_req(request, len, cmd) < 0) {
        conn->want_close = true;
        return false;   // error
    }
    Response resp;
    do_request(cmd, resp);
    make_response(resp, conn->outgoing);
    // ...
}

Step 1: Parse the request command

Length-prefixed data is trivial to parse:

static int32_t
parse_req(const uint8_t *data, size_t size, std::vector<std::string> &out) {
    const uint8_t *end = data + size;
    uint32_t nstr = 0;
    if (!read_u32(data, end, nstr)) {
        return -1;
    }
    if (nstr > k_max_args) {
        return -1;  // safety limit
    }

    while (out.size() < nstr) {
        uint32_t len = 0;
        if (!read_u32(data, end, len)) {
            return -1;
        }
        out.push_back(std::string());
        if (!read_str(data, end, len, out.back())) {
            return -1;
        }
    }
    if (data != end) {
        return -1;  // trailing garbage
    }
    return 0;
}
┌────┬───┬────┬───┬────┬───┬───┬────┐
│nstr│len│str1│len│str2│...│len│strn│
└────┴───┴────┴───┴────┴───┴───┴────┘
  4B   4B ...   4B ...

We have added some helper functions so we don’t have to deal with array indexes all the time. This makes the code less error-prone.

static bool read_u32(const uint8_t *&cur, const uint8_t *end, uint32_t &out) {
    if (cur + 4 > end) {
        return false;
    }
    memcpy(&out, cur, 4);
    cur += 4;
    return true;
}

static bool
read_str(const uint8_t *&cur, const uint8_t *end, size_t n, string &out) {
    if (cur + n > end) {
        return false;
    }
    out.assign(cur, cur + n);
    cur += n;
    return true;
}

const uint8_t *&cur is a reference to pointer, the pointer is adjusted after consuming some data from it. C++ references are just pointers with a different syntax, you can just use const uint8_t **cur instead.

Step 2: Process the command

Define the Response type:

struct Response {
    uint32_t status = 0;
    std::vector<uint8_t> data;
};

The KV store is faked with an STL map:

// placeholder; implemented later
static std::map<std::string, std::string> g_data;

static void do_request(std::vector<std::string> &cmd, Response &out) {
    if (cmd.size() == 2 && cmd[0] == "get") {
        auto it = g_data.find(cmd[1]);
        if (it == g_data.end()) {
            out.status = RES_NX;    // not found
            return;
        }
        const std::string &val = it->second;
        out.data.assign(val.begin(), val.end());
    } else if (cmd.size() == 3 && cmd[0] == "set") {
        g_data[cmd[1]].swap(cmd[2]);
    } else if (cmd.size() == 2 && cmd[0] == "del") {
        g_data.erase(cmd[1]);
    } else {
        out.status = RES_ERR;       // unrecognized command
    }
}

We’ll replace it with our own hashtable in the next chapter. But why not just use std::unordered_map? Besides for learning purposes, there are more specific requirements for a production-grade KV store; an STL container over a network is just a toy.

Step 3: Serialize the response

This is trivial:

static void make_response(const Response &resp, std::vector<uint8_t> &out) {
    uint32_t resp_len = 4 + (uint32_t)resp.data.size();
    buf_append(out, (const uint8_t *)&resp_len, 4);
    buf_append(out, (const uint8_t *)&resp.status, 4);
    buf_append(out, resp.data.data(), resp.data.size());
}

However, there is room for improvement: the response data is copied twice, first from the key value to Response::data, then from Response::data to Conn::outgoing. Exercise: Optimize the code so that the response data goes directly to Conn::outgoing.

7.3 What’s next?

We have something working. But making stuff work is not the hard part of programming, here come the more advanced topics:

  • Coding low-level data structures beyond toy code.
  • Applying data structures to real-world problems, such as timers, sorted sets.

Source code: