Build Your Own Redis with C/C++
build-your-own.org eBook·Paperback
⟵ prev Contents next ⟶

ℹ️ New Book: Build Your Own Database

04. Protocol Parsing

4.1 Overview

Our server will be able to process multiple requests from a client, so we need to implement some sort of “protocol” to at least split requests apart from the TCP byte stream. The easiest way to split requests apart is to declare how long the request is at the beginning of the request. Let’s use the following scheme.

+-----+------+-----+------+--------
| len | msg1 | len | msg2 | more...
+-----+------+-----+------+--------

The protocol consists of 2 parts: a 4-byte little-endian integer indicating the length of the request, and the variable-length request.

Starting from the code in the last chapter, the server’s loop is modified to handle multiple requests:

    while (true) {
        // accept
        struct sockaddr_in client_addr = {};
        socklen_t socklen = sizeof(client_addr);
        int connfd = accept(fd, (struct sockaddr *)&client_addr, &socklen);
        if (connfd < 0) {
            continue;   // error
        }

        // only serves one client connection at once
        while (true) {
            int32_t err = one_request(connfd);
            if (err) {
                break;
            }
        }
        close(connfd);
    }

The one_request function parses only one request and responds, until something bad happens or the client connection is lost. Our server can only handle one connection at a time until we introduce the event loop in later chapters.

4.2 IO Helpers

Adding two helper functions before listing the one_request function:

static int32_t read_full(int fd, char *buf, size_t n) {
    while (n > 0) {
        ssize_t rv = read(fd, buf, n);
        if (rv <= 0) {
            return -1;  // error, or unexpected EOF
        }
        assert((size_t)rv <= n);
        n -= (size_t)rv;
        buf += rv;
    }
    return 0;
}

static int32_t write_all(int fd, const char *buf, size_t n) {
    while (n > 0) {
        ssize_t rv = write(fd, buf, n);
        if (rv <= 0) {
            return -1;  // error
        }
        assert((size_t)rv <= n);
        n -= (size_t)rv;
        buf += rv;
    }
    return 0;
}

Two things to note:

  1. The read() syscall just returns whatever data is available in the kernel, or blocks if there is none. It’s up to the application to handle insufficient data. The read_full() function reads from the kernel until it gets exactly n bytes.
  2. Likewise, the write() syscall may return successfully with partial data written if the kernel buffer is full, we must keep trying when the write() returns fewer bytes than we need.

4.3 The Parser

The one_request function did the heavy work:

const size_t k_max_msg = 4096;

static int32_t one_request(int connfd) {
    // 4 bytes header
    char rbuf[4 + k_max_msg + 1];
    errno = 0;
    int32_t err = read_full(connfd, rbuf, 4);
    if (err) {
        if (errno == 0) {
            msg("EOF");
        } else {
            msg("read() error");
        }
        return err;
    }

    uint32_t len = 0;
    memcpy(&len, rbuf, 4);  // assume little endian
    if (len > k_max_msg) {
        msg("too long");
        return -1;
    }

    // request body
    err = read_full(connfd, &rbuf[4], len);
    if (err) {
        msg("read() error");
        return err;
    }

    // do something
    rbuf[4 + len] = '\0';
    printf("client says: %s\n", &rbuf[4]);

    // reply using the same protocol
    const char reply[] = "world";
    char wbuf[4 + sizeof(reply)];
    len = (uint32_t)strlen(reply);
    memcpy(wbuf, &len, 4);
    memcpy(&wbuf[4], reply, len);
    return write_all(connfd, wbuf, 4 + len);
}

For convenience, we put a limit on the maximum request size and use a buffer large enough to hold one request. Endianness used to be a consideration when parsing protocols, but it is less relevant today so we are just memcpy-ing integers.

4.4 The Client

The client code for making requests and receiving responses:

static int32_t query(int fd, const char *text) {
    uint32_t len = (uint32_t)strlen(text);
    if (len > k_max_msg) {
        return -1;
    }

    char wbuf[4 + k_max_msg];
    memcpy(wbuf, &len, 4);  // assume little endian
    memcpy(&wbuf[4], text, len);
    if (int32_t err = write_all(fd, wbuf, 4 + len)) {
        return err;
    }

    // 4 bytes header
    char rbuf[4 + k_max_msg + 1];
    errno = 0;
    int32_t err = read_full(fd, rbuf, 4);
    if (err) {
        if (errno == 0) {
            msg("EOF");
        } else {
            msg("read() error");
        }
        return err;
    }

    memcpy(&len, rbuf, 4);  // assume little endian
    if (len > k_max_msg) {
        msg("too long");
        return -1;
    }

    // reply body
    err = read_full(fd, &rbuf[4], len);
    if (err) {
        msg("read() error");
        return err;
    }

    // do something
    rbuf[4 + len] = '\0';
    printf("server says: %s\n", &rbuf[4]);
    return 0;
}

4.5 Testing

Test our server by sending multiple commands:

int main() {
    int fd = socket(AF_INET, SOCK_STREAM, 0);
    if (fd < 0) {
        die("socket()");
    }

    // code omitted ...

    // multiple requests
    int32_t err = query(fd, "hello1");
    if (err) {
        goto L_DONE;
    }
    err = query(fd, "hello2");
    if (err) {
        goto L_DONE;
    }
    err = query(fd, "hello3");
    if (err) {
        goto L_DONE;
    }

L_DONE:
    close(fd);
    return 0;
}

Running the server and the client:

$ ./server
client says: hello1
client says: hello2
client says: hello3
EOF

$ ./client
server says: world
server says: world
server says: world

4.6 More on Protocol Design

There are many decisions to be made when designing protocols. You can learn the tradeoffs by looking at existing protocols.

4.6.1 Text vs. Binary

The first decision in protocol design is text vs binary. Text protocols have the advantage of being human-readable, making debugging easier. A notable example is the HTTP protocol (not the newer one).

A disadvantage of text protocols is their complexity, even the simplest text protocol requires more work and is more error-prone to parse. You can try to compare the real Redis protocol with the protocol of this book.

Why are text protocols harder to parse? Because they consist of variable-length strings, the text parsing code involves a lot of length calculations, bound checks, and decisions. Let’s say you want to parse an integer in decimal text “1234”, for every byte, you have to check for the end of the buffer and whether the integer has ended. This is in contrast to the simplicity of a fixed-width binary integer.

The above example leads to a protocol design tip: avoid unnecessary variable-length components. The fewer of them, the less parsing, and the fewer of security bugs. Let’s say you want to include a string in your protocol, consider whether a fixed-length string is acceptable, or whether the string is needed at all.

4.6.2 Streaming Data

Our protocol includes the length of the message at the beginning. However, real-world protocols often use less obvious ways to indicate the end of the message. Some applications support “streaming” data continuously without knowing the full length of the output. A good example is the “Chunked Transfer Encoding” from the HTTP protocol.

Chunked encoding wraps incoming data into a message format that starts with the length of the message. The client receives a stream of messages, until a special message indicates the end of the stream.

There is also an alternative but bad way to do this: Use a special character (delimiter) to indicate the end of the stream. The problem is that the payload data encoding can not contain the delimiter, you need some “escape” mechanism, which complicates things.

4.7 Further Considerations

The protocol parsing code requires at least 2 read() syscalls per request. The number of syscalls can be reduced by using “buffered IO”. That is, read as much as you can into a buffer at once, then try to parse multiple requests from that buffer. Readers are encouraged to try this as an exercise as it may be helpful in understanding later chapters.

There are some common beginner mistakes when designing or implementing protocols:

Mistake 1: Not handling the return value of read and write.

These two functions can return fewer bytes than you expected, see the notes on the read_full helper. This mistake is also common with an event loop.

Mistake 2: No way to indicate the end of the message.

People often believe that the read and write syscalls are “messages” instead of byte streams, resulting in an unparsable protocol. Early versions of HTTP also allow this flaw: an HTTP connection without the Content-Length header or chunked encoding cannot be used for multiple requests.

Mistake 3: Unnecessary complexities.

See the section on the protocol design.

Source code:

( Report an Error | Ask a Question) @ build-your-own.org

See also:
codecrafters.io offers “Build Your Own X” courses in many programming languages.
Including Redis, Git, SQLite, Docker, and more.
Check it out

⟵ prev Contents next ⟶