05. A Simple Network Protocol

5.1 Message Echo Server

Parsing HTTP is a non-trivial amount of work, so let’s take a smaller step first. We will implement a simpler protocol to illustrate the most important function of a protocol: splitting byte stream into messages.

Our protocol consists of messages separated by '\n' (the newline character). The server reads messages and sends back replies using the same protocol.

client      server
------      ------

msg1\n  ==>
       <==  Echo: msg1\n
msg2\n  ==>
       <==  Echo: msg2\n
quit\n  ==>
       <==  Bye.\n

5.2 Dynamic Buffers

The Need for Dynamic Buffers

Messages do not come from the socket by themselves, we need to store the incoming data in a buffer, then we can try to split messages from it.

A Buffer object in Node.JS is a fixed-size chunk of binary data. It cannot grow by appending data. What we can do is to concatenate 2 buffers into a new buffer.

buf = Buffer.concat([buf, data]);

However, this is an anti-pattern that can lead to O(n2) algorithm complexity.

// pseudo code! bad example!
while (need_more_data()) {
    buf = Buffer.concat([buf, get_some_data()]);
}

Each time you append new data by concatenation, the old data is copied. To amortize the cost of copying, dynamic arrays are used:

Coding a Dynamic Buffer

The buffer has to be larger than needed for the appended data to use the extra capacity to amortize the copying. The DynBuf type stores the actual data length.

// A dynamic-sized buffer
type DynBuf = {
    data: Buffer,
    length: number,
};

The syntax of the copy() method is src.copy(dst, dst_offset, src_offset). This is for appending data.

// append data to DynBuf
function bufPush(buf: DynBuf, data: Buffer): void {
    const newLen = buf.length + data.length;
    if (buf.data.length < newLen) {
        // grow the capacity ...
    }
    data.copy(buf.data, buf.length, 0);
    buf.length = newLen;
}

Buffer.alloc(cap) creates a new buffer of a given size. This is for resizing the buffer. The new buffer has to grow exponentially so that the amortized cost is O(1). We’ll use power of two series for the buffer capacity.

// append data to DynBuf
function bufPush(buf: DynBuf, data: Buffer): void {
    const newLen = buf.length + data.length;
    if (buf.data.length < newLen) {
        // grow the capacity by the power of two
        let cap = Math.max(buf.data.length, 32);
        while (cap < newLen) {
            cap *= 2;
        }
        const grown = Buffer.alloc(cap);
        buf.data.copy(grown, 0, 0);
        buf.data = grown;
    }
    data.copy(buf.data, buf.length, 0);
    buf.length = newLen;
}

5.3 Implementing a Message Protocol

Step 1: The Server Loop

At a high level, the server should be a loop.

  1. Parse and remove a complete message from the incoming byte stream.
    • Append some data to the buffer.
    • Continue the loop if the message is incomplete.
  2. Handle the message.
  3. Send the response.
async function serveClient(socket: net.Socket): Promise<void> {
    const conn: TCPConn = soInit(socket);
    const buf: DynBuf = {data: Buffer.alloc(0), length: 0};
    while (true) {
        // try to get 1 message from the buffer
        const msg: null|Buffer = cutMessage(buf);
        if (!msg) {
            // need more data
            const data: Buffer = await soRead(conn);
            bufPush(buf, data);
            // EOF?
            if (data.length === 0) {
                // omitted ...
                return;
            }
            // got some data, try it again.
            continue;
        }
        // omitted. process the message and send the response ...
    } // loop for messages
}

A socket read is not related to any message boundary. What we do is append data to a buffer until it contains a complete message.

The cutMessage() function tells if the message is complete.

Step 2: Split Messages

The cutMessage() function tests if the message is complete using the delimiter '\n'.

function cutMessage(buf: DynBuf): null|Buffer {
    // messages are separated by '\n'
    const idx = buf.data.subarray(0, buf.length).indexOf('\n');
    if (idx < 0) {
        return null;    // not complete
    }
    // make a copy of the message and move the remaining data to the front
    const msg = Buffer.from(buf.data.subarray(0, idx + 1));
    bufPop(buf, idx + 1);
    return msg;
}

Then it makes a copy of the message data, because it will be removed from the buffer.

The message is removed by moving the remaining data to the front.

// remove data from the front
function bufPop(buf: DynBuf, len: number): void {
    buf.data.copyWithin(0, len, buf.length);
    buf.length -= len;
}

buf.copyWithin(dst, src_start, src_end) copies data within a buffer, source and destination can overlap, like memmove() in C. This way of handling buffers is not very optimal, more on this later.

Step 3: Handle Requests

Requests are handled and responses are sent in the server loop.

    while (true) {
        // try to get 1 message from the buffer
        const msg = cutMessage(buf);
        if (!msg) {
            // omitted ...
            continue;
        }

        // process the message and send the response
        if (msg.equals(Buffer.from('quit\n'))) {
            await soWrite(conn, Buffer.from('Bye.\n'));
            socket.destroy();
            return;
        } else {
            const reply = Buffer.concat([Buffer.from('Echo: '), msg]);
            await soWrite(conn, reply);
        }
    } // loop for messages

Our message echo server is now complete. Test it with the socat command.

5.4 Discussion: Pipelined Requests

When removing a message from the buffer, we moved the remaining data to the front. You may wonder how there can be any data left in the buffer, since in a request-response scenario, the client waits for the response before sending more requests. This is to support an optimization.

Reduce Latency by Pipelining Requests

Consider a typical modern web page that involves many scripts and style sheets. It takes many HTTP requests to load the page, and each request increases the load time by at least one roundtrip time (RTT). If we can send multiple requests at once, without waiting for the responses one by one, the load time could be greatly reduced. On the server side, the server shouldn’t tell the difference because a TCP connection is just a byte stream.

 client            server
 ------            ------

 |req1|  ==>
 |req2|  ==>
 |req3|  ==>  <==  |res1|
  ...         <==  |res2|
                    ...

This is called pipelined requests. It’s a common way to reduce RTTs in request-response protocols.

This is why we kept the remaining buffer data around, because there can be more than 1 message in it.

Support Pipelined Requests

While you can make pipelined requests to many well-implemented network servers, such as Redis, NGINX, some less common implementations are problematic! Web browsers do not use pipelined HTTP requests due to buggy servers, they may use multiple concurrent connections instead.

But if you treat TCP data strictly as a continuous stream of bytes, pipelined messages should be indistinguishable, because the parser doesn’t depend on the size of the buffered data, it just consumes elements one by one.

So pipelined messages are a way to check the correctness of a protocol parser. If the server treats a socket read as a “TCP packet”, it would easily fail.

You can test the pipelined scenario with the following command line.

echo -e 'asdf\n1234' | socat tcp:127.0.0.1:1234 -

The server will probably receive 2 messages in a single 'data' event, and our server handled them correctly.

Deadlock by Pipelining

A caveat about pipelining requests: pipelining too many requests can lead to deadlocks; because both the server and the client can be sending at the same time, and if both their send buffers are full, it’s deadlocked as they are both stuck at sending and cannot drain the buffer.

5.5 Discussion: Smarter Buffers

Removing Data from the Front

There is still O(n2) behavior in our buffer code; whenever we removed a message from the buffer, we moved the remaining data to the front. This can be triggered by pipelining many small messages.

To fix this, the data movement has to be amortized. This can be done by deferring the data movement. We can keep the remaining data temporarily in place until the wasted space in the front reaches a threshold (such as 1/2 capacity).

This fix requires us to keep track of the beginning of the data, so you’ll need a new method for retrieving the real data. The code is left as an exercise to you.

Using Fixed Sized Buffer without Reallocations

Sometimes it’s possible to use a large enough buffer without resizing if the message size in the protocol is limited to a reasonably small value.

For example, many HTTP implementations have a small limit on header size as there is no legitimate use case for putting lots of data in the header. For these implementations, one buffer allocation per connection is enough, and the buffer is also sufficient for reading the payload if it doesn’t need to store the entire payload in memory.

The advantages of this are:

This is not very relevant for Node.JS apps, but these advantages are desirable in environments with manual memory management or constrained hardware.

5.6 Conclusion: Network Programming Basics

A summary of what we have learned so far: