07. Code A Basic HTTP Server

Our HTTP server is based on the message echo server from the previous chapter, with the “message” replaced by the HTTP message.

7.1 Start Coding

The code is broken into small steps and follows a top-down approach.

Step 1: Types and Structures

Our first step is to define the structure for HTTP messages based on our understanding of HTTP semantics.

// a parsed HTTP request header
type HTTPReq = {
    method: string,
    uri: Buffer,
    version: string,
    headers: Buffer[],
};

// an HTTP response
type HTTPRes = {
    code: number,
    headers: Buffer[],
    body: BodyReader,
};

We use Buffer instead of string for the URI and header fields. Although HTTP is mostly plaintext, there is no guarantee that URI and header fields must be ASCII or UTF-8 strings. So we just leave them as bytes until we need to parse them.

The BodyReader type is the interface for reading data from the body payload.

// an interface for reading/writing data from/to the HTTP body.
type BodyReader = {
    // the "Content-Length", -1 if unknown.
    length: number,
    // read data. returns an empty buffer after EOF.
    read: () => Promise<Buffer>,
};

The payload body can be arbitrarily long, it may not even fit in memory, thus we have to use the read() function to read from it instead of a simple Buffer. The read() function follows the convention of the soRead() function — the end of data is signaled by an empty Buffer.

And when using chunked encoding, the length of the body is not known, which is another reason why this interface is needed.

Step 2: The Server Loop

The server loop follows the pattern from the previous chapter. Except that the cutMessage() function only parses the HTTP header; the payload body is expected to be read while handling the request, or discarded after handling the request. In this way, we don’t store the entire payload body in memory.

async function serveClient(conn: TCPConn): Promise<void> {
    const buf: DynBuf = {data: Buffer.alloc(0), length: 0};
    while (true) {
        // try to get 1 request header from the buffer
        const msg: null|HTTPReq = cutMessage(buf);
        if (!msg) {
            // need more data
            const data = await soRead(conn);
            bufPush(buf, data);
            // EOF?
            if (data.length === 0 && buf.length === 0) {
                return; // no more requests
            }
            if (data.length === 0) {
                throw new HTTPError(400, 'Unexpected EOF.');
            }
            // got some data, try it again.
            continue;
        }

        // process the message and send the response
        const reqBody: BodyReader = readerFromReq(conn, buf, msg);
        const res: HTTPRes = await handleReq(msg, reqBody);
        await writeHTTPResp(conn, res);
        // close the connection for HTTP/1.0
        if (msg.version === '1.0') {
            return;
        }
        // make sure that the request body is consumed completely
        while ((await reqBody.read()).length > 0) { /* empty */ }
    } // loop for IO
}

The HTTPError is a custom exception type defined by us. It is used to generate an error response and close the connection. Note that this thing exists only to make our code simpler by deferring the unhappy case of error handling. You probably don’t want to throw exceptions around like this in production code.

async function newConn(socket: net.Socket): Promise<void> {
    const conn: TCPConn = soInit(socket);
    try {
        await serveClient(conn);
    } catch (exc) {
        console.error('exception:', exc);
        if (exc instanceof HTTPError) {
            // intended to send an error response
            const resp: HTTPRes = {
                code: exc.code,
                headers: [],
                body: readerFromMemory(Buffer.from(exc.message + '\n')),
            };
            try {
                await writeHTTPResp(conn, resp);
            } catch (exc) { /* ignore */ }
        }
    } finally {
        socket.destroy();
    }
}

Step 3: Split the Header

The HTTP header ends with '\r\n\r\n', which is how we determine its length.

In theory, there is no limit to the size of the header, but in practice there is. Because we are going to parse and store the header in memory, and memory is finite.

// the maximum length of an HTTP header
const kMaxHeaderLen = 1024 * 8;

// parse & remove a header from the beginning of the buffer if possible
function cutMessage(buf: DynBuf): null|HTTPReq {
    // the end of the header is marked by '\r\n\r\n'
    const idx = buf.data.subarray(0, buf.length).indexOf('\r\n\r\n');
    if (idx < 0) {
        if (buf.length >= kMaxHeaderLen) {
            throw new HTTPError(413, 'header is too large');
        }
        return null;    // need more data
    }
    // parse & remove the header
    const msg = parseHTTPReq(buf.data.subarray(0, idx + 4));
    bufPop(buf, idx + 4);
    return msg;
}

Parsing is also easier when we have the complete data. That’s another reason why we waited for the full HTTP header before parsing anything.

Step 4: Parse the Header

To parse an HTTP header, we can first split the data into lines by CRLF since we have the complete header in the buffer. Then we can process each line individually.

// parse an HTTP request header
function parseHTTPReq(data: Buffer): HTTPReq {
    // split the data into lines
    const lines: Buffer[] = splitLines(data);
    // the first line is `METHOD URI VERSION`
    const [method, uri, version] = parseRequestLine(lines[0]);
    // followed by header fields in the format of `Name: value`
    const headers: Buffer[] = [];
    for (let i = 1; i < lines.length - 1; i++) {
        const h = Buffer.from(lines[i]);    // copy
        if (!validateHeader(h)) {
            throw new HTTPError(400, 'bad field');
        }
        headers.push(h);
    }
    // the header ends by an empty line
    console.assert(lines[lines.length - 1].length === 0);
    return {
        method: method, uri: uri, version: version, headers: headers,
    };
}

The first line is simply 3 pieces separated by space. The rest of the lines are header fields. Although we’re not trying to parse the header fields here, it’s still a good idea to do some validations on them.

The splitLines(), parseRequestLine(), and validateHeader() functions are not very interesting, so we will not show them here. You can easily code them yourself according to RFCs.

Step 5: Read the Body

Before handling the request, we must first construct the BodyReader object which will be passed to the handler function. There are 3 ways to read the payload body, as we mentioned earlier.

// BodyReader from an HTTP request
function readerFromReq(
    conn: TCPConn, buf: DynBuf, req: HTTPReq): BodyReader
{
    let bodyLen = -1;
    const contentLen = fieldGet(req.headers, 'Content-Length');
    if (contentLen) {
        bodyLen = parseDec(contentLen.toString('latin1'));
        if (isNaN(bodyLen)) {
            throw new HTTPError(400, 'bad Content-Length.');
        }
    }
    const bodyAllowed = !(req.method === 'GET' || req.method === 'HEAD');
    const chunked = fieldGet(req.headers, 'Transfer-Encoding')
        ?.equals(Buffer.from('chunked')) || false;
    if (!bodyAllowed && (bodyLen > 0 || chunked)) {
        throw new HTTPError(400, 'HTTP body not allowed.');
    }
    if (!bodyAllowed) {
        bodyLen = 0;
    }

    if (bodyLen >= 0) {
        // "Content-Length" is present
        return readerFromConnLength(conn, buf, bodyLen);
    } else if (chunked) {
        // chunked encoding
        throw new HTTPError(501, 'TODO');
    } else {
        // read the rest of the connection
        throw new HTTPError(501, 'TODO');
    }
}

Here we need to look at the Content-Length field and the Transfer-Encoding field. The fieldGet() function is for looking up the field value by name. Note that field names are case-insensitive. The implementation is left to the reader.

function fieldGet(headers: Buffer[], key: string): null|Buffer;

We will only implement the case where the Content-Length field is present, the other cases are left for later chapters.

// BodyReader from a socket with a known length
function readerFromConnLength(
    conn: TCPConn, buf: DynBuf, remain: number): BodyReader
{
    return {
        length: remain,
        read: async (): Promise<Buffer> => {
            if (remain === 0) {
                return Buffer.from(''); // done
            }
            if (buf.length === 0) {
                // try to get some data if there is none
                const data = await soRead(conn);
                bufPush(buf, data);
                if (data.length === 0) {
                    // expect more data!
                    throw new Error('Unexpected EOF from HTTP body');
                }
            }
            // consume data from the buffer
            const consume = Math.min(buf.length, remain);
            remain -= consume;
            const data = Buffer.from(buf.data.subarray(0, consume));
            bufPop(buf, consume);
            return data;
        }
    };
}

The readerFromConnLength() function returns a BodyReader that reads exactly the number of bytes specified in the Content-Length field. Note that the data from the socket goes into the buffer first, then we drain data from the buffer. This is because:

The remain variable is a state captured by the read() function to keep track of the remaining body length.

Step 6: The Request Handler

We can now handle the request according to its URI and method. Here we will show you 2 sample responses.

// a sample request handler
async function handleReq(req: HTTPReq, body: BodyReader): Promise<HTTPRes> {
    // act on the request URI
    let resp: BodyReader;
    switch (req.uri.toString('latin1')) {
    case '/echo':
        // http echo server
        resp = body;
        break;
    default:
        resp = readerFromMemory(Buffer.from('hello world.\n'));
        break;
    }

    return {
        code: 200,
        headers: [Buffer.from('Server: my_first_http_server')],
        body: resp,
    };
}

If the URI is '/echo', we simply set the response payload to the request payload. This essentially creates an echo server in HTTP. You can test this by POSTing data with the curl command.

curl -s --data-binary 'hello' http://127.0.0.1:1234/echo

The other sample response is a fixed string 'hello world.\n'. To do this, we must first create the BodyReader object.

// BodyReader from in-memory data
function readerFromMemory(data: Buffer): BodyReader {
    let done = false;
    return {
        length: data.length,
        read: async (): Promise<Buffer> => {
            if (done) {
                return Buffer.from(''); // no more data
            } else {
                done = true;
                return data;
            }
        },
    };
}

The read() function returns the full data on the first call and returns EOF after that. This is useful for responding with something small and already fits in memory.

Step 7: Send the Response

After handling the request, we can send the response header and the response body if there is one. In this chapter, we will only deal with the payload body of known length; the chunked encoding is left for later chapters. All we need to do is to add the Content-Length field.

// send an HTTP response through the socket
async function writeHTTPResp(conn: TCPConn, resp: HTTPRes): Promise<void> {
    if (resp.body.length < 0) {
        throw new Error('TODO: chunked encoding');
    }
    // set the "Content-Length" field
    console.assert(!fieldGet(resp.headers, 'Content-Length'));
    resp.headers.push(Buffer.from(`Content-Length: ${resp.body.length}`));
    // write the header
    await soWrite(conn, encodeHTTPResp(resp));
    // write the body
    while (true) {
        const data = await resp.body.read();
        if (data.length === 0) {
            break;
        }
        await soWrite(conn, data);
    }
}

The encodeHTTPResp() function encodes a response header into a byte buffer. The message format is almost identical to the request message, except for the first line.

status-line = HTTP-version SP status-code SP [ reason-phrase ]

Encoding is much easier than parsing, so the implementation is left to the reader.

Step 8: Review the Server Loop

There is still work to be done after sending the response. We can provide some compatibility for HTTP/1.0 clients by closing the connection immediately, since the connection cannot be reused anyway.

And most importantly, before continuing the loop to the next request, we must make sure that the request body is completely consumed, because the handler function may have ignored the request body and left the parser at the wrong position.

async function serveClient(conn: TCPConn): Promise<void> {
    const buf: DynBuf = {data: Buffer.alloc(0), length: 0};
    while (true) {
        // try to get 1 request header from the buffer
        const msg: null|HTTPReq = cutMessage(buf);
        if (!msg) {
            // omitted ...
            continue;
        }

        // process the message and send the response
        const reqBody: BodyReader = readerFromReq(conn, buf, msg);
        const res: HTTPRes = await handleReq(msg, reqBody);
        await writeHTTPResp(conn, res);
        // close the connection for HTTP/1.0
        if (msg.version === '1.0') {
            return;
        }
        // make sure that the request body is consumed completely
        while ((await reqBody.read()).length > 0) { /* empty */ }
    } // loop for IO
}

Our first HTTP server is now complete.

7.2 Testing

The simplest test case is to make requests with curl. The server should greet you with “hello world”. You can also POST data to the '/echo' path and the server should echo the data back.

curl -s --data-binary 'hello' http://127.0.0.1:1234/echo

Large HTTP Body

The curl command can also post data from files. We can post a really big file to verify that our server is only using constant memory and not triggering OOM.

curl -s --data-binary @a_big_file http://127.0.0.1:1234/echo | sha1sum

Connection Reuse & Pipelining

Another important thing to test is the ability to handle multiple requests per connection. You can test this either interactively via socat, or automatically via shell scripting.

(cat req1.txt; sleep 1; cat req2.txt) | socat tcp:127.0.0.1:1234,crnl -

Note the crnl option in the socat command, this is to make sure that lines end with CRLF instead of just LF.

If you remove the sleep 1 in the above script, you will also be testing pipelined requests.

7.3 Discussion: Nagle's Algorithm

Optimization: Combining Small Writes

When sending the response, we used the encodeHTTPResp() function to create a byte buffer of the header before writing the response to the socket. Some people may skip this step and write to the socket line by line.

// Bad example!
await soWrite(conn, Buffer.from(`HTTP/1.1 ${msg.code} ${status}\r\n`));
for (const h of msg.headers) {
    await soWrite(conn, h);
    await soWrite(conn, Buffer.from('\r\n'));
}
await soWrite(conn, Buffer.from('\r\n'));

The problem with this is that it generates many small writes, causing TCP to send many small packets. Not only does each packet have a relatively large space overhead, but more computation is required to process more packets. People saw this optimization opportunity and added a feature to the TCP stack known as “Nagle’s algorithm” — the TCP stack delays transmission to allow the send buffer to accumulate data, so that multiple consecutive small writes can be combined.

Premature Optimization

However, this is not a good optimization. Many newer network protocol designs, such as TLS, have put a lot of effort into reducing RTTs because many performance problems are latency problems. Adding delays to TCP to combine writes now looks like anti-optimization. And the intended optimization goal can easily be achieved at the application level instead; applications can simply combine small data themselves without delays.

Well-written applications should manage buffers carefully, either by explicitly serializing data into a buffer, or by using some buffered IO interfaces, so that Nagle’s algorithm is not needed. And high-performance applications will want to minimize the number of syscalls, making Nagle’s algorithm even more useless.

What People Actually Do in Practice

When developing networked applications:

  1. Avoid small writes by combining small data before writing.
  2. Disable Nagle’s algorithm.

Nagle’s algorithm is often enabled by default. This can be disabled using the noDelay flag in Node.js.

const server = net.createServer({
    noDelay: true,  // TCP_NODELAY
});

7.4 Discussion: Buffered Writer

Alternative: Make Buffering Semi-Transparent

Instead of explicitly serializing data into a buffer, as we do with the response header, we can also add a buffer to the TCPConn type and change the way it works.

// append data to an internal buffer
function soWrite(conn: TCPConn, data: Buffer): Promise<void>;
// flush the buffer to the runtime
function soFlush(conn: TCPConn): Promise<void>;

In the new scheme, the soWrite() function is changed to append data to an internal buffer in TCPConn, and the new soFlush() function is used to actually write the data. The buffer size is limited, and the soWrite() function can also flush the buffer when it is full.

This style of IO is very popular and you may have seen it in other programming languages. For example, stdio in C has a built-in buffer which is enabled by default, you must use fflush() when appropriate.

Alternative: Add a Buffered Wrapper

Alternatively, you can leave the TCPConn as is, and add a separate wrapper type like this:

type BufferedWriter = {
    write: (data: Buffer) => Promise<void>,
    flush: () => Promise<void>,
    // ...
};

function createBufferedWriter(conn: TCPConn): BufferedWriter;

This is similar to the bufio.Writer in Golang. This scheme is more flexible than adding buffering to the socket code, because the buffered wrapper is also applicable to other forms of IO. And the Go standard library was designed with well-defined interfaces (io.Writer), making the buffered writer a drop-in replacement for the unbuffered writer.

There are many more good ideas to steal from the Go standard library. One of them is that the bufio.Writer is not just an io.Writer, but also exposes its internal buffer so that you can write to it directly! This can eliminate temporary buffers and extra data copies when serializing data.