Build Your Own Web Server From Scratch In JavaScript Subscribe to get notified of new chapters and the book's release.
03. Code A TCP Server
Our first step is to get familiar with the socket API, so we will code a simple TCP server in this chapter.
3.1 TCP Quick Review
Layers of Protocols
Network protocols are divided into different layers, where the higher layer depends on the lower layer, and each layer provides different capacities.
top
/\ | App | message or whatever
|| | TCP | byte stream
|| | IP | packets
|| | ... |
bottom
The layer below TCP is the IP layer. Each IP packet is a message with 3 components:
- The sender’s address.
- The receiver’s address.
- The message data.
Communicating with packet-based schemes is not easy. There are lots of problems for applications to solve:
- What if the message data exceeds the capacity of a single packet?
- What if the packet is lost?
- Out-of-order packets?
To make things simple, the next layer is added on top of IP packets. TCP provides:
- Byte streams instead of packets.
- Reliable and ordered delivery.
A byte stream is simply an ordered sequence of bytes. Applications need to make sense of those bytes. That is what the application protocol layer is for. If the application wants messages, the protocol must be designed to split the bytes into messages.
UDP is on the same layer as TCP, but is still packet-based like the lower layer. UDP just adds port numbers over IP packets.
TCP Byte Stream vs. UDP Packet
The most important difference between a byte stream and a packet stream is that there are no boundaries within a byte stream. When you receive from a UDP socket, each receive corresponds to a single UDP packet, which also corresponds to an underlying IP packet (if not fragmented), which ultimately corresponds to a message from the sender.
However, when you receive from a TCP socket, the data you receive does not correspond in any way to the underlying IP packets! Each time you receive from a TCP socket, the data may come from multiple IP packets, or it may be part of an IP packet, which also cannot be corresponded to the sender’s socket API calls.
Byte Stream vs. Packet: DNS as an Example
To help you understand the implication of the byte stream, let’s use the DNS protocol (domain name to IP address lookup) as an example.
DNS runs on UDP, the client sends a single request message and the server responds with a single response message. A DNS message is encapsulated in a UDP packet.
Due to the drawbacks of packet-based protocols, e.g., the inability to use large messages, DNS is also designed to run on TCP. In TCP, there is no such thing as a “message”; when sending a DNS message over TCP, a 2-byte length field is added in front of the original DNS message so that the server or client can tell which part of the byte stream is which message. This 2-byte length field is the simplest example of an application protocol on top of TCP.
TCP Start with a Handshake
To establish a TCP connection, there should be a client and a server (ignoring the simultaneous case). The server waits for the client at a specific address (IP + port), this step is called bind & listen. Then the client can connect to that address. The “connect” operation involves a 3-step handshake (SYN, SYN-ACK, ACK), but this is not our concern because the OS does it transparently. After the OS completes the handshake, the connection can be accepted by the server.
TCP is Bidirectional & Full-Duplex
Once established, the TCP connection can be used as a bi-directional byte stream, with 2 channels for each direction. Many protocols are request-response like HTTP/1.1, where a peer is either sending a request/response or receiving a response/request. But TCP isn’t restricted to this mode of communication. Each peer can be sending and receiving at the same time (e.g. WebSocket), this is called full-duplex communication.
Each direction of channels can be terminated independently. A peer can terminate its side of the channel by sending the FIN flag, which tells the remote peer that it will not receive any more data. The remote peer is then notified of the termination when reading from the channel. A TCP connection is normally closed when both channels are terminated, which involves actions from both peers.
3.2 Socket Primitives
The socket API comes in different shapes in different languages and libraries. You are likely to get confused if you jump into the API documentation without knowing the basics.
The OS Refers to Sockets by Handles
A socket is an opaque handle that often represents a connection or something related to IO. When you create a TCP connection, the connection is managed by your operating system and you use the socket handle to refer to the connection in socket APIs. On Linux, a socket handle is simply a file descriptor (fd). In Node.js, socket handles are wrapped into JS objects with methods on them.
Listening Socket & Connection Socket
A TCP server listens on a particular address (ip address + port number) and accepts client connections from that address. The listening address is represented by a socket handle. And when you accept a new client connection, you get the socket handle of the TCP connection.
Now you know that there are 2 types of sockets. A listening socket, which you can bind to an address and listen on that address. And then you can can accept a client connection from the listening socket and get the TCP connection socket.
Read from & Write to a Socket
For a TCP socket, you can read data from it and write data to it. For the read side, there are ways to know when the transmission is ended. The end of transmission is often called the end of file (EOF), which is signaled by the TCP FIN flag.
For the write side, there are ways to end the transmission. Closing a socket is a common way to end a connection normally, and this causes the TCP FIN to be sent. You can also end your side of the transmission while keeping the other side open; this is called a half-open connection, more on this later.
All socket handles must be closed, just like all other types of handles.
List of Socket Primitives
In summary, there are several socket primitives that you need to know about.
- Listening socket:
- bind & listen
- accept
- close
- Connection socket:
- read
- write
- close
3.3 Socket API in Node.js
We will introduce the socket API with a small exercise: a TCP server that reads data from clients and writes the same data back. This is called an “echo server”.
Step 1: Create A Listening Socket
All the networking stuff is in the net
module.
import * as net from "net";
Different types of sockets are represented as JS objects. The
net.createServer()
function creates a listening socket
whose type is net.Server
. net.Server
has a
listen()
method to bind and listen on an address.
let server = net.createServer();
.listen({host: '127.0.0.1', port: 1234}); server
Step 2: Accept New Connections
The next thing is the accept primitive for getting new
connections. Unfortunately, there is no accept()
function
that simply returns a connection.
Here we need some background knowledge about IO in JS: There are 2 styles of handling IO in JS, the first style is using callbacks; you request something to be done and register a callback with the runtime, and when the thing is done, the callback is invoked.
function newConn(socket: net.Socket) {
console.log('new connection', socket.remoteAddress, socket.remotePort);
// ...
}
let server = net.createServer();
.on('connection', newConn);
server.listen({host: '127.0.0.1', port: 1234}); server
In the above code listing,
server.on('connection', newConn)
registers the callback
function newConn
. The runtime will automatically perform
the accept operation and invoke the callback with the new
connection as an argument of type net.Socket
. This callback
is registered once, but will be called for each new connection.
Step 3: Error Handling
The 'connection'
argument is called an event,
which is something you can register callbacks on. There are other events
on a listening socket. For example, there is the 'error'
event, which is invoked when an error occurs.
.on('error', (err) => { throw err; }); server
Here we simply throw the exception and terminate the program. You can test this by running 2 servers on the same address and port, the second server will fail.
As this book is not a manual, we will not list everything here. Read the Node.js documentation to find out other potentially useful events.
Step 4: Read and Write
Data received from the connection is also delivered via callbacks.
The relevant events for reading from a socket are the
'data'
event and the 'end'
event. The
'data'
event is invoked whenever data arrives from the
peer, and the 'end'
event is invoked when the peer has
ended the transmission.
.on('end', () => {
socket// FIN received. The connection will be closed automatically.
console.log('EOF.');
;
}).on('data', (data: Buffer) => {
socketconsole.log('data:', data);
.write(data); // echo back the data.
socket; })
The socket.write()
method sends data back to the
peer.
Step 5: Close The Connection
The socket.end()
method ends the transmission and closes
the socket. Here we call socket.end()
when the data
contains the letter “q” so we can easily test this scenario.
.on('data', (data: Buffer) => {
socketconsole.log('data:', data);
.write(data); // echo back the data.
socket
// actively closed the connection if the data contains 'q'
if (data.includes('q')) {
console.log('closing.');
.end(); // this will send FIN and close the connection.
socket
}; })
When the transmission is ended from either side, the socket is
automatically closed by the runtime. There is also the
'error'
event on net.Socket
that reports IO
errors. This event also causes the runtime to close the socket.
Step 6: Test It
Here is the complete code for our echo server.
import * as net from "net";
function newConn(socket: net.Socket) {
console.log('new connection', socket.remoteAddress, socket.remotePort);
.on('end', () => {
socket// FIN received. The connection will be closed automatically.
console.log('EOF.');
;
}).on('data', (data: Buffer) => {
socketconsole.log('data:', data);
.write(data); // echo back the data.
socket
// actively closed the connection if the data contains 'q'
if (data.includes('q')) {
console.log('closing.');
.end(); // this will send FIN and close the connection.
socket
};
})
}
let server = net.createServer();
.on('error', (err) => { throw err; });
server.on('connection', newConn);
server.listen({host: '127.0.0.1', port: 1234}); server
Start the echo server by running node echo_server.js
.
And test it with the nc
or socat
command.
3.4 Discussion: Half-Open Connections
It is possible to end one side of the TCP transmission with a FIN while still receiving from the other side; this unidirectional use of TCP is called TCP half-open. Not many applications are coded this way. Sockets in Node.js do not support half-open by default and are closed when either side of the transmission is ended with a FIN. To support TCP half-open, an additional flag is required.
let server = net.createServer({allowHalfOpen: true});
When the allowHalfOpen
flag is enabled, you are
responsible for closing the connection, as socket.end()
no
longer closes the connection, only ends your side of transmission. Use
socket.destroy()
to close the socket manually.
3.5 Discussion: The Event Loop & Concurrency
JS Code Runs Within the Event Loop
As you can see, callbacks are needed to do anything in our echo server. This is how an event loop works. It’s a mechanism of the Node.js runtime that is invisible to the programmer. The runtime roughly does something like this:
// pseudo code!
while (running) {
let events = wait_for_events(); // blocking
for (let e of events) {
do_something(e); // may invoke callbacks
} }
The runtime polls for IO events from the OS, such as a new connection arriving, a socket becoming ready to read, or a timer expiring. This step is blocking. Then the runtime reacts to the events and invokes the callbacks that the programmer registered earlier. This process repeats after all events have been handled, thus it’s called the event loop.
JS Code and Runtime Share a Single OS Thread
The event loop is single-threaded; execution is either on
the runtime code or on the JS code (callbacks or the main program). This
works because when a callback returns, or await
s, control
is yield back to the runtime. This implies that any JS code is
expected to finish in a short time because the event loop is
halted when executing JS code.
Concurrency in Node.JS is Event-Based
To help you understand the implication of the event loop, let’s now consider concurrency. A server can have multiple connections simultaneously, and each connection can emit events. While you are processing an event for one of the connections, the single-threaded runtime cannot do anything for the other connections until you yield back to the runtime. The longer you process an event, the longer everything else is delayed.
3.6 Discussion: Asynchronous vs. Synchronous
Blocking & Non-Blocking IO
How to avoid blocking the event loop? This question is equivalent to how to block the event loop. One way to block the event loop is to run CPU-intensive code. A common solution to this problem is to voluntarily yield to the runtime. Another solution is to move the CPU-intensive code out of the event loop via multi-threading or multi-processing. These topics are beyond the scope of this book, and there is another common cause of blocking — IO.
The OS provides both blocking mode and non-blocking mode for network IOs. In blocking mode, the calling OS thread simply blocks until the result is ready. In non-blocking mode, the OS always returns immediately if the result is not ready (or is ready), and there is a mechanism to be notified of readiness (for the event loop).
IO in Node.js is Asynchronous
Most Node.js library functions related to IO are either callback-based or promise-based. Promises can be viewed as another way to manage callbacks. These are also described as asynchronous, meaning that the result is delivered via a callback. These APIs do not block the event loop because the JS code doesn’t wait for the result; instead, the JS code yield to the runtime, and when the result is ready, the runtime invokes the callback to continue your program.
The opposite is the synchronous API, which blocks the
calling OS thread to wait for the result. For example, let’s take a look
at the documentation of the fs
module, file APIs are
available in all 3 types.
// promise
.read([options]);
filehandle// callback
.read(fd[, options], callback);
fs// synchronous, do not use!
.readSync(fd, buffer[, options]); fs
The synchronous API is what you do NOT use in network applications since it blocks the event loop.
The Event Loop is Also Used for GUIs
IO is more than disk files and networking. In GUI systems, user input from the mouse and keyboard is also IO. And event loops are not unique to the Node.js runtime; Web browsers and all other GUI applications also use event loops under the hood. You can transfer your experience in GUI programming to network programming and vice versa.
3.7 Discussion: Promise-Based IO
As we mentioned before, there is another style of writing IO code.
The alternative style uses Promise
s instead of callbacks.
The advantage of promise-based APIs is that you can await
on them and get the result, thus avoiding breaking your program into
tiny callbacks that scattered all over the place.
A hypothetical promised-based API for the accept primitive looks like this:
// pseudo code!
while (running) {
let socket = await server.accept();
newConn(socket); // no `await` on this
}
And the hypothetical API for the read and write primitive looks like this:
// pseudo code!
async function newConn(socket) {
while (true) {
let data = await socket.read();
if (!data) {
break; // EOF
}await socket.write(data);
} }
The above pseudo code appears to be synchronous, but without blocking the event loop. Although the advantage may not be clear at this point, since our program is very simple. We will move on to the promise-based style in later chapters.
Some Node.js APIs, but not all of them, are available in both callback-based and promise-based styles. However, with some effort, callback-based APIs can be converted to promised-based ones, as we will see in the next chapter.
codecrafters.io offers “Build Your Own X” courses in many programming languages.
Including Redis, Git, SQLite, Docker, and more.