03. Code A TCP Server
Our first step is to get familiar with the socket API, so we will code a simple TCP server in this chapter.
3.1 TCP Quick Review
Layers of Protocols
Network protocols are divided into different layers, where the higher layer depends on the lower layer, and each layer provides different capacities.
top
/\ | App | message or whatever
|| | TCP | byte stream
|| | IP | packets
|| | ... |
bottom
The layer below TCP is the IP layer. Each IP packet is a message with 3 components:
- The sender’s address.
- The receiver’s address.
- The message data.
Communication with a packet-based scheme is not easy. There are lots of problems for applications to solve:
- What if the message data exceeds the capacity of a single packet?
- What if the packet is lost?
- Out-of-order packets?
To make things simple, the next layer is added on top of IP packets. TCP provides:
- Byte streams instead of packets.
- Reliable and ordered delivery.
A byte stream is simply an ordered sequence of bytes. A protocol, rather than the application, is used to make sense of these bytes. Protocols are like file formats, except that the total length is unknown and the data is read in one pass.
UDP is on the same layer as TCP, but is still packet-based like the lower layer. UDP just adds port numbers over IP packets.
TCP Byte Stream vs. UDP Packet
The key difference: boundaries.
- UDP: Each read from a socket corresponds to a single write from the peer.
- TCP: No such correspondence! Data is a continuous flow of bytes.
TCP simply has no mechanism for preserving boundaries.
- TCP send buffer: This is where data is stored before transmission. Multiple writes are indistinguishable from a single write.
- Data is encapsulated as one or more IP packets, IP boundaries have no relationship to the original write boundaries.
- TCP receive buffer: Data is available to applications as it arrives.
The No. 1 beginner trap in socket programming is “concatenating & splitting TCP packets” because there is no such thing as “TCP packets”. Protocols are required to interpret TCP data by imposing boundaries within the byte stream.
Byte Stream vs. Packet: DNS as an Example
To help you understand the implications of the byte stream, let’s use the DNS protocol (domain name to IP address lookup) as an example.
DNS runs on UDP, the client sends a single request message and the server responds with a single response message. A DNS message is encapsulated in a UDP packet.
| IP header | IP payload |
\............................../
| UDP header | UDP payload |
\.............../
| DNS message |
Due to the drawbacks of packet-based protocols, e.g., the inability to use large messages, DNS is also designed to run on TCP. But TCP knows nothing about “message”, so when sending DNS messages over TCP, a 2-byte length field is prepended to each DNS message so that the server or client can tell which part of the byte stream is which message. This 2-byte length field is the simplest example of an application protocol on top of TCP. This protocol allows for multiple application messages (DNS) in a single TCP byte stream.
| len1 | msg1 | len2 | msg2 | ...
TCP Start with a Handshake
To establish a TCP connection, there should be a client and a server (ignoring the simultaneous case). The server waits for the client at a specific address (IP + port), this step is called bind & listen. Then the client can connect to that address. The “connect” operation involves a 3-step handshake (SYN, SYN-ACK, ACK), but this is not our concern because the OS does it transparently. After the OS completes the handshake, the connection can be accepted by the server.
TCP is Bidirectional & Full-Duplex
Once established, the TCP connection can be used as a bi-directional byte stream, with 2 channels for each direction. Many protocols are request-response like HTTP/1.1, where a peer is either sending a request/response or receiving a response/request. But TCP isn’t restricted to this mode of communication. Each peer can send and receive at the same time (e.g. WebSocket), this is called full-duplex communication.
TCP End with 2 Handshakes
A peer tells the other side that no more data will be sent with the FIN flag, then the other side ACKs the FIN. The remote application is notified of the termination when reading from the channel.
Each direction of channels can be terminated independently, so the other side also performs the same handshake to fully close the connection.
3.2 Socket Primitives
The socket API comes in different shapes in different languages and libraries. You are likely to get confused if you jump into the API documentation without knowing the basics.
Applications Refer to Sockets by Opaque OS Handles
When you create a TCP connection, the connection is managed by your operating system, and you use the socket handle to refer to the connection in the socket API. In Linux, a socket handle is simply a file descriptor (fd). In Node.js, socket handles are wrapped into JS objects with methods on them.
Any OS handle must be closed by the application to terminate the underlying resource and recycle the handle.
Listening Socket & Connection Socket
A TCP server listens on a particular address (IP + port) and accepts client connections from that address. The listening address is also represented by a socket handle. And when you accept a new client connection, you get the socket handle of the TCP connection.
Now you know that there are 2 types of socket handles.
- Listening sockets. Obtained by listening on an address.
- Connection sockets. Obtained by accepting a client connection from a listening socket.
End of Transmission
Send and receive are also called read and write. For the write side, there are ways to tell the peer that no more data will be sent.
- Closing a socket terminates a connection and causes the TCP FIN to be sent. Closing a handle of any type also recycles the handle itself. (Once the handle is gone, you cannot do anything with it.)
- You can also shutdown your side of the transmission (also send FIN) while still being able to receive data from the peer; this is called a half-open connection, more on this later.
For the read side, there are ways to know when the peer has ended the transmission (received FIN). The end of transmission is often called the end of file (EOF).
List of Socket Primitives
In summary, there are several socket primitives that you need to know about.
- Listening socket:
- bind & listen
- accept
- close
- Connection socket:
- read
- write
- close
3.3 Socket API in Node.js
We will introduce the socket API with a small exercise: a TCP server that reads data from clients and writes the same data back. This is called an “echo server”.
Step 1: Create A Listening Socket
All the networking stuff is in the net
module.
import * as net from "net";
Different types of sockets are represented as JS objects. The
net.createServer()
function creates a listening socket
whose type is net.Server
. net.Server
has a
listen()
method to bind and listen on an address.
let server = net.createServer();
.listen({host: '127.0.0.1', port: 1234}); server
Step 2: Accept New Connections
The next thing is the accept primitive for getting new
connections. Unfortunately, there is no accept()
function
that simply returns a connection.
Here we need some background knowledge about IO in JS: There are 2 styles of handling IO in JS, the first style is using callbacks; you request something to be done and register a callback with the runtime, and when the thing is done, the callback is invoked.
function newConn(socket: net.Socket): void {
console.log('new connection', socket.remoteAddress, socket.remotePort);
// ...
}
let server = net.createServer();
.on('connection', newConn);
server.listen({host: '127.0.0.1', port: 1234}); server
In the above code listing,
server.on('connection', newConn)
registers the callback
function newConn
. The runtime will automatically perform
the accept operation and invoke the callback with the new
connection as an argument of type net.Socket
. This callback
is registered once, but will be called for each new connection.
Step 3: Error Handling
The 'connection'
argument is called an event,
which is something you can register callbacks on. There are other events
on a listening socket. For example, there is the 'error'
event, which is invoked when an error occurs.
.on('error', (err: Error) => { throw err; }); server
Here we simply throw the exception and terminate the program. You can test this by running 2 servers on the same address and port, the second server will fail.
As this book is not a manual, we will not list everything here. Read the Node.js documentation to find out other potentially useful events.
Step 4: Read and Write
Data received from the connection is also delivered via callbacks.
The relevant events for reading from a socket are the
'data'
event and the 'end'
event. The
'data'
event is invoked whenever data arrives from the
peer, and the 'end'
event is invoked when the peer has
ended the transmission.
.on('end', () => {
socket// FIN received. The connection will be closed automatically.
console.log('EOF.');
;
}).on('data', (data: Buffer) => {
socketconsole.log('data:', data);
.write(data); // echo back the data.
socket; })
The socket.write()
method sends data back to the
peer.
Step 5: Close The Connection
The socket.end()
method ends the transmission and closes
the socket. Here we call socket.end()
when the data
contains the letter “q” so we can easily test this scenario.
.on('data', (data: Buffer) => {
socketconsole.log('data:', data);
.write(data); // echo back the data.
socket
// actively closed the connection if the data contains 'q'
if (data.includes('q')) {
console.log('closing.');
.end(); // this will send FIN and close the connection.
socket
}; })
When the transmission is ended from either side, the socket is
automatically closed by the runtime. There is also the
'error'
event on net.Socket
that reports IO
errors. This event also causes the runtime to close the socket.
Step 6: Test It
Here is the complete code for our echo server.
import * as net from "net";
function newConn(socket: net.Socket): void {
console.log('new connection', socket.remoteAddress, socket.remotePort);
.on('end', () => {
socket// FIN received. The connection will be closed automatically.
console.log('EOF.');
;
}).on('data', (data: Buffer) => {
socketconsole.log('data:', data);
.write(data); // echo back the data.
socket
// actively closed the connection if the data contains 'q'
if (data.includes('q')) {
console.log('closing.');
.end(); // this will send FIN and close the connection.
socket
};
})
}
let server = net.createServer();
.on('error', (err: Error) => { throw err; });
server.on('connection', newConn);
server.listen({host: '127.0.0.1', port: 1234}); server
Start the echo server by running
node --enable-source-maps echo_server.js
. And test it with
the nc
or socat
command.
3.4 Discussion: Half-Open Connections
Each direction of a TCP connection is ended independently, and it is possible to make use of the state where one direction is closed and the other is still open; this unidirectional use of TCP is called TCP half-open. For example, if peer A half-closes the connection to peer B:
- A cannot send any more data, but can still receive from B.
- B gets EOF, but can still send to A.
Not many applications make use of this. Most applications treat EOF the same way as being fully closed by the peer, and will also close the socket immediately.
The socket primitive for this is called shudown. Sockets in Node.js do not support half-open by default, and are automatically closed when either side sends or receives EOF. To support TCP half-open, an additional flag is required.
let server = net.createServer({allowHalfOpen: true});
When the allowHalfOpen
flag is enabled, you are
responsible for closing the connection, because
socket.end()
will no longer close the connection, but will
only send EOF. Use socket.destroy()
to close the socket
manually.
3.5 Discussion: The Event Loop & Concurrency
JS Code Runs Within the Event Loop
As you can see, callbacks are needed to do anything in our echo server. This is how an event loop works. It’s a mechanism of the Node.js runtime that is invisible to the programmer. The runtime does something like this:
// pseudo code!
while (running) {
let events = wait_for_events(); // blocking
for (let e of events) {
do_something(e); // may invoke callbacks
} }
The runtime polls for IO events from the OS, such as a new connection arriving, a socket becoming ready to read, or a timer expiring. Then the runtime reacts to the events and invokes the callbacks that the programmer registered earlier. This process repeats after all events have been handled, thus it’s called the event loop.
JS Code and Runtime Share a Single OS Thread
The event loop is single-threaded; execution is either on
the runtime code or on the JS code (callbacks or the main program). This
works because when a callback returns, or await
s, control
is back to the runtime, so the runtime can emit events and schedule
other tasks. This implies that any JS code is expected to finish
in a short time because the event loop is halted when executing
JS code.
Concurrency in Node.JS is Event-Based
To help you understand the implication of the event loop, let’s now consider concurrency. A server can have multiple connections simultaneously, and each connection can emit events.
While an event handler is running, the single-threaded runtime cannot do anything for the other connections until the handler returns. The longer you process an event, the longer everything else is delayed.
3.6 Discussion: Asynchronous vs. Synchronous
Blocking & Non-Blocking IO
It’s vital to avoid staying in the event loop for too long. One way to cause such trouble is to run CPU-intensive code. This can be solved by …
- Voluntarily yield to the runtime.
- Moving the CPU-intensive code out of the event loop via multi-threading or multi-processing.
These topics are beyond the scope of this book, and our primary concern is IO.
The OS provides both blocking mode and non-blocking mode for network IO.
- In blocking mode, the calling OS thread blocks until the result is ready.
- In non-blocking mode, the OS immediately returns if the result is not ready (or is ready), and there is a way to be notified of readiness (for event loops).
The Node.JS runtime uses non-blocking mode because blocking mode is incompatible with event-based concurrency. The only blocking operation in an event loop is polling the OS for more events when there is nothing to do.
IO in Node.js is Asynchronous
Most Node.js library functions related to IO are either callback-based or promise-based. Promises can be viewed as another way to manage callbacks. These are also described as asynchronous, meaning that the result is delivered via a callback. These APIs do not block the event loop because the JS code doesn’t wait for the result; instead, the JS code returns to the runtime, and when the result is ready, the runtime invokes the callback to continue your program.
The opposite is the synchronous API, which blocks the
calling OS thread to wait for the result. For example, let’s take a look
at the documentation of the fs
module, file APIs are
available in all 3
types.
// promise
.read([options]);
filehandle// callback
.read(fd[, options], callback);
fs// synchronous, do not use!
.readSync(fd, buffer[, options]); fs
The synchronous API is what you do NOT use in network applications since it blocks the event loop. They exist for some simple use cases (like scripting) that do not depend on the event loop at all.
Event-Based Programming Beyond Networking
IO is more than disk files and networking. In GUI systems, user input from the mouse and keyboard is also IO. And event loops are not unique to the Node.js runtime; Web browsers and all other GUI applications also use event loops under the hood. You can transfer your experience in GUI programming to network programming and vice versa.
3.7 Discussion: Promise-Based IO
As we mentioned before, there is another style of writing IO code.
The alternative style uses Promise
s instead of callbacks.
The advantage of promise-based APIs is that you can await
on them and get the result, thus avoiding breaking your program into
tiny callbacks that scattered all over the place.
A hypothetical promised-based API for the accept primitive looks like this:
// pseudo code!
while (running) {
let socket = await server.accept();
newConn(socket); // no `await` on this
}
And the hypothetical API for the read and write primitive looks like this:
// pseudo code!
async function newConn(socket) {
while (true) {
let data = await socket.read();
if (!data) {
break; // EOF
}await socket.write(data);
} }
The above pseudo code appears to be synchronous, but without blocking the event loop. Although the advantage may not be clear at this point, since our program is very simple.
Some Node.js APIs, but not all of them, are available in both callback-based and promise-based styles. However, with some effort, callback-based APIs can be converted to promised-based ones, as we will see in the next chapter.