Introduction
Database systems are one of the foundations of modern computing. When you hear the word “database”, you probably think of successful database products such as MySQL, Postgres, SQLite, ClickHouse, and so on. They are ubiquitous in software systems.
Understanding various database systems is important for a software developer, however, it’s also important to understand the ideas around database systems. “Build Your Own Database” is my new book that tries to illustrate the ideas around databases.
Why Learn Databases?
One of the reasons why SQL is so popular is that it can be largely used as a black box — feed a SQL into a DB and it will just do what your mean. The do-what-I-mean power combined with the English-like language makes SQL a good human interface.
But sometimes the do-what-I-mean capacity is just not enough. Applications also require the DB to perform operations in a reasonable amount of time, using a reasonable amount of resources. This leads to the topic of indexing.
Learning how to use indexes in SQL queries is usually the first step toward understanding databases. But it is only the first step, not the last. SQL to databases is like Excel to computers — it’s just an interface to do your work, and understandings come from the underneath the interfaces.
Taking database indexes as an example, one can spend time learning the function of indexes and the rules of their use. And he/she can do some real work with that knowledge, but still have only vague ideas like “indexes are fast, don’t scan the table”.
Or one can learn data structures like B-trees and the general concept of sorting and searching, and with that knowledge, the index is just a trivial application of data structures. Their idea of databases is more concrete, they can tell the big-O complexity of any database operation, and what can be done in production and what cannot. Much more useful.
And indexes are not the only topic you can learn about databases. There are other topics: persistence, concurrency and etc. Do you also treat them as black boxes, too? Time to move on!
The Book
The goal of this book is to illustrate important ideas around database systems. However, it’s not a book that talks about theories. It is one of the “build your own X from scratch” books.
If you have ever tried to understand the inner workings of a popular database, you may find yourself overwhelmed by information — there are countless books and blog posts explaining various aspects and details — but you still do not quite get the big picture. This is because real-world software projects are complex, and understanding them is just time-consuming.
But, it’s possible to build a simplified version of a database, just enough to teach you some important aspects and is simple enough to build your own in your spare time. My previous book “Build Your Own Redis” is an example of this. And there are good reasons for the “from scratch” method.
The mini database used in the book is not merely emulating the interface of databases but also covers important topics of database systems. I have chosen three major topics for this book, they are persistence, indexing, and concurrency.
The mini database implementation is incremental. You go from a B-tree to a KV store, then a relational DB, all in small steps.
The book is still a work in progress.
Read the draft here.