05. B+Tree Deletion and Testing
5.1 High-level interfaces
We’ll add the interfaces to use the B+tree as a KV.
// insert a new key or update an existing key
func (tree *BTree) Insert(key []byte, val []byte) error
// delete a key and returns whether the key was there
func (tree *BTree) Delete(key []byte) (bool, error)
Most of the details are introduced with the tree insertion, so there’s not much more to learn from the deletion. Skip this chapter if you know the principle.
Handle the root node
There is some extra work in maintaining the root node for tree insertions.
- Create the root node if the tree is empty.
- Add a new root if the root node is split.
func (tree *BTree) Insert(key []byte, val []byte) error {
// 1. check the length limit imposed by the node format
if err := checkLimit(key, val); err != nil {
return err // the only way for an update to fail
}
// 2. create the first node
if tree.root == 0 {
:= BNode(make([]byte, BTREE_PAGE_SIZE))
root // later...
.root = tree.new(root)
treereturn nil
}
// 3. insert the key
:= treeInsert(tree, tree.get(tree.root), key, val)
node // 4. grow the tree if the root is split
, split := nodeSplit3(node)
nsplit.del(tree.root)
treeif nsplit > 1 {
:= BNode(make([]byte, BTREE_PAGE_SIZE))
root // later ...
.root = tree.new(root)
tree} else {
.root = tree.new(split[0])
tree}
return nil
}
Sentinel value
There is a trick when creating the first root: we inserted an empty key. This is called a sentinel value, it’s used to remove an edge case.
if tree.root == 0 {
:= BNode(make([]byte, BTREE_PAGE_SIZE))
root .setHeader(BNODE_LEAF, 2)
root// a dummy key, this makes the tree cover the whole key space.
// thus a lookup can always find a containing node.
(root, 0, 0, nil, nil)
nodeAppendKV(root, 1, 0, key, val)
nodeAppendKV.root = tree.new(root)
treereturn nil
}
Examine the lookup function nodeLookupLE
, you’ll see
that it won’t work if the key is out of the node’s range. This is fixed
by inserting an empty key, which is the lowest possible key by sort
order, so that a lookup will always find a position.
Grow the tree
The new root points to the split nodes of the old root. This won’t happen often for the tree height is O(log N).
if nsplit > 1 { // the root was split, add a new level.
:= BNode(make([]byte, BTREE_PAGE_SIZE))
root .setHeader(BNODE_NODE, nsplit)
rootfor i, knode := range split[:nsplit] {
, key := tree.new(knode), knode.getKey(0)
ptr(root, uint16(i), ptr, key, nil)
nodeAppendKV}
}
5.2 Merge nodes
Node update functions
Since we’re copy-on-write, updating a node means copying data to a new node.
// remove a key from a leaf node
func leafDelete(new BNode, old BNode, idx uint16)
// merge 2 nodes into 1
func nodeMerge(new BNode, left BNode, right BNode)
// replace 2 adjacent links with 1
func nodeReplace2Kid(new BNode, old BNode, idx uint16, ptr uint64, key []byte)
Merge conditions
Deleting may result in empty nodes, which can be merged with a
sibling if it has one. shouldMerge
returns which sibling
(left or right) to merge with.
// should the updated kid be merged with a sibling?
func shouldMerge(
*BTree, node BNode, idx uint16, updated BNode,
tree ) (int, BNode) {
if updated.nbytes() > BTREE_PAGE_SIZE/4 {
return 0, BNode{}
}
if idx > 0 {
:= BNode(tree.get(node.getPtr(idx - 1)))
sibling := sibling.nbytes() + updated.nbytes() - HEADER
merged if merged <= BTREE_PAGE_SIZE {
return -1, sibling // left
}
}
if idx+1 < node.nkeys() {
:= BNode(tree.get(node.getPtr(idx + 1)))
sibling := sibling.nbytes() + updated.nbytes() - HEADER
merged if merged <= BTREE_PAGE_SIZE {
return +1, sibling // right
}
}
return 0, BNode{}
}
Deleted keys mean unused space within nodes. In the worst case, a mostly empty tree can still retain a large number of nodes. We can improve this by triggering merges earlier — using 1/4 of a page as a threshold instead of the empty node, which is an arbitrary soft limit on the minimum node size.
5.3 B+tree deletion
This is similar to insertion, just replace splitting with merging.
// delete a key from the tree
func treeDelete(tree *BTree, node BNode, key []byte) BNode
// delete a key from an internal node; part of the treeDelete()
func nodeDelete(tree *BTree, node BNode, idx uint16, key []byte) BNode {
// recurse into the kid
:= node.getPtr(idx)
kptr := treeDelete(tree, tree.get(kptr), key)
updated if len(updated) == 0 {
return BNode{} // not found
}
.del(kptr)
tree// check for merging
new := BNode(make([]byte, BTREE_PAGE_SIZE))
, sibling := shouldMerge(tree, node, idx, updated)
mergeDirswitch {
case mergeDir < 0: // left
:= BNode(make([]byte, BTREE_PAGE_SIZE))
merged (merged, sibling, updated)
nodeMerge.del(node.getPtr(idx - 1))
tree(new, node, idx-1, tree.new(merged), merged.getKey(0))
nodeReplace2Kidcase mergeDir > 0: // right
:= BNode(make([]byte, BTREE_PAGE_SIZE))
merged (merged, updated, sibling)
nodeMerge.del(node.getPtr(idx + 1))
tree(new, node, idx, tree.new(merged), merged.getKey(0))
nodeReplace2Kidcase mergeDir == 0 && updated.nkeys() == 0:
(node.nkeys() == 1 && idx == 0) // 1 empty child but no sibling
assertnew.setHeader(BNODE_NODE, 0) // the parent becomes empty too
case mergeDir == 0 && updated.nkeys() > 0: // no merge
(tree, new, node, idx, updated)
nodeReplaceKidN}
return new
}
Even if a node becomes empty, it may not be merged if it has no siblings. In this case, the empty node is propagated to its parent and merged later.
5.4 Test the B+tree
The data structure only interacts with the rest of the DB via the 3 page management callbacks. To test the B+tree, we can simulate pages in memory.
type C struct {
tree BTreemap[string]string // the reference data
ref map[uint64]BNode // in-memory pages
pages }
func newC() *C {
:= map[uint64]BNode{}
pages return &C{
: BTree{
tree: func(ptr uint64) []byte {
get, ok := pages[ptr]
node(ok)
assertreturn node
},
new: func(node []byte) uint64 {
(BNode(node).nbytes() <= BTREE_PAGE_SIZE)
assert:= uint64(uintptr(unsafe.Pointer(&node[0])))
ptr (pages[ptr] == nil)
assert[ptr] = node
pagesreturn ptr
},
: func(ptr uint64) {
del(pages[ptr] != nil)
assertdelete(pages, ptr)
},
},
: map[string]string{},
ref: pages,
pages}
}
C.pages
is a map of allocated pages. It’s used to
validate pointers and read pages. The pointers are actually in-memory
pointers, and the B+tree code doesn’t care.
To test the B+tree, we first need to update it under various scenarios and then verify the result. The verification is generic, there are 2 things to verify:
- The structure is valid.
- Keys are sorted.
- Node sizes are within limits.
- The data matches a reference. We used a map to capture each update.
func (c *C) add(key string, val string) {
.tree.Insert([]byte(key), []byte(val))
c.ref[key] = val // reference data
c}
The test cases are left as an exercise. The next thing is B+tree on disk.