B-Trees
A B-tree is a balanced tree designed to be stored on a disk. Disks can only read and write (large)fixed-size blocks of data at once. A B-tree stores multiple keys in each of its nodes so that (1) a single diskread can access a lot of keys, and (2) the branching factor of the tree is very high (in practice, 1000+), sothat a tree of small height can store a huge number of keys, any of which can be accessed with just a fewdisk operations
Formally, a B-tree is a rooted treeT(with root root[T]), each of whose nodes can contain a variablenumber of keys. and the following properties:
- Each nodexof the tree contains the following data:
- n(x) – the number of keys in nodex
- leaf(x) – a boolean, true iffxis a leaf of the tree
- keysk1(x). . . kn(x)(x), stored in sorted order
- child pointersc1(x). . . cn(x)+1(x). Note that a non-leaf node withnkeysalwayshasn+1 children.
- The equivalent to the BST property for B-trees is as follows. Letci(x) be the pointer to the subtreethat lies between keyski−1(x) andki(x). Then (assuming no duplicate keys) every key in the subtreerooted atci(x) lies strictly betweenki−1(x) andki(x).
For example, ifk2(x) = 5, andk3(x) = 17, all the keys in the subtree rooted atc3(x) would lie between6 and 16. The keys in the subtree rootedc1(x) are all< k1(x), while those in the subtree rooted atcn(x)+1(x)are all> kn(x)(x). 3. A B-tree has some heighth(which we define as the total number of levelsincluding the root). Everyleaf of the tree is at the same depthh. 4. A B-tree is parameterized by a valuet≥2, called itsminimum degree. Every B-tree with minimumdegreetmust satisfy the following twpdegree invariants: (a)min-degree: Each node of the treeexcept the rootmust contain at leastt−1 keys (and hencemust have at leasttchildren if it is not a leaf). The root must have at least 1 key (two children). (b)max-degree: Each node of the tree must contain no more than 2t−1 keys. A node with exactly2t−1 keys is called “full.” (In practice, this limit derives from the size of the disk block used tostore a node. Note that 2t−1 is always an odd number.)