0x370 Database

This page will be used to describe the concepts and implementations of software.


This section summarizing databases related topics. Most of contents here are my note from the database course.

Data models


  • SQL: Relational database
  • NoSQL: Key/Value, Graph, Document, Column-family, Array/Matrix


  • fast for simple queries
  • easy to scale horizontally

Document Model

Document models are to store all information for a single object in the same place. Encoding examples are XML and JSON. There are schemaless or schema-on-read

When should use document model :

  • self-contained
  • no reference
  • no duplication
  • tree structure (one-to-many relationship)


Relational Model

Relational models are to represent data as an unordered set of tuples and define operational primitives over the set.

Relational model is independent of query language implementation

When should use relation model

  • many-to-many relationship

Relation structure

define structure how to represent data in the relational model

  • relation is an unordered set that contain relationship of attributes (tuples)
  • A tuple is a set of attribute values
  • A relation’s primary key uniquely identifies a single tuple
  • A foreign key specifies that attribute

Relational algebra

define interface how to retrieve and manipulate tuples in a relation. Input is relation, output is relation

  • select: choose a subset of the tuples from a relation that satisfies a selection predicate
  • projection: generate a relation contains specified attributes
  • union, intersection, difference, product
  • join: generate a relation that contains all tuples that are a combination of two tuples with common attributes

Storage Management

One solution to manage storage is using file mapping (mmap), it works well for reading, but not good when there are multiple writers. Instead, database tends to manage storage by themselves using database pages (different page size for different db)

Heap file

A heap file is an unordered collection of pages. There are two styles of heap file.

  • Linked List: there is a header managing pointers to data pages and free pages
  • Page Directory: track the location and free slots for each page

Page Layout

Each page contains a header (page size, checksum, version…). There are two types of layouts: Tuple oriented layout or log oriented layout.

Tuple Oriented (Slotted Tuples)


Reference: CMU 15445 Lecture 3

Log oriented

append the new log to pages (hbase, cassandra, leveldb …)



  • build index
  • compaction

Tuple Storage

N-ary Storage Model (Row model)

Reference: CMU 15445 Lecture 4

Decomposition Storage Model (Column Model)

Reference: CMU 15445 Lecture 4

BigQuery is an example of this type. (For example, BigQuery processed data looks like to be proportional to the number of columns: select one column usually cost much less data compared with select all columns)

Large value:

  • make them into overflow page and store pointer
  • make them into external file and store path

Memory Management

Reference: CMU 15445 Lecture 5

Buffer Pool is a large memory range as an array of fixed size pages. An array entry (slot) is called a frame. DBMS may have multiple buffer pools.

Page table keep tracks of pages that are currently in buffer pool, also maintains meta-data per page such as dirty flag, pin, reference counter for concurrency issue.


  • scan sharing: decide which page to evict based on access pattern of multiple queries
  • pre-fetching: prefetch page based on query
  • buffer pool bypass: bypass memory pool to reduce overhead
  • OS page cache: O_DIRECT to bypass OS cache (one exception is postgre)

Buffer Replacement Policies

The followings are the most common policies. Enterprise db usually have much better policies than open source db

  • LRU (Least Recently Used): maintain a timestamp, evict pages by timestamp order. The disadvantage of LRU is the sequential flooding for sequential scan, the most recent page is the most useless.
  • clock: approximation of LRU, set to 1 when a page accessed, evict page whose when it is 0. Same problem as LRU
  • LRU-K: take into account history of the last K references as timestamps and compute the interval between subsequent accesses, the interval helps to predict when the page will get accessed next time. Typical K is 1.
  • Localization: restrict memory to its private buffer rather than polluting all memory.
  • Priority Hints: provide hints whether a page is important or not.

Dirty pages need to be taken account (e.g: background writing)


file-based serverless database. SQLite does not compete with client/server databases. SQLite competes with fopen().