Building a database from scratch - 2

date July 13, 2021

date 7 min read

An introduction to database fundamentals

Tim Armstrong, Founder, Consultant Engineer, Expert in Performance & Security

In Part 1 we looked into why you might consider building a database from scratch, and the architecture and requirements for the database we’re going to build in this series.

In this part of the series, we dive into Data Structures and Object Storage.

Let’s get started!

Let’s take a brief moment to reflect on what we discussed in Part 1. Our requirements for a specific component in a larger system led us to the conclusion that a clean solution would be to effectively build a custom PubSub-esque database that receives UDP packets containing samples in real-time and ships aggregated blocks upstream periodically to a RESTful API via HTTP.

A simplified diagram showing just three elements; source, database, and drain

Because we’re working with what is essentially a binary KeyValue store, we need to decide on how to pack our objects into binary blobs that can be stored.

Binary Blobs

If we look at other solutions in the space, we see that MongoDB uses BSON (which describes itself as a “Binary JSON” format), and some others use solutions built around protobuf (which for some inexplicable reason refers to itself as “XML, but smaller, faster and simpler”). Neither of the self-descriptions is even remotely accurate, but ignoring that, what are they doing?

Protobuf

Well, in the case of protobuf, it’s taking what is essentially a defined C Struct, generating a custom packing format for each object defined, and providing an ORM-like interface to those objects. This is pretty cool if you want a flat object, but things get complicated if you need a multi-dimensional array, or you want nested objects.

BSON

BSON however defines a more general Key-Value document, where the type, name, and data are packed together. This allows loading of the document with zero prior knowledge of the format while enabling both nested documents and multi-dimensional arrays, which is awesome! However, this additional complexity means that it is considerably slower at both parsing and packing; and uses significantly more space.

So how are we going to progress? Well, we have the fortunate position of being in complete control of this project, as we’re building both the process that stores the data and the process that reads the data in one project. So we can go a little lower level and define everything directly in python’s CTypes. This gives us even better speed and size than protobuf while providing similar flexibility to BSON. Although, nothing is for free, so we’re paying the cost of reduced flexibility such as automated migrations from an older format to a newer one.

Building the structures

So let’s get coding, the object we’re going to store, if represented as a dictionary, would look like this:

{
"ident": "37d5edb2-ebce-429f-82bb-631f161f0ab5",
"sample": 0.05570,
"timestamp": 1626013792.467065
}

The first thing we’re going to need to do is to import a few things from ctypes. In this case, we need to store float objects (the timestamp, and the sample) and a string object (the ident). The floats are easy, as this is just a native type in C, namely float but in this case, we’ll make it a double in case we need the extra precision in the future. The string is a little more complicated, as in C a string is defined as an array of individual char (Byte) objects. So to get started then, we need to import the Structure base class, the Array base class, c_double, c_char.

from ctypes import Structure, Array, c_double, c_char

To build our string analogue we need to define an array with a fixed length (similar to using a SQL database) and fixed content type (mixed-type arrays are supported but more complicated as you need to define union types).

class Ident(Array):
    _length_ = 36
    _type_   = c_char

Next, define the object structure:

class Sample(Structure):
    _fields_ = [
        ('ident', Ident),
        ('sample', c_double),
        ('timestamp', c_double)
    ]

At this point, you could technically get started using LMDB, but you’re going to want to have a plaintext representation at some point to easily work out if things are all working as expected. So let’s define a Mixin class that adds the property method as_dict. This method will parse the object we just defined and convert it into a regular python dictionary object:

class AsDictMixin:
    @property
    def as_dict(self):
        d = {}
        for (key, _) in self._fields_:
            if isinstance(self.__getattribute__(key), AsDictMixin):
                d[key] = self.__getattribute__(key).as_dict
            elif isinstance(self.__getattribute__(key), bytes):
                d[key] = self.__getattribute__(key).decode()
            else:
                d[key] = self.__getattribute__(key)
        return d

The as_dict method here iterates over the _fields_ list, identifies what the object represents, then copies the object into the returned dictionary d, making any necessary conversions along the way. You’ll notice that we’re looking for a bytes object here and doing a UTF-8 decode on it. This is because, while Python doesn’t know how to define our string analogue on its own (as it needs the length), it can do the conversions for us once we’ve defined it.

To use this Mixin class, simply add it after Structure in the parent class field. Finally, we need to tell Python how to build the string representation of our object by defining the __repr__ magic method. So now our Sample object should look like this:

class Sample(Structure, AsDictMixin):
    _fields_ = [
        ('ident', Ident),
        ('sample', c_double),
        ('timestamp', c_double)
    ]
    
    def __repr__(self):
        return f"{self.__class__.__name__}({', '.join(['='.join([key, str(val)]) for key, val in self.as_dict.items()])})"

Writing and Reading

Okay, so we’ve got our data structure now, how do we actually use it?

The first thing we need to do is open an Environment (What LMDB refers to its primary databases instances as). Let’s start by defining a 1GiB environment, given that we’re using 52 Bytes [1], this means we’d be able to store approximately 20 Million records.

import lmdb
Ki = 1024
Mi = 1024*Ki
Gi = 1024*Mi
env = lmdb.Environment("example.lmdb", map_size=Gi, subdir=True, readonly=False, metasync=True, sync=True,
                           map_async=False, mode=493, create=True, readahead=True, writemap=True, meminit=True,
                           max_readers=126, max_dbs=2, max_spare_txns=1, lock=True)

You can find out what each of the arguments does over at https://lmdb.readthedocs.io/en/release/.

Now, to read or write to the LMDB environment, you need a transaction.

In the code above, where we opened the environment we set the max_readers to 126, this means that at most we can have 126 concurrent read transactions open. Read transactions are COW(Copy On Write) protected, which means that the more read transactions you keep open, the higher the risk that the Storage and RAM space that you are consuming will grow.

Write transactions, however, are serialised, so if there is already a write transaction open, then the thread/task/process will need to wait. This means it will need to enter the queue for the write lock. This is why it’s imperative to use Context Managers to ensure that the write transaction gets closed properly (even in the case of an unexpected failure), failure to do so can result in a deadlock. Given this limitation (which is technically present for all ACID databases), it’s important to carefully consider where you open your write transactions.

To open a transaction we need to call env.begin() for a write transaction we supply the argument write=True. So let’s open a write transaction and write our first object:

ident = "37d5edb2-ebce-429f-82bb-631f161f0ab5"
measurement = 0.05570
ts = 1626013792.467065

input_sample = Sample(ident=ident.encode(), sample=measurement, timestamp=ts)

with env.begin(write=True) as txn:
    txn.put(input_sample.ident, bytes(input_sample))

So what happened here? We created an instance of our ctype struct Sample, then we opened the write transaction, converted the object to bytes and stored that in the database using the ident field of the object as the database key.

Let’s read this object back:

with env.begin() as txn:
    raw_bytes = txn.get(ident.encode())
    output_sample = Sample.from_buffer_copy(raw_bytes)

Here we opened the read transaction, collected the raw bytes that we stored earlier and used the from_buffer_copy method inherited from the ctypes Structure class to create an instance of the object from our raw data.

The code

If you want to take a look at the final code from this part of the tutorial it’s available here.

Articles overview