Log Structured Merge Trees

LSM trees are designed to achieve higher throughput and are used as the storage engine of various DB such as HBase, Cassandra, LevelDB, SQLite. As the name suggests, writes are made to log files in append-only mode. These logs files are merged and compacted in the background and indexed for efficient search.

To discuss how LSTM work’s let’s understand few basics:

  1. Log storage
  2. Segment
  3. Segment Merge
  4. Memtable
  5. SSTable
  6. WAL

Log storage – As the name suggests it is an append-only file where any new data is appended to the file. The file contains key-value pairs. Where key is the record key and value is the data. Since the file is append only, the log file can contain multiple records for the same key as an update to the existing the key.

Segment – Continuous appending to the log file, can make file size big and eventually running out of disk space. One good solution is to break log into segments of a certain size.

Segment Merge –  Since the files size are small these files can be merged at a regular time for compaction. The merging and compaction can be done as background thread while regular operations are performed. After the merging process is completed switch can be done to read from the new segments, replacing the old segments(which can be deleted). During the merge process multiple records with the same key are converged into a single record.


Source : Wikipedia

SSTable – Sorted String Table is a concept borrowed from Google BigTable which stores a set of immutable row fragments in sorted order based on row keys. It is a log file but with keys sorted.

Write Ahead Log – Write ahead log is a technique where the request received for write is first written into a separate file residing on the disk. The helps from crash recovery where the data from the file can be replayed and inserted.

Write Amplification – Log structured indexes also rewrite data multiple times due to repeated compaction and merging of SSTables. This effect of one write to the database resulting in multiple writes to the disk over course of the database’s lifetime – is known as write amplification.

How does read/write works for LGSTM?

  • When a write request comes, it is added to an in-memory balanced tree data structure (eg red-black tree). This in memory tree, is sometimes referred as memtable.
  • Along with request added to in-memory balanced tree the request is also logged in a WAL (on disk). The log helps to recover in case system or the process crashes.
  • When the memtable gets bigger than some threshold it is flushed to disk as a SSTable file. This is done efficiently because tree already maintains a sorted key-value pair. The becomes the most recent segment of the database. During the operation of flushing to the disk, write operation continues as normal. Even with compaction reads will still need to visit many files. Most implementations void this through the use of a Bloom filter. Bloom filters are a memory efficient way of working out whether a file contains a key.
  • To server the read request, first key is searched in memtable then the most recent on-disk segment and so on in the next older segment.
  • The merging and compaction of the segment happens in the background.

Advantages of  LSM

  • LSM trees are typically able to sustain higher write throughput than B-trees, partly because it has lower write amplification and partly because they sequentially write in the form of compact SST tables files rather than having to overwrite several pages in the tree.
  • LSM can be compressed better and thus often produce smaller files on disk than B-trees.
  • B-tree storage engines leave some disk space unused due to fragmentation: when a page is split or when a row cannot fit into an existing page. Since LSM are not page oriented and writes are sequential the storage overhead is low.

Disadvantage of LSM

  • The compaction process of LSM can sometime interfere with the performance of ongoing reads and writes. When writing to an empty database, the full disk bandwidth can be used for initial write but bigger the database gets, the more disk bandwidth is required for compaction.
  • An advantage of B-trees is that each key exists in exactly one place in the index whereas a log-structured storage engine may have multiple copies of the same key in different segments.

How does Https work?

We know that sending a data over the internet is like sending a box via courier with no digital lock and can be opened by anyone. (One can argue about hacking the lock but that’s not the problem we are trying to solve).

To send the data securely, https was invented which is nothing but

HTTP +SSL (Secure Socket Layer)


SSL uses cryptography to encrypt the text.

Let discuss the basics of cryptography.

1) Encryption – The process of converting a text to a random string is called Encryption. To convert the plain text to the random text, a key is used.
2) Key – The key is used to encrypt and decrypt the data.
3) Decryption – The process of converting a random string to text is called Decryption. To convert the random text to the plain text a key is used.

To encrypt the text A and decrypt back to text A the key should be common. The key is called symmetric key.

Symmetric Key An encryption system in which the sender and receiver of a message share a single, common key that is used to encrypt and decrypt the message. Symmetric-key systems are simpler and faster, but their main drawback is that the two parties must somehow exchange the key in a secure way.

Asymmetric key The keys are simply large numbers that have been paired together but are not identical (asymmetric). One key in the pair can be shared with everyone; it is called the public key. The other key in the pair is kept secret; it is called the private key. Either of the keys can be used to encrypt a message; the opposite key from the one used to encrypt the message is used for decryption.

Now we are aware of the basics of cryptography, lets look into how HTTPS works

  • When a user clicks on an https link browser makes a TCP connection on https port 443 with the server.
  • After a connection is successful SSL handshake starts between browser and server.

The series of exchange between server and client can be categorized into 3 groups.

  • Hello: Client sends a hello message which contains details such as Highest SSL version, Ciphers algorithm it supports, Compression algorithm, Random key – this is later used to generate a symmetric key. Server responds with a hello message containing SSL version, Cipher to be used, sessionid, random data – this data will later be used in generation of key.
  • Certificate Exchange: After server hello message, the server sends a digital certificate. The certificate contains the public key assigned for the browser. The certificate also helps to set the identity of the browser with the server. The digital signature on the certificate is someone vouching for the fact that a particular public key belongs to a particular individual or organization.In order to be trusted by the average web browser, certificates have to be signed by a trusted Certificate Authority (CA). CAs are companies that perform manual inspection and review, to make sure that the applying entity is both:
    1. a real person or business that exists in the public record
    2. in control of the domain, they’re applying for a signed certificate for

    Once the CA verifies that the applicant is real and really owns the domain, the CA will “sign” the site’s certificate, essentially putting their stamp of approval on the fact that this site’s public key really belongs to them and should be trusted. The browser comes preloaded with a list of trusted CAs.

  • Key Exchange: After receiving the digital certificate, the browser generates a symmetric key. It sends this key by encrypting it with the server public key. Since this message is encrypted using server public key, it can only be decrypted by its private key which only resides on the server.

Once the symmetric key is exchanged the browser can start interacting with the server by sending encrypted messages securely.