17 Jan 2017

Exploring LMDB

This week I'm continuing my tour of interesting C libraries by writing a Scheme binding to OpenLDAP's Lightning Memory-mapped Database (LMDB). Here's a quick overview of LMDB, and a few things I liked about it.

What is LMDB?

LMDB is a key-value store written by Howard Chu for the OpenLDAP project. LDAP workloads are usually very read-heavy, so the design of LMDB is read-optimised (rather than write-optimised). Depending on your workloads and the shape of your data, you might want to use this as an alternative to LevelDB, TokyoCabinet, BerkleyDB or other embedded key-value stores.

What I liked

  • It's written in C (not C++ like LevelDB)
    • I find C APIs easier to work with, and much easier to write Scheme bindings for
  • Ordered map interface
    • keys are ordered lexicographically by default
    • custom comparator functions available
  • Reader/writer transactions
    • This can be really convenient over LevelDB's atomic updates using batches
    • Read transactions are very cheap and don't block writers
  • Support for multiple named databases in an LMDB environment
    • This is something I'd often fake in LevelDB using sublevel (in Node, or my version in Scheme), but this handles 90% of my sublevel needs
    • A transaction can cover all named databases
  • LMDB environments may be opened by multiple processes on the same host
    • Great for Node, Python, CHICKEN Scheme etc. where the easiest way to handle parallel work is to spawn multiple processes
  • Ability to store sorted duplicates
    • You have the option of storing multiple sorted values per key (configured when opening a named database)
    • I wasn't sure what I'd use this for at first, but found some use cases surprisingly quickly. Clearly, I'm evaluating key-value stores from a 'Blub programmer' perspective.
  • Memory mapped, allowing zero copy lookup and iteration
    • I haven't exposed this in my Scheme bindings yet, but it's easy to do
  • Maintenance requires no background threads
  • Fast program start-up times
    • Database load times are very fast, which makes it a good candidate for the small command-line programs I occasionally like to write. I've also found Sqlite to be good. LevelDB introduces a more noticeable start-up delay (at least in my compiled Scheme programs).
  • Surprisingly good write performance for a read-optimised database
    • While the Copy-On-Write B+tree design of LMDB can't compete with the Log-structured merge (LSM) tree design of databases like LevelDB for high insert volume, in my small (and probably not representative) insert tests I've found it to be surprisingly quick.
    • On very large values you may see better performance from LMDB due to lower write amplification. In an LSM design like LevelDB the full size of most data is written once to the log, then again to the db. In a COW design like LMDB it must make copies of every page in the path from the leaf back to the root of the tree (the log of the number of keys), plus write the data. If the data is sufficiently large, this actually becomes the more economical trade-off.

Oddities

  • Databases have a max size
    • You set the memory map size manually - it defaults to some surprisingly small value like 1MB, which is probably good because it alerts you to this and makes you think about the appropriate value early on
    • It can be increased dynamically by calling mdb_env_set_mapsize, and this change will take effect in other processes too
    • It cannot be decreased (without copying the data to a new database)
  • Databases only grow, never shrink
    • This is a restriction of LMDB, but if you think about it, it's generally true in practice as well - databases rarely shrink!
    • Having worked with CouchDB and ZODB I'm fairly comfortable with databases that like to increase in size
    • Like CouchDB and ZODB, LMDB can use compaction to write out a smaller version of a database
    • Unlike CouchDB, LMDB can reclaim the space used by deleted data - that makes it easier to use in high-churn situations. LMDB does not suffer from the same growth problem as CouchDB, where a delete would increase the database size (until compaction)

Disclaimer

I've only written some Scheme bindings and toy programs, and these are my notes so far. All I'm saying is that LMDB has some interesting characteristics, and maybe you should consider it's suitability for your own project. Note a distinct lack of benchmarks on this page. They would be pointless coming from me.

Bindings as exploration

I've written several CHICKEN Scheme bindings now (LevelDB, CommonMark, WiringPi, plus others yet to be released) and I cannot recommend it enough. It's a fun way to explore interesting open-source projects (including all the nooks and crannies of the API), and learn about the language you use at the same time (FFI's are where many programmers first meet language internals). All while producing something for useful your community, which is relatively small, and achievable in a few evenings. Give it a try!

References