Skip to content

Latest commit

 

History

History
542 lines (377 loc) · 16.1 KB

docs.rst

File metadata and controls

542 lines (377 loc) · 16.1 KB

LimDB

Fast, in-process key-value store with a table-like interface persisted to disk using lmdb.

Why?

Memory-mapped files are one of the fastest ways to store data but are not safe to access concurrently. Lmdb is a proven and mature to solution to that problem, offering full compliance to ACID3, a common standard for database reliability, while keeping most of the speed.

Leveraging the excellent nim-lmdb interface, LimDB makes a larg-ish sub-set of lmdb features available in an interface familiar to Nim users who have experience with a table.

While programming with LimDB feels like using a table, it is still very much lmdb. Some common boilerplate is automated and LimDB is clever about bundling lmdb's moving parts, but there are no bolted-on bits or obscuring of lmdb's behavior.

What's new?

Since version 0.2, LimDB added:

  • withTransaction code block syntax for safe transactions with overridable readonly/readwrite auto-selection
  • Support for all Nim system types except ref
  • convenient initialization syntax for multiple databases of different types
  • transactions spanning multiple databases
  • ultra-shorthand syntax for quick throwaway programs
  • readonly transactions (0.2 used only writable ones)

Simple Usage

Provide LimDB with a local storage directory and then use it like you would use a table. After inserting the element, it's on disk an can be accessed even after the program is restarted, or concurrently by different threads or processes.

import limdb
let db = initDatabase("myDirectory")
db["foo"] = "bar"  # that's it, foo -> bar is now on disk
echo db["foo"]     # prints bar

Now if you comment out the write, you can run the program again and read the value off disk

That's it. If you just need to quickly save some data, you can stop reading here and start programming.

Transactions

If you have more than one read or write to do, it is usually a good idea to group them all into a so-called "transaction", because:

  • Your data will not change between different reads, even if there are unrelated writes going on
  • All writes will be done if successful, none if there is an error

This ensures consistency.

Transactions in LimDB are done using a simple block structure.

If there is an exception raised in your code, the writes in the block don't happen at all.

You can use that on purpose if you're not sure if everything you are going to write will be valid, for example when interacting with a user through a form.

Data Types

By default, keys and values are strings, but you can use any Nim system data type except ref.

Add a tuple for seperate types for the keys and values

Or just a type if both are the same.

Objects and named or unnamed tuples work fine as long as they don't contain a ref.

It's also possible to serialize objects to string and store them like that, if you prefer.

See Custom Data Types below if you want to natively add your own.

Caution!

It is recommended to hard-code the data types and the database if possible, making sure each database is only used with the data types that were already written to it. Confusing them can lead to garbage output or data loss.

Named Databases

If you need more than one database, you can put many in the same directory and refer to the by names.

The default database, the one used in the examples above, also has a name, an empty string "", but it should only be used if it's the only one.

Use a named tuple to provide names and types for the databases you want. You will get back a named tuple with the same keys containing your database objects.

import limdb

let db = initDatabase("myDirectory", (foo: int, bar: float, string))

db.foo[1] = 15
db.bar[5.5] = "fuz"

Note

If you already stored data in the default database, and now want to use named databases, migrate your data to a named database before adding more because the default database is used internally in this case.

Multi-Database Transactions

If you need to make consistent reads and/or writes to several databases, you can give withTransaction a tuple containing database objects. It can be one you got from initDatabase, or you can make your own.

A tuple containing a transaction object for each database will be placed into the transaction variable that you can use in the block to make changes, just like with the single database transaction above.

import limdb

let db = initDatabase("myDirectory", (foo: int, bar: int, string, fuz: float))

db.withTransaction t:
  t.foo[1] = 12
  t.bar[2] = "buz"
  t.fuz[3.3] = 4.4

(db.foo, db.fuz).withTransaction t:
  t[0][2] = 3
  t[1][4.4] = 5.5

(a: db.bar, b: db.buz).withTransaction t:
  t.a[3] = "fizz"
  t.b[6.6] = 8.8

Ultra-Shorthand

If you want to use a quick shorthand at the expense of some code readability, call tx instead of withTransaction t. Your transaction or transactions will be placed into a tx variable.

import limdb

let db = initDatabase("myDirectory")

db.tx:
  tx["foo"] = "bar"
  tx["fuz"] = "buz"
  echo tx["foo"]

db.tx:
  echo tx["bar"]

Note

The LimDB author recommends using this for quick throwaway code and exploratory programming, renaming to the more verbose withTransaction as programs get longer and mature.

Explicit Read/Write

By default, LimDB looks into your withTransaction or tx block and checks if there are any write calls in there, chosing readwrite or readonly modes accordingly.

If you want to make it clear a code block will not make any database changes, you can use an explicit readonly transaction.

import limdb

let db = initDatabase("myDirectory")
db["foo"] = "bar"

db.withTransaction readonly as t:
  echo t["foo"]
  t["fuz"] = "buz"  # raises IOError

db.tx ro:
  echo tx["foo"]
  tx["fuz"] = "buz"  # raises IOError

If you really want a readwrite transaction that doesn't write for some reason, you can have it.

Note

Automatically selecting transactions require Nim 1.4 or greater. On Nim 1.2 or lower, transaction blocks write by default, so if you are sticking to an older Nim version, use explicit readonly blocks to get a performance benefit.

Iterators

While you can access any data using the keys, you might want all of the data or not know the keys. You can use the usual keys, values and pairs iterators with a LimDB. They can be used standalone on a database or as part of a transaciton.

You can also use mvalues and mpairs to modify values on the go.

import limdb
let db = initDatabase[string, string]("myDirectory")
db.withTransaction:
  t["foo"] = "bar"
  t["fuz"] = "buz"

for key in db.keys:
  echo key
# prints:
# foo
# fuz

db.withTransaction:
  for value in t.values:
    echo value
# prints:
# bar
# buz

for key, value in db:
  echo "$# -> $#" % (key, value)

# prints:
# foo -> bar
# fuz -> buz

for value in db.mvalues:
  if value == "fuz":
    value = "buzz"

db.withTransaction:
  for key, value in t.mpairs:
    if key == "foo":
      value = "barz"

for key, value in db:
  echo "$# -> $#" % (key, value)

# prints:
# foo -> barz
# fuz -> buzz

Derived database

For many use cases, using only one centralized call to initDatabase in the whole program gives a nice, readable and safe way setting up your read and write needs and may be all you need.

Sometimes you might still prefer or need to open databases as you go along.

You can get more database objects (or tuples of several) from existing ones by calling initDatabase again, passing an existing database instead of a directory on disk.

Caution!

It's harder to make sure you open each named database with the right types when deriving databases, especially programmatically or at run-time. This can cause garbage output or data corruption- use with care.

Custom data types

If you need different data types, the simplest way is to convert them to a supported data type before entering them and after retrieving them.

If you have complex data structures, you can also use your favorite serialization library to serialize them to string before saving them as key or value.

If you want to have more syntactic convenience, you can add your own types to LimDB by implementing toBlob, fromBlob as proc or template.

The safe-and-easy way is to pre-process your type into one of the data types supported by LimDB. This is mainly for convenience, it doesn't run any faster than converting manually.

You can also implement your type manually for more speed and control. In this case, you also need to supply a compare template or procedure that returns 1 if the b argument is larger, -1 if the a argument is larger, or 0 if they are equal.

template toBlob(a: MyType): Blob
  Blob(mvSize: sizeof(a), mvData: cast[pointer](a.addr))

proc fromBlob(b: Blob): DateTime
  result = cast[ptr T](b.mvData)[]

proc compare(a, b: MyType): int =
  # assuming here that <, > and == are implemented for MyType
  if a < b:
    -1
  elif a > b:
    1
  else:
    0

let db = initDatabase[string, DateTime]("myDirectory")
db["now'] = now()

echo db["now"].fromUnixTime  # prints datetime

Caution!

You are responsible for ensuring memory safety if you work with Blob types directly

Manual transactions

If you want more control, you can begin, commit and reset transactions manually.

If you call initTransaction and then reset it later, that's equivalent to calling a withTransaction block in readonly mode.

If you call initTransaction and then commit it later, that's equivalent to calling a withTransaction block in readwrite mode.

Transactions are in readwrite mode by default, but can be set readonly for much better performance.

import limdb
let db = initDatabase("myDirectory")
let t = db.initTransaction
t["foo"] = "bar"
t["fuz"] = "buz"se
t.commit()

# readwrite can be set explicitly
let t = db.initTransaction readwrite
t["foo"] = "another bar"
t["fuz"] = "another buz"
t.reset()  # foo and bar remain unchanged

# readonly transaction
let t = db.initTransaction readonly
echo t["foo"]
echo t["bar"]
t.reset()  # Reset Read-only transactions when done

Caution!

You need to reset or commit readwrite transactions immediately after writing or all further ones will block forever.

Readonly transactions are more forgiving but still eventually need to be reset to avoid resource leak.

It's usually safer and more convenient to use the withTransaction syntax instead.

Deployment

LimDB relies on nim-lmdb for low-level calls, which in turn uses dynamic linking. Static linking or compiling the C sources in is not supported.

Linux

The resulting program will depend on liblmdb0.so, which you can install using the system's package manager, and require for distribution.

Windows

The program will depend on liblmdb.dll. A working 64-bit version nimmed from msys2 can be downloaded from this project and should be placed in the same directory as your executable.

OSX

The program depends lin liblmdb.dynlib. The easiest way to get it is to install it via homebrew.

To distribute it with your program, you can change its baked-in location to the binary file directory like so:

install_name_tool -id "@loader_path/liblmdb.dylib" liblmdb.dylib

For other systems, run the program to find out the file name of the required library, then build or install it for that platform.

Improvement Areas Of Interest

# Patch nim-lmdb to allow static linking and including the C sources * Allow auto-unpacking of multi-database transaction variables, e.g. (db1, db2).withTransaction t1, t2 readonly * Document how many copies are made when accessing and writing- there aren't many, and no more than in LMDB code in C * Useful iterators: keysFrom, keysBetween, other common usage of lmdb cursors * Map lmdb multipe values per key feature to something Nimish, perhaps iterators or seqs

Migrating from 0.2

0.2 code works unchanged and performance is improved for reads directly on the database object.

For transactions, it's recommended to use the new readonly parameter for initTransaction calls where appropriate.

Consider switching to the safer withTransaction syntax.

If you still need them, the 0.2 docs are still available.

Why is it called LimDB?

LimDB was originally named LimrodDB after the ancient king Nimrod's younger sibling, Limrod, who didn't make it into the history books because he was short. It was later renamed LimDB for marketing reasons.

By a wild coincidence, it also sounds a little like a vaguely pleasing jumble of Nim and LMDB.