I have a program that is projected to use a few GB of lmdb diskspace (it’s a blockchain, and we’re moving away from leveldb due to its lack of ACID, which I need for some future plans). Is it possible to run that program with that database on a Raspberry Pi without adding more swap (with >1 GB memory)? (considering that adding swap is for advanced users).
Currently when I run that program mdb_env_set_mapsize(1 << 30)
, hence 1 GB of mapsize, it returns error 12, which is out-of-memory. But it works if I reduce the size to 512 MB.
But what’s the right way to handle such memory issues in lmdb when the database size keeps increasing?
Advertisement
Answer
The maximum size of memory that can be memory mapped depends on the size of the virtual address space, which is dictated by the CPU’s virtual memory manager. A 32-bit CPU have a limit of 4GB virtual address space, this limit is for the whole system unless PAE is enabled, in which case the limit is per process.
In addition to this, the kernel and your application reserves some space of their own on your address space, and memory allocation usually requires contiguous address space, reducing memory available for the database to allocate.
So your user will need to either enable PAE on their system, or upgrade to 64-bit CPU. If neither of these is an option in your application, then you cannot use a memory mapped file larger than your available address space, so you’ll have to do some segmentation to split your data into multiple files that you can map only small chunks at a time. I’m guessing that lmdb requires that it can map the entire database file into memory.
For a blockchain application, your data is mostly a linear sequence of log entries, so your application should only need to work with the most recent entries most of the time. You can separate the recent entries into its own working file, and the rest of the log in a database that doesn’t require mapping the entire file into memory or in multiple fixed size files that you can map and unmap as needed.