Skip to content
Advertisement

Alternative to writing many files. MongoDB?

I have a Perl script that produces ~10000 files in the 1kB – 10kB size area, which is not optimal for performance, so I though about using MongoDB instead of writing the many files. I need to run the script on my laptops Linux and OSX.

Question

Would MongoDB be overkill? Or are there something more suited for this sort of local file storage?

Advertisement

Answer

Some file systems are better than other to deal with dozen of thousands of small files. According to rumors, BTRFS or ReiserFS used to be better than Ext3 or Ext4. And you could set up the block size (perhaps to 1Kbyte) when making the file system.

You could stay with 10K files, but it might be easier to put them in hundred directories, i.e. have dir01/file000.txt dir01/file999.txt dir02/file000.txtdir02/file999.txt …. dir99/ ….

It is at least more human friendly -so ls gives a reasonable output-, and may be more efficient on some old filesystems.

MongoDB, like MariaDb (or MySQL) or PostgreSQL are database servers, so you need to have the server running (perhaps just on localhost) and the sole client server connection has some cost.

You could also consider GDBM, which is a library providing indexed files.

And you could also consider Sqlite, which is also a library providing an Sql database.

At last, 10K files of 10Kbytes each is only 100Mbytes. This fits easily in memory, or in a single file…

And keeping 10K files of 1 to 10Kbytes each can have advantages, e.g. if content is textual: standard tools like grep or awk work well on them.

It really depends upon your application.

Advertisement