I have implemented a file-backed HashTable using numpy.memmap. It appears to be functioning correctly, however, I notice that on Linux both KSysGuard and SMART are reporting ridiculous IO Write amounts. About 50x the amount of data that should be written. I have not tested this on other operating systems.
This is the code that creates the internal memory map
self.data = np.memmap(self.filename, shape=(self.nbuckets, self.bucket_size), dtype=[('key', 'u8'), ('time', 'u2'), ('flags', 'u2'), ('id', 'i4')], mode=mode)
And the following is the code which writes a single entry into the table after the hash function is applied:
def store(self, id_, time_, key, i): bucket = self.data[i] head = bucket[0] if bucket[1]['flags'] & HashTable.FLAG_INUSE == 0: free = bucket[1] head['id'] = 1 #we use the first entry in the bucket to count how full it is self.written += 4 #4 bytes written to set that counter to 1 elif head['id'] < self.bucket_size: free = bucket[head['id'] ] else: return False free['key'] = key free['time'] = time_ free['flags'] |= HashTable.FLAG_INUSE free['id'] = id_ head['id'] += 1 self.dirty = True self.written += 20 #16 bytes for the entry, +4 bytes for updating bucket usage counter return True
I added the self.written
variable to keep track of how many bytes were written, and to figure out if maybe the function was being called too many times.
For about 3 million entries, self.written
reports about 60 MiB at the end, which makes sense based on my calculations and means the store function isn’t being called excessively. KSysGuard and SMART (data_units_written), however, report a total of 3 GiB written by the program. The HashTable I am using is set to 100 MiB and is not corrupted or anything, so I suspect it’s just writing the same data to the same locations over and over. But I can’t figure out where in the code this could be happening.
I am not 100% sure the writes are happening to the HashTable file, but when I put it in a ramdisk (tmpfs), no disk writes are reported by KSysGuard and SMART.
I am using Python 3.9.2 with numpy 1.21.0 on Debian Bullseye.
If anyone can help me figure this out, I would very much appreciate it. Thank you.
Advertisement
Answer
memmap
work by mapping pages in virtual memory (typically to physical memory pages or storage device ones like in your case). On most platforms, the size of pages is at least 4 KiB. As a result, any write in a page may cause the whole page to be updated.
SSDs and more generally flash memory works using block too, but they often use bigger chunks. Indeed, flash memory use cells with a very limited number of writes (eg. 1000). When cells are too much overwritten, they get unstable and may not be able to be read/written correctly. As a result, flash storage devices avoid any direct write access to cells and move written data blocks at a new location to save cells while being relatively fast. Once written, blocks cannot be mutated: a new block needs to be allocated and written to replace the old one. Thus, writing only few bytes randomly on a flash storage device cause it to allocate a lot of new blocks and copy a lot of (unchanged) data chunks. This also significantly impact the life of the target storage device. This can explain why SMART informations report such a high amount of IO writes.
Note that HDD do not have this issue but random writes are very slow compared to SSD (due to the time to move the heads). Alternative non-volatile RAM like ferroelectric RAM or magneto-resistive RAM can solve this issue correctly. Unfortunately, such RAMs are relatively experimental currently.
A possible fix is to store the modified data block in RAM, sort the block by location and write all of them at once. If the dataset is huge and the writes are very spread uniformly, then there is no solution on current mainstream hardware.