Skip to content
Advertisement

Understanding Linux write performance

I’ve been doing some benchmarking to try and understand write performance on Linux, and I don’t understand the results I got (I’m using ext4 on Ubuntu 17.04, though I’m more interested in understanding ext4 if anything, than I am in comparing filesystems).

Specifically, I understand that some databases/filesystems work by keeping a stale copy of your data, and then writing updates to a modification log. Periodically, the log is replayed over the stale data to get a fresh version of the data which is then persisted. This only makes sense to me if appending to a file is faster than overwriting the whole file (otherwise why write updates to a log? Why not just overwrite the data on disk?). I was curious how much faster appending is than overwriting, so I wrote a small benchmark in go (https://gist.github.com/msteffen/08267045be42eb40900758c419c3bd38) and got these results:

$ go test ./write_test.go  -bench='.*'
BenchmarkWrite/Write_10_Bytes_10_times-8                30    46189788 ns/op
BenchmarkWrite/Write_100_Bytes_10_times-8               30    46477540 ns/op
BenchmarkWrite/Write_1000_Bytes_10_times-8              30    46214996 ns/op
BenchmarkWrite/Write_10_Bytes_100_times-8                3   458081572 ns/op
BenchmarkWrite/Write_100_Bytes_100_times-8               3   678916489 ns/op
BenchmarkWrite/Write_1000_Bytes_100_times-8              3   448888734 ns/op
BenchmarkWrite/Write_10_Bytes_1000_times-8               1  4579554906 ns/op
BenchmarkWrite/Write_100_Bytes_1000_times-8              1  4436367852 ns/op
BenchmarkWrite/Write_1000_Bytes_1000_times-8             1  4515641735 ns/op
BenchmarkAppend/Append_10_Bytes_10_times-8              30    43790244 ns/op
BenchmarkAppend/Append_100_Bytes_10_times-8             30    44581063 ns/op
BenchmarkAppend/Append_1000_Bytes_10_times-8            30    46399849 ns/op
BenchmarkAppend/Append_10_Bytes_100_times-8              3   452417883 ns/op
BenchmarkAppend/Append_100_Bytes_100_times-8             3   458258083 ns/op
BenchmarkAppend/Append_1000_Bytes_100_times-8            3   452616573 ns/op
BenchmarkAppend/Append_10_Bytes_1000_times-8             1  4504030390 ns/op
BenchmarkAppend/Append_100_Bytes_1000_times-8            1  4591249445 ns/op
BenchmarkAppend/Append_1000_Bytes_1000_times-8           1  4522205630 ns/op
PASS
ok    command-line-arguments  52.681s

This left me with two questions that I couldn’t think of an answer to:

1) Why does time per operation go up so much when I go from 100 writes to 1000? (I know Go repeats benchmarks for me, so doing multiple writes myself is probably silly, but since I got a weird answer I’d like to understand why) This was due to a bug in the Go test (which is now fixed)

2) Why isn’t appending to a file faster than writing to it? I thought the whole point of the update log was to take advantage of the comparative speed of appends? (note that the current bench calls Sync() after every write, but even if I don’t do that appends are no faster than writes, though both are much faster overall)

If any of the experts here could enlighten me, I would really appreciate it! Thanks!

Advertisement

Answer

About (1), I think the issue is related to your benchmarks not doing what the Go tools expect them to do.

From the documentation (https://golang.org/pkg/testing/#hdr-Benchmarks):

The benchmark function must run the target code b.N times. During benchmark execution, b.N is adjusted until the benchmark function lasts long enough to be timed reliably.

I don’t see your code using b.N, so while the benchmark tool thinks you run the code b.N times, you are managing the repeats by yourself. Depending on the values the tools are actually using for b.N, the results will vary unexpectedly.

You can actually do things 10, 100 and 1,000 times, but in all cases do them b.N times (make that b.N * 10, b.N * 100, etc) so that the reported benchmark is adjusted properly.

About (2), when some systems rather use a sequential log to store operations to the replay them, it’s not because appending to a file is faster than overwriting a single file.

In a database system, if you need to update a specific record, you must first find what’s the actual file (and position in the file) you need to update.

That might require several index lookups, and once you update the record, you might need to update those indexes to reflect the new values.

So the right comparison is appending to a single log vs making several reads plus then several writes.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement