Skip to content
Advertisement

Why dd can’t handle sparse files in shell scripts? [closed]

I have the following sparse file that I want to flash to an SD card:

647M -rw-------  1 root     root     4.2G Sep 21 16:53 make_sd_card.sh.xNws4e

As you can see, it takes ~647M on disk for an apparent size of 4.2G. If I flash it directly with dd, in my shell, it’s really fast, ~6s:

$ time (sudo /bin/dd if=make_sd_card.sh.xNws4e of=/dev/mmcblkp0 conv=sparse; sync)
8601600+0 records in
8601600+0 records out
4404019200 bytes (4.4 GB, 4.1 GiB) copied, 6.20815 s, 709 MB/s

real    0m6.284s
user    0m1.920s
sys     0m4.336s

But when I do the very same commands inside a shell script, it behaves like if it was copying all the zeroes and takes a big amount of time (~2m10):

$ time sudo ./plop.sh ./make_sd_card.sh.xNws4e
+ dd if=./make_sd_card.sh.xNws4e of=/dev/mmcblk0 conv=sparse
8601600+0 records in
8601600+0 records out
4404019200 bytes (4.4 GB, 4.1 GiB) copied, 127.984 s, 34.4 MB/s
+ sync

real    2m9.885s
user    0m3.520s
sys     0m15.560s

If I watch the dirty section of /proc/meminfo, I can see that this counter is much higher when dd-ing from a shell script than directly from the shell.

My shell is bash an for the record, the script is:

#!/bin/bash
set -xeu
dd if=$1 of=/dev/mmcblk0 conv=sparse bs=512
sync

[EDIT] I’m resurrecting this topic, because a developer I work with, has found these commands: bmap_create and bmap_copy which seems to do exactly what I was trying with achieve clumsily with dd. In debian, they are part of the bmap-tools package. With it, it takes 1m2s to flash a 4.1GB sparse SD image, with a real size of 674MB, when it takes 6m26s with dd or cp.

Advertisement

Answer

This difference is caused by a typo in the non-scripted invocation, which did not actually write to your memory card. There is no difference in dd behavior between scripted and interactive invocation.


Keep in mind what a sparse file is: It’s a file on a filesystem that’s able to store metadata tracking which blocks have values at all, and thus for which zero blocks have never been allocated any storage on disk whatsoever.

This concept — of a sparse file — is specific to files. You can’t have a sparse block device.


The distinction between your two lines of code is that one of them (the fast one) has a typo (mmcblkp0 instead of mmcblk0), so it’s referring to a block device name that doesn’t exist. Thus, it creates a file. Files can be sparse. Thus, it creates a sparse file. Creating a sparse file is fast.

The other one, without the typo, writes to the block device. Block devices can’t be sparse. Thus, it always takes the full execution time to run.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement