Skip to content
Advertisement

Unzipping bz2 file

I have the following command to open a tbz file:

# pricing20150304.tbz
tar xpj -C {tarball} {files_to_unarchive}

The compressed file is about 15 GB and when it is expanded it is about 500GB or so. This operation on an ec2-4x-large operation takes roughly 1h40m.

Is there a way to optimize this operation? What would be the fastest way to do the above operation?

Advertisement

Answer

A couple possibilities come to mind. First off, bzip2 is pretty slow, so if you can use a different algorithm you might want to consider doing so. Assuming you still want a fairly high ratio, LZHAM and Brotli might be good choices; they take longer to compress but are much faster when it comes to decompression, and IIRC both come with multi-threaded decompressors. There are lots of choices, and they all have different trade-offs between compression speed, decompression speed, and ratio.

If a different algorithm isn’t an option, you might want to consider using pbzip2 instead of bzip2. Something like pbzip2 -dc infile.tar.bz2 | tar x.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement