Using tar with Your Favorite Compression

Here's a fun one! You may already know that tarball is a pure archive format, and that any compression is applied to the whole archive as a unit. That is to say that compression is not actually applied at the file level, but to the entire archive.

This is a trade-off the designers made to limit complexity, and as a side-effect, is the reason why you can't randomly access parts of a compressed tarball.

What you may not know is that the tar utility has built-in support for a few formats! GZIP is probably the most commonly used for historical reasons, but zstd and lz4 are built-in options on my Mac. This is probably system-dependent, so check your local manpages.

Here's an example of compressing and decompressing with zstd:

tar --zstd -cf directory.tar.zst directory/
tar --zstd -xf directory.tar.zst

You can also use this with any (de)compression program that operates on stdin and stdout!

tar --use-compress-program zstd -cf directory.tar.zst directory/

Pretty cool, huh? It's no different that using pipes at the end of the day, but it does simplify the invocation a bit in my opinion.

After I initially published this article, @cartocalypse@norden.social noted that some versions of tar include the -a/--auto-compress option which will automatically determine format and compression based on the suffix! Check your manpages for details; it appears to work on FreeBSD, macOS (which inherits the FreeBSD implementation), and GNU tar.

Posts from blogs I follow

The Conversation the Fediverse Refuses to Have

When Twitter got taken over by Musk a couple of years back, something akin to an intellectual exodus of sorts happened, and lots of communities left Twitter for other, less ideologically captured alternative platforms that aligned more c...

via Max's Homepage

Generated by openring-rs from my blogroll.