I have a tar archive which is very big ~ 5GB.
I want to grep for a pattern on all files (and also print the name of the file that has the pattern ) in the archive but do not want to fill up my disk space by extracting the archive.
Anyway I can do that?
I tried these, but this does not give me the file names that contain the pattern, just the matching lines:
tar -O -xf test.tar.gz | grep 'this' tar -xf test.tar.gz --to-command='grep awesome'
Also where is this feature of tar documented? tar xf test.tar $FILE
Advertisement
Answer
Here’s my take on this:
while read filename; do tar -xOf file.tar "$filename" | grep 'pattern' | sed "s|^|$filename:|"; done < <(tar -tf file.tar | grep -v '/$')
Broken out for explanation:
while read filename; do
— it’s a loop…tar -xOf file.tar "$filename"
— this extracts each file…| grep 'pattern'
— here’s where you put your pattern…| sed "s|^|$filename:|";
– prepend the filename, so this looks like grep. Salt to taste.done < <(tar -tf file.tar | grep -v '/$')
— end the loop, get the list of files as to fead to yourwhile read
.
One proviso: this breaks if you have OR bars (|
) in your filenames.
Hmm. In fact, this makes a nice little bash function, which you can append to your .bashrc
file:
targrep() { local taropt="" if [[ ! -f "$2" ]]; then echo "Usage: targrep pattern file ..." fi while [[ -n "$2" ]]; do if [[ ! -f "$2" ]]; then echo "targrep: $2: No such file" >&2 fi case "$2" in *.tar.gz) taropt="-z" ;; *) taropt="" ;; esac while read filename; do tar $taropt -xOf "$2" | grep "$1" | sed "s|^|$filename:|"; done < <(tar $taropt -tf $2 | grep -v '/$') shift done }