Skip to content
Advertisement

Move all rows in a tsv with a certain date to their own file

I have a TSV file with 4 columns in this format

dog    phil    tall    2020-12-09 12:34:22
cat    jill    tall    2020-12-10 11:34:22

The 4th column is a date string Example : 2020-12-09 12:34:22

I want every row with the same date to go into its own file

For example,

file 20201209 should have all rows that start with 2020-12-09 in the 4th column
file 20201210 should have all rows that start with 2020-12-10 in the 4th column

Is there any way to do this through the terminal?

Advertisement

Answer

With GNU awk to allow potentially large numbers of concurrently open output files and gensub():

awk '{print > gensub(/-/,"","g",$(NF-1))}' file

With any awk:

awk '{out=$(NF-1); gsub(/-/,"",out); if (seen[out]++) print >> out; else print > out; close(out)}' file

There’s ways to speed up either script by sorting the input first if that’s an issue.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement