I have a TSV file with 4 columns in this format
dog phil tall 2020-12-09 12:34:22 cat jill tall 2020-12-10 11:34:22
The 4th column is a date string Example : 2020-12-09 12:34:22
I want every row with the same date to go into its own file
For example,
file 20201209
should have all rows that start with 2020-12-09
in the 4th column
file 20201210
should have all rows that start with 2020-12-10
in the 4th column
Is there any way to do this through the terminal?
Advertisement
Answer
With GNU awk to allow potentially large numbers of concurrently open output files and gensub()
:
awk '{print > gensub(/-/,"","g",$(NF-1))}' file
With any awk:
awk '{out=$(NF-1); gsub(/-/,"",out); if (seen[out]++) print >> out; else print > out; close(out)}' file
There’s ways to speed up either script by sorting the input first if that’s an issue.