Split CSV file in bash into multiple files based on condition

Question

My csv file has multiple rows of data and I want to split it into multiple files based on one attribute. SQL code with ORDER BY ID is triggered from beeline which creates single CSV. cat sql.csv "attr;attr;ID;attr" "data;data;XXXX;date" "data;data;XXXX;date" "data;data;YYYYY;date" "data;data;YYYYY;date" "data;data;BBBBB;date" "data;data;BBBBB;date" Desired result is to split once new ID is recognised and use that ID in filename.

Accepted Answer

If I understand your question, you can take the csv file produced by sql and then split that into the 3 files you show simply by using a few variables, string concatenation and then by redirecting to the output files, e.g.awk -v field=a -v n=1 -v dt=$(date '+%Y%m%d_%H%M%S') '    FNR == 1 {hdg=$0; next}    a != $3 {a = $3; name="file_"n"_ID_"a"_"dt; n++; print hdg > name}    {print $0 > name}' sqldataExample Input FileWhere your sqldata file contains:$ cat sqldataattr    attr    ID  attrdata    data    XXXX    datedata    data    XXXX    datedata    data    YYYYY   datedata    data    YYYYY   datedata    data    BBBBB   datedata    data    BBBBB   dateExample Use/Output FilesSimply copying and middle-mouse pasting awk script into the terminal with the correct filename to read would produce the following three output files:$ cat file_1_ID_XXXX_20190805_033514attr    attr    ID  attrdata    data    XXXX    datedata    data    XXXX    date$ cat file_2_ID_YYYYY_20190805_033514attr    attr    ID  attrdata    data    YYYYY   datedata    data    YYYYY   date$ cat file_3_ID_BBBBB_20190805_033514attr    attr    ID  attrdata    data    BBBBB   datedata    data    BBBBB   dateLook things over and let me know if this is what you intended. If not, let me know and I&#8217;m happy to help further.

Advertisement

Answer