Skip to content
Advertisement

How to split file based on first character in Linux shell

I do have a fixedwidth flatfile with the header and detail data. Both of them can be recognized by the first character: 1 for header and 2 for detail.

I want to genrate 2 different files from my fixedwidth file , each file having it’s own record set, but without type record written.

File Header.txt having only type 1 records. File Detail.txt having only type 2 records.

Please Let me know how we can achieve this.

Example flatfile:

120190301,025712,FRANK,DURAND,USA
20257120023.12
20257120000.21
20257120191.45
120190301,025737,ERICK,SMITH,USA
20257370000.29
20257370326.41
120190301,025632,JOSEPH,SILVA,USA
20256320019.57
20256320029.12
20256320129.04

Desired Outputs:

Header.txt

20190301,025712,FRANK,DURAND,USA
20190301,025737,ERICK,SMITH,USA
20190301,025632,JOSEPH,SILVA,USA

Detail.txt

0257120023.12
0257120000.21
0257120191.45
0257370000.29
0257370326.41
0256320019.57
0256320029.12
0256320129.04

Advertisement

Answer

This first one is gawk-specific and works because in gawk “If the value [of FS] is the null string (“”), then each character in the record becomes a separate field.”

$ awk 'BEGIN {FS=""; f[1]="header.txt"; f[2]="detail.txt"}
       {i=$1; sub(/^./,""); print > f[i]}' file

$ cat header.txt
20190301,025712,FRANK,DURAND,USA
20190301,025737,ERICK,SMITH,USA
20190301,025632,JOSEPH,SILVA,USA

$ cat detail.txt
0257120023.12
0257120000.21
0257120191.45
0257370000.29
0257370326.41
0256320019.57
0256320029.12

One that should work with any awk:

$ awk '/^1/ {f="header.txt"}
       /^2/ {f="detail.txt"}
      {sub(/^./,""); print > f}' file
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement