With Linux I want to automatically extract data from .dat files and name the new files according to a string in the input files:
I have 300 .dat files with a data structure as follows:
.
.
.
DE name1, contig1 .
.
SQ
information1
//
.
.
DE name1, contig2 .
.
SQ
information2
//
.
where the “.” stands for lines that I don’t need. I now want to extract all the “information” from the .dat file and generate a new file with the name “name1” from the line DE.
for file in *.dat; do awk '/SQ/{flag=1;next}/"//"/{flag=0}flag' "$file" > ??? done
What command would you recommend to perform this task ?
Advertisement
Answer
You can use this awk
1 liner:
awk -F '[, ]' '/^DE/ {filename=$2} /SQ/,//// {print > filename}' file.dat
And here is a sample run:
$ ls file.dat $ cat file.dat . . . DE name1, contig1 . . SQ information1 // . . DE name2, contig2 . . SQ information2 // . $ awk -F '[, ]' '/^DE/ {filename=$2} /SQ/,//// {print > filename}' file.dat $ ls file.dat name1 name2 $ cat name1 SQ information1 // $ cat name2 SQ information2 //