With Linux I want to automatically extract data from .dat files and name the new files according to a string in the input files:
I have 300 .dat files with a data structure as follows:
.
.
.
DE name1, contig1 .
.
SQ
information1
//
.
.
DE name1, contig2 .
.
SQ
information2
//
.
where the “.” stands for lines that I don’t need. I now want to extract all the “information” from the .dat file and generate a new file with the name “name1” from the line DE.
JavaScript
x
for file in *.dat;
do
awk '/SQ/{flag=1;next}/"//"/{flag=0}flag' "$file" > ???
done
What command would you recommend to perform this task ?
Advertisement
Answer
You can use this awk
1 liner:
awk -F '[, ]' '/^DE/ {filename=$2} /SQ/,//// {print > filename}' file.dat
And here is a sample run:
JavaScript
$ ls
file.dat
$ cat file.dat
.
.
.
DE name1, contig1 .
.
SQ
information1
//
.
.
DE name2, contig2 .
.
SQ
information2
//
.
$ awk -F '[, ]' '/^DE/ {filename=$2} /SQ/,//// {print > filename}' file.dat
$ ls
file.dat name1 name2
$ cat name1
SQ
information1
//
$ cat name2
SQ
information2
//