Skip to content
Advertisement

batch extracting data from files, naming new files according to string in input file

With Linux I want to automatically extract data from .dat files and name the new files according to a string in the input files:

I have 300 .dat files with a data structure as follows:

.
.
.
DE name1, contig1 .
.
SQ
information1
//
.
.
DE name1, contig2 .
.
SQ
information2
//
.

where the “.” stands for lines that I don’t need. I now want to extract all the “information” from the .dat file and generate a new file with the name “name1” from the line DE.

    for file in *.dat;
do
    awk '/SQ/{flag=1;next}/"//"/{flag=0}flag' "$file" > ???
done

What command would you recommend to perform this task ?

Advertisement

Answer

You can use this awk 1 liner:

awk -F '[, ]' '/^DE/ {filename=$2} /SQ/,//// {print > filename}' file.dat

And here is a sample run:

$ ls
file.dat
$ cat file.dat 
.
.
.
DE name1, contig1 .
.
SQ
information1
//
.
.
DE name2, contig2 .
.
SQ
information2
//
.
$ awk -F '[, ]' '/^DE/ {filename=$2} /SQ/,//// {print > filename}' file.dat 
$ ls
file.dat  name1  name2
$ cat name1
SQ
information1
//
$ cat name2
SQ
information2
//
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement