Bash loop only read the last line

Question

I have problems trying to extract data behind colons in multiple lines using while loop and awk. This is my data structure: What I want to get is the BioSample ID, which is like SAMD00019077. Scripts I tried: while read line ; do echo $line | awk -F&#8217;:&#8217; &#8216;{print $3}&#8217; > 1.tmp2 ; done &…

Accepted Answer

First of all, awk has the ability to loop through lines and the field separator can be a regex.So, your script can be reduced to this optimized format:awk -F'[;:]' '{print $3}' 1.tmp > 1.tmp2This is the optimized format that you can use.Having said that, you might want to know what was wrong in the your script.while read line ; do echo $line | awk -F':' '{print $3}' > 1.tmp2 ; done < 1.tmp ^ hereThe > marked above is the redirection operator.It writes the stdout of the command (awk in this case) to the file specified. It does not append, but overwrite.So, in every iteration of the loop, the file is cleared and the output of the command is written to it. Hence it leaves only the last entry.To fix that, you can use the append redirection: >>.while read line ; do echo $line | awk -F':' '{print $3}' >> 1.tmp2 ; done < 1.tmpNow, there is a caveat. What if the file is not originally empty? This loop will append to the file, without clearing the file first. To fix that, you can first clear the file with:>1.tmp2; while read line ; do echo $line | awk -F':' '{print $3}' >> 1.tmp2 ; done < 1.tmpHowever, if we are sure that all the stdout produced by the loop needs to go into the file, you can simply move the redirection out of the loop. That way, shell does not have to keep opening and closing the file descriptors.while read line ; do echo $line | awk -F':' '{print $3}'; done < 1.tmp > 1.tmp2Note that these options are unoptimized, but would still work. The optimized option would be to let awk itself do the line-by-line processing as mentioned in the first snippet in the answer.

Advertisement

Answer