I have a tab-delim file that looks like this
10 7.98862e-05 10 cellular organisms; Bacteria; Proteobacteria; Betaproteobacteria; Burkholderiales; Burkholderiaceae; Ralstonia; Ralstonia solanacearum 0.000159772 10 0.000207704 10 cellular organisms; Bacteria; Terrabacteria group; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium; Clostridium sp. C105KSO15 0.00092668 10 9.58635e-05 10 0.000191727 10 0.000127818 10 cellular organisms; Bacteria; Terrabacteria group; Firmicutes; Clostridia; Clostridiales; Peptostreptococcaceae; Clostridioides; Clostridioides difficile 0.0 0142198 10 cellular organisms; Bacteria; Terrabacteria group; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium; Clostridium sp. C105KSO15 0.00268418
It should be three columns but some rows are missing the second column (bacterial taxonomy). For these rows I want to move the number to a third column and insert some text (such as Unclassified
).
I have been attempting to modify some commands I found online to insert Unclassifiedt
before any value in the second column which starts with a number but have been unable to get it working.
Any help would be much appreciated.
Advertisement
Answer
give this line a try:
awk -F't' -v OFS="t" 'NF==2{$3=$2;$2="Unclassified"}7' file