Filtering on a condition using the column names and not numbers

Question

I am trying to filter a text file with columns based on two conditions. Due to the size of the file, I cannot use the column numbers (as there are thousands and are unnumbered) but need to use the column names. I have searched and tried to come up with multiple ways to do this but nothing is returned to

Accepted Answer

with your input file and the implied conditions this should work$ awk -v c1='ad' -v c2='bd' 'NR==1{n=split($0,h); for(i=1;i<=n;i++) col[h[i]]=i}                              $col[c1]=="c" && $col[c2]=="water"{count++} END{print count+0}' file2or you can replace c1 and c2 with the values in the script as well.to find the column indices you can run$ awk -v cols='ad bd' 'BEGIN{n=split(cols,c); for(i=1;i<=n;i++) colmap[c[i]]}                        NR==1{for(i=1;i<=NF;i++) if($i in colmap) print $i,i; exit}' filead 2bd 3or perhaps with this chain$ sed 1q file | tr -s ' ' 
 | nl | grep -E 'ad|bd'     2  ad     3  bdalthough may have false positives due to regex match&#8230;You can rewrite the awk to be more succinct$ awk -v cols='ad bd' '{while(++i<=NF) if(FS cols FS ~ FS $i FS) print $i,i;                         exit}' filead 2bd 3

Advertisement

Answer