Hi I have a file with 6 columns and I wish to know the average of three of these (columns 2,3,4) and the sum of the last two (columns 5 and 6) for each unique variable in column one.
A1234 0.526 0.123 0.456 0.986 1.123 A1234 0.423 0.256 0.397 0.876 0.999 A1234 0.645 0.321 0.402 0.903 1.101 A1234 0.555 0.155 0.406 0.888 1.009 B5678 0.111 0.345 0.285 0.888 0.789 B5678 0.221 0.215 0.305 0.768 0.987 B5678 0.336 0.289 0.320 0.789 0.921
I have come across code that will get the average for column 2 based on column one but is there anyway I can expand this across columns? Thanks
awk '{a[$1]+=$2; c[$1]++} END{for (i in a) printf "%d%s%.2fn", i, OFS, a[i]/c[i]}'
I would like the output to be in the following format ;each variable in column one will also have a different number of rows
A1234 0.53725 0.21375 0.41525 3.653 4.232 B5678 0.22233 0.283 0.30333 2.445 2.697
Advertisement
Answer
awk '{a[$1]+=$2;b[$1]+=$3;c[$1]+=$4;d[$1]+=$5;e[$1]+=$6;f[$1]++} END{for (i in a) print i,a[i]/f[i],b[i]/f[i],c[i]/f[i],d[i],e[i]}' file
O/p:
B5678 0.222667 0.283 0.303333 2.445 2.697 A1234 0.53725 0.21375 0.41525 3.653 4.232