I have a file that look like this:
1 snp1 0.0 4 1 snp2 0.2 6 1 snp3 0.3 4 1 snp4 0.4 3 1 snp5 0.5 5 1 snp6 0.6 6 1 snp7 1.3 5 1 snp8 1.3 3 1 snp9 1.9 4
File is sorted by column 3. I want the average of 4th column grouped by column 3 every 0.5 unit apart. For example it should output like this:
1 snp1 0.0 4.4 1 snp6 0.6 6.0 1 snp7 1.3 4.0 1 snp9 1.9 4.0
I can print all positions without average like this:
awk 'NR==1 {pos=$3; print $0} $3>=pos+0.5{pos=$3; print $0}' input
But I am not able to figure out how to print average of 4th column. It would be great if someone can help me to find solution to this problem. Thanks!
Advertisement
Answer
Something like this, maybe:
awk ' NR==1 {c1=$1; c2=$2; v=$3; n=1; s=$4; next} $3>v+0.5 {print c1, c2, v, s/n; c1=$1; c2=$2; v=$3; n=1; s=$4; next} {n+=1; s+=$4} END {print c1, c2, v, s/n} ' input