How to handle missing values while calculating average in shell script?

Question

I have a dataset with many missing values as double slash (//). Part of the data is I would like calculate the average in each 6 rows interval without considering the missing values. I was trying with this, but not getting what I need. It is giving But the answer should come Answer You need to modify the awk a

Accepted Answer

You need to modify the awk a bit to obtain the output as$ awk '!///{sum += $1; count++} NR%6==0{print sum/count;sum=count=0}'Test$ awk '!///{sum += $1; count++} NR%6==0{print sum/count;sum=count=0}' file29.442.5What it does?!///{sum += $1; count++}!/// pattern checks if the line doesn&#8217;t contain /{sum += $1; count++} when the line doesn&#8217;t contain / this action is performed, which sums the column 1, $1 to sum and increments a count which essentially tells us how many digits awk has seen till the next print.NR%6==0{print sum/count;sum=count=0} When the number of records, NR is multiple of 6, we print the average, sum/count and resets the count and sum variables.EditTo print the last set of lines which may be less than 6 in number, we can use an END block as$ awk '!///{sum += $1; count++} NR%6==0{print sum/count;sum=count=0} END{print sum/count}' file29.442.5END{print sum/count}' The END block will be updated when the file reaches its end.Edit 2Edge case when no numbers occur in the 6 lines, the above script can lead to divide by zero error. The print statement can be formatted to handle this case.print count ? (sum/count) : count;sum=count=0}This is a basic ternary operator which checks if count is non zero, if so prints the divided value , else print count, 0Test$ awk '!///{sum += $1; count++} NR%6==0{print count ? (sum/count) : count;sum=count=0}' file29.442.50

Advertisement

Answer