I have a collection of data in a text file arranged in two columns. What I want is to calculate the average value for repeating numbers in the first column. e.g. for the first three rows take one average of the second column and so on. I will be grateful for any help you can provide.
0.628319 0.123401
0.628319 0.23044
0.628319 4.57734
0.888577 0.390783
1.40496 0.110672
1.40496 0.239377
1.40496 0.248376
1.40496 0.751108
1.40496 0.971678
1.40496 1.36865
Advertisement
Answer
Put the data in an Excel file and read it into a Pandas DataFrame. Compute the mean of the second column grouped by the first column.
import pandas as pd # header=None because there are no column headers in my XLSX file # Column names will be integers: 0 and 1 data = pd.read_excel("physics.xlsx", header=None, engine="openpyxl") # What does "grouped means" mean?: Sort column 1 values by column 0 value and take mean of each column-1 group grp_means = data.groupby(0).mean() print(grp_means)