I have a collection of data in a text file arranged in two columns. What I want is to calculate the average value for repeating numbers in the first column. e.g. for the first three rows take one average of the second column and so on. I will be grateful for any help you can provide.

0.628319 0.123401

0.628319 0.23044

0.628319 4.57734

0.888577 0.390783

1.40496 0.110672

1.40496 0.239377

1.40496 0.248376

1.40496 0.751108

1.40496 0.971678

1.40496 1.36865

## Advertisement

## Answer

Put the data in an Excel file and read it into a Pandas DataFrame. Compute the mean of the second column grouped by the first column.

import pandas as pd # header=None because there are no column headers in my XLSX file # Column names will be integers: 0 and 1 data = pd.read_excel("physics.xlsx", header=None, engine="openpyxl") # What does "grouped means" mean?: Sort column 1 values by column 0 value and take mean of each column-1 group grp_means = data.groupby(0).mean() print(grp_means)