I want to compare two columns in a file as below using AWK, can someone gives a help please?
e.g.
Col1 Col2 ---- ---- 2 A 2 D 3 D 3 D 3 A 7 N 7 M 1 D 1 R
Now I want to use AWK to implement the following algorithm to find matches between those columns:
list1[] <=== Col1 list2[] <=== Col2 NewList[] for i in col2: d = 0 for j in range(1,len(col2)): if i == list2[j]: d++ NewList.append(list1[list2.index[i]])
Expected result:
A ==> 2 // means A matches two times to Col1 D ==> 4 // means D matches two times to Col1 ....
So I want to write the above code in AWK script and I find it too complicated for me as I haven’t used it yet.
Thank you very much for your help
Advertisement
Answer
Not all that complicated, keep the count in an array indexed by the character and print the array out at the end;
awk '{cnt[$2]++} END {for(c in cnt) print c, cnt[c]}' test.txt # A 2 # D 4 # M 1 # N 1 # R 1 {cnt[$2]++} # For each row, get the second column and increase the # value of the array at that position (ie cnt['A']++) END {for(c in cnt) print c, cnt[c]} # When all rows done (END), loop through the keys of the # array and print key and array[key] (the value)