Faster solution to compare files in bash

Question

file1: and file2: I need find match file2 in file1 and print whole file1 + second column of file2 So ouptut is: My solution is very slow in bash: I am prefer FASTER any bash or awk solution. Output can be modified, but need keep all the informations (order of column can be different). EDIT: Right now it looks…

Accepted Answer

Another solution using join and sed, Under the assumption that file1 and file2 are sortedjoin <(sed -r 's/[^ _]+_[^_]+/& &/' file1) file2 -1 4 -2 1 -o "1.1 1.2 1.3 1.5 2.2" > outputIf the output order doesn&#8217;t matter, to use awkawk 'FNR==NR{d[$1]=$2; next}    {split($4,v,"_"); key=v[1]"_"v[2]; if(key in d) print $0, d[key]}' file2 file1 you get,chr1 14361 14829 NR_024540_0_r_DDX11L1,WASH7P_468 11chr1 14969 15038 NR_024540_1_r_WASH7P_69 11chr1 15795 15947 NR_024540_2_r_WASH7P_152 11chr1 16606 16765 NR_024540_3_r_WASH7P_15 11chr1 16857 17055 NR_024540_4_r_WASH7P_198 11

Advertisement

Answer