Skip to content
Advertisement

How to compare the columns of file1 to the columns of file2, select matching values, and output to new file using grep or unix commands

I have two files, file1 and file2, where the target_id compose the first column in both.

I want to compare file1 to file2, and only keep the rows of file1 which match the target_id in file2.

file1

file2:

target_id
ENSMUST00000128641.2
ENSMUST00000185334.7
ENSMUST00000170213.2
ENSMUST00000232944.2

Any help would be appreciated.

% grep -x -f file1 file2 resulted in no output in my terminal

Advertisement

Answer

Sample data that actually shows overlaps between the files.

  • file1.csv:

    target_id,KO_1D_7dpi,KO_2D_7dpi
    ENSMUST00000178537.2,0,0
    ENSMUST00000178862.2,0,0
    ENSMUST00000196221.2,0,0
    ENSMUST00000179664.2,0,0
    ENSMUST00000177564.2,0,0
    
  • file2.csv

    target_id
    ENSMUST00000178537.2
    ENSMUST00000196221.2
    ENSMUST00000177564.2
    

Your grep command, but swapped:

$ grep -F -f file2.csv file1.csv
target_id,KO_1D_7dpi,KO_2D_7dpi
ENSMUST00000178537.2,0,0
ENSMUST00000196221.2,0,0
ENSMUST00000177564.2,0,0

Edit: we can add the -F argument since it is a fixed-string search. Plus it adds protection against the . matching something else as a regex. Thanks to @Sundeep for the recommendation.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement