Skip to content
Advertisement

extracting unique values between 2 sets/files

Working in linux/shell env, how can I accomplish the following:

text file 1 contains:

JavaScript

text file 2 contains:

JavaScript

I need to extract the entries in file 2 which are not in file 1. So ‘6’ and ‘7’ in this example.

How do I do this from the command line?

many thanks!

Advertisement

Answer

JavaScript

Explanation of how the code works:

  • If we’re working on file1, track each line of text we see.
  • If we’re working on file2, and have not seen the line text, then print it.

Explanation of details:

  • FNR is the current file’s record number
  • NR is the current overall record number from all input files
  • FNR==NR is true only when we are reading file1
  • $0 is the current line of text
  • a[$0] is a hash with the key set to the current line of text
  • a[$0]++ tracks that we’ve seen the current line of text
  • !($0 in a) is true only when we have not seen the line text
  • Print the line of text if the above pattern returns true, this is the default awk behavior when no explicit action is given
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement