Skip to content
Advertisement

How to get the lines in one file which contain a string(with repetition) in another file?

File 1:

a
a
b
c
d

File 2:

a a1
b b1
e e1
f f1

My desired output:

a a1
a a1
b b1

I am trying to implement this using bash or Python.

In python I tried:

f1=open("file1")
f2=open("file2")
dpo1=f1.readlines()
dpo2=f2.readlines()

for i in dpo2:
    for j in dpo1:
        if j in i:
            print i

In bash I tried:

awk 'NR == FNR { ++h[tolower($1)]; next; } h[tolower($1)]' file1 file2

But this does not consider repetitions. It will give the output

a a1
b b1

Any ideas?

Advertisement

Answer

Here’s one way you could do it using awk:

$ awk 'NR==FNR{a[$1]=$2;next}$0 in a{print $0,a[$0]}' file2 file1
a a1
a a1
b b1

Read the key-value pairs from the second file into the array a, then print the ones that match.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement