Merge two files using awk in linux

I have a 1.txt file:

betomak@msn.com||o||0174686211||o||7880291304ca0404f4dac3dc205f1adf||o||Mario||o||Mario||o||Kawati
zizipi@libero.it||o||174732943.0174732943||o||e10adc3949ba59abbe56e057f20f883e||o||Tiziano||o||Tiziano||o||D'Intino
frankmel@hotmail.de||o||0174844404||o||8d496ce08a7ecef4721973cb9f777307||o||Melanie||o||Melanie||o||Kiesel
apoka-paris@hotmail.fr||o||0174847613||o||536c1287d2dc086030497d1b8ea7a175||o||Sihem||o||Sihem||o||Sousou
sofianomovic@msn.fr||o||174902297.0174902297||o||9893ac33a018e8d37e68c66cae23040e||o||Nabile||o||Nabile||o||Nassime
donaldduck@yahoo.com||o||174912161.0174912161||o||0c770713436695c18a7939ad82bc8351||o||Donald||o||Donald||o||Duck
cernakova@centrum.cz||o||0174991962||o||d161dc716be5daf1649472ddf9e343e6||o||Dagmar||o||Dagmar||o||Cernakova
trgsrl@tiscali.it||o||0175099675||o||d26005df3e5b416d6a39cc5bcfdef42b||o||Esmeralda||o||Esmeralda||o||Trogu
catherinesou@yahoo.fr||o||0175128896||o||2e9ce84389c3e2c003fd42bae3c49d12||o||Cat||o||Cat||o||Sou
ermimurati24@hotmail.com||o||0175228687||o||a7766a502e4f598c9ddb3a821bc02159||o||Anna||o||Anna||o||Beratsja
cece_89@live.fr||o||0175306898||o||297642a68e4e0b79fca312ac072a9d41||o||Celine||o||Celine||o||Jacinto
kendinegel39@hotmail.com||o||0175410459||o||a6565ca2bc8887cde5e0a9819d9a8ee9||o||Adem||o||Adem||o||Bulut

JavaScript
​x
 
betomak@msn.com||o||0174686211||o||7880291304ca0404f4dac3dc205f1adf||o||Mario||o||Mario||o||Kawatizizipi@libero.it||o||174732943.0174732943||o||e10adc3949ba59abbe56e057f20f883e||o||Tiziano||o||Tiziano||o||D'Intinofrankmel@hotmail.de||o||0174844404||o||8d496ce08a7ecef4721973cb9f777307||o||Melanie||o||Melanie||o||Kieselapoka-paris@hotmail.fr||o||0174847613||o||536c1287d2dc086030497d1b8ea7a175||o||Sihem||o||Sihem||o||Sousousofianomovic@msn.fr||o||174902297.0174902297||o||9893ac33a018e8d37e68c66cae23040e||o||Nabile||o||Nabile||o||Nassimedonaldduck@yahoo.com||o||174912161.0174912161||o||0c770713436695c18a7939ad82bc8351||o||Donald||o||Donald||o||Duckcernakova@centrum.cz||o||0174991962||o||d161dc716be5daf1649472ddf9e343e6||o||Dagmar||o||Dagmar||o||Cernakovatrgsrl@tiscali.it||o||0175099675||o||d26005df3e5b416d6a39cc5bcfdef42b||o||Esmeralda||o||Esmeralda||o||Trogucatherinesou@yahoo.fr||o||0175128896||o||2e9ce84389c3e2c003fd42bae3c49d12||o||Cat||o||Cat||o||Souermimurati24@hotmail.com||o||0175228687||o||a7766a502e4f598c9ddb3a821bc02159||o||Anna||o||Anna||o||Beratsjacece_89@live.fr||o||0175306898||o||297642a68e4e0b79fca312ac072a9d41||o||Celine||o||Celine||o||Jacintokendinegel39@hotmail.com||o||0175410459||o||a6565ca2bc8887cde5e0a9819d9a8ee9||o||Adem||o||Adem||o||Bulut​

A 2.txt file:

9893ac33a018e8d37e68c66cae23040e:134:@a1
536c1287d2dc086030497d1b8ea7a175:~~@!:/92
8d496ce08a7ecef4721973cb9f777307:demodemo

JavaScript
 
9893ac33a018e8d37e68c66cae23040e:134:@a1536c1287d2dc086030497d1b8ea7a175:~~@!:/928d496ce08a7ecef4721973cb9f777307:demodemo​

FS for 1.txt is “||o||” and for 2.txt is “:” I want to merge two files in a single file result.txt based on the condition that the 3rd column of 1.txt must match with 1st column of 2.txt file and should be replaced by the 2nd column of 2.txt file.

The expected output will contain all the matching lines: I am showing you one of them:

sofianomovic@msn.fr||o||174902297.0174902297||o||134:@a1||o||Nabile||o||Nabile||o||Nassime

JavaScript
 
sofianomovic@msn.fr||o||174902297.0174902297||o||134:@a1||o||Nabile||o||Nabile||o||Nassime​

I tried the script:

awk -F"||o||"  'NR==FNR{s=$0; sub(/:[^:]*$/, "", s); a[s]=$NF;next} {s = $5; for (i=6; i<=NF; ++i) s = s "," $i; if (s in a) { NF = 5; $5=a[s]; print } }' FS=: <(tr -d 'r' < 2.txt) FS="||o||" OFS="||o||" <(tr -d 'r' < 1.txt) > result.txt

JavaScript
 
awk -F"||o||"  'NR==FNR{s=$0; sub(/:[^:]*$/, "", s); a[s]=$NF;next} {s = $5; for (i=6; i<=NF; ++i) s = s "," $i; if (s in a) { NF = 5; $5=a[s]; print } }' FS=: <(tr -d 'r' < 2.txt) FS="||o||" OFS="||o||" <(tr -d 'r' < 1.txt) > result.txt​

But getting an empty file as the result. Any help would be highly appreciated.

Answer

If your actual Input_file(s) are same as shown sample then following awk may help you in same.

awk -v s1="||o||" '
FNR==NR{
  a[$9]=$1 s1 $5;
  b[$9]=$13 s1 $17 s1 $21;
  next
}
($1 in a){
  print a[$1] s1 $2 FS $3 s1 b[$1]
}
' FS="|" 1.txt FS=":" 2.txt

JavaScript
 
awk -v s1="||o||" 'FNR==NR{  a[$9]=$1 s1 $5;  b[$9]=$13 s1 $17 s1 $21;  next}($1 in a){  print a[$1] s1 $2 FS $3 s1 b[$1]}' FS="|" 1.txt FS=":" 2.txt​

EDIT: Since OP has changed requirement a bit so providing code as per new ask where it will create 2 files too 1 file which will have ids present in 1.txt and NOT in 2.txt and other will be vice versa of it.

awk -v s1="||o||" '
FNR==NR{
  a[$9]=$1 s1 $5;
  b[$9]=$13 s1 $17 s1 $21;
  c[$9]=$0;
  next
}
($1 in a){
  val=$1;
  $1="";
  sub(/:/,"");
  print a[val] s1 $0 s1 b[val];
  d[val]=$0;
  next
}
{
  print > "NOT_present_in_2.txt"
}
END{
for(i in d){
  delete c[i]
};
for(j in c){
  print j,c[j] > "NOT_present_in_1.txt"
}}
' FS="|" 1.txt FS=":" OFS=":" 2.txt

JavaScript
 
awk -v s1="||o||" 'FNR==NR{  a[$9]=$1 s1 $5;  b[$9]=$13 s1 $17 s1 $21;  c[$9]=$0;  next}($1 in a){  val=$1;  $1="";  sub(/:/,"");  print a[val] s1 $0 s1 b[val];  d[val]=$0;  next}{  print > "NOT_present_in_2.txt"}END{for(i in d){  delete c[i]};for(j in c){  print j,c[j] > "NOT_present_in_1.txt"}}' FS="|" 1.txt FS=":" OFS=":" 2.txt​

Advertisement

Answer