I am looking for a way to remove the content from one .txt
file based on the other.
For example, I have a file.txt
with 2000 character that are random and not sorted. I have another file importantfile.txt
with 2016 characters that have the same characters that file.txt
has as well as 16 other characters randomly placed in.
Is there any way to remove the characters in file.txt
from importantfile.txt
to find the 16 character string.
Some errors I found that in the diff command is that It would print the whole string because It was considered one word diff file.txt importantfile.txt
would return w881lYoi8042aKGfwj7EjenViinsmbmnWIHJMZ2T9L40KiLr4x485TM3gKmc1Ig8n6VVW82iqjxypCp19sXIMisX4HIkp54lVohqKSuLjjuns91GiEwtTsvN0zhn6c9GZC2GqUKLsy9v1SvSKvdSPBmIJtNoSwr65BBGqLQ1LdHg93kfZoCq5NPxkaYjIyppzYaczGlwZBrsKyjbTEI5B1aWuw6g9xBZ1viussKRP5C5Pq5yO14P8xBDHGugo93mwf7rsjNehNuxDSAt
shortened for obvious reasons, but the start of both strings would be w881l.....
I also tried java script, using the importantfile.replace("file","");
code but it returns the whole string as well. Anything helps, thanks
Advertisement
Answer
If I’m understanding correctry, how about:
awk ' NR==FNR{str1 = $0; next} {str2 = $0} END { for (i = j = 1; j <= length(str2); ) { if (substr(str1, i, 1) == substr(str2, j, 1)) { incr = 1 } else { incr = 0 printf "%s", substr(str2, j, 1) } i+=incr; j++ } print "" }' file.txt importantfile.txt
Output:
d5