Read line by line and print matches line by line

I am new to shell scripting, it would be great if I can get some help with the question below.

I want to read a text file line by line, and print all matched patterns in that line to a line in a new text file.

For example:

$ cat input.txt

SYSTEM ERROR: EU-1C0A  Report error -- SYSTEM ERROR: TM-0401 DEFAULT Test error
SYSTEM ERROR: MG-7688 DEFAULT error -- SYSTEM ERROR: DN-0A00 Error while getting object -- ERROR: DN-0A52 DEFAULT Error -- ERROR: MG-3218 error occured in HSSL
SYSTEM ERROR: DN-0A00 Error while getting object -- ERROR: DN-0A52 DEFAULT Error
SYSTEM ERROR: EU-1C0A  error Failed to fill in test report -- ERROR: MG-7688

JavaScript
​x
 
$ cat input.txt​SYSTEM ERROR: EU-1C0A  Report error -- SYSTEM ERROR: TM-0401 DEFAULT Test errorSYSTEM ERROR: MG-7688 DEFAULT error -- SYSTEM ERROR: DN-0A00 Error while getting object -- ERROR: DN-0A52 DEFAULT Error -- ERROR: MG-3218 error occured in HSSLSYSTEM ERROR: DN-0A00 Error while getting object -- ERROR: DN-0A52 DEFAULT ErrorSYSTEM ERROR: EU-1C0A  error Failed to fill in test report -- ERROR: MG-7688​

The intended output is as follows:

$ cat output.txt

EU-1C0A TM-0401
MG-7688 DN-0A00 DN-0A52 MG-3218
DN-0A00 DN-0A52
EU-1C0A MG-7688

JavaScript
 
$ cat output.txt​EU-1C0A TM-0401MG-7688 DN-0A00 DN-0A52 MG-3218DN-0A00 DN-0A52EU-1C0A MG-7688​

I tried the following code:

while read p; do
    grep -o '[A-Z]{2}-[A-Z0-9]{4}' | xargs
done < input.txt > output.txt

JavaScript
 
while read p; do    grep -o '[A-Z]{2}-[A-Z0-9]{4}' | xargsdone < input.txt > output.txt​

which produced this output:

EU-1C0A TM-0401 MG-7688 DN-0A00 DN-0A52 MG-3218 DN-0A00 DN-0A52 EU-1C0A MG-7688 .......

JavaScript
 
EU-1C0A TM-0401 MG-7688 DN-0A00 DN-0A52 MG-3218 DN-0A00 DN-0A52 EU-1C0A MG-7688 .......​

Then I also tried this:

while read p; do
    grep -o '[A-Z]{2}-[A-Z0-9]{4}' | xargs > output.txt
done < input.txt

JavaScript
 
while read p; do    grep -o '[A-Z]{2}-[A-Z0-9]{4}' | xargs > output.txtdone < input.txt​

But did not help 🙁

Maybe there is another way, I am open to awk/sed/cut or whatever… 🙂

Note: There can be any number of Error codes (i.e. XX:XXXX, the pattern of interest in a single line).

Answer

There’s always perl! And this will grab any number of matches per line.

perl -nle '@matches = /[A-Z]{2}-[A-Z0-9]{4}/g; print(join(" ", @matches)) if (scalar @matches);' output.txt

JavaScript
 
perl -nle '@matches = /[A-Z]{2}-[A-Z0-9]{4}/g; print(join(" ", @matches)) if (scalar @matches);' output.txt​

-e perl code to be run by compiler and -n run one line at a time and -l automatically chomps the line and adds a newline to prints.

The regex implicitly matches against $_. So @matches = $_ =~ //g is overly verbose.

If there is no match, this will not print anything.

Advertisement

Answer