Skip to content
Advertisement

Trouble writing bash sed command – regex match

I have a file full of garbage collection information that is irregular, some lines have extra information that I want to initially remove so I can then process the the file as a whole.

Unfortunately the line has quite a few special characters and I am struggling with a sed command that manages to match the bit I want to remove…

The line includes something along the lines of this:

[ParOldGen: 0K->0K(0K)] 0K->0K(0K), [Metaspace: 0K->0K(0K)], 0 secs]

The line has other information around the above string which I do want to keep, that includes []() characters.

I want to match

[ParOldGen*secs]

and then remove it using sed

cat test.log | sed -e 's,<match>,,g' | ...

I went and checked on a regex checker, which came up with:

[ParOldGen(?:(?!secs])(?:.|n))*secs]

However, it doesn’t match with sed -e and it errors when using sed -E

I can’t use cut easily because there are too many other sections that have [ and ].

I was trying something like this:

cat test.log | while read line; do if [ "$line" == *"ParOldGen"* ];then cut -d ":" -f 1,9; else cut -d ":" -f 1,7; fi; done | tail

which would effectively work around it, but I have not been able to get a match on the ParOldGen, it always just executes the then portion.

My expected output is that I want to remove the ParOldGen line.

Is anyone able to help me with this one?

Thanks!

Advertisement

Answer

I am working on the assumption that you want to remove the entire string starting with [ParOldGen and finishing with secs] from each line in your file. In that case, you can use the following sed command:

sed -e 's/^(.*)[ParOldGen.*secs](.*)$/12/' test.log

The regexp grabs any characters before [ParOldGen into one capture group, and any characters after secs] into another. The entire line is then replaced by those two capture groups, effectively removing the characters from [ParOldGen to secs]. e.g. if test.log contains:

[Some other data (4) ][ParOldGen: 0K->0K(0K)] 0K->0K(0K), [Metaspace: 0K->0K(0K)], 0 secs] and then some more [possibly also with ()]

The output of cat test.log | sed -e 's/^(.*)[ParOldGen.*secs](.*)$/12/' is

[Some other data (4) ] and then some more [possibly also with ()]
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement