I am new to using regex and having some issues understanding how to extract a group from a regular expression. I have a text file (example.txt):
libstuff-example1 (>= 6.3.2), libpackage-example2 (>= 5.2.1), libtest-example3 (>= 5.2.1)
I am trying to extract the “5.2.1” only from the libpackage line and put it into a variable for a bash script. I have tried doing
cat example.txt | grep -oP "libpackage-[a-z- (>=]+(.*)[)],"
But it gives me the entire line instead of that “5.2.1” section. How do I extract the first group from that line so I only get “5.2.1”?
Advertisement
Answer
You can use
val=$(grep -Po 'libpackage-.* K[0-9.]+' example.txt)
Details:
-Po– enables the PCRE regex engine (P) and outputting matches only (witho)libpackage-.* K[0-9.]+– matcheslibpackage-, then any text, a space, then omits all this text matched, and then matches and returns one or more digits or dots.
See an online demo:
s='libstuff-example1 (>= 6.3.2), libpackage-example2 (>= 5.2.1), libtest-example3 (>= 5.2.1)' val=$(grep -Po 'libpackage-.* K[0-9.]+' <<< "$s") echo "$val" # => 5.2.1
A GNU awk way:
val=$(awk -F'[ ()]+' '/^libpackage-/{print $3}' example.txt)
See this online demo.
Here, -F'[ ()]+' sets a field separator as one or more spaces or parentheses, /^libpackage-/ finds the line(s) starting with libpackage- and {print $3} prints (“returns”) the value of Field (Column) 3.