I am new to using regex and having some issues understanding how to extract a group from a regular expression. I have a text file (example.txt):
libstuff-example1 (>= 6.3.2), libpackage-example2 (>= 5.2.1), libtest-example3 (>= 5.2.1)
I am trying to extract the “5.2.1” only from the libpackage line and put it into a variable for a bash script. I have tried doing
cat example.txt | grep -oP "libpackage-[a-z- (>=]+(.*)[)],"
But it gives me the entire line instead of that “5.2.1” section. How do I extract the first group from that line so I only get “5.2.1”?
Advertisement
Answer
You can use
val=$(grep -Po 'libpackage-.* K[0-9.]+' example.txt)
Details:
-Po
– enables the PCRE regex engine (P
) and outputting matches only (witho
)libpackage-.* K[0-9.]+
– matcheslibpackage-
, then any text, a space, then omits all this text matched, and then matches and returns one or more digits or dots.
See an online demo:
s='libstuff-example1 (>= 6.3.2), libpackage-example2 (>= 5.2.1), libtest-example3 (>= 5.2.1)' val=$(grep -Po 'libpackage-.* K[0-9.]+' <<< "$s") echo "$val" # => 5.2.1
A GNU awk
way:
val=$(awk -F'[ ()]+' '/^libpackage-/{print $3}' example.txt)
See this online demo.
Here, -F'[ ()]+'
sets a field separator as one or more spaces or parentheses, /^libpackage-/
finds the line(s) starting with libpackage-
and {print $3}
prints (“returns”) the value of Field (Column) 3.