Skip to content
Advertisement

Extracting group from regex

I am new to using regex and having some issues understanding how to extract a group from a regular expression. I have a text file (example.txt):

libstuff-example1 (>= 6.3.2),
libpackage-example2 (>= 5.2.1),
libtest-example3 (>= 5.2.1)

I am trying to extract the “5.2.1” only from the libpackage line and put it into a variable for a bash script. I have tried doing

cat example.txt | grep -oP "libpackage-[a-z- (>=]+(.*)[)],"

But it gives me the entire line instead of that “5.2.1” section. How do I extract the first group from that line so I only get “5.2.1”?

Advertisement

Answer

You can use

val=$(grep -Po 'libpackage-.* K[0-9.]+' example.txt)

Details:

  • -Po – enables the PCRE regex engine (P) and outputting matches only (with o)
  • libpackage-.* K[0-9.]+ – matches libpackage-, then any text, a space, then omits all this text matched, and then matches and returns one or more digits or dots.

See an online demo:

s='libstuff-example1 (>= 6.3.2),
libpackage-example2 (>= 5.2.1),
libtest-example3 (>= 5.2.1)'

val=$(grep -Po 'libpackage-.* K[0-9.]+'  <<< "$s")
echo "$val"
# => 5.2.1

A GNU awk way:

val=$(awk -F'[ ()]+' '/^libpackage-/{print $3}' example.txt)

See this online demo.

Here, -F'[ ()]+' sets a field separator as one or more spaces or parentheses, /^libpackage-/ finds the line(s) starting with libpackage- and {print $3} prints (“returns”) the value of Field (Column) 3.

Advertisement