Skip to content
Advertisement

How to grep multi line string with new line characters or tab characters or spaces

My test file has text like:

JavaScript

I am trying to match all single lines ending with semicolon (;) and having text “dummy(“. Then I need to extract the string present in the double quotes inside dummy. I have come up with the following command, but it matches only the first and third statement.

JavaScript

With -o flag I expected to extract string between the double quotes inside dummy. But that is also not working. Can you please give me an idea on how to proceed?

Expected output is:

JavaScript

Some of the below answers work for basic file structures. If lines contains more than 1 new line characters, then code breaks. e.g. Input text files with more new line characters:

JavaScript

I referred to following SO links:

How to give a pattern for new line in grep?

how to grep multiple lines until ; (semicolon)

Advertisement

Answer

@TLP was pretty close:

JavaScript
JavaScript

Using

  • -0777 to slurp the file in as a single string
  • /bdummy(s*"(.+?)"/gs finds all the quoted string content after “dummy(” (with optional whitespace before the opening quote)
    • the s flag allows . to match newlines.
    • any string containing escaped double quotes will break this regex
  • map {s/^s+|s+$//gr} trims leading/trailing whitespace from each string.
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement