Skip to content
Advertisement

How to find the lines that include same two letters by using grep?

For example "Conclusion" has two c but at different index. I am using

JavaScript

It shows me only like “Accept”. I mean it shows me only these words are with side by side but i want to see here also “Conclusion”

Advertisement

Answer

If you plan to match lines that contain any two identical letters that are not necessarily consecutive, you can use

JavaScript

Here, ([[:alpha:]]) is a capturing group with ID 1 that matches any letter, .* matches any text and then 1 backreference matches the same char as in Group 1 (case insensitively due to -i option).

If you have in mind a specific letter of your choice, you can use a simpler

JavaScript

So, just replace the POSIX character class and backreference with the letter.

If you want to make sure there are ONLY two identical letters and not more, you can use

JavaScript

Details of the PCRE pattern (note the P option):

  • ^ – start of string
  • (?!.*(p{L})(?:.*1){2}) – a negative lookahead that fails the match if there are zero or more chars other than line break chars, as many as possible, then any Unicode letter (captured into Group 1, and then two occurrences of any zero or more chars other than line break chars, as many as possible, followed with the same letter as captured into Group 1
  • .* – zero or more chars other than line break chars, as many as possible
  • (p{L}) – Group 2: any Unicode letter
  • .* – zero or more chars other than line break chars, as many as possible
  • 2 – Backreference to Group 2 value.

And if the character is a specific one:

JavaScript

where [^c]* matches zero or more chars other than c.

See the grep demo:

JavaScript
Advertisement