Skip to content
Advertisement

Regex is not checking some part of text

I’ve example file with data to analyze by egrep command:

[IG#]
IG#
[RM#]
RM#
[IG#1234]
[IG# 1234]
[IG #1234] [RM# ]
[IG# 1234] [RM #1224]
[RM#1234]
[RM# 1234]
[RM #1234]
[RM# 1234] [IG#]
[RM# ] [IG#1234]
#1234
1234

My regexp looks that:

(RM#.*[0-9]|IG#.*[0-9]|b([A-Z][A-Z0-9]+-[0-9]+)b)

I wan’t to find only rows where [RM# {digits}] AND [IG# {digits}] but it returns like using OR and results looks following:

[IG#1234]
[IG# 1234]
[IG# 1234] [RM #1224]
[RM#1234]
[RM# 1234]
[RM# 1234] [IG#]
[RM# ] [IG#1234]

Expected output is

[IG# 1234]
[RM# 1234]
[IG# 1234] [RM1224]

Advertisement

Answer

Looks like you want to search for a line that should match two different strings in any order.. one way to do it

$ grep -E 'RMs*#s*[0-9]' ip.txt | grep -E 'IGs*#s*[0-9]'
[IG# 1234] [RM #1224]
  • s will match any whitespace character, use literal space if that is sufficient
  • add additional constraints like checking for [] surrounding RM/IG if needed


to check it in one shot, need to create all permutations

$ grep -E 'RMs*#s*[0-9].*IGs*#s*[0-9]|IGs*#s*[0-9].*RMs*#s*[0-9]' ip.txt
[IG# 1234] [RM #1224]

$ # awk is better suited
$ awk '/RMs*#s*[0-9]/ && /IGs*#s*[0-9]/' ip.txt
[IG# 1234] [RM #1224]
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement