Skip to content
Advertisement

Grep Regex: How to find multiple area codes in phone number?

I have a file: each line consist of a name, room number, house address, phone number.

I want to search for the lines that have the area codes of either 404 or 202. I did “(404)|(202)” but it also gives me lines that had the numbers in the phone number in general instead of from area code, example:

John Smith 300 123 N. Street 808-543-2029

I do not want the above, I am targeting lines like this, examples:

Danny Brown 173 555 W. Avenue 202-383-1540
Martha Keith 567 322 S. Example 404-653-1200

Advertisement

Answer

Let’s consider this test file:

$ cat addresses
John Smith 202 404 N. Street 808-543-2029
Danny Brown 173 555 W. Avenue 202-383-1540
Martha Keith 567 322 S. Example 404-653-1200

The distinguishing feature of area codes, as opposed to other three digit numbers, is that they have a space before them and a - after them. Thus, use:

$ grep -E ' (202|404)-' addresses
Danny Brown 173 555 W. Avenue 202-383-1540
Martha Keith 567 322 S. Example 404-653-1200

More complex example

Suppose that phone numbers appear at the end of lines but can have any of the three forms 808-543-2029, 8085432029, or 808 543 2029 as in the following example:

$ cat addresses
John Smith 202 404 N. Street 808-543-2029
Danny Brown 173 555 W. Avenue 2023831540
Martha Keith 567 322 S. Example 404 653 1200

To select the lines with 202 or 404 area codes:

$ grep -E ' (202|404)([- ][[:digit:]]{3}[- ][[:digit:]]{4}|[[:digit:]]{7})$' addresses
Danny Brown 173 555 W. Avenue 2023831540
Martha Keith 567 322 S. Example 404 653 1200

If it is possible that the phone numbers are followed by stray whitespaces, then use:

$ grep -E ' (202|404)([- ][[:digit:]]{3}[- ][[:digit:]]{4}|[[:digit:]]{7})[[:blank:]]*$' addresses
Danny Brown 173 555 W. Avenue 2023831540
Martha Keith 567 322 S. Example 404 653 1200
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement