Skip to content
Advertisement

Why does dollar not match literal dollar when extended regex (ERE) option is used with sed?

I want to replace $foo with bar. This works fine.

$ echo '$foo' | sed 's/$foo/bar/'
bar

But this command does not work fine when I use the -r option.

$ echo '$foo' | sed -r 's/$foo/bar/'
$foo

Why doesn’t this work.

Here is an example of what works with -r option.

$ echo '$foo' | sed -r 's/$foo/bar/'
bar

The real question is: Why does $ need to be escaped only while using the -r option. What would $ mean otherwise with the -r option?

I am using a Debian Linux system.

Advertisement

Answer

In extended regular expressions (ERE), that is, under -r, $ means the end of the line:

$ echo '$foo' | sed -r 's/foo$/bar/'
$bar

If you want it to mean something else, it has to be escaped:

$ echo '$foo' | sed -r 's/[$]foo$/bar/'
bar

Documentation

man 7 regex explains that, in Extended Regular Expressions (ERE), the $ matches at the end of the line:

‘$’ (matching the null string at the end of a line)

The same man page goes on to explain that in Basic Regular Expressions (BRE), which is what you get without -r, its meaning is more complicated:

Obsolete (“basic”) regular expressions differ in several respects. ‘|’, ‘+’, and ‘?’ are ordinary characters and there is no equivalent for their functionality. The delimiters for bounds are “{” and “}”, with ‘{‘ and ‘}’ by themselves ordinary characters. The parentheses for nested subexpressions are “(” and “)”, with ‘(‘ and ‘)’ by themselves ordinary characters. ‘^’ is an ordinary character except at the beginning of the RE or(!) the beginning of a parenthesized subexpression, ‘$’ is an ordinary character except at the end of the RE or(!) the end of a parenthesized subexpression, and ‘*’ is an ordinary character if it appears at the beginning of the RE or the beginning of a parenthesized subexpression (after a possible leading ‘^’).

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement