Skip to content
Advertisement

BASH / sed – not giving same output for simple sed commands

Box 1: uname -srm

Darwin 16.1.0 x86_64

Box 2: uname -srm; cat /etc/debian_version

Linux 3.13.0-100-generic x86_64
jessie/sid

BASH on box1 is: GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin16)

BASH on box2 is: GNU bash, version 4.3.11(1)-release (x86_64-pc-linux-gnu)

On both boxes, I’m have the following script:

#!/bin/bash

args="$@"
cust="$(echo ${args} | sed "s/^[, t][, t]*//;s/[, t][, t]*$//;s/[ ,][ ,]*/|/g;s/^/^/;s/$/$/;s/|/$|^/g;s/|/n/g"|sort|uniq|tr '12' '|'|sed "s/|$//")";
echo --1 ${cust}

cust="$(echo ${args} | sed "s/^[, t][, t]*//;s/[, t][, t]*$//; 
        s/[ ,][ ,]*/|/g; 
        s/^/^/;s/$/$/; 
        s/|/$|^/g; 
        s/|/n/g" | sort | uniq | tr '12' '|' 
        | sed "s/|$//")";
echo --2 ${cust}

cust="$(echo ${args}   | sed "s/^[, t][, t]*//;s/[, t][, t]*$//")"
cust="$(echo ${cust}   | sed "s/[ ,][ ,]*/|/g")"
cust="$(echo ${cust}   | sed "s/^/^/;s/$/$/")"
cust="$(echo ${cust}   | sed "s/|/$|^/g")"
cust="$(echo ${cust}   | sed "s/|/n/g")"
cust="$(echo "${cust}" | sort | uniq | tr '12' '|')"
cust="$(echo ${cust}   | sed "s/|$//")";
echo --3 ${cust}

All commands are same. This is what I’m trying to do:

## Remove prefix/suffix space, commas.
## Replce inbetween spaces/commas with '|'.
## Prefix '^' & suffix '$' in the regex value.
## Embedd strict regex pattern for a customer by
## - replacing: '|' with '$|^'
## Sort & Uniq - for alphabetical order & remove duplicates.
## Set ${cust} regex variable with a valid regex value.
## Ex1: ".*" (if no argument is passed i.e. for all IDs).
## Ex2: "^custID1$|^custID2$|^custID3$"
##      (if 'custID1 custID3, custID2' were passed)

On box1: When I’m running the script with arguments: aa1, aa3,aa2 aa1 , a0, I’m getting the following output which is NOT what I’m expecting in the output (Character ‘n‘ is getting embedded here and it didn’t even do sort or uniq operation on the values):

$ ./1.sh aa1, aa3,aa2   aa1 , a0

--1 ^aa1$n^aa3$n^aa2$n^aa1$n^a0$
--2 ^aa1$n^aa3$n^aa2$n^aa1$n^a0$
--3 ^aa1$n^aa3$n^aa2$n^aa1$n^a0$

On box2: When I’m doing the same: I’m getting the EXPECTED output.

--1 ^a0$|^aa1$|^aa2$|^aa3$
--2 ^a0$|^aa1$|^aa2$|^aa3$
--3 ^a0$|^aa1$|^aa2$|^aa3$

What should I change so that the above expression gives same output on both machines? OR- do I have to update BASH on box1?

My understanding is, that I’m doing very simple sed operation and both BASH versions should have given me same output for such simple sed commands.

Advertisement

Answer

It seems a bit convoluted to add pipes, then remove them, then add them back again. This code works on macOS Sierra 10.12.1 (Darwin 16.1.0) — I believe it would work on Linux too:

#!/bin/bash

args="$*"
IFS=$' t,'
# echo $args

cust="$(printf '%sn' ${args} |
        sort -u |
        tr '12' '|' | 
        sed -e 's/|$//' -e 's/^|//' -e 's/^/^/' -e 's/$/$/' -e 's/|/$|^/g')"
echo "--2 ${cust}"

The combination of setting IFS so it includes a comma and passing $args to printf without any quotes around it eliminates oddities in the spacing and commas. Sorting uniquely can be done in one operation. Then replace all the newlines with | symbols, then use sed to remove any leading or trailing |, add the leading ^ and trailing $ and the intermediate $|^ sequences. Then echo the result string.

When the script is made executable and called x37.sh, it produces:

$ ./x37.sh aa1, aa3,aa2 aa1 , a0
--2 ^a0$|^aa1$|^aa2$|^aa3$
$

There are ways to do without the tr command, making sed do the line concatenation.

Advertisement