Skip to content
Advertisement

Group the consecutive numbers in shell

$ foo="1,2,3,6,7,8,11,13,14,15,16,17"

In shell, how to group the numbers in $foo as 1-3,6-8,11,13-17

Advertisement

Answer

As an alternative, you can use this awk command:

cat series.awk
function prnt(delim) {
   printf "%s%s", s, (p > s ? "-" p : "") delim
}
BEGIN {
   RS=","
}
NR==1 {
   s = $1
}
p < $1-1 {
   prnt(RS)
   s = $1
}
{
   p = $1
}
END {
   prnt(ORS)
}

Now run it as:

$> foo="1,2,3,6,7,8,11,13,14,15,16,17"
$> awk -f series.awk <<< "$foo"
1-3,6-8,11,13-17

$> foo="1,3,6,7,8,11,13,14,15,16,17"
$> awk -f series.awk <<< "$foo"
1,3,6-8,11,13-17

$> foo="1,3,6,7,8,11,13,14,15,16,17,20"
$> awk -f series.awk <<< "$foo"
1,3,6-8,11,13-17,20

Here is an one-liner for doing the same:

awk 'function prnt(delim){printf "%s%s", s, (p > s ? "-" p : "") delim}
BEGIN{RS=","} NR==1{s = $1} p < $1-1{prnt(RS); s = $1} {p = $1}END {prnt(ORS)}' <<< "$foo"

In this awk command we keep 2 variables:

  1. p for storing previous line’s number
  2. s for storing start of the range that need to be printed

How it works:

  1. When NR==1 we set s to first line’s number
  2. When p is less than (current_number -1) or $1-1 that indicates we have a break in sequence and we need to print the range.
  3. We use a function prnt for doing the printing that accepts only one argument that is end delimiter. When prnt is called from p < $1-1 { ...} block then we pass RS or comma as end delimiter and when it gets called from END{...} block then we pass ORS or newline as delimiter.
  4. Inside p < $1-1 { ...} we reset s (start range) to $1
  5. After processing each line we store $1 in variable p.
  6. prnt uses printf for formatted output. It always prints starting number s first. Then it checks if p > s and prints hyphen followed by p if that is the case.
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement