Skip to content
Advertisement

Sort a find command to respect a custom order in Unix

I have a script that outputs file paths (via find), which I want to sort based on very specific custom logic:

  • 1st sort key: I want the 2nd and, if present, the 3rd --separated field to be sorted using custom ordering based on a list of keys I supply – but excluding a numerical suffix.
    With the sample input below, the list of keys is:
    rp,alpha,beta-ri,beta-rs,RC

  • 2nd sort key: numeric sorting by the trailing number on each line.

Given the following sample input (note that the /foo/bar/test/example/8.2.4.0 prefix of each line is incidental):

/foo/bar/test/example/8.2.4.0-RC10
/foo/bar/test/example/8.2.4.0-RC2
/foo/bar/test/example/8.2.4.0-RC1
/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-beta-ri10
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-rs10
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-alpha2
/foo/bar/test/example/8.2.4.0-rp10
/foo/bar/test/example/8.2.4.0-rp2

I expect:

/foo/bar/test/example/8.2.4.0-rp2
/foo/bar/test/example/8.2.4.0-rp10
/foo/bar/test/example/8.2.4.0-alpha2
/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-ri10
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-beta-rs10
/foo/bar/test/example/8.2.4.0-RC1
/foo/bar/test/example/8.2.4.0-RC2
/foo/bar/test/example/8.2.4.0-RC10

Advertisement

Answer

Using a variant of my answer to your original question:

./your-script | awk -v keysInOrder='rp,alpha,beta-ri,beta-rs,RC' '
    BEGIN {
      FS=OFS="-"
      keyCount = split(keysInOrder, a, ",")
      for (i = 1; i <= keyCount; ++i) keysToOrdinal[a[i]] = i
    }
    { 
      sortKey = $2
      if (NF == 3) sortKey = sortKey FS $3
      sub(/[0-9]+$/, "", sortKey)
      auxFieldPrefix = "|" FS
      if (NF == 2) auxFieldPrefix = auxFieldPrefix FS
      sub(/[0-9]/, auxFieldPrefix "&", $NF)
      sortOrdinal = sortKey in keysToOrdinal ? keysToOrdinal[sortKey] : keyCount + 1
      print sortOrdinal, $0
    }
'  | sort -t- -k1,1n -k3,3 -k5,5n | sed 's/^[^-]*-//; s/|-{1,2}//'

./your-script represents whatever command produces the output you want to sort.

Note that an aux. character, |, is used to facilitate sorting, and the assumption is that this character doesn’t appear in the input – which should be reasonable safe, given that filesystem paths usually don’t contain pipe characters.

Any field 2 values (sans numeric suffix) that aren’t in the list of sort keys, sort after the field 2/3 values that are, using alphabetic sorting among them.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement