Skip to content
Advertisement

How to sort or rearrange numbers from multiple column into multiple row [fixed into 4 columns]?

I have 1 text file, which is test1.txt.

text1.txt contain as following:
Input:

##[A1] [B1] [T1]  [V1] [T2]  [V2] [T3]  [V3] [T4]  [V4]## --> headers
    1  1000    0   100   10   200   20   300   30   400
              40   500   50   600   60   700   70   800
       1010    0   101   10   201   20   301   30   401
              40   501   50   601  
    2  1000    0   110   15   210   25   310   35   410
              45   510   55   610   65   710
       1010    0   150   10   250   20   350   30   450
              40   550  

Condition:
A1 and B1 -> for each A1 + (B1 + [Tn + Vn])
A1 should be in 1 column.
B1 should be in 1 column.
T1,T2,T3 and T4 should be in 1 column.
V1,V2,V3 and V4 should be in 1 column.

How do I sort it become like below?
Desire Output:

##   A1    B1   Tn    Vn ## --> headers

      1  1000    0   100
                10   200
                20   300
                30   400
                40   500
                50   600
                60   700
                70   800
         1010    0   101
                10   201
                20   301
                30   401
                40   501
                50   601
      2  1000    0   110
                15   210
                25   310
                35   410
                45   510
                55   610
                65   710
         1010    0   150
                10   250
                20   350
                30   450
                40   550

Here is my current code:
First Attempt:
Input

cat test1.txt | awk ' { a=$1 b=$2 } { for(i=1; i<=5; i=i+1) { t=substr($0,11+i*10,5) v=substr($0,16+i*10,5) if( t ~ /^ +[0-9]+$/ || t ~ /^[0-9]+$/ || t ~ /^ +[0-9]+ +$/ ){ printf "%7s %7d %8d %8d n",a,b,t,v } }}' | less

Output:

      1    1000      400        0 
     40     500      800        0 
   1010       0      401        0 
      2    1000      410        0 
   1010       0      450        0

I’m trying using simple awk command, but still can’t get the result.
Can anyone help me on this?

Thanks,
Am

Advertisement

Answer

This is a rather tricky problem that can be handled a number of ways. Whether bash, perl or awk, you will need to handle to number of fields in some semi-generic way instead of just hardcoding values for your example.

Using bash, so long as you can rely on an even-number of fields in all lines (except for the lines with the sole initial value (e.g. 1010), you can accommodate the number of fields is a reasonably generic way. For the lines with 1, 2, etc.. you know your initial output will contain 4-fields. For lines with 1010, etc.. you know the output will contain an initial 3-fields. For the remaining values you are simply outputting pairs.

The tricky part is handling the alignment. Here is where printf which allows you to set the field-width with a parameter using the form "%*s" where the conversion specifier expects the next parameter to be an integer value specifying the field-width followed by a parameter for the string conversion itself. It takes a little gymnastics, but you could do something like the following in bash itself:

(note: edit to match your output header format)

#!/bin/bash

declare -i nfields wd=6     ## total no. fields, printf field-width modifier

while read -r line; do      ## read each line  (preserve for header line)
    arr=($line)             ## separate into array
    first=${arr[0]}         ## check for '#' in first line for header
    if [ "${first:0:1}" = '#' ]; then
        nfields=$((${#arr[@]} - 2))     ## no. fields in header
        printf "##   A1    B1   Tn    Vn ## --> headersn"  ## new header
        continue
    fi
    fields=${#arr[@]}                   ## fields in line
    case "$fields" in
        $nfields )                      ## fields -eq nfiles?
            cnt=4                       ## handle 1st 4 values in line
            printf " "
            for ((i=0; i < cnt; i++)); do
                if [ "$i" -eq '2' ]; then
                    printf "%*s" "5" "${arr[i]}"
                else
                    printf "%*s" "$wd" "${arr[i]}"
                fi
            done
            echo
            for ((i = cnt; i < $fields; i += 2)); do    ## handle rest
                printf "%*s%*s%*sn" "$((2*wd))" " " "$wd" "${arr[i]}" "$wd" "${arr[$((i+1))]}"
            done
            ;;
        $((nfields - 1)) )              ## one less than nfields
            cnt=3                       ## handle 1st 3 values
            printf " %*s%*s" "$wd" " "
            for ((i=0; i < cnt; i++)); do
                if [ "$i" -eq '1' ]; then
                    printf "%*s" "5" "${arr[i]}"
                else
                    printf "%*s" "$wd" "${arr[i]}"
                fi
            done
            echo
            for ((i = cnt; i < $fields; i += 2)); do    ## handle rest
                if [ "$i" -eq '0' ]; then
                    printf "%*s%*s%*sn" "$((wd+1))" " " "$wd" "${arr[i]}" "$wd" "${arr[$((i+1))]}"
                else
                    printf "%*s%*s%*sn" "$((2*wd))" " " "$wd" "${arr[i]}" "$wd" "${arr[$((i+1))]}"
                fi
            done
            ;;
        * )     ## all other lines format as pairs
            for ((i = 0; i < $fields; i += 2)); do
                printf "%*s%*s%*sn" "$((2*wd))" " " "$wd" "${arr[i]}" "$wd" "${arr[$((i+1))]}"
            done
            ;;
    esac
done

Rather than reading from a file, just use redirection to redirect the input file to your script (if you want to just provide a filename, then redirect the file to feed the output while read... loop)

Example Use/Output

$ bash text1format.sh <dat/text1.txt
##   A1    B1   Tn    Vn ## --> headers
      1  1000    0   100
                10   200
                20   300
                30   400
                40   500
                50   600
                60   700
                70   800
         1010    0   101
                10   201
                20   301
                30   401
                40   501
                50   601
      2  1000    0   110
                15   210
                25   310
                35   410
                45   510
                55   610
                65   710
         1010    0   150
                10   250
                20   350
                30   450
                40   550

As between awk and bash, awk will generally be faster, but here with formatted output, it may be closer than usual. Look things over and let me know if you have questions.

Advertisement