Bash: transform key-value lines to CSV format [closed]

Question

Closed. This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 6 years ago. Improve this question Editor's note: I've clarified the problem definition, because I think the problem is an interesting one, and this question deserves to be reopened. I've got

Accepted Answer

A simple solution with cut, paste, and head (assumes input file file, outputs to file out.csv):#!/usr/bin/env bash{ cut -d':' -f1 file | head -n 3 | paste -d, - - -; cut -d':' -f2- file | paste -d, - - -; } >out.csvcut -d':' -f1 file | head -n 3 creates the header line:cut -d':' -f1 file extracts the first :-based field from each input line, and head -n 3 stops after 3 lines, given that the headers repeat every 3 lines.paste -d, - - - takes 3 input lines from stdin (one for each -) and combines them to a single, comma-separated output line (-d,)cut -d':' -f2- file | paste -d, - - - creates the data lines:cut -d':' -f2- file extracts everything after the : from each input line.As above, paste then combines 3 values to a single, comma-separated output line.agc points out in a comment that the column count (3) and the paste operands (- - -) are hard-coded above.The following solution parameterizes the column count (set it via n=...):{ n=3; pasteOperands=$(printf '%.s- ' $(seq $n)) cut -d':' -f1 file | head -n $n | paste -d, $pasteOperands; cut -d':' -f2- file | paste -d, $pasteOperands; } >out.csvprintf '%.s- ' $(seq $n) is a trick that produces a list of as many space-separated - chars. as there are columns ($n).While the previous solution is now parameterized, it still assumes that the column count is known in advance; the following solution dynamically determines the column count (requires Bash 4+ due to use of readarray, but could be made to work with Bash 3.x):# Determine the unique list of column headers and# read them into a Bash array.readarray -t columnHeaders < <(awk -F: 'seen[$1]++ { exit } { print $1 }' file)# Output the header line.(IFS=','; echo "${columnHeaders[*]}") >out.csv# Append the data lines.cut -d':' -f2- file | paste -d, $(printf '%.s- ' $(seq ${#columnHeaders[@]})) >>out.csvawk -F: 'seen[$1]++ { exit } { print $1 } outputs each input line’s column name (the 1st :-separated field), remembers the column names in associative array seen, and stops at the first column name that is seen for the second time.readarray -t columnHeaders reads awk‘s output line by line into array columnHeaders(IFS=','; echo "${columnHeaders[*]}") >out.csv prints the array elements using a space as the separator (specified via $IFS); note the use of a subshell ((...)) so as to localize the effect of modifying $IFS, which would otherwise have global effects.The cut ... pipeline uses the same approach as before, with the operands for paste being created based on the count of the elements of array columnHeaders (${#columnHeaders[@]}).To wrap the above up in a function that outputs to stdout and also works with Bash 3.x:toCsv() { local file=$1 columnHeaders # Determine the unique list of column headers and # read them into a Bash array. IFS=$'n' read -d '' -ra columnHeaders < <(awk -F: 'seen[$1]++ { exit } { print $1 }' "$file") # Output the header line. (IFS=','; echo "${columnHeaders[*]}") # Append the data lines. cut -d':' -f2- "$file" | paste -d, $(printf '%.s- ' $(seq ${#columnHeaders[@]}))}# Sample invocationtoCsv file > out.csv

Advertisement

Answer