I would appreciate your advice on how to deal with the following problem using Linux bash scripts.
I have a file with the following structure:
abc qwe, rty, 23er abc ex6, 56789 45768, ew, fgcv abc dfd, fghfghgf d-e2 ghjkfg, 34 d-e2 opuo2rr d-e2 56 rt-y, rggg, 1 9
It must be turned to:
abc, qwe, rty, 23er, ex6, 56789 45768, ew, fgcv, dfd, fghfghgf d-e2, ghjkfg, 34, opuo2rr, 56 rt-y, rggg, 1 9
Put it simply, all lines starting with the same leading word («abc» and «d-e2» in my example) must be combined into single line. The leading words are separated from the rest with tabulation symbols, other records are separated with commas; each record may consist of a few words separated with multiple spaces which are to be preserved in converting.
I would try to invent something using some combination of sed or awk, grep and while myself, but as the file is big enough (contains a bit less than one million of lines), so effectiveness of the solution is of essence and I have only rudimentary knowing of bash scripting. A solution with Perl or Python or whatever would be great too, but I’d like to use this to gain some more knowledge of bash.
Advertisement
Answer
here i guess: Join lines with the same value in the first column There is a cute AWK one-liner inside, which should be working for you as well.
Moreover i think that u should try to write more specific topic- not only “sorting file with specific structure” but something more concrete like in the example above. cheers.