Hii experts i want to split a large column of text file at a particular symbol(here >) and want to paste the splitted file side by side as given in a example below:
I tried with split -l 4 inputfile > otputfile but it doesnot help.I hope some expert will definitely help me.
For example i have data as given below:
> 1 2 2 4 > 4 3 5 3 > 4 5 2 3
and i need output like as below
1 4 4 2 3 5 2 5 2 4 3 3
Advertisement
Answer
EDIT: As per OP’s comment lines between > mark may not be regular in numbers if this is the case I have come up with following, where it will add NA for missing specific occurrence of >. Written and tested with GNU awk and considering no empty lines in your Input_file here.
awk -v RS=">" -v FS="n" '
FNR==NR{
max=(max>NF?max:NF)
next
}
FNR>1{
for(i=2;i<max;i++){
val[i]=(val[i]?val[i] OFS:"")($i?$i:"NA")
}
}
END{
for(i=2;i<max;i++){
print val[i]
}
}' Input_file Input_file
Could you please try following, written and tested with shown samples in GNU awk.
awk '
/^>/{
count=""
next
}
{
++count
val[count]=(val[count]?val[count] OFS:"")$0
}
END{
for(i=1;i<=count;i++){
print val[i]
}
}' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
/^>/{ ##Checking condition if a line starts from > then do following.
count="" ##Nullifying count variable here.
next ##next will skip all further statements from here.
}
{
++count ##Incrementing count variable with 1 here.
val[count]=(val[count]?val[count] OFS:"")$0 ##Creating val with index count and keep adding current lines values to it with spaces.
}
END{ ##Starting END block for this awk program from here.
for(i=1;i<=count;i++){ ##Starting a for loop from here.
print val[i] ##Printing array val with index i here.
}
}' Input_file ##Mentioning Input_file name here.