Skip to content
Advertisement

How to delete prefix, suffix in a string matching a pattern and split on a character using sed?

I have the following string, which is the output of a cassandra query in bash

JavaScript

I want to split this string so as to remove the string in the beginning till the last + symbol and then remove the tail end, which is (XYZ rows).

So, the string becomes A|1|a B|2|b C|3|c D|4|d. Now, I want to split this string into multiple arrays that look like this

JavaScript

so that I can iterate over each row using a for loop to do some processing. The number of rows can vary.

How can I do this using sed or grep?

I tried this for the first pass but it didn’t work:

JavaScript

NOTE: the column strings can have multiple spaces in them ex: output of CQL query when written to file is

JavaScript

Advertisement

Answer

JavaScript

The substitution does the following (hat tip to potong for pointing out how to get rid of one more substitution):

JavaScript

resulting in this intermediate stage:

JavaScript

The transformation (y///) turns all spaces into newlines and pipes into spaces.

Spaces other than the ones separating rows

If there are spaces within column and we assume that each entry has the format

JavaScript

i.e., exactly two sets of spaces per entry, we have to replace the transformation y/// with another substitution,

JavaScript

This looks for spaces following not a space or pipe and followed by not a space or pipe, and inserts a newline before those spaces. Result:

JavaScript
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement