Skip to content
Advertisement

formatting the output with sed command

I need your help… I got this kind of text:

2016.04.10 19:24:00,044 +0300 basdahsdjashd asjd ashdjkl [{"socialSecurityNumber":"68888410106514","socialSecurityNumberCountryCode":"EE"}]
2016.04.07 14:29:09,126 +0300 jsjdgdbcgf jjsgftr kksgcxdw2 [{"socialSecurityNumber":"00299288282224","socialSecurityNumberCountryCode":"EE"}]
2016.04.05 22:01:32,005 +0300 jafhaljdhf afs ljhsdhfl adf tng-customer-id=9303801442
2016.04.05 20:44:51,003 +0300 pppcndhfgus23 ofkgjg jdghhfye uksd tng-customer-id=2875223046

and the output I need is (first and second column and socialSecurityNumber OR tng-customer-id):

2016.04.10 19:24:00,044 "socialSecurityNumber":"68888410106514"
2016.04.07 14:29:09,126 "socialSecurityNumber":"00299288282224"
2016.04.05 22:01:32,005 tng-customer-id=9303801442
2016.04.05 20:44:51,003 tng-customer-id=2875223046

So the question is … is it possible to solve this issue with sed command? I need the OR option here.

If I try to do it separately, firstly, find the socialSecurityNumber, I get this:

wsslogfetcher ~/temp/log_parser$ sed 's/([^+]*).*("socialSecurityNumber"[^,]*).*/1 2/' testfile.txt
2016.04.10 19:24:00,044  "socialSecurityNumber":"68888410106514"
2016.04.07 14:29:09,126  "socialSecurityNumber":"00299288282224"
2016.04.05 22:01:32,005 +0300 jafhaljdhf afs ljhsdhfl adf tng-customer-id=9303801442
2016.04.05 20:44:51,003 +0300 pppcndhfgus23 ofkgjg jdghhfye uksd tng-customer-id=2875223046

secondly, find the tng-customer-id, I get this:

wsslogfetcher ~/temp/log_parser$ sed 's/([^+]*).*(tng-customer-id[^ ]*).*/1 2/' testfile.txt
2016.04.10 19:24:00,044 +0300 basdahsdjashd asjd ashdjkl [{"socialSecurityNumber":"68888410106514","socialSecurityNumberCountryCode":"EE"}]
2016.04.07 14:29:09,126 +0300 jsjdgdbcgf jjsgftr kksgcxdw2 [{"socialSecurityNumber":"00299288282224","socialSecurityNumberCountryCode":"EE"}]
2016.04.05 22:01:32,005  tng-customer-id=9303801442
2016.04.05 20:44:51,003  tng-customer-id=2875223046

So, if you can see, in the first example when the socialSecurityNumber is not found in two last lines it just prints them out. In the second example the same situation …

When I try to complect my sed command with OR operator I get this output, which is completely wrong:

wsslogfetcher ~/temp/log_parser$ sed 's/([^+]*).*(("socialSecurityNumber"[^,]*).*|(tng-customer-id=[^ ]*).*)/1 2/' testfile.txt
2016.04.10 19:24:00,044  "socialSecurityNumber":"68888410106514","socialSecurityNumberCountryCode":"EE"}]
2016.04.07 14:29:09,126  "socialSecurityNumber":"00299288282224","socialSecurityNumberCountryCode":"EE"}]
2016.04.05 22:01:32,005  tng-customer-id=9303801442
2016.04.05 20:44:51,003  tng-customer-id=2875223046

So … what I’m doing wrong?

Advertisement

Answer

Use this sed:

sed 's/^([^ ]*) ([^ ]*).*("socialSecurityNumber":"[^"]*"|tng-customer-id=[^ ]*).*$/1 2 3/g' file

Test:

$ sed 's/^([^ ]*) ([^ ]*).*("socialSecurityNumber":"[^"]*"|tng-customer-id=[^ ]*).*$/1 2 3/g' a
2016.04.10 19:24:00,044 "socialSecurityNumber":"68888410106514"
2016.04.07 14:29:09,126 "socialSecurityNumber":"00299288282224"
2016.04.05 22:01:32,005 tng-customer-id=9303801442
2016.04.05 20:44:51,003 tng-customer-id=2875223046

From your command:

sed 's/([^+]*).*(("socialSecurityNumber"[^,]*)|(tng-customer-id=[^ ]*)).*/1 2/'

I have removed .* in each grouping which is grouped by outer single group. So that, the unmatched string won’t be grouped.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement