I need your help… I got this kind of text:
2016.04.10 19:24:00,044 +0300 basdahsdjashd asjd ashdjkl [{"socialSecurityNumber":"68888410106514","socialSecurityNumberCountryCode":"EE"}] 2016.04.07 14:29:09,126 +0300 jsjdgdbcgf jjsgftr kksgcxdw2 [{"socialSecurityNumber":"00299288282224","socialSecurityNumberCountryCode":"EE"}] 2016.04.05 22:01:32,005 +0300 jafhaljdhf afs ljhsdhfl adf tng-customer-id=9303801442 2016.04.05 20:44:51,003 +0300 pppcndhfgus23 ofkgjg jdghhfye uksd tng-customer-id=2875223046
and the output I need is (first and second column and socialSecurityNumber OR tng-customer-id):
2016.04.10 19:24:00,044 "socialSecurityNumber":"68888410106514" 2016.04.07 14:29:09,126 "socialSecurityNumber":"00299288282224" 2016.04.05 22:01:32,005 tng-customer-id=9303801442 2016.04.05 20:44:51,003 tng-customer-id=2875223046
So the question is … is it possible to solve this issue with sed command? I need the OR option here.
If I try to do it separately, firstly, find the socialSecurityNumber, I get this:
wsslogfetcher ~/temp/log_parser$ sed 's/([^+]*).*("socialSecurityNumber"[^,]*).*/1 2/' testfile.txt 2016.04.10 19:24:00,044 "socialSecurityNumber":"68888410106514" 2016.04.07 14:29:09,126 "socialSecurityNumber":"00299288282224" 2016.04.05 22:01:32,005 +0300 jafhaljdhf afs ljhsdhfl adf tng-customer-id=9303801442 2016.04.05 20:44:51,003 +0300 pppcndhfgus23 ofkgjg jdghhfye uksd tng-customer-id=2875223046
secondly, find the tng-customer-id, I get this:
wsslogfetcher ~/temp/log_parser$ sed 's/([^+]*).*(tng-customer-id[^ ]*).*/1 2/' testfile.txt 2016.04.10 19:24:00,044 +0300 basdahsdjashd asjd ashdjkl [{"socialSecurityNumber":"68888410106514","socialSecurityNumberCountryCode":"EE"}] 2016.04.07 14:29:09,126 +0300 jsjdgdbcgf jjsgftr kksgcxdw2 [{"socialSecurityNumber":"00299288282224","socialSecurityNumberCountryCode":"EE"}] 2016.04.05 22:01:32,005 tng-customer-id=9303801442 2016.04.05 20:44:51,003 tng-customer-id=2875223046
So, if you can see, in the first example when the socialSecurityNumber is not found in two last lines it just prints them out. In the second example the same situation …
When I try to complect my sed command with OR operator I get this output, which is completely wrong:
wsslogfetcher ~/temp/log_parser$ sed 's/([^+]*).*(("socialSecurityNumber"[^,]*).*|(tng-customer-id=[^ ]*).*)/1 2/' testfile.txt 2016.04.10 19:24:00,044 "socialSecurityNumber":"68888410106514","socialSecurityNumberCountryCode":"EE"}] 2016.04.07 14:29:09,126 "socialSecurityNumber":"00299288282224","socialSecurityNumberCountryCode":"EE"}] 2016.04.05 22:01:32,005 tng-customer-id=9303801442 2016.04.05 20:44:51,003 tng-customer-id=2875223046
So … what I’m doing wrong?
Advertisement
Answer
Use this sed
:
sed 's/^([^ ]*) ([^ ]*).*("socialSecurityNumber":"[^"]*"|tng-customer-id=[^ ]*).*$/1 2 3/g' file
Test:
$ sed 's/^([^ ]*) ([^ ]*).*("socialSecurityNumber":"[^"]*"|tng-customer-id=[^ ]*).*$/1 2 3/g' a 2016.04.10 19:24:00,044 "socialSecurityNumber":"68888410106514" 2016.04.07 14:29:09,126 "socialSecurityNumber":"00299288282224" 2016.04.05 22:01:32,005 tng-customer-id=9303801442 2016.04.05 20:44:51,003 tng-customer-id=2875223046
From your command:
sed 's/([^+]*).*(("socialSecurityNumber"[^,]*)|(tng-customer-id=[^ ]*)).*/1 2/'
I have removed .*
in each grouping which is grouped by outer single group. So that, the unmatched string won’t be grouped.