Skip to content
Advertisement

log text parsing in linux

I need to select a text from the log and store the field has a column to the new file.

For example below is the log format

[Mon Dec 07] [error] [client 10.0.0.65] [id "981004"] [file "sample"] [line "84"] [hostname "test"] [uri "/login"] [unique_id "VmVddAo"]
[Mon Dec 07] [error] [client 10.0.0.65] [file "sample"] [line "47"] [id "960015"] [rev "1"] [msg "Request Missing an Accept Header"] [severity "NOTICE"][ver "OWASP_CRS/2.2.9"] [maturity "9"] [accuracy "9"] [tag "MISSING_HEADER_ACCEPT"] [tag "WASCTC/WASC-21"] [tag "OWASP_TOP_10/A7"] [tag "PCI/6.5.10"] [hostname "test"] [uri "/home"] [unique_id "VmVddQo"]

Want to print the output like below

[Mon Dec 07] [id "981004"] [uri "/login"]
[Mon Dec 07] [id "960015"] [uri "/home"]

i have used awk to print as column wise

grep "Mon Dec 07" filename | sed '/[[a-zA-Z]/t&/g' | awk -F't' '{print $5}'

But i got the below output 

[id "981004"]
[file "sample"]

Because the column are found on different places, for example

[id "981004"] in the 4th column
[id "960015"] in the 6th column 

How to get the value using like, the id as key and inside the double quotes is value for that key. After selecting all the values it has to be stored in a new file(csv) as a column.

Thanks vrs & Mirosław Zalewski

#!/bin/bash

search=$1
log=$2
regexp="s/([$search[^]]*]).+([id[^]]*]).+([uri[^]]*]).+/1 2 3/p"
sed -rn "$regexp" $2

&

perl -n -e '$,=" "; @groups = $_ =~ m/([.*?]).*([(?:id|uri).*?]).*((?-2)).*/ ; print @groups, "n"' /path/to/log/file.log

both worked…

Advertisement

Answer

You can do that with a script like this:

#!/bin/bash

search=$1
log=$2
regexp="s/([$search[^]]*]).+([id[^]]*]).+([uri[^]]*]).+/1 2 3/p"
sed -rn "$regexp" $2

You can save this program to a file (say, script.sh), make it executable (chmod +x script.sh) and run $ ./script.sh "Mon Dec 07" log.txt

This is what the script does:

  1. Assigns the first argument to the script to variable $search (text you want to match lines against), second one to variable $log (name of a log file)
  2. Creates a regular expression for sed
    • (...) means grouping
    • [some text] means some text inside square brakets (they are escaped with baclslashes)
    • [^...]means any character but ..., i.e. [^]] means any character but closing square bracket (that’s needed for regex termination)
    • .+ means any positive number of any characters
    • 1 means that we need to use text from the first group (see the first bullet)
    • sed‘s options -rn mean suppress default printing of each line and usage of extended regular expressions respectively
  3. Use the regular expression with sed on your log file log.txt

Hope that helps.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement