Skip to content
Advertisement

how to extract part of log file bash [closed]

I have a log file,

10.1.1.10 arcesium.com [17/Dec/2018:08:05:32 +0000] "GET /api/v1/services HTTP/1.1" 200 4081 "http://www. example.com/" "Mozilla/5.0 (X11; Linux x86_64; rv:25.0) Gecko/20100101 Firefox/25.0"
10.1.1.11 arcesium.com [17/Dec/2018:08:05:32 +0000] "GET /api/v1/services HTTP/1.1" 200 4084 "http://www. example.com/" "Mozilla/5.0 (X11; Linux x86_64; rv:25.0) Gecko/20100101 Firefox/25.0"
10.1.1.13 arcesium.com [17/Dec/2018:08:05:32 +0000] "GET /api/v1/services HTTP/1.1" 200 4082 "http://www. example.com/" "Mozilla/5.0 (X11; Linux x86_64; rv:25.0) Gecko/20100101 Firefox/25.0"

I want to get the 9th field as,

awk '{print $9}' file.txt
4081
4084
4082

But the problem is if the 3rd column got one more space "[17/Dec/2018:08:05:32 +0000]", then my value position will change to 10th column.

How can I achieve to combine the single value fields irrespective of space between them.

I want to achieve this using awk.

Advertisement

Answer

You can use in gnu-awk FPAT, splitting by content

awk 'BEGIN{FPAT = "("[^"]+")|(\[[^\]]+\])|([^ ]+)" } {print $6}' file.txt

you get,

4081
4084
4082

For column 1,

awk 'BEGIN{FPAT = "("[^"]+")|(\[[^\]]+\])|([^ ]+)" } {print $1}' file.txt

you get,

10.1.1.10
10.1.1.11
10.1.1.13

For column 3, for example

awk 'BEGIN{FPAT = "("[^"]+")|(\[[^\]]+\])|([^ ]+)" } {print $3}' file.txt

you get,

[17/Dec/2018:08:05:32 +0000]
[17/Dec/2018:08:05:32 +0000]
[17/Dec/2018:08:05:32 +0000]

for column 4, for example

awk 'BEGIN{FPAT = "("[^"]+")|(\[[^\]]+\])|([^ ]+)" } {print $4}' file.txt

you get,

"GET /api/v1/services HTTP/1.1"
"GET /api/v1/services HTTP/1.1"
"GET /api/v1/services HTTP/1.1"

REGEX Explanation

  • 1st Alternative ("[^"]+")

Match record which starts with " and ends with ", ex. "GET /api/v1/services HTTP/1.1"

  • 2nd Alternative (\[[^\]]+\]). Note in awk \[ or \] is mandatory

Match record which starts with [ and ends with ], ex. [17/Dec/2018:08:05:32 +0000]

  • 3rd Alternative ([^ ]+)

Match with whole word, ex. 10.1.1.10 or arcesium.com

Advertisement