Skip to content
Advertisement

How to extract something out of a file after greping?

How can I extract group-3-vm1 after I do this grep?

$ cat output/tpcds_query_1a_71_mr.out | grep "Tracking URL" | tail -1

Here’s the result:

Starting Job = job_1442587212068_0126, Tracking URL = http://group-3-vm1:8088/proxy/application_1442587212068_0126/

Advertisement

Answer

This awk command replaces the grep and tail commands while also extracting the text of interest:

$ awk -F'[:/]' '/Tracking URL/{n=$4;} END{print n;}' output/tpcds_query_1a_71_mr.out 
group-3-vm1

How it works

  • -F'[:/]'

    This sets the field separator to either a colon or a slash.

  • /Tracking URL/{n=$4;}

    This looks for lines containing Tracking URL and saves the fourth field in variable n.

  • END{print n;}

    After we have reached the end of the file, this prints the last n that we found.

Example

Here is a sample test file and output:

$ cat output/tpcds_query_1a_71_mr.out 
Starting Job = job_1442587212068_0126, Tracking URL = http://group-1-vm1:8088/proxy/application_1442587212068_0126/
Starting Job = job_1442587212068_0126, Tracking URL = http://group-2-vm1:8088/proxy/application_1442587212068_0126/
Starting Job = job_1442587212068_0126, Tracking URL = http://group-3-vm1:8088/proxy/application_1442587212068_0126/
Starting Job = job_1442587212068_0126, No Track URL = http://group-4-vm1:8088/proxy/application_1442587212068_0126/
$ awk -F'[:/]' '/Tracking URL/{n=$4;} END{print n;}' output/tpcds_query_1a_71_mr.out 
group-3-vm1
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement