Skip to content
Advertisement

Fetch latest matching string value

I have a file which contains two values for initial... keyword. I want to grab the latest date for matching initial... string. After getting the date I also need to format the date by replacing / with -

---other data
    INFO   | abc 1    | 2018/01/04 20:04:35 | initial...

    INFO   | abc 1    | 2018/02/05 17:01:42 | INFO | new| InitialLauncher | c.t.s.s.setup.launch | initial...

---other data

In the above example, my output should be 2018-02-05. Here, I am fetching the line which contains initial... value and only getting the line with latest date value. Then, I need to strip out the remaining string and fetch only the date value.

I am using the following grep but it is not yet as per the requirement.

grep -q -iF "initial..." /tmp/file.log

Advertisement

Answer

Using the knowledge that later dates appear later in the file, it’s only necessary to print the date from the last line containing initial....

First step (drop the -q from grep — you don’t want it to be quiet):

grep -iF 'initial...' /tmp/file.log |
tail -n 1 |
sed -e 's/^[^|]*|[^|]*| *([^ ]*) .*/1/' -e 's%/%-%g'

The (first) s/// command matches a series of non-pipes followed by a pipe, another series of non-pipes followed by a pipe, a blank, then captures a series of non-blanks, and finally matches a blank and anything; it replaces all that with just the captured string, which is the date field after the second pipe on the input line. The (second) s%%% command replaces slashes with dashes, using % to avoid the confusion that the equivalent s///-/g might engender, thereby reformatting the date in ISO 8601-style format.

But we can lose the tail with:

grep -iF 'initial...' /tmp/file.log |
sed -n -e '$ { s/^[^|]*|[^|]*| *([^ ]*) .*/1/; s%/%-%gp; }'

The -n suppresses normal output; the $ matches only the last line; the p after the second s/// operation prints the result.

The case-insensitive fixed-pattern search is more conveniently written in grep than in sed. Although it could be done in a single sed command, you have to work fairly hard, saving matching rows in the hold space, then swapping the hold and pattern space at the end, and doing the substitution and printing:

sed -n 
    -e '/[Ii][Nn][Ii][Tt][Ii][Aa][Ll].../h' 
    -e '$ { x; s/^[^|]*|[^|]*| *([^ ]*) .*/1/; s%/%-%gp; }' /tmp/file.log

Each of these produces the output 2018-02-05 on the sample data. If fed an input with no initial... in it, they output nothing.

Advertisement