Skip to content
Advertisement

Grep across multiple lines but returning all matches

I’m trying to grep against a windows file under cygwin or linux, to find all MySQL statements between a table lock and table unlock in a mysql log.

I can use awk, but I need the line numbers from the original file

I don’t see why -Pzo isn’t working in cygwin

Effectively I’m looking for is a way to locate statements which are prevented from obtaining a lock (where the main query is interrupted)

If I can at least get sensible results from grep then I can then search the results to find entries where something has interrupted the query.

    2302221 Query   LOCK TABLES browse WRITE, browse_being_allocated WRITE
    2302221 Query   SELECT id,startAtom,finishAtom FROM browse_being_allocated WHERE poolID = 31543 AND rushID = '32ca680dd0d84f9b9b2945e2186c09ff' AND format = 516 AND startAtom <= 1182716 AND finishAtom > 1182716
    2302221 Query   INSERT INTO browse (poolId,atom,skew,format,rushID,start,finish,databytes,srcPoolID,srcAtom,srcSkew,arrived) VALUES (31543,1182716,0,516,'32ca680dd0d84f9b9b2945e2186c09ff',274545,274588,315392,0,0,0,1)
    2302221 Query   UPDATE browse_being_allocated SET startAtom = 1182717 WHERE id = 26471948
    2302221 Query   UNLOCK TABLES
    2522564 Query   LOCK TABLES browse WRITE, browse_being_allocated WRITE
    2522564 Query   SELECT id,startAtom,finishAtom FROM browse_being_allocated WHERE poolID = 31543 AND rushID = '32ca680dd0d84f9b9b2945e2186c09ff' AND format = 516 AND startAtom <= 1182717 AND finishAtom > 1182717
    2522564 Query   INSERT INTO browse (poolId,atom,skew,format,rushID,start,finish,databytes,srcPoolID,srcAtom,srcSkew,arrived) VALUES (31543,1182717,0,516,'32ca680dd0d84f9b9b2945e2186c09ff',274588,274633,331776,0,0,0,1)
    2522564 Query   UPDATE browse_being_allocated SET startAtom = 1182718 WHERE id = 26471948
    2522564 Query   UNLOCK TABLES

The example above would be sensible, and below would represent the first part of a fault condition.

            2334151 Change user user@dbsrv1 on db
    2334151 Query   SET NAMES utf8
    2334151 Query   SET character_set_results = NULL
    2334151 Query   LOCK TABLES browse WRITE, browse_being_allocated WRITE
    2302201 Change user user@dbsrv1 on db
    2302201 Query   SET NAMES utf8
    2302201 Query   SET character_set_results = NULL
    2302201 Query   SELECT DISTINCT rushID FROM tags WHERE rushID NOT IN (SELECT DISTINCT rushID FROM essencefragments) GROUP BY rushID
 151216 19:00:39    2566722 Quit    

I’ve tried many different ways to get a multiline match, but I’ve simply not been able to get anything working. Ideally I’d get it going under cygwin, and I really need to keep the original line numbers so I can jump back to the original logs and tie it up.

Any thoughts? Cheers

Advertisement

Answer

awk should be able to do what you want. I think perl and sed probably would too for that matter.

Using the -E (extended regex) I believe is making your regex too greedy. As far as why your grep -P is not working, you’ll have to use grep –version and grep–help and do some research. Mine is working fine with GNU grep 2.22 on Ubuntu 16.04.

If I understand your question correctly (which I very well may not), you’re wanting everything between LOCK and UNLOCK strings. Would this be a viable solution?

awk 'BEGIN {ln=1; lck="n"; print "---"};
     lck=="y" {print ln")",$0};
     $3=="UNLOCK" {lck="n"; ln++; print "---"; next};
     $3=="LOCK" && lck=="n" {print ln")",$0; lck="y";ln++; next};
     {ln++};
' NEWSJBHQDB12A.log > NEWSJBHQDB12A_filtered.txt;

It should keep track and display the line numbers (var ln) for you to easily reference against your original log file. With the use of the “lck” variable as an on/off switch, it will begin printing when it finds “LOCK” and stop printing when it finds an “UNLOCK.”

$ cat file
2302221 Query   LOCK TABLES browse WRITE, browse_being_allocated WRITE
2302221 Query   SELECT id,startAtom,finishAtom FROM browse_being_allocated WHERE poolID = 31543 AND rushID = '32ca680dd0d84f9b9b2945e2186c09ff' AND format = 516 AND startAtom <= 1182716 AND finishAtom > 1182716
2302221 Query   INSERT INTO browse (poolId,atom,skew,format,rushID,start,finish,databytes,srcPoolID,srcAtom,srcSkew,arrived) VALUES (31543,1182716,0,516,'32ca680dd0d84f9b9b2945e2186c09ff',274545,274588,315392,0,0,0,1)
2302221 Query   UPDATE browse_being_allocated SET startAtom = 1182717 WHERE id = 26471948
2302221 Query   UNLOCK TABLES
2334151 Change user user@dbsrv1 on db
2334151 Query   SET NAMES utf8
2334151 Query   SET character_set_results = NULL
2334151 Query   LOCK TABLES browse WRITE, browse_being_allocated WRITE
2302201 Change user user@dbsrv1 on db
2302201 Query   SET NAMES utf8
2302201 Query   SET character_set_results = NULL
2302201 Query   SELECT DISTINCT rushID FROM tags WHERE rushID NOT IN (SELECT DISTINCT rushID FROM essencefragments) GROUP BY rushID 151216 19:00:39
2566722 Quit    
2522564 Query   LOCK TABLES browse WRITE, browse_being_allocated WRITE
2522564 Query   SELECT id,startAtom,finishAtom FROM browse_being_allocated WHERE poolID = 31543 AND rushID = '32ca680dd0d84f9b9b2945e2186c09ff' AND format = 516 AND startAtom <= 1182717 AND finishAtom > 1182717
2522564 Query   INSERT INTO browse (poolId,atom,skew,format,rushID,start,finish,databytes,srcPoolID,srcAtom,srcSkew,arrived) VALUES (31543,1182717,0,516,'32ca680dd0d84f9b9b2945e2186c09ff',274588,274633,331776,0,0,0,1)
2522564 Query   UPDATE browse_being_allocated SET startAtom = 1182718 WHERE id = 26471948
2522564 Query   UNLOCK TABLES

Test:

$ awk 'BEGIN {ln=1; lck="n"; print "---"};
     lck=="y" {print ln")",$0};
     $3=="UNLOCK" {lck="n"; ln++; print "---"; next};
     $3=="LOCK" && lck=="n" {print ln")",$0; lck="y";ln++; next};
     {ln++};
' file
---
1)     2302221 Query   LOCK TABLES browse WRITE, browse_being_allocated WRITE
2)     2302221 Query   SELECT id,startAtom,finishAtom FROM browse_being_allocated WHERE poolID = 31543 AND rushID = '32ca680dd0d84f9b9b2945e2186c09ff' AND format = 516 AND startAtom <= 1182716 AND finishAtom > 1182716
3)     2302221 Query   INSERT INTO browse (poolId,atom,skew,format,rushID,start,finish,databytes,srcPoolID,srcAtom,srcSkew,arrived) VALUES (31543,1182716,0,516,'32ca680dd0d84f9b9b2945e2186c09ff',274545,274588,315392,0,0,0,1)
4)     2302221 Query   UPDATE browse_being_allocated SET startAtom = 1182717 WHERE id = 26471948
5)     2302221 Query   UNLOCK TABLES
---
9)     2334151 Query   LOCK TABLES browse WRITE, browse_being_allocated WRITE
10)     2302201 Change user user@dbsrv1 on db
11)     2302201 Query   SET NAMES utf8
12)     2302201 Query   SET character_set_results = NULL
13)     2302201 Query   SELECT DISTINCT rushID FROM tags WHERE rushID NOT IN (SELECT DISTINCT rushID FROM essencefragments) GROUP BY rushID 151216 19:00:39
14)     2566722 Quit    
15)     2522564 Query   LOCK TABLES browse WRITE, browse_being_allocated WRITE
16)     2522564 Query   SELECT id,startAtom,finishAtom FROM browse_being_allocated WHERE poolID = 31543 AND rushID = '32ca680dd0d84f9b9b2945e2186c09ff' AND format = 516 AND startAtom <= 1182717 AND finishAtom > 1182717
17)     2522564 Query   INSERT INTO browse (poolId,atom,skew,format,rushID,start,finish,databytes,srcPoolID,srcAtom,srcSkew,arrived) VALUES (31543,1182717,0,516,'32ca680dd0d84f9b9b2945e2186c09ff',274588,274633,331776,0,0,0,1)
18)     2522564 Query   UPDATE browse_being_allocated SET startAtom = 1182718 WHERE id = 26471948
19)     2522564 Query   UNLOCK TABLES
---

Simple example:

$ cat file 
NO PRINT
NO PRINT
1 Query LOCK
STUFF
STUFF
STUFF
1 Query UNLOCK
NO PRINT
2 Query LOCK
STUFF
2 Query UNLOCK
NO PRINT
NO PRINT
NO PRINT
NO PRINT

$ awk 'BEGIN {ln=1; lck="n"; print "---"};
     lck=="y" {print ln")",$0};
     $3=="UNLOCK" {lck="n"; ln++; print "---"; next};
     $3=="LOCK" && lck=="n" {print ln")",$0; lck="y";ln++; next};
     {ln++};
' file
---
3) 1 Query LOCK
4) STUFF
5) STUFF
6) STUFF
7) 1 Query UNLOCK
---
9) 2 Query LOCK
10) STUFF
11) 2 Query UNLOCK
---
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement