I am trying to use awk to extract a portion of each line in my file

Question

I have a large file of user agent strings, and I want to extract one particular section of each request. For input: I am trying to get output: from after /product/ in the sample above. I'm trying to use Awk, but I can't figure out how to get the regex expression that's required for this. I'm sure it's simpler than

Accepted Answer

The square brackets you tried to put around the FS are incorrect here, but the problem after you fix that is that you then simply have two fields, as you are overriding the splitting on whitespace which Awk normally does.Because the (horrible) date format always has exactly two slashes, I think you can actually doawk -F / '/product/ { print $5 }' filenameEven though it divides the earlier part of the line into quite weird parts, the things after GET or PUT will always be $4, $5, etc.If you wanted to keep your original idea, maybe tryawk 'BEGIN {FS="GET /product/"}  NF==2{    # second field is now everything after /product/ -- split on slash    split($2, f, "/")    print f[1] }' file&#8230; or very simply, brutally remove everything except the text you want;awk '//product// { sub(".*/product/", ""), sub("/.*", ""); print }' filewhich might be better expressed as a simple sed script;sed -n 's%.*GET /product/([^/]*)/.*%1%p' file

Advertisement

Answer