The text file has many lines of these sort , i want to extract the words after /videos till .mp4 and the very last number ( shown in bold ) and output each filtered line in a separate file
https://videos-a.jwpsrv.com/content/conversions/7kHOkkQa/videos/**S4KWZTyt-32313922.mp4**.m3u8?hdnts=exp=1592315851~acl=*/S4KWZTyt-32313922.mp4.m3u8~hmac=83f4674e6bf2576b070c716a3196cb6a30f35737827ee69c8cf7e0c57a196e51 **1**
Lets say for example the text file content is ..
https://videos-a.jwpsrv.com/content/conversions/7kHOkkQa/videos/JajSfbVN-32313922.mp4.m3u8?hdnts=exp=1592315891~acl=*/JajSfbVN-32313922.mp4.m3u8~hmac=d3ca7bd5b233a531cfe242d17d2ea0c0167b41b90fff6459e433700ffc969d69 19 https://videos-a.jwpsrv.com/content/conversions/7kHOkkQa/videos/Qs3xZqcv-32313922.mp4.m3u8?hdnts=exp=1592315940~acl=*/Qs3xZqcv-32313922.mp4.m3u8~hmac=c30e2082bf748a6b4d1621c1d33a95319baa61798775e9da8856041951cf5233 20
The output should be
JajSfbVN-32313922.mp4 19 Qs3xZqcv-32313922.mp4 20
Advertisement
Answer
You may try the below regex:
.*/videos/(.*?mp4).*?(?<= )(d+)
Explanation of the above regex:
.*
– Matching everything beforevideos
.
/videos/
– Matching videos literally.
(.*?mp4)
– Represents a capturing group lazily matching everything beforemp4
.
.*?
– Greedily matches everything before the occurrence of digits.
(d+)
– Represents second capturing group matching the numbers at the end as required by you.
You can find the demo of the above regex in here.
Command line implementation in linux:
cat regea.txt | perl -ne 'print "$1 $2n" while /.*/videos/(.*?mp4).*?(?<= )(d+)/g;'> out.txt
You can find the sample implementation of the above command in here.