Skip to content
Advertisement

Copying multiple text lines into a file after specified pattern using shell [closed]

I want to insert multiple line from file1 marked with pattern into a file2 using shell.
The pattern is 10 numbers, always different input exmple: “2016854218”

file1 example (input):

[...]
    <a class="none" data-container="#fr_5854841" href="https://example.com/profiles/2016854218"></a>
    <div class="new_cl">
        <img src="2016854218_medium.jpg">
    </div>
    <div class="blocker">Novaa<br>
        <span class="friend_small_text">
[...]

file2 example (output):

2016854218
2016859711
2017076181

Advertisement

Answer

EDIT: Since OP wants to have http link’s complete value till all the digits s adding this solution now too.

awk --re-interval 'match($0,/https:.*[0-9]{10}/){print substr($0,RSTART,RLENGTH)}' Input_file

Could you please first if you have control M characters in your Input_file by doing cat -v Input_file if yes then run dos2unix utility in case you have it. In case you don’t have it use:

tr -d 'r' < Input_file > temp_file && mv temp_file Input_file

but above will remove all control M characters, so to remove control Ms on last of the line(in case) use:

awk '{sub(/r$/,"")}1' Input_file > temp_file && mv temp_file Input_file

Now once your control Ms are not there on Input_file then you could use following:

awk --re-interval 'match($0,/[0-9]{10}/){print substr($0,RSTART,RLENGTH)}' Input_file > Output_file

You could remove --re-interval in case you have newer version of GNU awk with you.

Advertisement