Skip to content
Advertisement

Sed command in linux

How do I extract URL’s from a file? My file name is URL_name.txt This file has a lot of url inside. It looks like this:

<pre>
<pre><div></pre><something>something here<href="http://www.google.com/">something here</font>
<font><href="http://www.stackoverflow.com/">something</td>

..
..
..
</pre>

Here is my idea, I want to remove everything before URLs then I can remove everything after URL. How do I use sed command deal with it? The output should be

http://www.google.com/
http://www.stackoverflow.com/

Advertisement

Answer

You can use grep:

grep -o 'http://[^"]*' yourfile
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement