Sorry title is not very clear. So let’s say I’m grepping recursively for urls like this:
grep -ERo '(http|https)://[^/"]+' /folder
and in folder there are several files containing the same url. My goal is to output only once this url. I tried to pipe the grep to | uniq or sort -u but that doesn’t help
example result:
/www/tmpl/button.tpl.php:http://www.w3.org /www/tmpl/header.tpl.php:http://www.w3.org /www/tmpl/main.tpl.php:http://www.w3.org /www/tmpl/master.tpl.php:http://www.w3.org /www/tmpl/progress.tpl.php:http://www.w3.org
Advertisement
Answer
If the structure of the output is always:
/some/path/to/file.php:http://www.someurl.org
you can use the command cut
:
cut -d ':' -f 2-
should work. Basically, it cuts each line into fields separated by a delimiter (here “:”) and you select the 2nd and following fields (-f 2-)
After that, you can use uniq
to filter.