How to use WGET to separate the marked links from this side?
Can this be done with CURL?
I want to download URLs from this page and save them in a file.
I tried like that.
wget -r -p -k https://polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2984/585ddf5a3dde69cb58c7f42ba52790a4
Link Gopher separated the addresses.
EDITION.
How can I download addresses to the file from the terminal?
Can it be done with the help of WGET?
Can it be done with the help of CURL?
I want to download addresses from this page and save them to the file.
I want to save these links.
` Edition 1.
Advertisement
Answer
You will need to use something like
I added that to my Firefox browser and it works, although it is a bit slow, and the only time you know it is completed is when the *.html.part file disappears for the corresponding *.html file which you will save using the Add-on button.
Basically, that will save the complete web page (excluding binaries, i.e. images, videos, etc.) as a single text file.
Also, only while saving these files, the developper indicates there is a bug for which you MUST allow “Use in private mode” to circumvent the bug.
Here is a fragment of the full season 44 index page displayed (note the address in the address bar):
Since I don’t have your access I can’t reproduce, but the service is hiding from me the page of the individual video (what you get when you click on a picture) because I don’t have login access. They give me the index instead of the address in the address bar (their security processes at work). However the index page should probably show something different after the “…/sezon-44/5027472/” .
Using that saved DOM file as input, the following will extract the necessary references:
#!/bin/sh ### ### LOGIC FLOW => CONFIRMED VALID ### DBG=1 #URL="https://polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2984/585ddf5a3dde69cb58c7f42ba52790a4" ### ### Completely expanded and populated DOM file, ### as captured by Firefox extension "Download Serialized DOM" ### ### Extension is slow but apparently very functional. ### INPUT="test_77_Serialized.html" BASE=$(basename "$0" ".sh") TMP="${BASE}.tmp" HARVESTED="${BASE}.harvest" DISTILLED="${BASE}.urls" #if [ ! -s "${TMP}" ] #then # ### Non-serialized # wget -O "${TMP}" "${URL}" #fi ### Each 'more' step is to allow review of outputs to identify patterns which are to be used for the next step. cp -p ${INPUT} "${TMP}" test ${DBG} -eq 1 && more ${TMP} sed 's+<a +n<a +g' "${TMP}" >"${TMP}.2" URL_BASE=$( grep 'tiba=' ${TMP}.2 | sed 's+tiba=+ntiba=+' | grep -v 'viewport' | cut -f1 -d; | cut -f2 -d= | cut -f1 -d% ) echo "n=======================n${URL_BASE}n=======================n" sed 's+<a +n<a +g' "${TMP}" | grep '<a ' >"${TMP}.2" test ${DBG} -eq 1 && more ${TMP}.2 grep 'title="Pierwsza Miłość - Odcinek' "${TMP}.2" >"${TMP}.3" test ${DBG} -eq 1 && more ${TMP}.3 ### FORMAT: Typical entry identified for video files #<a data-testing="list.item.0" title="Pierwsza Miłość - Odcinek 2984" href="/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2984/585ddf5a3dde69cb58c7f42ba52790a4" class="ifodj3-0 yIiYl"></a><div class="sc-1vdpbg2-2 hKhMfx"><img data-src="https://ipla.pluscdn.pl/p/vm2images/9x/9xzfengehrm1rm8ukf7cvzvypv175iin.jpg" alt="Pierwsza Miłość - Odcinek 2984" class="rebvib-0 hIBTLi" src="https://ipla.pluscdn.pl/p/vm2images/9x/9xzfengehrm1rm8ukf7cvzvypv175iin.jpg"></div><div class="sc-1i4o84g-2 iDuLtn"><div class="orrg5d-0 gBnmbk"><span class="orrg5d-1 AjaSg">Odcinek 2984</span></div></div></div></div><div class="sc-1vdpbg2-1 bBDzBS"><div class="sc-1vdpbg2-0 hWnUTt">< sed 's+href=+nhref=+' "${TMP}.3" | sed 's+class=+nclass=+' | grep '^href=' >"${TMP}.4" test ${DBG} -eq 1 && more ${TMP}.4 awk -v base="${URL_BASE}" -v splitter=" '{ printf("https://%s", base ) ; pos=index( $0, "href=" ) ; if( pos != 0 ){ rem=substr( $0, pos+6 ) ; n=split( rem, var, splitter) ; printf("%sn", var[1] ) ; } ; }' "${TMP}.4" >${TMP}.5 more ${TMP}.5 exit
That will give you a report for ${TMP}.5 like this:
https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2984/585ddf5a3dde69cb58c7f42ba52790a4 https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2985/e15e664718ef6c0dba471d59c4a1928a https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2986/58edb8e0f06dc3da40c255e50b3839cf https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2987/2ebc2e7b13268e74d90cc64c898530ee https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2988/2031529377d3be27402f61f07c1cd4f4 https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2989/eaceb96a0368da10fb64e1383f93f513 https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2990/4974094499083a8d67158d51c5df2fcb https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2991/4c79d87656dcafcccd4dfd9349ca7c23 https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2992/26b4d8808ef4851640b9a2dfa8499a6d https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2993/930aaa5b2b3d52e2367dd4f533728020 https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2994/fa78c186bc9414f844f197fd2d673da3 https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2995/c059c7b2b54c3c25996c02992228e46b https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2996/4a016aeed0ee5b7ed5ae1c6117347e6a https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2997/1e3dca41d84471d5d95579afee66c6cf https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2998/440d069159114621939d1627eda37aec https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-2999/f54381d4b61f76bb83f072059c15ea84 https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-3000/b272901a616147cd9f570750aa450f99 https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-3001/3aca6bd8e81962dc4a45fcc586cdcc7f https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-3002/c6500c6e261bd5d65d0bd3a57cd36288 https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-3003/35a13bc5e5570ed223c5a0221a8d13f3 https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-3004/a5cfb71ed30e704730b8891323ff7d92 https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-3005/d86c1308029d78a6b7090503f8bab88e https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-3006/54bba327bc7a1ae7b9b609e7ee11c07c https://Polsatboxgo.pl/wideo/seriale/pierwsza-milosc/5027238/sezon-44/5027472/pierwsza-milosc-odcinek-3007/17d199a0523df8430bcb1f21d4a5b573
NOTE: In the image below, the icon between the “folder” and the “star”, in the address bar of that image, is the button for the Download Serialized DOM extension to capture the currently displayed page as a fully-instantiated DOM file.