[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

Tool for Downloading Website Pages?

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 36
Thread images: 6

File: baltic states.jpg (96KB, 471x912px) Image search: [Google]
baltic states.jpg
96KB, 471x912px
Hi,

I was wondering if anybody knew of a tool for downloading website pages and even entire websites. I want to do this so I can save exercise instructions, diet tips, etc., and print them out in a neat format so I can take my final steps away from the Internet. I already know that you can save a website with Firefox or Chrome, but I wonder if there's something that can conveniently convert the website into a neat PDF or .docx without taking too much space.

I know that this could get complicated really quickly if you're dealing with a large website or a website with elaborate presentation, media, or videos, so I won't mind if there's something only for simple websites. Of course, if there's something that can "compromise" on more complex websites without losing too much functionality, then I'd like to see that too.

Thanks!
>>
It's called a web browser. You can save the complete page, or a complete screenshot, or "Print to PDF".
>>
wew lad this entire post
>>
wget
>>
>>61204864
Right click on webpage
click "Print"
>>
>>61204882
>It's called a web browser. You can save the complete page
I mentioned that in my post.
>I already know that you can save a website with Firefox or Chrome,
...

>or a complete screenshot, or "Print to PDF".
Doesn't it only save as an .html file with an additional folder for all of the page's assets? If it could easily be saved by .pdf, I'd like to know how.
>>
>>61204930
The pdfs created by "printing" are ugly as fuck and often waste tons of paper printing random bullshit and formatting problems. Surely somebody has found a better way?
>>
>>61204946
Just hit print and then print to pdf
>>
File: print.png (314KB, 1809x862px) Image search: [Google]
print.png
314KB, 1809x862px
print preview looks good for me

although there is another way to get 100% similar pages for printing
>>
>>61205010
Looks kinda shitty to me.
>>
>>61205133
If you want your the exact thing you see on your screen, then follow these steps.

1.Open page
2.Press PrtSc button
3.Open Paint
4.Press Ctrl + V
5.Save the image

Now you have a screenshot of your entire screen.

6.Scroll down to view the area that wasn't screenshot
7.go to instruction 2

After that, you print out the images in full color. Your pages will be sideways, but will look non-shitty.
>>
https://www.httrack.com/page/2/en/index.html

np
>>
>>61205304
Thanks. This would be the gung-ho solution that's better than the normal browser saving method. Would still like a better PDF saving method than the one provided by printing the page.
>>
>>61205273
What if I just want the text cleanly formatted without too much hassle?
>>
Is OP pic represents Baltic countries flag? Pretty neat if you ask me.
>>
>>61207096
Yup. I love the Baltic States.
>>
>some bs excuse
>implying he isn't just a very lazy dude trying to take someone else's work and sell it
*giggles*
>>
File: a.png (37KB, 595x502px) Image search: [Google]
a.png
37KB, 595x502px
just print from your browser, to a pdf writer
your browser will sensibly format the page for printing, like remove backgrounds and adjust it to fit on the specified page size
some pages also have a button to view a version of the page specifically for printing, often seen on blogs or cooking sites and the like, so keep an eye out for those
>>
File: a.png (944KB, 2480x3507px) Image search: [Google]
a.png
944KB, 2480x3507px
>>61207515
example output from palemoon, of this page
pdf is 243K, and is pretty much what you'd want out of a printed version, background colours removed, formatted to suit the page, etc
>>
>>61207624
>for printing
You could easily print the text in one page, maybe 1.5 pages, if you removed most of the formatting except for the separation in text. Pale Moon is one of the nicer browsers that I've seen though.
>>
>>61204864
scrapbook
>>
>>61207673
What the hell is scrapbook? You mean like that shit people did in middle school?
>>
File: a.png (691KB, 2480x3507px) Image search: [Google]
a.png
691KB, 2480x3507px
>>61207657
yea, though that gets into per-page territory, there's plenty of things that can be done, if you're willing to to handle things on more of a page-to-page basis

here's an example with 85% scaling, and all the stuff at the top removed using Inspect Element, palemoon/firefox prints how the page is currently being rendered
>>
>>61207799
Not bad. I wish I wouldn't have to inspect element every time, might get tedious. How quickly were you able to do it?
>>
>>61207799
The thing is with a website like 4chan, I would probably download the whole thing using HTTrack because the images are really important sometimes. But for a website that gives advices on stretches, which is mostly text, it would be nice to have a tool that cuts the fat from the crucial information without having to do heavy user-side editing.
>>
>>61207991
pretty quick
you can also either;
a. make ublock rules, which you can then turn on and off
b. instead of removing bits one at a time, identify things to keep and just delete the rest

>>61208009
yea, the best solutions depends on a number of things, like whether you're (actually) printing or not, whether you need embedded or linked media or not, which parts of the page are actually important, and so on
something like httrack set to grab immediately linked information (1 depth) would probably be a minimum for a "true" archival of a single page, and that copy can be then printed or converted as if you had the real page open (important for if the real page goes away)
>>
>>61204864
One of those picture is taken by me.
To Hell the faggot who posted it on 9gag.
>>
>>61208523
>>61208009
here's an example of httrack, so op has an idea of what it does;
https://ipfs.io/ipfs/QmYAzU4iTxksPDN
nQaCUKB6kgKjZEB3
yTMoKJSvEcKyMhk

i made this minimal mirror just now using
httrack --mirror --depth=2 --ext-depth=1 >>61204864

it took a copy of the page, everything on it, and just the first thing from whatever was linked from it (which includes full images, but also includes the index of every other board, due to the fact that there's links to them on this page)
>>
>>61206258
Write a script
>>
File: a.webm (307KB, 1366x788px) Image search: [Google]
a.webm
307KB, 1366x788px
>>61206258
sometimes the best way is the simplest way
copy+paste the stuff you want into a word processor
>>
>>61208600
Proof? Post the original you faggot. And post the exif.
>>
>>61208741
And he was never heard from again.
>>
>>61207146
It's Baltic republics or nations, states is the saying when they where Soviet occupied USSR states.
>>
>>61209511
I hope you weren't triggered too much.
>>
>>61205304
This. Op I used this for years when I had no internet and had to hemmorage gb's from friends wifi. Worked flawlessly.
>>
>>61209978
I'm just shitposting, I don't really give a fuck
Thread posts: 36
Thread images: 6


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.