What's the best way to scrape part of a website for offline

Thread replies: 12
Thread images: 2

Anonymous
2017-06-09 05:23:33 Post No. 60818943
[Report] Image search: [Google]

File: 1460609152734.jpg (177KB, 600x740px) Image search: [Google]

Anonymous 2017-06-09 05:23:33 Post No. 60818943 [Report]

What's the best way to scrape part of a website for offline reading? I'm on GNU+Linux so preferably something that isn't too complicated for a relative noob. I want to have all of Warosu's /g/ saved on my computer.

Anonymous 2017-06-09 05:25:58 Post No.60818967
[Report]

Anonymous 2017-06-09 05:25:58 Post No.60818967 [Report]

here comes the plane, open your mouth~
wget >>60818943

Anonymous 2017-06-09 05:27:58 Post No.60818986
[Report]

Anonymous 2017-06-09 05:27:58 Post No.60818986 [Report]

Some kid actually turned this in for a grade. The fact that we have become this outrageously stupid as a species makes me furious.

Anonymous 2017-06-09 05:28:20 Post No.60818990
[Report]

Anonymous 2017-06-09 05:28:20 Post No.60818990 [Report]

>>60818943

php file_get_contents(), $dom = new DOMDocument(); $xpath = new DOMXpath()

easy as shit

Anonymous 2017-06-09 05:29:15 Post No.60818997
[Report]

Anonymous 2017-06-09 05:29:15 Post No.60818997 [Report]

>>60818986

Some kid actually thought these pictures are real. The fact that we have become this outrageously stupid as a species makes me furious.

Anonymous 2017-06-09 05:30:37 Post No.60819006
[Report]

Anonymous 2017-06-09 05:30:37 Post No.60819006 [Report]

>>60818997
Even if you're only pretending to be retarded, you're still being retarded.

Anonymous 2017-06-09 05:36:41 Post No.60819072
[Report]

Anonymous 2017-06-09 05:36:41 Post No.60819072 [Report]

>>60818990
How do I translate this into a command? I want this (https://warosu.org/g/) with every thumbnail, image, post, thread, etc saved on my hard drive. I'm an Ubuntu user so I'm not too familiar with the CLI lingo.

Anonymous 2017-06-09 05:36:52 Post No.60819076
[Report]

Anonymous 2017-06-09 05:36:52 Post No.60819076 [Report]

>>60819006
He's right though. These pictures are old as hell and fake as fuck. They're still funny though.

Anonymous 2017-06-09 05:54:45 Post No.60819221
[Report]

Anonymous 2017-06-09 05:54:45 Post No.60819221 [Report]

>>60818943
>>60819072
>I want to have all of Warosu's /g/ saved on my computer.
Not happening. Don't bother.

Anonymous 2017-06-09 06:04:47 Post No.60819318
[Report]

Anonymous 2017-06-09 06:04:47 Post No.60819318 [Report]

>>60819221
Too large? I have a lot of storage space. I wonder if I should just find some neet on /r9k/ to pay and have him manually save every thread one by one.

Anonymous 2017-06-09 06:28:59 Post No.60819524
[Report]

Anonymous 2017-06-09 06:28:59 Post No.60819524 [Report]

>>60819072
build a crawler with Python

Anonymous 2017-06-09 09:15:28 Post No.60820922
[Report] Image search: [Google]

Anonymous 2017-06-09 09:15:28 Post No.60820922 [Report]

File: 1496982985498.jpg (35KB, 640x480px) Image search: [Google]

35KB, 640x480px

>>60819072
can you write js? python? ruby?
1. grab the pages
2. parse it
3. ???
4. profit!

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible. Read more on this topic here - https://archived.moe/talk/thread/1694/

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/