[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

So i am writing a script in python to scrape all the images from

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 11
Thread images: 2

File: 1477929306760.png (473KB, 764x800px) Image search: [Google]
1477929306760.png
473KB, 764x800px
So i am writing a script in python to scrape all the images from an online directory. However, the directory is protected with cloudfare. I used cfscrape to initially access the source for the page so i could grab all the urls for the images, but when i try to use urlretrieve to actually save the files it returns a 403 forbidden URLError. I have also tried to use URLopener class to change user agent, just spits out a different error that once researched just points to the same thing, the 403 error. Anyone have expereience getting around cloudfare to actually SAVE files?
>>
File: 1477928741437.png (518KB, 806x798px) Image search: [Google]
1477928741437.png
518KB, 806x798px
bump, any help appreciated
>>
>>57327047
Please don't start a sentence with the word "so".

Have you tried putting a sleep in between download access attempts?
>>
>>57327211
Duly noted lol. and yes I have, 2 seconds at a time, i'm about to try 6 seconds as the cloudfare generally delays access by about 5
>>
>>57327211
>Please don't start a sentence with the word "so".
Not him but what's wrong with that?

t. non native english speaker
>>
>>57327211
"So" is perfectly acceptable in general discourse. Nobody's writing a formal essay/letter/research paper here.

I bet you think it's still wrong to start a sentence with "because"?
>>
>>57327317
Except in certain cases, it doesn't mean anything. If you remove it from OP's first sentence it doesn't change the sentence at all. It's a terrible verbal tic that everyone has started using.

Note that there are circumstances in which you can use it, like when it means "Therefore" or "Very much".

>"I'm out tomorrow. So I won't be able to answer the door."
>"So happy to be engaged to my girlfriend!"

These are acceptable in informal speech.
>>
>>57327317
OP here, usually the word "so" indicates an event or outcome based on some specified cause that preceded it in the sentence. starting a sentence with so provides no information. pretty commonly used that way though, incorrect or not.
>>
>>57327292
Please don't finish a sentence with the word "lol".
>>
>>57327211
>>57327292
a six second sleep still returns 403 forbidden. I found a different host for what I'm trying to scrape so my actual problem is solved, but i would like to digure this out in case it comes up again.
>>
>>57327455
Cloudflare works by IP doesn't it? Maybe spoof the IP between requests.
Thread posts: 11
Thread images: 2


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.