I know this aint a Python board but I figured you guys should be able to solve this as well.
I am trying to build a code that downloads all PDF files from a given website for me. Searching off google, I found this:
(it's a pastebin url) /ABxnkqA7
It seemed to work for a lot of people, but the only thing I am getting are errors.
Can anyone help with this? I am kinda new to programming..
Just use wget, curl and a bit of regexp.
>>57947081
Thanks a lot, will try.
But it would still be nice to know what's wrong with the code tho.
>>57946887
>but the only thing I am getting are errors.
The errors you have shown are non-existing. Just try again. The errors are seemingly not real, I have not seen any.
>>57947134
I tried serveral times with different URLS and different directories. Every single time it tells me that basename () only takes 1 argument (2 given).
>>57947167
well... Put the whole pastebin link here and I will look at it. "(it's a pastebin url) /ABxnkqA7" is not a real functioning adress
>>57947167
Show the proper error you idiot, what line, what are you trying to run?
>>57947188
I can't cause 4chan won't let me post it. If you put the "/ABxnkqA7" part after the basic pastebin Url, it should work.
>>57947200
The whole script is in the pastebin url. My error is basically this:
[*] Downloading: fulltext.pdf
Traceback (most recent call last):
File "ALL_PDF_2.py", line 35, in <module>
f = open(download_path + "\\" + os.path.basename(tag['href'], "wb"))
TypeError: basename() takes exactly 1 argument (2 given)
I executed the script with: python ALL_PDF_2.py and put the URL and the directory in through the raw input contained in the script.
>python
lmfao
>>57947188
http://pastebin.com/ABxnkqA7
>I can't cause 4chan won't let me post it.
Is this something new? Have seen tons of pastebins here.
>>57947272
Looks like you have the wrong version of Python. Are you using python 3? Because this is Python 2 code. (Easiest way to see this is by looking in the print statements, they do not have parantheses)
the os.path.basename() might take 2 arguments in python 2. In any case you have the wrong version.
Install python 2 and if you have both python 3 and python 2 installed then run python 2 with
$py -2 -m ALL_PDF_2.py
(I don't think -m is strictly necessary because it does not have any relative imports, but often good practice)
>>57947378
I am using only python 2,7 so this can't really be the problem. Maybe the script is from an older version of python 2? Could that be it? If so, how would I get an older version of python without fucking up the most recent one?
And somehow it only wouldn't let me post the pastebin link in the thread itself, a post containing the url seems fine.
install gentoo
>>57946887
>Can anyone help with this? I am kinda new to programming..
no youre not
if you cant read line 35
> f = open(download_path + "\\" + os.path.basename(tag['href'], "wb"))
and the doc page for python 2 on open() and os.path()
and realize its a paranthesis that is out of place..
whats that "wb" hummm?
you should have just asked this to whoever gave you the link, so they could fix the code
>>57947523
So why the hell did it work for other people?
>>57947523
>gentoo
I am using windows 8 btw.
>>57947523
>>57947378
Anyways, thanks a lot guys I fixed it now.