How to write software that scrapes music off bandcamp and downloads

Thread replies: 22
Thread images: 2

Anonymous
2017-07-28 04:40:51 Post No. 61608765
[Report] Image search: [Google]

File: 1500240574593.jpg (1MB, 1840x3264px) Image search: [Google]

Anonymous 2017-07-28 04:40:51 Post No. 61608765 [Report]

How to write software that scrapes music off bandcamp and downloads it?

I don't wanna scrape anything off bandcamp, I just want to learn how do these things work.

Anonymous 2017-07-28 04:45:24 Post No.61608820
[Report]

Anonymous 2017-07-28 04:45:24 Post No.61608820 [Report]

>>61608765
learn 2 regex

if you want example code look at flexget

Anonymous 2017-07-28 07:24:01 Post No.61611018
[Report]

Anonymous 2017-07-28 07:24:01 Post No.61611018 [Report]

look at bandcamp source code. Find where the link to the music file is and wget that file.

Anonymous 2017-07-28 07:32:49 Post No.61611123
[Report]

Anonymous 2017-07-28 07:32:49 Post No.61611123 [Report]

>>61608765
https://github.com/Otiel/BandcampDownloader

Anonymous 2017-07-28 08:01:01 Post No.61611545
[Report]

Anonymous 2017-07-28 08:01:01 Post No.61611545 [Report]

>>61608765
such a sexy boy

Anonymous 2017-07-28 08:07:40 Post No.61611631
[Report]

Anonymous 2017-07-28 08:07:40 Post No.61611631 [Report]

>>61608765
install gentoo

Anonymous 2017-07-28 08:18:18 Post No.61611776
[Report]

Anonymous 2017-07-28 08:18:18 Post No.61611776 [Report]

Fiddle around the page to see how it works. Sometimes it can be as easy as recursive wget (ignore robots), others you might have to code some logic on http requests, and sometimes (depending on how much of a cunt the webdev is), you might have to emulate a web browser with something like phantomjs.

>>61608820
>Parsing html with regex

Anonymous 2017-07-28 08:19:57 Post No.61611798
[Report]

Anonymous 2017-07-28 08:19:57 Post No.61611798 [Report]

>>61611776
what would you use instead of regex?

Anonymous 2017-07-28 08:22:36 Post No.61611835
[Report]

Anonymous 2017-07-28 08:22:36 Post No.61611835 [Report]

>>61611018
This, wget is very powerful if you know how to use it.

Anonymous 2017-07-28 08:36:08 Post No.61612037
[Report]

Anonymous 2017-07-28 08:36:08 Post No.61612037 [Report]

>>61611798
Well, how about a proper parser?

You can probably parse html with regex, chances are that your doing it wrong and working at least twice as much. I certainly wouldn't recommend it.

Anonymous 2017-07-28 08:41:33 Post No.61612120
[Report]

Anonymous 2017-07-28 08:41:33 Post No.61612120 [Report]

>>61612037
>I certainly wouldn't recommend it.
so, how would you do it?

Anonymous 2017-07-28 08:44:51 Post No.61612176
[Report]

Anonymous 2017-07-28 08:44:51 Post No.61612176 [Report]

>>61612037
>Well, how about a proper parser?
way to sidestep the question. how would you do it?

Anonymous 2017-07-28 08:50:15 Post No.61612267
[Report]

Anonymous 2017-07-28 08:50:15 Post No.61612267 [Report]

>>61612176
Fetch the html and use vim to rip all relevant links which are forwarded to the shell script to download it.

Anonymous 2017-07-28 08:58:00 Post No.61612372
[Report]

Anonymous 2017-07-28 08:58:00 Post No.61612372 [Report]

Check out BAS - Browser automation studio. Dunno if you can make it download music. But it is the easyest way to go when it comes to no coding skill and a need for web automation. And its completelt FREE.

Anonymous 2017-07-28 09:05:04 Post No.61612499
[Report]

Anonymous 2017-07-28 09:05:04 Post No.61612499 [Report]

>>61608765
>Google is your friend
If you don't know shit about technology, why to you come to /g/?

Anonymous 2017-07-28 09:07:21 Post No.61612545
[Report]

Anonymous 2017-07-28 09:07:21 Post No.61612545 [Report]

>>61612267
>a proper parser
>'just do it all manually! that is what I would do'
the whole point is to automate the process.

Anonymous 2017-07-28 09:10:02 Post No.61612587
[Report]

Anonymous 2017-07-28 09:10:02 Post No.61612587 [Report]

>>61612545
Vim is not a ordinary text editor. You can run Vim macro inside a bash shell which will do the work for you.

Anonymous 2017-07-28 09:16:51 Post No.61612699
[Report] Image search: [Google]

Anonymous 2017-07-28 09:16:51 Post No.61612699 [Report]

File: theponyhecomes.png (156KB, 740x695px) Image search: [Google]

156KB, 740x695px

>>61611798
>>61612037
>>61612176
With a proper XML parser and xpath expressions.

Anonymous 2017-07-28 09:36:42 Post No.61612972
[Report]

Anonymous 2017-07-28 09:36:42 Post No.61612972 [Report]

read the source code of soundscrape and youll have a pretty good idea

Anonymous 2017-07-28 09:43:19 Post No.61613056
[Report]

Anonymous 2017-07-28 09:43:19 Post No.61613056 [Report]

>>61612699
My favorite answer on the entire site

Anonymous 2017-07-28 10:01:33 Post No.61613310
[Report]

Anonymous 2017-07-28 10:01:33 Post No.61613310 [Report]

I don't know the bandcamp website, but I build webscraper with python+beautifulsoup.
When I need javascript, I use python+selenium

Anonymous 2017-07-28 10:08:22 Post No.61613426
[Report]

Anonymous 2017-07-28 10:08:22 Post No.61613426 [Report]

>>61612176
Not him, but XPath is meant to do that. You shouldn't try to parse HTML with regex

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible. Read more on this topic here - https://archived.moe/talk/thread/1694/

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/