[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

I want to create a tool that web scrapes for data and does calculations

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 7
Thread images: 1

File: Investments.jpg (34KB, 560x285px) Image search: [Google]
Investments.jpg
34KB, 560x285px
I want to create a tool that web scrapes for data and does calculations with them.

Basically an automated excel sheet.

I know a little Java, but it's hard to find any in depth books on web scraping in java.

What should I learn/prepare for?

To be more precise:

>be finance student
>learning about stocks and various models to evaluate companies
>have to manually search the web for data and enter in excel for the formulas
>what to create a tool where I just enter the stock symbol and it automatically does it
>>
Python. Use beautifulsoup4 for scraping (or scrapy but I've never used that), and there's also a library for manipulating excel spreadsheets.
>>
>>58778515
ive used python and openpyxl
>>
>>58778315
I only know how to do it in Python.
If pages you need are static I'd use BeautifulSoup4 as >>58778515 said.
If pages have AJAX contents you need Selenium.

BTW I'm 100% sure Selenium has a java version. If the pages you need to scrape are in the order of hundereds / few thousands I would go that way because is a more general solution.
>>
>>58778315
curl, i'm sure java bindings are available too
>>
>>58780310
Fucking idiot, that's literally the worst way to do scraping. How are you going to parse the dom once you have it eh? Think before you type you stupid cunt.
>>
>>58780291
Thanks. I will look into Selenium.
Thread posts: 7
Thread images: 1


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.