I want to create a tool that web scrapes for data and does calculations with them.
Basically an automated excel sheet.
I know a little Java, but it's hard to find any in depth books on web scraping in java.
What should I learn/prepare for?
To be more precise:
>be finance student
>learning about stocks and various models to evaluate companies
>have to manually search the web for data and enter in excel for the formulas
>what to create a tool where I just enter the stock symbol and it automatically does it
Python. Use beautifulsoup4 for scraping (or scrapy but I've never used that), and there's also a library for manipulating excel spreadsheets.
>>58778515
ive used python and openpyxl
>>58778315
I only know how to do it in Python.
If pages you need are static I'd use BeautifulSoup4 as >>58778515 said.
If pages have AJAX contents you need Selenium.
BTW I'm 100% sure Selenium has a java version. If the pages you need to scrape are in the order of hundereds / few thousands I would go that way because is a more general solution.
>>58778315
curl, i'm sure java bindings are available too
>>58780310
Fucking idiot, that's literally the worst way to do scraping. How are you going to parse the dom once you have it eh? Think before you type you stupid cunt.
>>58780291
Thanks. I will look into Selenium.