I am working on a trading algo right now and went to get some tasty training data. Yahoo allows you to download past data in .csv format so seemed pretty good. Then I was looking at the data in the file and the older entries do not match the site at all! WTF! Anyone have some insight into this? It looks like they had a bad float conversion in the volume field.
Pic. related.
https://finance.yahoo.com/quote/JDST/history?period1=1430107200&period2=1493265600&interval=1d&filter=history&frequency=1d
>>1998580
I don't know anything about database files but my knee jerk reaction would be to try and look up similar data on other sites to see whether the html or the CSV file is correct... if either
>>1998589
I looked up googles and got a csv from them. Does not even come close to matching. Time to find another data set to cross reference.
>>1998608
Google at least matches the site table to the .csv file.
>>1998580
1798/8.99 = 1812/9.06 = 1624/8.12
>>1998625
Forgot pic
>>1998628
Nice. Did not even see that.
Try getting the data through yql, one time I was getting different data for the same requests when I was just grabbing the raw csv files.
>>1998628
So the .csv accounts for the stock splits. That would explain the Adj Close and the Close.
>>1998580
Backtesting on stock data is not trivial.
You'll need a series that accounts for splits and dividends (and a bunch of other corporate actions). You also need a survivorship bias free universe of stocks. Neither of these things is easy to get for free.
If you're at university they ought to have paid subscritptions to data services (Thomson Reuters, Bloomberg, Factset or CRSP). Quandl might be an alternative. A lot of their data is free.
>>1998656
After digging some more it looks like both of these support each other after some conversion fuckery.
>>1998628
Yahoo - JDST
Date,Open,High,Low,Close,Volume,Adj Close
2015-07-15,10.50,10.95,10.44,10.71,13700,2141.999952 Yahoo
Google - JDST
Date,Open,High,Low,Close,Volume
15-Jul-15,8400.00,8760.00,8352.00,8568.00,3440 Google
(G) (Y)
Open 8400 / 10.50 = 800
High 8760 / 10.95 = 800
Low 8352 / 10.44 = 800
Close 8568 / 10.71 = 800
Volume 3440 * 4 = 13760 || ~13700
>>1998701
Unfortunately I am not enrolled anywhere atm. I believe the college I went to would allow you day passes to the library though. I plan on going there for the reason you mentioned, I would like some data that has intraday as well. For now I am going to try working with the Yahoo datasets until I get the whole system set up.