Make Money from Machine Learning

Thread replies: 25
Thread images: 1

Anonymous
Make Money from Machine Learning 2016-12-20 06:02:11 Post No. 8551294
[Report] Image search: [Google]

File: gradient_descent_1.png (88KB, 499x365px) Image search: [Google]

Make Money from Machine Learning Anonymous 2016-12-20 06:02:11 Post No. 8551294 [Report]

hey /sci/,

MLfag here. been thinking about different ways to make money off of Machine Learning (still in school, also pet projects are pretty fun).

Two that come to mind are predicting sports books/horse racing and online poker.

Anyone have experience with these kinds of things?

If i could get enough horse-data, random forests looked like a good option, since i dont think horses have long enough careers to make neural nets viable.

Poker seems like it would be saturated with people that got to it first.

Anonymous 2016-12-20 06:19:44 Post No.8551310
[Report]

Anonymous 2016-12-20 06:19:44 Post No.8551310 [Report]

>>8551294
>Poker seems like it would be saturated with people that got to it first.
There are already bots playing online, but I don't know how sophisticated they are, probably not very. They play winning though so that's one way to earn money. I know heads up limit holdem is a solved game, but state of the art bots can't beat top level holdem players heads up with deep stacks, see https://en.wikipedia.org/wiki/Claudico, so there's plenty of stuff to accomplish still.

Anonymous 2016-12-20 06:21:46 Post No.8551313
[Report]

Anonymous 2016-12-20 06:21:46 Post No.8551313 [Report]

Poker is not smart.

Your bot will basically just learn to play the odds of its hand, unless you do something really sophisticated.

A smart human player will be able to pick up on this after enough hands, and will bust your bot.

Sports is doable, but machine learning may be overkill. More traditional statistical methods would probably be more appropriate.

Anonymous 2016-12-20 06:24:11 Post No.8551316
[Report]

Anonymous 2016-12-20 06:24:11 Post No.8551316 [Report]

>>8551313
well machine learning is how you implement the statistics

>>8551310
i've read that as well. some people seem to disagree that it was "solved".

and i wouldn't need to beat top-level players, just the average online poker player

Anonymous 2016-12-20 06:25:26 Post No.8551319
[Report]

Anonymous 2016-12-20 06:25:26 Post No.8551319 [Report]

>>8551294
NumerAI is a good way to practice applying machine learning while also getting compensated with bitcoin.

Anonymous 2016-12-20 06:26:38 Post No.8551321
[Report]

Anonymous 2016-12-20 06:26:38 Post No.8551321 [Report]

Reminds me of this article.
http://www.nytimes.com/2015/03/22/opinion/sunday/making-march-madness-easy.html?_r=0
tl;dr good features are more important than good models

>i dont think horses have long enough careers to make neural nets viable
Depends on the net. Yeah they can require a lot of data but only if you go crazy with the network model. Logistic regression is basically the simplest NN you can get.

Anonymous 2016-12-20 06:27:43 Post No.8551323
[Report]

Anonymous 2016-12-20 06:27:43 Post No.8551323 [Report]

>>8551319
>NumerAI
very interesting. Aside from what's in the top 10 google searches, anything you'd like to note about it?

Anonymous 2016-12-20 06:29:05 Post No.8551325
[Report]

Anonymous 2016-12-20 06:29:05 Post No.8551325 [Report]

>>8551321
if i used a logistic regression activation in the hidden layer, and a softmax output activation, with only one hidden layer, i think i'd still need ~10k observations.

horses dont race that many times

Anonymous 2016-12-20 06:36:00 Post No.8551336
[Report]

Anonymous 2016-12-20 06:36:00 Post No.8551336 [Report]

>>8551323
It's a good excuse to learn more ML techniques. Ultimately the stuff I have learned trying to be as competitive as possible has been more valuable than the money I have made (about 70$ in the past few months so that's not really saying that much).

Anonymous 2016-12-20 06:40:08 Post No.8551337
[Report]

Anonymous 2016-12-20 06:40:08 Post No.8551337 [Report]

>>8551336
ever do Kaggle? Those are pretty solid and have a good community as well.

Anonymous 2016-12-20 06:45:19 Post No.8551342
[Report]

Anonymous 2016-12-20 06:45:19 Post No.8551342 [Report]

>>8551337
Yeah, in fact I think I found out about NumerAI through Kaggle. Though the question of the thread was how to make money from machine learning and I think it is much easier to do that through NumeraI

Anonymous 2016-12-20 06:51:37 Post No.8551346
[Report]

Anonymous 2016-12-20 06:51:37 Post No.8551346 [Report]

>>8551294
I thought you were going to start a company on convolutional neural networks to solve some problem and then it will be bought by a big company for hundreds of millions. Guess you just want to gamble.

Anonymous 2016-12-20 07:16:18 Post No.8551370
[Report]

Anonymous 2016-12-20 07:16:18 Post No.8551370 [Report]

>>8551346
thanks dad.

>>8551342
ah, right. Yeah, all the Kaggle competition winners are teams. Berkeley's was pretty wizard level too

Anonymous 2016-12-20 07:18:29 Post No.8551374
[Report]

Anonymous 2016-12-20 07:18:29 Post No.8551374 [Report]

>>8551294
ML great at interpolation, horrible at extrapolation

Anonymous 2016-12-20 07:45:52 Post No.8551400
[Report]

Anonymous 2016-12-20 07:45:52 Post No.8551400 [Report]

>>8551294
You'll find the problem is getting the data. People know for ages how valuable data is and they figured out what kind of data is the most valuable, so they at least started to make it as difficult as possible to get it. There won't be some database online with everything you need. Chances are those just don't exist. So you end up scraping for data with little chance to get enough of it. If you can find larger amounts of data, then you'll quickly find that people already tried what you are about to try.

Anonymous 2016-12-20 11:56:37 Post No.8551623
[Report]

Anonymous 2016-12-20 11:56:37 Post No.8551623 [Report]

>>8551400
>There won't be some database online with everything you need. Chances are those just don't exist.
If OP wanted to do this a few years sooner, he could easily buy hundreds of millions of handhistories. The site that offered them has gone down though, but you can probably still get them somewhere or just mine your own.

Anonymous 2016-12-20 02:45:42 Post No.8551833
[Report]

Anonymous 2016-12-20 02:45:42 Post No.8551833 [Report]

>>8551294
Sports like football probably wouldn't work, because the teams change nearly every year, and machine learning is slow, so you probably will never enough data before a team make a major change.

Anonymous 2016-12-20 04:59:26 Post No.8552039
[Report]

Anonymous 2016-12-20 04:59:26 Post No.8552039 [Report]

>>8551833
Modular machine learning with neural networks is a thing though.

Anonymous 2016-12-20 05:20:36 Post No.8552075
[Report]

Anonymous 2016-12-20 05:20:36 Post No.8552075 [Report]

>>8551313
>unless you do something really sophisticated.
adding noise to its decisions is not sophisticated

Anonymous 2016-12-20 07:39:19 Post No.8552308
[Report]

Anonymous 2016-12-20 07:39:19 Post No.8552308 [Report]

undergrad wannabe MLfag here. I haven't got much substance to contribute - mostly replying to bump this post

iirc most lucrative sports betting is about predicting the point spread - by how many points does the winning team win? Your objective is to predict the point spread as a function of both teams' individual players' career stats, and put down money when you think the house is wrong.

Why does the horse's career length matter? You don't want to predict the winner as a function of the horses' names, you want to predict it as a function of their performance statistics.

sportsfags tell me that a football team's performance as a function of individual player stats is simpler to approximate than a basketball team's - the latter involves more teamwork, where the former is more additive.

You might want to just try and find some really fucking obscure sports. I remember hearing about a guy who would travel the world looking for obscure kinds of races on which to bet. He'd always use the same exact technique, and would get banned within a few months after cleaning them out, so he'd just find another sport.

Anonymous 2016-12-20 08:16:43 Post No.8552378
[Report]

Anonymous 2016-12-20 08:16:43 Post No.8552378 [Report]

>>8552308
hmm. interesting thought about the horses.
So you'd use past speeds on that course of similar weighted horse/jockey combinations? or what? Random Forest and neural networks seem like the obvious option.

but if the horse's career is less than 5 years, then you won't that much statistics on them.

it seems like you've researched this a bit more. I have quite a bit of technical knowledge, but not much in the domain.

if you have any questions regarding algoirthms to use etc. please ask.

Anonymous 2016-12-20 09:06:10 Post No.8552481
[Report]

Anonymous 2016-12-20 09:06:10 Post No.8552481 [Report]

>>8552378
sorry, that comment represents almost all of my domain knowledge - I'm just repeating what I overheard in a meeting at my school's sports analytics club. I was just surprised that so many people in this thread were talking about having little data on a candidate - it's not the name of the team that matters, it's the lineup, and there should be much more data on the lineup

But yeah, idk if I can think of anything better than what you just described. I'm sure there's some correlation between courses though - maybe cluster horses together based on how their performance differs between, say, rough tracks and smooth tracks. If you were doing linear regression, you'd have a different regression function for each

I've never actually encountered random forest in a course before. Afaik from the wiki intro, it's an ensemble of decision trees, but why do I hear about it so much more than the usual information gain strategy for single trees?

Anonymous 2016-12-20 09:19:29 Post No.8552518
[Report]

Anonymous 2016-12-20 09:19:29 Post No.8552518 [Report]

i'm gonna get together with a friend of mine who does sports betting and try some stuff. maybe check back in january i browse often.

Anonymous 2016-12-20 09:23:47 Post No.8552523
[Report]

Anonymous 2016-12-20 09:23:47 Post No.8552523 [Report]

>>8551294
if you really care about actually making money rather than just having a pet project, you should research the fees that the online services charge (it's not like they'll let you bet for free). also i'm guessing that your winnings would be taxed as income. so consider whether your competitive advantage from your algorithm will be enough to make this a more lucrative strategy that just putting your money in a stock market index fund (which will most likely have lower service fees and will be considered capital gains, so taxed at a lower rate than income)

Anonymous 2016-12-20 09:51:53 Post No.8552567
[Report]

Anonymous 2016-12-20 09:51:53 Post No.8552567 [Report]

>>8552481
in a decision tree, the deeper you make the tree, the more likely you are to overfit (get good results on training set and bad on validation set).

A random forest is an ensemble method that uses a bunch of decision trees. However, each tree has only a certain percentage of the sample. say, make 10 trees each with 15% of the data, overlapping a bit (which is okay).

Then you use the average of the posterior probabilities as your prediction. (meaning you put in your test data point, and out of the 10 trees, each one will output a guess and a probability. You average all the probabilities per class then predict using the highest.

It reduces overfitting really substantially, So you can have deeper trees and still get good accuracy.

It uses the same entropy calculation for maximizing information gain.

there's also some other tricks you can do like forcing certain trees to specialize on edge cases by feeding them more outliers etc.

It also has a certain readability quality. As you can look back through the tree and see what steps lead to the decision.

pretty cool imho.

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible. Read more on this topic here - https://archived.moe/talk/thread/1694/

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/