[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

Make Money from Machine Learning

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 25
Thread images: 1

File: gradient_descent_1.png (88KB, 499x365px) Image search: [Google]
gradient_descent_1.png
88KB, 499x365px
hey /sci/,

MLfag here. been thinking about different ways to make money off of Machine Learning (still in school, also pet projects are pretty fun).

Two that come to mind are predicting sports books/horse racing and online poker.

Anyone have experience with these kinds of things?

If i could get enough horse-data, random forests looked like a good option, since i dont think horses have long enough careers to make neural nets viable.

Poker seems like it would be saturated with people that got to it first.
>>
>>8551294
>Poker seems like it would be saturated with people that got to it first.
There are already bots playing online, but I don't know how sophisticated they are, probably not very. They play winning though so that's one way to earn money. I know heads up limit holdem is a solved game, but state of the art bots can't beat top level holdem players heads up with deep stacks, see https://en.wikipedia.org/wiki/Claudico, so there's plenty of stuff to accomplish still.
>>
Poker is not smart.

Your bot will basically just learn to play the odds of its hand, unless you do something really sophisticated.

A smart human player will be able to pick up on this after enough hands, and will bust your bot.

Sports is doable, but machine learning may be overkill. More traditional statistical methods would probably be more appropriate.
>>
>>8551313
well machine learning is how you implement the statistics

>>8551310
i've read that as well. some people seem to disagree that it was "solved".

and i wouldn't need to beat top-level players, just the average online poker player
>>
>>8551294
NumerAI is a good way to practice applying machine learning while also getting compensated with bitcoin.
>>
Reminds me of this article.
http://www.nytimes.com/2015/03/22/opinion/sunday/making-march-madness-easy.html?_r=0
tl;dr good features are more important than good models

>i dont think horses have long enough careers to make neural nets viable
Depends on the net. Yeah they can require a lot of data but only if you go crazy with the network model. Logistic regression is basically the simplest NN you can get.
>>
>>8551319
>NumerAI
very interesting. Aside from what's in the top 10 google searches, anything you'd like to note about it?
>>
>>8551321
if i used a logistic regression activation in the hidden layer, and a softmax output activation, with only one hidden layer, i think i'd still need ~10k observations.

horses dont race that many times
>>
>>8551323
It's a good excuse to learn more ML techniques. Ultimately the stuff I have learned trying to be as competitive as possible has been more valuable than the money I have made (about 70$ in the past few months so that's not really saying that much).
>>
>>8551336
ever do Kaggle? Those are pretty solid and have a good community as well.
>>
>>8551337
Yeah, in fact I think I found out about NumerAI through Kaggle. Though the question of the thread was how to make money from machine learning and I think it is much easier to do that through NumeraI
>>
>>8551294
I thought you were going to start a company on convolutional neural networks to solve some problem and then it will be bought by a big company for hundreds of millions. Guess you just want to gamble.
>>
>>8551346
thanks dad.

>>8551342
ah, right. Yeah, all the Kaggle competition winners are teams. Berkeley's was pretty wizard level too
>>
>>8551294
ML great at interpolation, horrible at extrapolation
>>
>>8551294
You'll find the problem is getting the data. People know for ages how valuable data is and they figured out what kind of data is the most valuable, so they at least started to make it as difficult as possible to get it. There won't be some database online with everything you need. Chances are those just don't exist. So you end up scraping for data with little chance to get enough of it. If you can find larger amounts of data, then you'll quickly find that people already tried what you are about to try.
>>
>>8551400
>There won't be some database online with everything you need. Chances are those just don't exist.
If OP wanted to do this a few years sooner, he could easily buy hundreds of millions of handhistories. The site that offered them has gone down though, but you can probably still get them somewhere or just mine your own.
>>
>>8551294
Sports like football probably wouldn't work, because the teams change nearly every year, and machine learning is slow, so you probably will never enough data before a team make a major change.
>>
>>8551833
Modular machine learning with neural networks is a thing though.
>>
>>8551313
>unless you do something really sophisticated.
adding noise to its decisions is not sophisticated
>>
undergrad wannabe MLfag here. I haven't got much substance to contribute - mostly replying to bump this post

iirc most lucrative sports betting is about predicting the point spread - by how many points does the winning team win? Your objective is to predict the point spread as a function of both teams' individual players' career stats, and put down money when you think the house is wrong.

Why does the horse's career length matter? You don't want to predict the winner as a function of the horses' names, you want to predict it as a function of their performance statistics.

sportsfags tell me that a football team's performance as a function of individual player stats is simpler to approximate than a basketball team's - the latter involves more teamwork, where the former is more additive.

You might want to just try and find some really fucking obscure sports. I remember hearing about a guy who would travel the world looking for obscure kinds of races on which to bet. He'd always use the same exact technique, and would get banned within a few months after cleaning them out, so he'd just find another sport.
>>
>>8552308
hmm. interesting thought about the horses.
So you'd use past speeds on that course of similar weighted horse/jockey combinations? or what? Random Forest and neural networks seem like the obvious option.

but if the horse's career is less than 5 years, then you won't that much statistics on them.

it seems like you've researched this a bit more. I have quite a bit of technical knowledge, but not much in the domain.

if you have any questions regarding algoirthms to use etc. please ask.
>>
>>8552378
sorry, that comment represents almost all of my domain knowledge - I'm just repeating what I overheard in a meeting at my school's sports analytics club. I was just surprised that so many people in this thread were talking about having little data on a candidate - it's not the name of the team that matters, it's the lineup, and there should be much more data on the lineup

But yeah, idk if I can think of anything better than what you just described. I'm sure there's some correlation between courses though - maybe cluster horses together based on how their performance differs between, say, rough tracks and smooth tracks. If you were doing linear regression, you'd have a different regression function for each

I've never actually encountered random forest in a course before. Afaik from the wiki intro, it's an ensemble of decision trees, but why do I hear about it so much more than the usual information gain strategy for single trees?
>>
i'm gonna get together with a friend of mine who does sports betting and try some stuff. maybe check back in january i browse often.
>>
>>8551294
if you really care about actually making money rather than just having a pet project, you should research the fees that the online services charge (it's not like they'll let you bet for free). also i'm guessing that your winnings would be taxed as income. so consider whether your competitive advantage from your algorithm will be enough to make this a more lucrative strategy that just putting your money in a stock market index fund (which will most likely have lower service fees and will be considered capital gains, so taxed at a lower rate than income)
>>
>>8552481
in a decision tree, the deeper you make the tree, the more likely you are to overfit (get good results on training set and bad on validation set).

A random forest is an ensemble method that uses a bunch of decision trees. However, each tree has only a certain percentage of the sample. say, make 10 trees each with 15% of the data, overlapping a bit (which is okay).

Then you use the average of the posterior probabilities as your prediction. (meaning you put in your test data point, and out of the 10 trees, each one will output a guess and a probability. You average all the probabilities per class then predict using the highest.

It reduces overfitting really substantially, So you can have deeper trees and still get good accuracy.

It uses the same entropy calculation for maximizing information gain.

there's also some other tricks you can do like forcing certain trees to specialize on edge cases by feeding them more outliers etc.

It also has a certain readability quality. As you can look back through the tree and see what steps lead to the decision.

pretty cool imho.
Thread posts: 25
Thread images: 1


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.