[Boards: 3 / a / aco / adv / an / asp / b / biz / c / cgl / ck / cm / co / d / diy / e / fa / fit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mu / n / news / o / out / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / t / tg / toy / trash / trv / tv / u / v / vg / vip /vp / vr / w / wg / wsg / wsr / x / y ] [Search | Home]
4Archive logo
Computer Science
If images are not shown try to refresh the page. If you like this website, please disable any AdBlock software!

You are currently reading a thread in /sci/ - Science & Math

Thread replies: 7
Thread images: 1
How does one get a good understanding of preprocessing data before starting to think about neural network architecture, etc?

Is there a checklist or something? I guess there's imputation if needed, converting categorical to numerical, then... I look for correlations (correlation matrix) and maybe for mutual information (to check for non-linear correlations) but what else? I don't know, is there a complete guide for this?

Also, computer science general
>>
>http://blog.kaggle.com/2016/01/04/how-much-did-it-rain-ii-winners-interview-1st-place-pupa-aka-aaron-sim/

>mfw random physics guy jumps into ML and gets #1
>>
>>7779060
fuck NNs, bayesian program learning BTFO deep learning: http://science.sciencemag.org/content/350/6266/1332.full
>>
>>7779090
nice paywall kike
>>
>If I were to take one point away from this contest, it is that the days of manually constructing features from data are almost over. The machines will win. I experienced this in the Plankton classification contest where the monumental effort that my teammate and I put into extracting image features was eclipsed within minutes by even the shallowest of CNNs.
>>
>>7779060
That basically means you have to learn the field you are trying to do learning on.

>>7779172
People in general don't bother reading it if it's behind a paywall. Also the machines won't win if you don't have a method of selecting relevant training data. Any machine learning method could fail if you train it using the wrong data. Manually selected features could be used to disqualify the worst training data to avoid ruining the network.
>>
>>7780343
>not being part of a group that provides access to all papers you want
Thread replies: 7
Thread images: 1
Thread DB ID: 432531



[Boards: 3 / a / aco / adv / an / asp / b / biz / c / cgl / ck / cm / co / d / diy / e / fa / fit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mu / n / news / o / out / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / t / tg / toy / trash / trv / tv / u / v / vg / vip /vp / vr / w / wg / wsg / wsr / x / y] [Search | Home]

[Boards: 3 / a / aco / adv / an / asp / b / biz / c / cgl / ck / cm / co / d / diy / e / fa / fit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mu / n / news / o / out / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / t / tg / toy / trash / trv / tv / u / v / vg / vip /vp / vr / w / wg / wsg / wsr / x / y] [Search | Home]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the shown content originated from that site. This means that 4Archive shows their content, archived. If you need information for a Poster - contact them.
If a post contains personal/copyrighted/illegal content, then use the post's [Report] link! If a post is not removed within 24h contact me at [email protected] with the post's information.