[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

Is it possible for decision trees to split a set of data into

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 14
Thread images: 3

File: Screenshot 2016-01-01 19.01.19.png (215KB, 653x308px) Image search: [Google]
Screenshot 2016-01-01 19.01.19.png
215KB, 653x308px
Is it possible for decision trees to split a set of data into subsets where the sum of the entropies (weighted respectively) of each subset will be greater than the original entropy of the set?

I know its a bit contrived, but can there be a scenario where I would need to worry about negative information gain?

Or will this not be a problem and all I have to do is plug and chug and only minimize entropy after, without worrying about its relationship to entropy before?

>inb4 goto /g/, /g/ cant even into entropy apparently
>>
File: Screenshot 2015-12-31 06.51.17.png (705KB, 773x708px) Image search: [Google]
Screenshot 2015-12-31 06.51.17.png
705KB, 773x708px
bump
>>
>>7757093
stop bumping your shitty thread
where did you learn about entropy that allows negative information gain?
where did you learn about entropy or information gain that causes you to ask
>Is it possible for decision trees to split a set of data into subsets where the sum of the entropies (weighted respectively) of each subset will be greater than the original entropy of the set?

are you in an online degree from some third-world porta-potty? or are you trying to convince some random company that you don't need a degree to be a data miner?
>>
just a student thats having a bit of trouble with yesterdays material, jesus christ

let x = {0,0,0,1,1,1}
is there a y that is a subset of x such that the entropy is greater in y
>>
>>7757142
please just open a textbook wow
literally every subset of x has a higher entropy than x
>shitposting homework on sci that you haven't even looked at other than to copy into a post
shiggy diggy sage
>>
>>7757142
Anon, a set of determined digits has 0 entropy...
>>
>>7757093
That is not really how information entropy works. Its used more as a guide to building the tree.
>>
File: Screenshot 2015-11-04 03.41.22.png (472KB, 711x690px) Image search: [Google]
Screenshot 2015-11-04 03.41.22.png
472KB, 711x690px
k let me rephrase my dumbass self.

decision trees are known to select splitting attributes via maximizing information gain/minimizing the entropy of any one attribute as a decision applied over some set X.

I am asking if the algorithm would ever need to consider attributes that were retarded or contrived so that the information gain was negative (SOMEHOW) or would we instead have to worry about it and have to do something like max(x, 0) to prevent bullshit "malicious" attributes
>>
>>7757170
it is implied that with a set you calculate entropy assuming you choose random objects from the set, kid...
so {1, 1, 1, 1} has entropy of -(4/4)*log_2(4/4) and {1, 1, 0, 0} has entropy of -((2/4)*log_2(2/4) + (2/4)*log_2(2/4))
>>
>>7757183
Textbook, seriously.

Hell, just look at the equation for entropy! If you can't deduce it from that then what are you doing in university?
>>
>>7757187
perhaps I am unsure of my mathematics and just need a little help because I am human and am still learning, and you are not contributing and to me feel like an academic elitist

>just maybe
>>
>>7757191
or perhaps you're lazy
or maybe retarded
neither of which should be shitposting on sci
>>
>>7757187
also my question said over all subsets in every possible weight.

so its not so simple as hurr durr 0 <= x <= 1
its the sum of the weighted entropy by number of elements within in each subset w/ respect to the original set
>>
>>7757205
actually, I just dont think you understand the gravity of my question
Thread posts: 14
Thread images: 3


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.