Shannon Entropy
Measured in bits One bit of entropy is the amount o information it takes to encode a yes/no signal “Surprise” level
Entropy Formula
Continuous Should be 1 if all events are equally likely Should be 0 if the outcome is certain And every time we double the number of potential outcomes (one coin flip has two outcomes, two has four, three has eight), we want entropy to increase by one.
If all events are equal probability:
Otherwise:
Where
Decision Trees
Leaves should have NO entropy! If I sample my leaves, I should have no surprise! Information gain from root to leaves: -1 bit