首页 > 代码库 > 信息熵
信息熵
That transfer of information, from what we don’t know about the system to what we know, represents a change in entropy. Insight decreases the entropy of the system. Get information, reduce entropy. This is information gain. And yes, this type of entropy is subjective, in that it depends on what we know about the system at hand. (Fwiw, information gain is synonymous with Kullback-Leibler divergence, which we explored briefly in this tutorial on restricted Boltzmann machines.)
So each principal component cutting through the scatterplot represents a decrease in the system’s entropy, in its unpredictability.
It so happens that explaining the shape of the data one principal component at a time, beginning with the component that accounts for the most variance, is similar to walking data through a decision tree. The first component of PCA, like the first if-then-else split in a properly formed decision tree, will be along the dimension that reduces unpredictability the most.
KL 散度 。 准备翻译一下:
https://www.countbayesie.com/blog/2017/5/9/kullback-leibler-divergence-explained
信息熵