An empirical study of the naive Bayes classifier

An empirical study of the naive Bayes classifier,I. Rish

An empirical study of the naive Bayes classifier   (Citations: 149)
BibTex | RIS | RefWorks Download
The naive Bayes classifier greatly simplify learn- ing by assuming that features are independent given class. Although independence is generally a poor assumption, in practice naive Bayes often competes well with more sophisticated classifiers. Our broad goal is to understand the data character- istics which affect the performance of naive Bayes. Our approach uses Monte Carlo simulations that al- low a systematic study of classification accuracy for several classes of randomly generated prob- lems. We analyze the impact of the distribution entropy on the classification error, showing that low-entropy feature distributions yield good per- formance of naive Bayes. We also demonstrate that naive Bayes works well for certain nearly- functional feature dependencies, thus reaching its best performance in two opposite cases: completely independent features (as expected) and function- ally dependent features (which is surprising). An- other surprising result is that the accuracy of naive Bayes is not directly correlated with the degree of feature dependencies measured as the class- conditional mutual information between the fea- tures. Instead, a better predictor of naive Bayes ac- curacy is the amount of information about the class that is lost because of the independence assump- tion.
Published in 2001.
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
Sort by: