DocumentCode :
3017514
Title :
Weighted naïve Bayes classifier on categorical features
Author :
Omura, K. ; Kudo, Motoi ; Endo, T. ; Murai, Takashi
Author_Institution :
Div. of Comput. Sci., Hokkaido Univ., Sapporo, Japan
fYear :
2012
fDate :
27-29 Nov. 2012
Firstpage :
865
Lastpage :
870
Abstract :
Recently we face classification problems with many categorical features, as seen in genetic data and text data. In this paper, we discuss some ways to give weights on features in the framework of naïve Bayes classifier, that is, under independent assumption of features. Because no order exists in a categorical feature, we consider a histogram over possible values (bins) in the feature. Taking into the difference of number of samples falling in each bin, we propose two kinds of weights: 1) one is derived from the probability that the majority class takes the majority even in samples, and 2) another reflects the expected conditional entropy. With the latter entropy weight, it will be shown that more discriminative features gain higher weights and non-discriminative feature diminishes as the number of samples goes infinity. We reveal the properties of these two kinds of weights through artificial data and some real-life data.
Keywords :
belief networks; entropy; pattern classification; probability; artificial data; categorical features; discriminative features; entropy weight; expected conditional entropy; genetic data; histogram; nondiscriminative feature; probability; real-life data; text data; weighted naïve Bayes classifier; Accuracy; Entropy; Histograms; Intelligent systems; Reliability; Training; Training data; Categorical feature; Confidence weight; Entropy weight; Feature shrinkage; Naïve Bayes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Systems Design and Applications (ISDA), 2012 12th International Conference on
Conference_Location :
Kochi
ISSN :
2164-7143
Print_ISBN :
978-1-4673-5117-1
Type :
conf
DOI :
10.1109/ISDA.2012.6416651
Filename :
6416651
Link To Document :
بازگشت