DocumentCode :
390401
Title :
Learning Bayesian network classifiers from data with missing values
Author :
Zhang, Hongwei ; Lu, Yuchang
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Volume :
1
fYear :
2002
fDate :
28-31 Oct. 2002
Firstpage :
35
Abstract :
Learning accurate Bayesian network (BN) classifiers from complete databases is a very active research topic in data mining and machine learning. However, in practice, databases are rarely complete. This affects their real world data mining applications. This paper investigates the methods for learning four types well-known Bayesian network classifiers from incomplete databases. These four types BN classifiers are: Naive-Bayes, tree augmented Naive-Bayes, BN augmented Naive-Bayes, and general BN, where the latter two are learned using dependency analysis based algorithms that work only on the database completeness assumption. In order to enable this kind of algorithms to handle with missing data, this paper introduces a novel deterministic method to estimate the (conditional) mutual information from incomplete databases, which can be used to do CI tests, a fundamental step in the dependency analysis based algorithms. The experimental results show that our algorithm is efficient and reliable.
Keywords :
belief networks; data mining; database management systems; BN augmented Naive-Bayes method; Bayesian network classifiers learning; Naive-Bayes method; complete databases; data mining; machine learning; tree augmented Naive-Bayes method; Algorithm design and analysis; Bayesian methods; Classification tree analysis; Data mining; Intelligent systems; Iterative algorithms; Laboratories; Machine learning; Spatial databases; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
TENCON '02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering
Print_ISBN :
0-7803-7490-8
Type :
conf
DOI :
10.1109/TENCON.2002.1180966
Filename :
1180966
Link To Document :
بازگشت