Title :
Cost-sensitive semi-supervised classification using CS-EM
Author :
Qin, Zhenxing ; Zhang, Shichao ; Liu, Li ; Wang, Tao
Author_Institution :
Fac. of Inf. Technol., Univ. of Technol. Sydney, Sydney, NSW
Abstract :
In many real world data mining and classification tasks, we face with the problem of high cost in making training data sets. In addition, in many domains, different misclassification errors involve different costs. These two issues are often addressed by semi-supervised learning and cost-sensitive learning separately. Sometimes the two issues can happen at the same time in real world applications. However, existing semi-supervised learning algorithms never consider the misclassification costs. In this paper, we propose a simple and novel method, CS-EM for learning cost-sensitive classifier using both labeled and unlabeled training data. CS-EM modifies EM, a popular semi-supervised learning algorithm by incorporating misclassification costs into the probability estimation process. Our experiments show that CS-EM outperforms other two competing methods on three bench mark text data sets across different cost ratios.
Keywords :
data mining; learning (artificial intelligence); pattern classification; cost-sensitive classifier; cost-sensitive semisupervised classification; data classification tasks; data mining; misclassification costs; probability estimation process; semisupervised learning algorithms; training data sets; Australia; Computer science; Costs; Data mining; Information technology; Logic; Machine learning; Semisupervised learning; Testing; Training data;
Conference_Titel :
Computer and Information Technology, 2008. CIT 2008. 8th IEEE International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-1-4244-2357-6
Electronic_ISBN :
978-1-4244-2358-3
DOI :
10.1109/CIT.2008.4594662