Title :
A web text classification technique for unlabeled training samples
Author :
Francois Tchiegue;Rui Li;Shilong Ma
Author_Institution :
State Key Lab. Of Software Development Environment, School of Computer Science &
Abstract :
The common classification is conducted under the supervised learning algorithms, which design classifiers through learning the labeled training samples. However, in actual situations, it is very costly to acquire class-labeled samples, because manually labeling documents requires a lot of time and efforts from experts. Therefore, it restrains the text classification to a great extent. To solve the issue that labeled texts are hard to retrieve from the Internet, this paper has proposed the text classification method combining Fuzzy Partition Clustering Method (FPCM) and Naive Bayesian Augment Learning to integrate the unsupervision of the clustering with the prior knowledge of the sample, which has solved the bottleneck problem of unlabeled training set in the text classification, further improved the classification performance by estimating the classification error loss to balance the sample selection, and constructed superior classification learning method.
Keywords :
"Training","Clustering methods","Bayes methods","Text categorization","Clustering algorithms","Classification algorithms","Data models"
Conference_Titel :
Software Engineering and Service Science (ICSESS), 2015 6th IEEE International Conference on
Print_ISBN :
978-1-4799-8352-0
Electronic_ISBN :
2327-0594
DOI :
10.1109/ICSESS.2015.7339091