DocumentCode :
2474923
Title :
A novel features partition algorithm for semi-supervised categorization
Author :
Tang, HuanLing ; Lin, Zhengkui ; Lu, Mingyu ; Liu, Na
Author_Institution :
Coll. of Inf. & Sci. Tech., Dalian Maritime Univ., Dalian
fYear :
2008
fDate :
25-27 June 2008
Firstpage :
129
Lastpage :
134
Abstract :
This paper investigates the co-training algorithm and its assumption that the features set can be split into two compatible and independent views. However, the assumption is usually violated to some degree in real-world application, especially for independence. Sometimes the natural feature split does not exist. So a novel features partition algorithm, named Partition-MID, has been proposed. We give the formulas to estimate mutual independence between two features, feature and sub-view, sub-view and sub-view. And then the features with weaker mutual independence are classified in the same sub-view, whereas those with stronger mutual independence are in separate sub-views. Theoretical proof and experiments results both prove the proposed partition method can effectively split features set into two sub-views with higher independence. Based on Partition-MID, a new semi-supervised categorization algorithm, named SC-PMID, is developed. Utilizing unlabeled data together with labeled data, SC-PMID algorithm can significantly improve classification precision, especially when labeled data is sparse.
Keywords :
document handling; learning (artificial intelligence); cotraining algorithm; features partition algorithm; mutual independence; partition-MID; semisupervised categorization; Automation; Classification algorithms; Constraint theory; Educational institutions; Information science; Intelligent control; Mutual information; Partitioning algorithms; Semisupervised learning; Statistics; Categorization; Mutual independence; Semi-supervised Learning; labeled data; unlabeled data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Control and Automation, 2008. WCICA 2008. 7th World Congress on
Conference_Location :
Chongqing
Print_ISBN :
978-1-4244-2113-8
Electronic_ISBN :
978-1-4244-2114-5
Type :
conf
DOI :
10.1109/WCICA.2008.4592911
Filename :
4592911
Link To Document :
بازگشت