DocumentCode :
589283
Title :
Graph Based Semi-supervised Non-negative Matrix Factorization for Document Clustering
Author :
Naiyang Guan ; Xuhui Huang ; Long Lan ; Zhigang Luo ; Xiang Zhang
Author_Institution :
Sch. of Comput. Sci., Nat. Univ. of Defense Technol., Changsha, China
Volume :
1
fYear :
2012
fDate :
12-15 Dec. 2012
Firstpage :
404
Lastpage :
408
Abstract :
Non-negative matrix factorization (NMF) approximates a non-negative matrix by the product of two low-rank matrices and achieves good performance in clustering. Recently, semi-supervised NMF (SS-NMF) further improves the performance by incorporating part of the labels of few samples into NMF. In this paper, we proposed a novel graph based SS-NMF (GSS-NMF). For each sample, GSS-NMF minimizes its distances to the same labeled samples and maximizes the distances against different labeled samples to incorporate the discriminative information. Since both labeled and unlabeled samples are embedded in the same reduced dimensional space, the discriminative information from the labeled samples is successfully transferred to the unlabeled samples, and thus it greatly improves the clustering performance. Since the traditional multiplicative update rule converges slowly, we applied the well-known projected gradient method to optimizing GSS-NMF and the proposed algorithm can be applied to optimizing other manifold regularized NMF efficiently. Experimental results on two popular document datasets, i.e., Reuters21578 and TDT-2, show that GSS-NMF outperforms the representative SS-NMF algorithms.
Keywords :
approximation theory; document handling; gradient methods; graph theory; learning (artificial intelligence); matrix decomposition; minimisation; pattern clustering; GSS-NMF optimization; Reuters21578 document dataset; TDT-2 document dataset; dimensional space reduction; distance maximization; distance minimization; document clustering performance improvement; graph based SS-NMF; graph-based semisupervised nonnegative matrix factorization approximation; labeled samples; low-rank matrix product; manifold regularized NMF optimization; projected gradient method; unlabeled samples; Clustering algorithms; Gradient methods; Linear programming; Machine learning; Manifolds; Periodic structures; Semisupervised learning; manifold learning; non-negative matrix factorization; semi-supervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications (ICMLA), 2012 11th International Conference on
Conference_Location :
Boca Raton, FL
Print_ISBN :
978-1-4673-4651-1
Type :
conf
DOI :
10.1109/ICMLA.2012.73
Filename :
6406696
Link To Document :
بازگشت