DocumentCode :
3165335
Title :
Incorporating User Provided Constraints into Document Clustering
Author :
Chen, Yanhua ; Rege, Manjeet ; Dong, Ming ; Hua, Jing
Author_Institution :
Wayne State Univ., Detroit
fYear :
2007
fDate :
28-31 Oct. 2007
Firstpage :
103
Lastpage :
112
Abstract :
Document clustering without any prior knowledge or background information is a challenging problem. In this paper, we propose SS-NMF: a semi-supervised non- negative matrix factorization framework for document clustering. In SS-NMF, users are able to provide supervision for document clustering in terms of pairwise constraints on a few documents specifying whether they "must" or "cannot" be clustered together. Through an iterative algorithm, we perform symmetric tri-factorization of the document- document similarity matrix to infer the document clusters. Theoretically, we show that SS-NMF provides a general framework for semi-supervised clustering and that existing approaches can be considered as special cases of SS-NMF. Through extensive experiments conducted on publicly available data sets, we demonstrate the superior performance of SS-NMF for clustering documents.
Keywords :
document handling; matrix decomposition; pattern clustering; document clustering; pairwise constraints; semisupervised nonnegative matrix factorization; user provided constraints; Clustering algorithms; Clustering methods; Computer graphics; Couplings; Data mining; Databases; Iterative algorithms; Machine vision; Pattern recognition; Symmetric matrices;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
ISSN :
1550-4786
Print_ISBN :
978-0-7695-3018-5
Type :
conf
DOI :
10.1109/ICDM.2007.67
Filename :
4470234
Link To Document :
بازگشت