Title :
Topic Discovery in Research Literature Based on Non-negative Matrix Factorization and Testor Theory
Author :
Li, Fang ; Zhu, Qunxiong ; Lin, Xiaoyong
Author_Institution :
Sch. of Comput. Sci. & Technol., Beijing Univ. of Chem. Technol., Beijing, China
Abstract :
The paper proposes a new way of comprising the Non-negative matrix factorization (NMF) and Testor theory to make topic discovery. NMF method is good at dealing with high dimensional documents and clustering, while Testor theory is used to find the topic of each cluster. By an example of ten abstracts of Chinese science literature from magazines relative to environmental science, the whole process is described in detail. In the end, a case study about automatic classification of a conference proceeding (in Chinese) is given. The result shows the effectiveness of the whole method.
Keywords :
classification; data mining; data reduction; literature; matrix decomposition; pattern clustering; text analysis; Chinese science literature; Testor theory; automatic conference proceeding classification; environmental science; high dimensional document clustering; nonnegative matrix factorization; research literature; text data dimensionality reduction; text mining; topic discovery; Abstracts; Chemical technology; Clustering algorithms; Clustering methods; Computer science; Conference proceedings; Information processing; Paper technology; Partitioning algorithms; Testing; Document Clustering; NMF; Term-Document Matrix; Testor theory; Topic Discovery;
Conference_Titel :
Information Processing, 2009. APCIP 2009. Asia-Pacific Conference on
Conference_Location :
Shenzhen
Print_ISBN :
978-0-7695-3699-6
DOI :
10.1109/APCIP.2009.202