Title :
Tree Cluster of Text Data by NMF Based Neural Network
Author :
Barman, Paresh Chandra ; Lee, Soo-Young
Author_Institution :
Dept. of BioSystems, Korea Adv. Inst. of Sci. & Technol., Daejeon
Abstract :
This paper proposes the tree clustering of text documents using non-negative matrix factorization (NMF) based neural network. The main problem of the tree clustering is to find the number of branches of a parent node of the tree and to find the significant attributes which can cluster the documents of each node. In advance if we don´t know the exact number of branches then it is not applicable for tree clustering. The paper proposed the min-max correlation coefficient method for finding the number of branches and by reducing the NMF basis vectors dimension according to their probability and removing the common terms from the NMF basis vectors, the authors find the significant attributes to cluster the documents. The approach is also helpful for prepruning the tree. For the justification of the approaches the authors use the CLASSIC3 text database
Keywords :
matrix decomposition; minimax techniques; neural nets; probability; trees (mathematics); CLASSIC3 text database; min-max correlation coefficient method; neural network; nonnegative matrix factorization; probability; significant attributes; text data; text documents; tree clustering; Biochemical analysis; Clustering algorithms; Computer networks; Data analysis; Data engineering; Data mining; Electronic mail; Image analysis; Neural networks; Tree data structures;
Conference_Titel :
Electrical and Computer Engineering, 2006. ICECE '06. International Conference on
Conference_Location :
Dhaka
Print_ISBN :
98432-3814-1
DOI :
10.1109/ICECE.2006.355634