Title :
Concept analysis and web clustering using combinatorial topology
Author :
Lin, T.Y. ; Sutojo, A. ; Hsu, J.-D.
Author_Institution :
Dept. of Comput. Sci., San Jose State Univ., CA
Abstract :
The collection of the concepts that are discussed in a document set can be represented by a geometric structure, called simplical complex, of combinatorial topology. A simplex is a high-frequency keyword set that co-occurs closely which, we believe, carries a concept in the document set. The collection of all these simplexes that forms the simplical complex represents the structure of these concepts. Based on the topological structure of this complex, the documents are clustered. Several clustering schemes are presented. Our initial experiments, as expected, do support the theory
Keywords :
Internet; combinatorial mathematics; data mining; vocabulary; Web clustering; combinatorial topology; concept analysis; high-frequency keyword set; Association rules; Clustering methods; Computer science; Data mining; Mathematical analysis; Matrix decomposition; Singular value decomposition; Spine; Topology; Web sites;
Conference_Titel :
Data Mining Workshops, 2006. ICDM Workshops 2006. Sixth IEEE International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2702-7
DOI :
10.1109/ICDMW.2006.48