Title :
Concept Mining using Association Rules and Combinatorial Topology
Author_Institution :
San Jose State Univ., San Jose
Abstract :
The collection of concepts in a document set can be represented by a geometric structure called simplicial complex of combinatorial topology where each keyword is represented as a vertex and the relation between keywords as simplex. A simplex which consists of more than one keyword is a high-frequency keywordset. These keywords occur close to each other within a document which also occur frequently within a set of documents. The high frequent occurrence of these keywords shows relations between keywords. These relations carry concepts. The relations of these keywords can be captured by association rule mining and represented as simplices. The collection of all these simplices, represents the structure of concepts within a document set. Based on this topology, documents are clustered and the collection of simplices can serve as document index.
Keywords :
data mining; topology; association rules; combinatorial topology; concept mining; document index; geometric structure; high-frequency keywordset; simplicial complex; Association rules; Data mining; Databases; Frequency measurement; Topology; USA Councils;
Conference_Titel :
Granular Computing, 2007. GRC 2007. IEEE International Conference on
Conference_Location :
Fremont, CA
Print_ISBN :
978-0-7695-3032-1
DOI :
10.1109/GrC.2007.154