Title :
Automatic keyword extraction with relational clustering and Levenshtein distances
Author :
Runkler, Thomas A. ; Bezdek, James C.
Author_Institution :
Corp. Technol., Siemens AG, Munich, Germany
Abstract :
Alternating cluster estimation (ACE) is a generalized clustering model. Relational ACE is a modification of ACE that can be used to cluster data which do not possess a clear numerical representation, but for which a meaningful relation matrix can be defined. For text data sets we define (pairwise) relation matrices based on the Levenshtein string distance (1966). Relational ACE with Levenshtein distances is applied to four different texts. The cluster centers represent typical words in the texts, so this algorithm can be used to automatically determine keywords
Keywords :
matrix algebra; pattern clustering; text analysis; ACE; Levenshtein distances; Levenshtein string distance; alternating cluster estimation; automatic keyword extraction; generalized clustering model; numerical representation; pairwise relation matrices; relation matrix; relational clustering; Clustering algorithms; Communications technology; Computer science; Data analysis; Data mining; Image sensors; Image sequence analysis; Optimization methods; Pixel; Prototypes;
Conference_Titel :
Fuzzy Systems, 2000. FUZZ IEEE 2000. The Ninth IEEE International Conference on
Conference_Location :
San Antonio, TX
Print_ISBN :
0-7803-5877-5
DOI :
10.1109/FUZZY.2000.839067