Title :
Kernel space for text analysis based on fuzzy neighborhoods
Author :
Miyamoto, Sadaaki ; Kawasaki, Yuichi
Author_Institution :
Dept. of Risk Eng., Tsukuba Univ., Tsukuba
Abstract :
A natural Euclidean space is defined on a set of texts as sequences or hierarchical structures. Unlike the traditional term-document model, the present model takes local topological structure of texts. Kernel functions are defined that enable the use of Euclidean spaces and hence methods of data analysis based on kernels are applicable to the present model. Applications include agglomerative as well as c-means clustering and principal component analysis. Numerical examples are shown.
Keywords :
data analysis; data mining; fuzzy set theory; geometry; text analysis; c-means clustering; data analysis; fuzzy neighborhoods; hierarchical structures; kernel functions; kernel space; local topological structure; natural Euclidean space; principal component analysis; text analysis; Data analysis; Equations; Fuzzy sets; Kernel; Principal component analysis; Support vector machine classification; Support vector machines; Text analysis; Text mining; Web pages;
Conference_Titel :
Fuzzy Systems, 2008. FUZZ-IEEE 2008. (IEEE World Congress on Computational Intelligence). IEEE International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-1818-3
Electronic_ISBN :
1098-7584
DOI :
10.1109/FUZZY.2008.4630452