Title :
Unsupervised fuzzy-membership estimation of terms in semantic and syntactic lexical classes
Author :
Portnoy, David ; Bock, Peter
Author_Institution :
Dept. of Comput. Sci., George Washington Univ., DC, USA
Abstract :
The objective of this research is to discover fuzzy semantic and syntactic relationships among English words at various levels of abstraction without using any other sources of semantic or syntactical reference information (e.g. dictionaries, lexicons, grammar rules, etc...) An agglomerative clustering algorithm is applied to the co-occurrence space formed by subsets of target words and training words the output of which is a set of semantic or syntactic classes. The fuzzy-relationships (membership coefficients) between test words and the semantic and syntactic classes are estimated by the non-negative least-squares solution to the system of linear equations defined. Experiments using raw text in 218 unrelated novels have yielded promising results. It is expected that larger and/or more narrowly focused training sets would yield even better and more diverse results.
Keywords :
abstracting; least squares approximations; pattern clustering; thesauri; word processing; English words; agglomerative clustering algorithm; co-occurrence space; linear equations; nonnegative least-squares solution; semantic class; syntactic lexical class; unsupervised fuzzy-membership estimation; Assembly; Clustering algorithms; Clustering methods; Computer science; Dictionaries; Equations; Frequency; Fuzzy sets; System testing; Thesauri;
Conference_Titel :
Information Theory, 2004. ISIT 2004. Proceedings. International Symposium on
Print_ISBN :
0-7695-2250-5
DOI :
10.1109/AIPR.2004.48