DocumentCode
3466682
Title
Clustering Using Feature Domain Similarity to Discover Word Senses for Adjectives
Author
Tomuro, Noriko ; Lytinen, Steven L. ; Kanzaki, Kyoko ; Isahara, Hitoshi
Author_Institution
DePaul Univ., Chicago
fYear
2007
fDate
17-19 Sept. 2007
Firstpage
370
Lastpage
377
Abstract
This paper presents a new clustering algorithm called DSCBC which is designed to automatically discover word senses for polysemous words. DSCBC is an extension of CBC clustering (P. Pantel and D. Lin, 2002), and incorporates feature domain similarity: the similarity between the features themselves, obtained a priori from sources external to the dataset used at hand. When polysemous words are clustered, words that have similar sense patterns are often grouped together, producing polysemous clusters: a cluster in which features in several different domains are mixed in. By incorporating the feature domain similarity in clustering, DSCBC produces monosemous clusters, thereby discovering individual senses of polysemous words. In this work, we apply the algorithm to English adjectives, and compare the discovered senses against WordNet. The results show significant improvements by our algorithm over other clustering algorithms including CBC.
Keywords
natural language processing; pattern clustering; DSCBC; English adjectives; clustering algorithm; feature domain similarity; polysemous word clustering; word senses discovering; Algorithm design and analysis; Clustering algorithms; Communications technology; Computer science; Concrete; Frequency; Information systems; Natural language processing; Telecommunication computing; Thesauri;
fLanguage
English
Publisher
ieee
Conference_Titel
Semantic Computing, 2007. ICSC 2007. International Conference on
Conference_Location
Irvine, CA
Print_ISBN
978-0-7695-2997-4
Type
conf
DOI
10.1109/ICSC.2007.72
Filename
4338371
Link To Document