DocumentCode
2319452
Title
Exploring multiple features for sense prediction of Chinese unknown words
Author
Wang, Chao-yue ; Zhao, Yan-qing ; Fu, Guo-hong
Author_Institution
Sch. of Comput. Sci. & Technol., Heilongjiang Univ., Harbin, China
Volume
5
fYear
2012
fDate
15-17 July 2012
Firstpage
2031
Lastpage
2036
Abstract
Word sense disambiguation is a crucial problem in natural language processing. While sense disambiguation of in-vocabulary words is well studied to date, few research findings are yet available concerning the prediction of unknown words´ sense. In this paper, we attempt to exploit multiple features for predicting sense of Chinese out-of-vocabulary words in real text. To this end, we first take morpheme as the basic component units of Chinese words and thus investigate the relationship between Chinese unknown words´ senses and their internal morphological structures. Then, we explore both word internal cues and word external contextual features, and combine them for sense prediction of Chinese unknown words using maximum entropy modeling. Our experimental results show that the incorporation of multiple features, especially the word-internal morphological features are of great value to Chinese unknown word sense prediction.
Keywords
natural language processing; text analysis; vocabulary; Chinese out-of-vocabulary words; Chinese unknown words; internal morphological structures; maximum entropy modeling; multiple features; natural language processing; sense prediction; word sense disambiguation; word-internal morphological features; Abstracts; Maximum entropy models; Morpheme features; Sense prediction; Word sense disambiguation;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics (ICMLC), 2012 International Conference on
Conference_Location
Xian
ISSN
2160-133X
Print_ISBN
978-1-4673-1484-8
Type
conf
DOI
10.1109/ICMLC.2012.6359688
Filename
6359688
Link To Document