Title :
An unsupervised & statistical word sense tagging using bilingual sources
Author :
Oliveira, Francisco ; Wong, Fai ; Li, Yi-ping
Author_Institution :
Fac. of Sci. & Technol., Univ. of Macau, Macau
Abstract :
This paper presents an approach for choosing the correct translation of an ambiguous word in a given sentence. An unsupervised learning is applied and a non-aligned bilingual Portuguese to Chinese bilingual corpus is used in disambiguating word senses. The identification of the relationships between words is done by considering its surrounding words and their relative distance to tackle syntactical relationships. All the related words are then translated to the target language in finding out the correct senses of ambiguous words. The selection is based on a statistical and a mathematical model by assigning a score to each of the sense identified previously. After all the senses discovered, its semantic and syntactical information are converted into a set of rules and stored in the database for later use in the disambiguation process. Preliminary experiment results of the proposed method shows an improvement of 6% in assigning correctly the corresponding translation over the baseline method.
Keywords :
dictionaries; language translation; linguistics; statistical analysis; unsupervised learning; word processing; ambiguous word translation; bilingual dictionary; machine translation; nonaligned bilingual Portuguese-to-Chinese bilingual corpus; statistical word sense tagging; unsupervised learning; Costs; Databases; Dictionaries; Flip-flops; Labeling; Mathematical model; Natural language processing; Natural languages; Tagging; Unsupervised learning; Machine Translation; Word Sense Tagging;
Conference_Titel :
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location :
Guangzhou, China
Print_ISBN :
0-7803-9091-1
DOI :
10.1109/ICMLC.2005.1527592