DocumentCode
424245
Title
Interpolated probabilistic tagging model optimized with genetic algorithm
Author
Wong, Fa ; Chao, Sam ; Hu, Dong-Cheng ; Mao, W-Hang
Author_Institution
Fac. of Sci. & Technol., Macao Univ., China
Volume
4
fYear
2004
fDate
26-29 Aug. 2004
Firstpage
2569
Abstract
We present results of probabilistic tagging of Portuguese texts in order to show how these techniques work for one of the highly morphologically ambiguous inflective languages by using a limited corpus as the basic training source. In order to cope the ambiguities problem caused by the insufficient training data, especially the unknown words, we incorporate the lexical features into the probabilistic model. Different from other proposed tagging models, these features are introduced into the word probabilities by means of interpolation. A technique to determine the optimal set of interpolation parameters based on genetic algorithm is described. Our preliminary result shows that we can correctly tag 91.8% of the sentences based on our tagging model.
Keywords
genetic algorithms; interpolation; probability; text analysis; Portuguese texts; genetic algorithm; interpolated probabilistic tagging model; interpolation; word probability; Chaos; Genetic algorithms; Interpolation; Natural language processing; Natural languages; Probability; Speech; Statistical analysis; Tagging; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
Print_ISBN
0-7803-8403-2
Type
conf
DOI
10.1109/ICMLC.2004.1382237
Filename
1382237
Link To Document