DocumentCode
2348230
Title
Designing effective web mining-based techniques for OOV translation
Author
Yu, Haitao ; Ren, Fuji ; Huang, Degen ; Li, Lishuang
Author_Institution
Fac. of Eng., Univ. of Tokushima, Tokushima, Japan
fYear
2010
fDate
21-23 Aug. 2010
Firstpage
1
Lastpage
8
Abstract
Due to a limited coverage of the existing bilingual dictionary, it is often difficult to translate the Out-Of-Vocabulary terms (OOV) in many natural language processing tasks. In this paper, we propose a general cascade mining technique of three steps, it leverages OOV category to optimize the effectiveness of each step. OOV category based expansion policy is suggested to get more relevant mixed-language documents. OOV category based hybrid extraction approach is suggested to perform a robust extraction. A more flexible model combination based on OOV category is also suggested. Moreover, we conducted experiments to evaluate the effectiveness of each step and the overall performance of the mining technique. The experimental results show significantly performance improvement than the existing methods.
Keywords
Internet; data mining; natural language processing; text analysis; OOV category based expansion policy; OOV translation; Web mining-based techniques; bilingual dictionary; natural language processing; out-of-vocabulary terms; Computational modeling; TV; CLIR; OOV category; Out-of-Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering (NLP-KE), 2010 International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4244-6896-6
Type
conf
DOI
10.1109/NLPKE.2010.5587807
Filename
5587807
Link To Document