DocumentCode
3079840
Title
Preprocessors in NLP applications: In the context of English to Malayalam Machine Translation
Author
Sunil, R. ; Jayan, V. ; Bhadran, V.K.
Author_Institution
Language Technol. Centre, Centre for Dev. of Adv. Comput. (C-DAC), Trivandrum, India
fYear
2012
fDate
7-9 Dec. 2012
Firstpage
221
Lastpage
226
Abstract
Preprocessing the input text is an essential component in a Natural Language Processing (NLP) system. We are discussing the relevance of the preprocessors in the context of Machine Translation system developed by us based on AnglaBharati Technology. Whenever we come across with text for translation we encounter with the special formats in an input text and getting its appropriate translation is a difficult task. Sometimes they may not have definite grammatical structure and may not be able to handle using a language rule. This paper present a strategy to identify the special formats in English text like date, currency, number, time, quotes, acronym, parenthesis, etc for a rule based English Malayalam Machine Aided Translation system. AnglaBharati is a pattern directed rule based system with context free grammar like structure for English which generates a pseudo target for group of Indian languages. Preprocessor is one of the main modules in this translation System. Here it manipulates the English input text to produce an input which is more suitable for an engine to generate appropriate translation. Extensive research is carried out in this area to disambiguate and process the input text in order to get more suitable output from the translation engine.
Keywords
language translation; AnglaBharati technology; English; Malayalam; NLP applications; Natural Language Processing system; machine translation; preprocessors; pseudo target; Context; Data preprocessing; Engines; Helium; Knowledge based systems; Natural language processing; Terminology; AnglaBharati; machine translation; preprocessor;
fLanguage
English
Publisher
ieee
Conference_Titel
India Conference (INDICON), 2012 Annual IEEE
Conference_Location
Kochi
Print_ISBN
978-1-4673-2270-6
Type
conf
DOI
10.1109/INDCON.2012.6420619
Filename
6420619
Link To Document