• DocumentCode
    600975
  • Title

    Optimal translation of English to Bahasa Indonesia using statistical machine translation system

  • Author

    Mantoro, Teddy ; Asian, J. ; Octavian, R. ; Ayu, M.A.

  • Author_Institution
    Adv. Inf. Sch., Univ. of Technol. Malaysia (UTM), Kuala Lumpur, Malaysia
  • fYear
    2013
  • fDate
    26-27 March 2013
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Finding optimal translation is not an easy task as it requires in-depth knowledge of the language to re-encode the meaning into the target language. This paper explores the translation process of a statistical machine translation system from English to Bahasa Indonesia by considering four weight variables i.e. translation model, language model, distortion (reordering) and for word penalty. This translation approach does not require in-depth knowledge of the linguistic properties of the languages. The well-behaved aligned parallel corpus as the training data for the machine translation is used to increase the BLEU and NIST scores in getting better quality translations. One way to enhance the corpus is by increasing the number of words and/or sentences it contains. In this study, the better evaluation score is achieved when we alter the weights of some translation parameters. Our study shows that both the weights and well-behaved aligned parallel corpus play significant roles in improving the translation quality which presented by higher score of NIST and BLEU. Our results show that this approach has better performance than a popular Rule Based Machine Translation (RBMT) system.
  • Keywords
    language translation; natural language processing; statistical analysis; BLEU score; English-to-Bahasa Indonesia translation; NIST score; RBMT system; bilingual evaluation understudy; distortion variable; language knowledge; language model; linguistic property; parallel corpus; rule based machine translation; statistical machine translation system; translation model; word penalty; Bismuth; Context; Educational institutions; Hidden Markov models; NIST; Smoothing methods; Training; formatting; insert; style; styling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information and Communication Technology for the Muslim World (ICT4M), 2013 5th International Conference on
  • Conference_Location
    Rabat
  • Print_ISBN
    978-1-4799-0134-0
  • Type

    conf

  • DOI
    10.1109/ICT4M.2013.6518918
  • Filename
    6518918