• DocumentCode
    644004
  • Title

    Long sentence partitioning using top-down analysis for machine translation

  • Author

    Baosheng Yin ; Junjun Zuo ; Na Ye

  • Author_Institution
    Knowledge Eng. Res. Center, Shenyang Aerosp. Univ., Shenyang, China
  • Volume
    03
  • fYear
    2012
  • fDate
    Oct. 30 2012-Nov. 1 2012
  • Firstpage
    1425
  • Lastpage
    1429
  • Abstract
    Long sentence processing is an important part for English-Chinese machine translation systems. The system performance is directly affected by the correctness of long sentence processing. A basic thought for processing a long sentence is to partition it into short sub-sentences and to merge the sub-translations for the whole translation. In this paper, a rule-based top-down partitioning algorithm is provided. The rules are inducted from sentence patterns and use regular expressions as main part. Firstly, the algorithm reduces some sentence components to shorten the sentence; then coordinate sub-sentences are recognized and partitioned; finally, clauses within sub-sentences are processed. Experiment shows an approximate 85% accuracy and an over 90% recall rate of the algorithm.
  • Keywords
    language translation; natural language processing; English-Chinese machine translation systems; long sentence partitioning; regular expressions; rule-based top-down partitioning algorithm; sentence patterns; short sub-sentences; top-down analysis; Algorithm design and analysis; Google; Partitioning algorithms; Pattern matching; Pragmatics; Speech; Speech recognition; long sentence partitioning; machine translation; sentence pattern; top-down analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing and Intelligent Systems (CCIS), 2012 IEEE 2nd International Conference on
  • Conference_Location
    Hangzhou
  • Print_ISBN
    978-1-4673-1855-6
  • Type

    conf

  • DOI
    10.1109/CCIS.2012.6664620
  • Filename
    6664620