• DocumentCode
    1984085
  • Title

    A novel technique for words reordering based on N-grams

  • Author

    Athanaselis, Theologos ; Bakamidis, Stelios ; Dologlou, Ioannis

  • Author_Institution
    Inst. for Language & Speech Process., Athens
  • fYear
    2007
  • fDate
    12-15 Feb. 2007
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    This paper presents an approach for repairing word order errors in English text by reordering words in a sentence and choosing the version that maximizes the number of trigram hits according to a language model. The novelty of this method concerns the use of an efficient confusion matrix technique for reordering the words. For further reducing the number of permutations the use of unigramspsila probability is used. The comparative advantage of this method is that works with a large set of words, and avoids the laborious and costly process of collecting word order errors for creating error patterns.
  • Keywords
    natural languages; probability; text analysis; English text; confusion matrix; language model; trigram hits; word order errors; words reordering; Computer errors; Internet; Machine learning algorithms; Natural languages; Probability; Search engines; Speech processing; Testing; Text recognition; Writing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing and Its Applications, 2007. ISSPA 2007. 9th International Symposium on
  • Conference_Location
    Sharjah
  • Print_ISBN
    978-1-4244-0778-1
  • Electronic_ISBN
    978-1-4244-1779-8
  • Type

    conf

  • DOI
    10.1109/ISSPA.2007.4555284
  • Filename
    4555284