• DocumentCode
    3622277
  • Title

    Turkish Word Error Detection Using Syllable Bigram Statistics

  • Author

    Gunel; Asliyan

  • Author_Institution
    Bilgisayar Mü
  • fYear
    2006
  • fDate
    6/28/1905 12:00:00 AM
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    In this study, we have designed and implemented a system, which uses n-gram statistical language model in order to facilitate optical character recognition, speech synthesis and recognition systems. First, the syllables bigram frequencies are extracted from Turkish corpora. Then, the test database including the words, which are written correctly and wrongly, is created. The probability of the words appears the given text is calculated and the wrongly and, correctly written words are determined. The system finds the wrongly written words about 86.13% with the proposed approach and the correctly written words are found about 88.32%
  • Keywords
    "Error analysis","Character recognition","Optical design","Natural languages","Optical character recognition software","Speech synthesis","Speech recognition","Frequency","Testing","Databases"
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing and Communications Applications, 2006 IEEE 14th
  • ISSN
    2165-0608
  • Print_ISBN
    1-4244-0238-7
  • Type

    conf

  • DOI
    10.1109/SIU.2006.1659786
  • Filename
    1659786