• DocumentCode
    698623
  • Title

    Turkish dictation system for Broadcast news applications

  • Author

    Arisoy, Ebru ; Arslan, Levent M.

  • Author_Institution
    Electr. & Electron. Eng. Dept., Bogazici Univ., İstanbul, Turkey
  • fYear
    2005
  • fDate
    4-8 Sept. 2005
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    We have designed a Turkish dictation system for Broadcast news applications. Turkish is an agglutinative language with free word order. These characteristics of the language result in the vocabulary explosion, large number of out-of-vocabulary (OOV) words and the complexity of the N-gram language models in speech recognition when words are used as recognition units. Therefore, we proposed new recognition units. We parsed some of the words to smaller recognition units like stems, endings and morphemes, and introduced these smaller units and the unparsed words to the speech recognizer as lexicon entries. This way, we were able to overcome to the problem of large number of OOV words with a moderate vocabulary size and get better estimates for the N-gram language models. However, best recognition result was obtained using the word-based language model.
  • Keywords
    dictation; speech recognition; vocabulary; N-gram language model; OOV word; Turkish dictation system; broadcast news application; lexicon entry; morphemes; out-of-vocabulary word; speech recognition; word-based language model; Analytical models; Biological system modeling; Data models; Hidden Markov models; Speech recognition; Training; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference, 2005 13th European
  • Conference_Location
    Antalya
  • Print_ISBN
    978-160-4238-21-1
  • Type

    conf

  • Filename
    7078215