• DocumentCode
    675468
  • Title

    Speech resources for a Serbian LVCSR system

  • Author

    Ostrogonac, Stevan ; Suzic, Sinisa ; Bojanic, Milana ; Pakoci, Edvin

  • Author_Institution
    Fac. of Tech. Sci., Univ. of Novi Sad, Novi Sad, Serbia
  • fYear
    2013
  • fDate
    26-28 Nov. 2013
  • Firstpage
    478
  • Lastpage
    481
  • Abstract
    This paper describes the whole procedure of speech database collection and processing required for building a good large vocabulary speech recognition system for the Serbian language. The speech database consists of speech recordings from audio books, radio programs and talk shows, as well as read utterances from an array of male and female speakers. To date, around 200 hours of read speech is collected, as well as about 10 hours of radio recordings.
  • Keywords
    natural language processing; speech recognition; Serbian LVCSR system; audio books; large vocabulary speech recognition system; radio programs; radio recording; speech database collection; speech recording; speech resources; talk show; Acoustics; Databases; Materials; Speech; Speech recognition; Training; Vocabulary; Serbian; large vocabulary continuous speech recognition; speech database;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Telecommunications Forum (TELFOR), 2013 21st
  • Conference_Location
    Belgrade
  • Print_ISBN
    978-1-4799-1419-7
  • Type

    conf

  • DOI
    10.1109/TELFOR.2013.6716271
  • Filename
    6716271