• DocumentCode
    3628643
  • Title

    Speech bandwidth classification for broadcast news domain using artificial neural network and Gaussian mixture models

  • Author

    Marko Kos;Matej Grasic;Bojan Kotnik;Zdravko Kacic

  • Author_Institution
    University of Maribor, Faculty of Electrical Engineering and Computer Science, Laboratory for Digital Signal Processing, Smetanova ul. 17, SI-2000, Slovenia
  • fYear
    2008
  • Firstpage
    283
  • Lastpage
    286
  • Abstract
    In this paper we present research work that was carried out on Slovenian BNSI Broadcast News database regarding the speech bandwidth classification. Speech recorded in studio environment has frequency bandwidth of 8 kHz, while speech recorded over telephone channel has the bandwidth of 3.1 kHz. Speech bandwidth classification enables us to use separate speech models for automatic speech recognition (ASR), which helps to improve the overall automatic speech recognition result. For the task of speech bandwidth classification we used two different model-based principles. One principle is based on artificial neural network and the second principle is based on Gaussian mixture models. Both principles have been tested and evaluated using same front-end features for simple result comparison.
  • Keywords
    "Speech","Artificial neural networks","Bandwidth","Databases","Materials","Speech recognition","Acoustics"
  • Publisher
    ieee
  • Conference_Titel
    Systems, Signals and Image Processing, 2008. IWSSIP 2008. 15th International Conference on
  • ISSN
    2157-8672
  • Print_ISBN
    978-80-227-2856-0
  • Electronic_ISBN
    2157-8702
  • Type

    conf

  • DOI
    10.1109/IWSSIP.2008.4604422
  • Filename
    4604422