Title :
Speech bandwidth classification for broadcast news domain using artificial neural network and Gaussian mixture models
Author :
Marko Kos;Matej Grasic;Bojan Kotnik;Zdravko Kacic
Author_Institution :
University of Maribor, Faculty of Electrical Engineering and Computer Science, Laboratory for Digital Signal Processing, Smetanova ul. 17, SI-2000, Slovenia
Abstract :
In this paper we present research work that was carried out on Slovenian BNSI Broadcast News database regarding the speech bandwidth classification. Speech recorded in studio environment has frequency bandwidth of 8 kHz, while speech recorded over telephone channel has the bandwidth of 3.1 kHz. Speech bandwidth classification enables us to use separate speech models for automatic speech recognition (ASR), which helps to improve the overall automatic speech recognition result. For the task of speech bandwidth classification we used two different model-based principles. One principle is based on artificial neural network and the second principle is based on Gaussian mixture models. Both principles have been tested and evaluated using same front-end features for simple result comparison.
Keywords :
"Speech","Artificial neural networks","Bandwidth","Databases","Materials","Speech recognition","Acoustics"
Conference_Titel :
Systems, Signals and Image Processing, 2008. IWSSIP 2008. 15th International Conference on
Print_ISBN :
978-80-227-2856-0
Electronic_ISBN :
2157-8702
DOI :
10.1109/IWSSIP.2008.4604422