DocumentCode :
3628643
Title :
Speech bandwidth classification for broadcast news domain using artificial neural network and Gaussian mixture models
Author :
Marko Kos;Matej Grasic;Bojan Kotnik;Zdravko Kacic
Author_Institution :
University of Maribor, Faculty of Electrical Engineering and Computer Science, Laboratory for Digital Signal Processing, Smetanova ul. 17, SI-2000, Slovenia
fYear :
2008
Firstpage :
283
Lastpage :
286
Abstract :
In this paper we present research work that was carried out on Slovenian BNSI Broadcast News database regarding the speech bandwidth classification. Speech recorded in studio environment has frequency bandwidth of 8 kHz, while speech recorded over telephone channel has the bandwidth of 3.1 kHz. Speech bandwidth classification enables us to use separate speech models for automatic speech recognition (ASR), which helps to improve the overall automatic speech recognition result. For the task of speech bandwidth classification we used two different model-based principles. One principle is based on artificial neural network and the second principle is based on Gaussian mixture models. Both principles have been tested and evaluated using same front-end features for simple result comparison.
Keywords :
"Speech","Artificial neural networks","Bandwidth","Databases","Materials","Speech recognition","Acoustics"
Publisher :
ieee
Conference_Titel :
Systems, Signals and Image Processing, 2008. IWSSIP 2008. 15th International Conference on
ISSN :
2157-8672
Print_ISBN :
978-80-227-2856-0
Electronic_ISBN :
2157-8702
Type :
conf
DOI :
10.1109/IWSSIP.2008.4604422
Filename :
4604422
Link To Document :
بازگشت