DocumentCode
2880536
Title
Investigation of speech recognition over IP channels
Author
Van Sciver, Jim ; Ma, Jeff Z. ; Vanpoucke, Filiep ; Van hamme, Hugo
Author_Institution
BBN Technologies-Verizon, 70 Fawcett St., Cambridge, MA 02138 USA
Volume
4
fYear
2002
fDate
13-17 May 2002
Abstract
In this paper we investigate the effects of IP channels on speech recognition systems and methods to recover the associated performance degradation. There are three major VoIP (voice over IP) distortion sources: speech encoding-decoding (codecs), packet loss and jitter (time-delay). To speech recognition systems distortions are mainly from packet loss and the speech codecs. Their effects on the recognizer´s performance are systematically investigated by using four different ITU-T recommended speech codecs. The results show that the speech codecs introduce bigger degradation than the packet losses (random and burst). To recover the codec degradations we have applied the MLLR adaptation and a data-mixed retraining method. These techniques reduce the degradation by about 50%.
Keywords
Encoding; Hidden Markov models; Speech recognition; Wireless communication;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location
Orlando, FL, USA
ISSN
1520-6149
Print_ISBN
0-7803-7402-9
Type
conf
DOI
10.1109/ICASSP.2002.5745487
Filename
5745487
Link To Document