Title :
Training of stream weights for the decoding of speech using parallel feature streams
Author :
Li, Xiang ; Stern, Richard M.
Author_Institution :
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
In speech recognition systems, information from multiple sources such as different feature streams can be combined in many different ways to yield better recognition accuracy. In general, information may be combined at the level of the incoming feature vectors, at the level of the decoding process, or after hypothesis generation. We focus on the specific case where parallel streams of features are used simultaneously during search to generate a hypothesis, or a set of hypotheses. In this case the contributions of the individual features to the score associated with a frame of speech must be weighted appropriately during search. We present an offline data-driven algorithm for determining the weights to be associated with each feature stream for combining acoustic likelihoods for each frame. Experimental results show that the word error rates (WERs) obtained using the proposed algorithm are lower than those obtained using conventional schemes for parallel feature combination.
Keywords :
acoustic signal processing; data compression; decoding; error statistics; feature extraction; speech coding; speech recognition; acoustic likelihoods; data-driven algorithm; feature vectors; hypothesis generation; parallel feature combination; parallel feature streams; speech decoding; speech frame; speech recognition accuracy; speech recognition systems; stream weights training; word error rates; Aggregates; Computer science; Concurrent computing; Decoding; Error analysis; Hidden Markov models; Speech processing; Speech recognition; Training data;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
Print_ISBN :
0-7803-7663-3
DOI :
10.1109/ICASSP.2003.1198910