Title :
Automatic speech recognition via pseudo-independent marginal mixtures
Author :
Nadas, Andras ; Nahamoo, David
Author_Institution :
IBM T. J. Watson Research Center, Yorktown Heights, NY
Abstract :
Statistical models (prototypes) for the multivariate probability distribution of vectors (frames) of speech parameters may be utilized in various ways. If the stream of vectors is passed directly to the decoder of a continuous parameter speech recognizer then the prototypes are used by the decoder; if the recognizer has a time-synchronous labeling acoustic processor then they are used for vector quantization (labeling) and the resulting label stream is passed to the decoder; other uses are possible as well. We present a method for constructing such prototypes. This method was chosen as a compromise between describing a prototype in an assumption free way as a nonparametric density and describing it in a convenient way as a simple multivariate Gaussian density. We describe speech recognition experiments where our prototypes were trained by iteratively interleaving steps of a K-MEANS type algorithm for clustering and steps of an EM algorithm for reestimation. We present results (using a labeling acoustic processor) having significantly fewer decoding errors than our previous methods do.
Keywords :
Automatic speech recognition; Clustering algorithms; Decoding; Iterative algorithms; Labeling; Probability distribution; Prototypes; Speech processing; Speech recognition; Vector quantization;
Conference_Titel :
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '87.
DOI :
10.1109/ICASSP.1987.1169454