Title :
Pseudo 2-dimensional hidden Markov models in speech recognition
Author :
Werner, Stefen ; Rigoll, Gerhard
Author_Institution :
Dept. of Comput. Sci., Gerhard Mercator Univ., Duisburg, Germany
Abstract :
In this paper, the usage of pseudo 2-dimensional hidden Markov models for speech recognition is discussed. This image processing method should better model the time-frequency structure in speech signals. The method calculates the emission probability of a standard HMM by embedded HMM for each state. If a temporal sequence of spectral vectors is imagined as a spectrogram, this leads to a 2-dimensional warping of the spectrogram. This additional warping of the frequency axis could be useful for speaker-independent recognition and can be considered to be similar to a vocal tract normalization. The effects of this paradigm are investigated in this paper using the TI-Digits database.
Keywords :
feature extraction; hidden Markov models; probability; spectral analysis; speech recognition; time-frequency analysis; 2-dimensional warping; TI-Digits database; embedded HMM; emission probability; hidden Markov models; image processing method; pseudo 2-dimensional HMM; speaker-independent recognition; spectral vectors; spectrogram; speech recognition; speech signals; temporal sequence; time-frequency structure; vocal tract normalization; Computer science; Databases; Feature extraction; Frequency; Hidden Markov models; Image processing; Signal processing; Spectrogram; Speech processing; Speech recognition;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on
Print_ISBN :
0-7803-7343-X
DOI :
10.1109/ASRU.2001.1034679