DocumentCode :
3422494
Title :
Time-inhomogeneous hidden Bernoulli model: An alternative to hidden Markov model for automatic speech recognition
Author :
Kabudian, Jahanshah ; Homayounpour, M. Mehdi ; Ahadi, S. Mohammad
Author_Institution :
Dept. of Comput. Eng., Amirkabir Univ. of Technol., Tehran
fYear :
2008
fDate :
March 31 2008-April 4 2008
Firstpage :
4101
Lastpage :
4104
Abstract :
In this paper, a new acoustic model called time-inhomogeneous hidden Bernoulli model (TI-HBM) is introduced as an alternative to hidden Markov model (HMM) in automatic speech recognition. Contrary to HMM, the state transition process in TI-HBM is not a Markov process; rather it is an independent (generalized Bernoulli) process. This difference leads to elimination of dynamic programming at state-level in TI-HBM decoding process. Thus, the computational complexity of TI-HBM for probability evaluation and state estimation is O(NL) (instead of O(N2L) in the HMM case). As a new framework for phone duration modeling, TI-HBM is able to model acoustic-unit duration (e.g. phone duration) by using a built-in parameter named survival probability. Similar to the HMM case, three essential problems in TI-HBM have been solved. An EM-algorithm based method has been proposed for training TI-HBM parameters. Experiments in phone recognition for Persian (Farsi) spoken language show that the TI-HBM has some advantages over HMM (e.g., more simplicity and increased speed in recognition phase), and also outperforms HMM in terms of phone recognition accuracy.
Keywords :
computational complexity; expectation-maximisation algorithm; hidden Markov models; natural language processing; speech coding; speech recognition; EM-algorithm based method; HMM; Persian spoken language; automatic speech recognition; computational complexity; decoding process; dynamic programming; generalized Bernoulli process; hidden Markov model; survival probability; time-inhomogeneous hidden Bernoulli model; Acoustical engineering; Automatic speech recognition; Decoding; Distributed computing; Distribution functions; Dynamic programming; Hidden Markov models; Markov processes; Natural languages; Speech recognition; Acoustic Modeling; Hidden Markov Model; Persian (Farsi) Spoken Language; Phone Duration Modeling; Phone Recognition; Speech Recognition; Time-Inhomogeneous Hidden Bernoulli Model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
ISSN :
1520-6149
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2008.4518556
Filename :
4518556
Link To Document :
بازگشت