DocumentCode :
418166
Title :
Scalable architecture for word HMM-based speech recognition
Author :
Yoshizawa, Shingo ; Wada, Naoya ; Hayasaka, Noboru ; Miyanaga, Yoshikazu
Author_Institution :
Graduate Sch. of Eng., Hokkaido Univ., Sapporo, Japan
Volume :
3
fYear :
2004
fDate :
23-26 May 2004
Abstract :
This paper presents a scalable architecture for realizing real-time speech recognizers based on a word HMM (hidden Markov model). HMM-based recognition algorithms are classified into two acoustic models, i.e., phenome-level model and word-level model. The phenome-level HMM has been widely used in current speech recognition systems which permit large-sized vocabularies. Whereas the word-level HMM has been constrained to small-sized vocabularies because of extremely high computation cost in spite of excellent recognition performance. In order to overcome the shortage, we adopt the scalable architecture focused on the word HMM structure. The proposed architecture can flexibly improve recognition performance and extend word vocabularies. In addition, the computation time is hardly increasing. In order to demonstrate practical solutions, we have designed and evaluated a total system recognizer including speech analysis and noise robustness on a 0.18 μm CMOS standard cell library. The recognition time is 35.7 μs/word at 128 MHz operating frequency. The recognizer can achieve over middle-sized vocabularies in real-time response.
Keywords :
CMOS integrated circuits; audio signal processing; hidden Markov models; real-time systems; speech recognition; vocabulary; 01.8 micron; 128 MHz; CMOS standard cell library; HMM-based recognition algorithm; acoustic model; hidden Markov model; noise robustness; operating frequency; phenome-level HMM; real-time speech recognizer; recognition time; scalable speech recognition architecture; speech analysis; speech recognition system; word HMM-based speech recognition; word vocabulary extension; word-level HMM; Computational efficiency; Computer architecture; Frequency; Hidden Markov models; High performance computing; Libraries; Noise robustness; Speech analysis; Speech recognition; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Circuits and Systems, 2004. ISCAS '04. Proceedings of the 2004 International Symposium on
Print_ISBN :
0-7803-8251-X
Type :
conf
DOI :
10.1109/ISCAS.2004.1328772
Filename :
1328772
Link To Document :
بازگشت