DocumentCode :
1370095
Title :
A review of large-vocabulary continuous-speech
Author :
Young, Steve
Volume :
13
Issue :
5
fYear :
1996
Firstpage :
45
Abstract :
Considerable progress has been made in speech-recognition technology over the last few years and nowhere has this progress been more evident than in the area of large-vocabulary recognition (LVR). Current laboratory systems are capable of transcribing continuous speech from any speaker with average word-error rates between 5% and 10%. If speaker adaptation is allowed, then after 2 or 3 minutes of speech, the error rate will drop well below 5% for most speakers. LVR systems had been limited to dictation applications since the systems were speaker dependent and required words to be spoken with a short pause between them. However, the capability to recognize natural continuous-speech input from any speaker opens up many more applications. As a result, LVR technology appears to be on the brink of widespread deployment across a range of information technology (IT) systems. This article discusses the principles and architecture of current LVR systems and identifies the key issues affecting their future deployment. To illustrate the various points raised, the Cambridge University HTK system is described. This system is a modem design that gives state-of-the-art performance, and it is typical of the current generation of recognition systems.
Keywords :
Acoustic waves; Dictionaries; Discrete cosine transforms; Equations; Hidden Markov models; Modems; Probability; Signal processing; Speech processing; Speech recognition;
fLanguage :
English
Journal_Title :
Signal Processing Magazine, IEEE
Publisher :
ieee
ISSN :
1053-5888
Type :
jour
DOI :
10.1109/79.536824
Filename :
536824
Link To Document :
بازگشت