Title :
Efficient search using posterior phone probability estimates
Author :
Renals, Steve ; Hochberg, Mike
Author_Institution :
Dept. of Comput. Sci., Sheffield Univ., UK
Abstract :
We present a novel, efficient search strategy for large vocabulary continuous speech recognition (LVCSR). The search algorithm, based on stack decoding, uses posterior phone probability estimates to substantially increase its efficiency with minimal effect on accuracy. In particular, the search space is dramatically reduced by phone deactivation pruning where phones with a small local posterior probability are deactivated. This approach is particularly well-suited to hybrid connectionist/hidden Markov model systems because posterior phone probabilities are directly computed by the acoustic model. On large vocabulary tasks, using a trigram language model, this increased the search speed by an order of magnitude, with 2% or less relative search error. Results from a hybrid system are presented using the Wall Street Journal LVCSR database for a 20,000 word task using a backed-off trigram language model. For this task, our single-pass decoder took around 15x realtime on an HP735 workstation. At a cost of 7% relative search error, the decoding time can be speeded up to approximately realtime
Keywords :
acoustic signal processing; decoding; estimation theory; grammars; hidden Markov models; natural languages; probability; search problems; speech processing; speech recognition; HP735 workstation; LVCSR database; Wall Street Journal; acoustic model; backed-off trigram language model; decoding time; efficiency; hybrid connectionist/HMM systems; large vocabulary continuous speech recognition; local posterior probability; phone deactivation pruning; posterior phone probability estimates; relative search error; search algorithm; search space reduction; search speed; single-pass decoder; stack decoding; trigram language model; Computer networks; Computer science; Context modeling; Databases; Decoding; Hidden Markov models; Speech recognition; State estimation; Topology; Upper bound; Viterbi algorithm; Vocabulary; Workstations;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
Print_ISBN :
0-7803-2431-5
DOI :
10.1109/ICASSP.1995.479668