مرکز منطقه ای اطلاع رساني علوم و فناوري - Start-synchronous search for large vocabulary continuous speech recognition

DocumentCode :

1544797

Title :

Start-synchronous search for large vocabulary continuous speech recognition

Author :

Renals, Steve ; Hochberg, Michael M.

Author_Institution :

Dept. of Comput. Sci., Sheffield Univ., UK

Volume :

Issue :

fYear :

1999

fDate :

9/1/1999 12:00:00 AM

Firstpage :

542

Lastpage :

553

Abstract :

In this paper, we present a novel, efficient search strategy for large vocabulary continuous speech recognition. The search algorithm, based on a stack decoder framework, utilizes phone-level posterior probability estimates (produced by a connectionist/hidden Markov model acoustic model) as a basis for phone deactivation pruning-a highly efficient method of reducing the required computation. The single-pass algorithm is naturally factored into the time-asynchronous processing of the word sequence and the time-synchronous processing of the hidden Markov model state sequence. This enables the search to be decoupled from the language model while still maintaining the computational benefits of time-synchronous processing. The incorporation of the language model in the search is discussed and computationally cheap approximations to the full language model are introduced. Experiments were performed on the North American Business News task using a 60000 word vocabulary and a trigram language model. Results indicate that the computational cost of the search may be reduced by more than a factor of 40 with a relative search error of less than 2% using the techniques discussed in the paper

Keywords :

computational complexity; computational linguistics; decoding; hidden Markov models; search problems; speech recognition; synchronisation; North American Business News task; computation; computational cost; connectionist/hidden Markov model acoustic model; hidden Markov model state sequence; language model; large vocabulary continuous speech recognition; phone deactivation pruning; phone-level posterior probability estimates; search algorithm; search error; single-pass algorithm; stack decoder framework; start-synchronous search; time-asynchronous processing; time-synchronous processing; trigram language model; word sequence; Computational efficiency; Decoding; Dictionaries; Helium; Hidden Markov models; Probability; Search problems; Speech recognition; Topology; Vocabulary;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/89.784107

Filename :

784107

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1544797