On using a priori segmentation of the speech signal in an N-best solutions post-processing

Author

Moudenc, T. ; Jouvet, D. ; Monné, J.

Author_Institution

CNET, Lannion, France

Volume

1

fYear

1995

fDate

9-12 May 1995

Firstpage

580

Abstract

This paper proposes a new approach to the incorporation of automatic a-priori segmentation into an HMM based speech recognizer. The approach used for the post-processing of N-best solutions is based on stochastic modelling of the number of speech signal stationarity changes which occur within the phonetic segments of each solution. The objective of this post-processing is to validate the presence of stationarity zones in the speech signal. This particular validation cannot be exploited using a centisecond approach. The signal stationarity changes are detected using an "a priori" segmentation algorithm. Two phonetic models are calculated for each phonetic segment. One corresponds to correct solutions and the other one corresponds to incorrect solutions. These two models are used simultaneously in order to compute a post-processing score for each solution. In the initial set of experiments, which was conducted on telephone databases, the use of this method resulted in a 9% error rate reduction on the "Number" database, and a 15% error rate reduction on the "Digit" database.

Keywords

acoustic signal detection; acoustic signal processing; hidden Markov models; speech processing; speech recognition; stochastic processes; Digit database; HMM; N-best solutions post-processing; Number database; a priori segmentation algorithm; correct solutions; error rate reduction; experiments; incorrect solutions; phonetic models; phonetic segment; phonetic segments; post-processing score; signal stationarity changes detection; speech recognizer; speech signal stationarity changes; stochastic modelling; telephone databases; Automatic speech recognition; Cepstral analysis; Context modeling; Databases; Error analysis; Hidden Markov models; Speech recognition; Stochastic processes; Telecommunications; Telephony;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on

Conference_Location

Detroit, MI, USA

ISSN

1520-6149

Print_ISBN

0-7803-2431-5

Type

conf

DOI

10.1109/ICASSP.1995.479664

Filename

479664