Pseudo-articulatory speech synthesis for recognition using automatic feature extraction from X-ray data

Author

Blackburn, C.S. ; Young, S.J.

Author_Institution

Dept. of Eng., Cambridge Univ., UK

Volume

2

fYear

1996

fDate

3-6 Oct 1996

Firstpage

969

Abstract

Describes a self-organising pseudo-articulatory speech production model (SPM) trained on an X-ray microbeam database, and present results when using the SPM within a speech recognition framework. Given a time-aligned phonemic string, the system uses an explicit statistical model of co-articulation to generate pseudo-articulator trajectories. From these, parametrised speech vectors are synthesised using a set of artificial neural networks (ANNs). We present an analysis of the articulatory information in the database used, and demonstrate the improvements in articulatory modelling accuracy obtained using our co-articulation system. Finally, we give results when using the SPM to re-score N-best utterance transcription lists as produced by the Cambridge University Engineering Department (CUED) HTK hidden Markov model (HMM) speech recognition system. Relative reductions of 18% in the phoneme error rate and 15% in the word error rate are achieved

Keywords

X-rays; feature extraction; hidden Markov models; neural nets; speech recognition; speech synthesis; HTK hidden Markov model speech recognition system; X-ray microbeam database; articulatory modelling accuracy; artificial neural networks; automatic feature extraction; coarticulation; explicit statistical model; parametrised speech vectors; phoneme error rate; pseudo-articulator trajectory generation; pseudo-articulatory speech synthesis; self-organising pseudo-articulatory speech production model; time-aligned phonemic string; utterance transcription lists; word error rate; Automatic speech recognition; Context modeling; Databases; Hidden Markov models; Loudspeakers; Mel frequency cepstral coefficient; Scanning probe microscopy; Speech enhancement; Speech recognition; Speech synthesis;

fLanguage

English

Publisher

ieee

Conference_Titel

Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on

Conference_Location

Philadelphia, PA

Print_ISBN

0-7803-3555-4

Type

conf

DOI

10.1109/ICSLP.1996.607764

Filename

607764