• DocumentCode
    312145
  • Title

    Pseudo-articulatory speech synthesis for recognition using automatic feature extraction from X-ray data

  • Author

    Blackburn, C.S. ; Young, S.J.

  • Author_Institution
    Dept. of Eng., Cambridge Univ., UK
  • Volume
    2
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    969
  • Abstract
    Describes a self-organising pseudo-articulatory speech production model (SPM) trained on an X-ray microbeam database, and present results when using the SPM within a speech recognition framework. Given a time-aligned phonemic string, the system uses an explicit statistical model of co-articulation to generate pseudo-articulator trajectories. From these, parametrised speech vectors are synthesised using a set of artificial neural networks (ANNs). We present an analysis of the articulatory information in the database used, and demonstrate the improvements in articulatory modelling accuracy obtained using our co-articulation system. Finally, we give results when using the SPM to re-score N-best utterance transcription lists as produced by the Cambridge University Engineering Department (CUED) HTK hidden Markov model (HMM) speech recognition system. Relative reductions of 18% in the phoneme error rate and 15% in the word error rate are achieved
  • Keywords
    X-rays; feature extraction; hidden Markov models; neural nets; speech recognition; speech synthesis; HTK hidden Markov model speech recognition system; X-ray microbeam database; articulatory modelling accuracy; artificial neural networks; automatic feature extraction; coarticulation; explicit statistical model; parametrised speech vectors; phoneme error rate; pseudo-articulator trajectory generation; pseudo-articulatory speech synthesis; self-organising pseudo-articulatory speech production model; time-aligned phonemic string; utterance transcription lists; word error rate; Automatic speech recognition; Context modeling; Databases; Hidden Markov models; Loudspeakers; Mel frequency cepstral coefficient; Scanning probe microscopy; Speech enhancement; Speech recognition; Speech synthesis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607764
  • Filename
    607764