• DocumentCode
    590624
  • Title

    Acoustic model training using feature vectors generated by manipulating speech parameters of real speakers

  • Author

    Kawai, Takaaki ; Kitaoka, Norihide ; Takeda, Kenji

  • Author_Institution
    Nagoya Univ., Nagoya, Japan
  • fYear
    2012
  • fDate
    3-6 Dec. 2012
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    In this paper, we propose a robust speaker-independent acoustic model training method using generative training to generate many pseudo-speakers from a small number of real speakers. We focus on the difference between each speaker´s vocal tract length, and manipulate it in order to create many different pseudo-speakers with a range of vocal tract lengths. This method employs frequency warping based on the inverted use Vocal Tract Length Normalization(VTLN). Another method for creating pseudo-speakers is to vary the speaking rate of the speakers. This can be achieved by a method called PICOLA; Pointer Interval Controlled OverLap and Add. In experiments, we train acoustic models using these generated pseudo-speakers in addition to the original speakers. Evaluation results show that generating pseudo-speakers by manipulating speaking rates did not result in a sufficient increase in performance, however, vocal tract length warping was effective.
  • Keywords
    learning (artificial intelligence); speech processing; PICOLA; Pointer Interval Controlled OverLap and Add; VTLN; feature vectors; generative training; pseudo-speaker generation; pseudo-speakers; speaker-independent acoustic model training method; speech parameter manipulation; vocal tract length normalization; Accuracy; Decoding; Filter banks; Robustness; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific
  • Conference_Location
    Hollywood, CA
  • Print_ISBN
    978-1-4673-4863-8
  • Type

    conf

  • Filename
    6411771