DocumentCode :
590624
Title :
Acoustic model training using feature vectors generated by manipulating speech parameters of real speakers
Author :
Kawai, Takaaki ; Kitaoka, Norihide ; Takeda, Kenji
Author_Institution :
Nagoya Univ., Nagoya, Japan
fYear :
2012
fDate :
3-6 Dec. 2012
Firstpage :
1
Lastpage :
5
Abstract :
In this paper, we propose a robust speaker-independent acoustic model training method using generative training to generate many pseudo-speakers from a small number of real speakers. We focus on the difference between each speaker´s vocal tract length, and manipulate it in order to create many different pseudo-speakers with a range of vocal tract lengths. This method employs frequency warping based on the inverted use Vocal Tract Length Normalization(VTLN). Another method for creating pseudo-speakers is to vary the speaking rate of the speakers. This can be achieved by a method called PICOLA; Pointer Interval Controlled OverLap and Add. In experiments, we train acoustic models using these generated pseudo-speakers in addition to the original speakers. Evaluation results show that generating pseudo-speakers by manipulating speaking rates did not result in a sufficient increase in performance, however, vocal tract length warping was effective.
Keywords :
learning (artificial intelligence); speech processing; PICOLA; Pointer Interval Controlled OverLap and Add; VTLN; feature vectors; generative training; pseudo-speaker generation; pseudo-speakers; speaker-independent acoustic model training method; speech parameter manipulation; vocal tract length normalization; Accuracy; Decoding; Filter banks; Robustness; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific
Conference_Location :
Hollywood, CA
Print_ISBN :
978-1-4673-4863-8
Type :
conf
Filename :
6411771
Link To Document :
بازگشت