DocumentCode :
1455875
Title :
Developing Objective Measures of Foreign-Accent Conversion
Author :
Felps, Daniel ; Gutierrez-Osuna, Ricardo
Author_Institution :
Dept. of Comput. Sci. & Eng., Texas A&M Univ., College Station, TX, USA
Volume :
18
Issue :
5
fYear :
2010
fDate :
7/1/2010 12:00:00 AM
Firstpage :
1030
Lastpage :
1040
Abstract :
Various methods have recently appeared to transform foreign-accented speech into its native-accented counterpart. Evaluation of these accent conversion methods requires extensive listening tests across a number of perceptual dimensions. This article presents three objective measures that may be used to assess the acoustic quality, degree of foreign accent, and speaker identity of accent-converted utterances. Accent conversion generates novel utterances: those of a foreign speaker with a native accent. Therefore, the acoustic quality in accent conversion cannot be evaluated with conventional measures of spectral distortion, which assume that a clean recording of the speech signal is available for comparison. Here we evaluate a single-ended measure of speech quality, ITU-T recommendation P.563 for narrow-band telephony. We also propose a measure of foreign accent that exploits a weakness of automatic speech recognizers: their sensitivity to foreign accents. Namely, we use phoneme-level match scores given by the HTK recognizer trained on a large number of English American speakers to obtain a measure of native accent. Finally, we propose a measure of speaker identity that projects acoustic vectors (e.g., Mel cepstral, F0) onto the linear discriminant that maximizes separability for a given pair of source and target speakers. The three measures are evaluated on a corpus of accent-converted utterances that had been previously rated through perceptual tests. Our results show that the three measures have a high degree of correlation with their corresponding subjective ratings, suggesting that they may be used to accelerate the development of foreign-accent conversion tools. Applications of these measures in the context of computer assisted pronunciation training and voice conversion are also discussed.
Keywords :
natural language processing; speaker recognition; ITU-T recommendation P.563; acoustic quality; automatic speech recognizers; computer assisted pronunciation training; foreign speaker recognition; foreign-accent conversion tool; foreign-accented speech transform; linear discriminant analysis; narrow-band telephony; phoneme-level match scores; speaker recognition; spectral distortion; speech signal; Accent conversion; foreign accent recognition; speaker recognition; voice conversion;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2009.2038818
Filename :
5439721
Link To Document :
بازگشت