مرکز منطقه ای اطلاع رساني علوم و فناوري - Developing Objective Measures of Foreign-Accent Conversion

DocumentCode :

1455875

Title :

Developing Objective Measures of Foreign-Accent Conversion

Author :

Felps, Daniel ; Gutierrez-Osuna, Ricardo

Author_Institution :

Dept. of Comput. Sci. & Eng., Texas A&M Univ., College Station, TX, USA

Volume :

Issue :

fYear :

2010

fDate :

7/1/2010 12:00:00 AM

Firstpage :

1030

Lastpage :

1040

Abstract :

Various methods have recently appeared to transform foreign-accented speech into its native-accented counterpart. Evaluation of these accent conversion methods requires extensive listening tests across a number of perceptual dimensions. This article presents three objective measures that may be used to assess the acoustic quality, degree of foreign accent, and speaker identity of accent-converted utterances. Accent conversion generates novel utterances: those of a foreign speaker with a native accent. Therefore, the acoustic quality in accent conversion cannot be evaluated with conventional measures of spectral distortion, which assume that a clean recording of the speech signal is available for comparison. Here we evaluate a single-ended measure of speech quality, ITU-T recommendation P.563 for narrow-band telephony. We also propose a measure of foreign accent that exploits a weakness of automatic speech recognizers: their sensitivity to foreign accents. Namely, we use phoneme-level match scores given by the HTK recognizer trained on a large number of English American speakers to obtain a measure of native accent. Finally, we propose a measure of speaker identity that projects acoustic vectors (e.g., Mel cepstral, F0) onto the linear discriminant that maximizes separability for a given pair of source and target speakers. The three measures are evaluated on a corpus of accent-converted utterances that had been previously rated through perceptual tests. Our results show that the three measures have a high degree of correlation with their corresponding subjective ratings, suggesting that they may be used to accelerate the development of foreign-accent conversion tools. Applications of these measures in the context of computer assisted pronunciation training and voice conversion are also discussed.

Keywords :

natural language processing; speaker recognition; ITU-T recommendation P.563; acoustic quality; automatic speech recognizers; computer assisted pronunciation training; foreign speaker recognition; foreign-accent conversion tool; foreign-accented speech transform; linear discriminant analysis; narrow-band telephony; phoneme-level match scores; speaker recognition; spectral distortion; speech signal; Accent conversion; foreign accent recognition; speaker recognition; voice conversion;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2009.2038818

Filename :

5439721

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1455875