• DocumentCode
    1455875
  • Title

    Developing Objective Measures of Foreign-Accent Conversion

  • Author

    Felps, Daniel ; Gutierrez-Osuna, Ricardo

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Texas A&M Univ., College Station, TX, USA
  • Volume
    18
  • Issue
    5
  • fYear
    2010
  • fDate
    7/1/2010 12:00:00 AM
  • Firstpage
    1030
  • Lastpage
    1040
  • Abstract
    Various methods have recently appeared to transform foreign-accented speech into its native-accented counterpart. Evaluation of these accent conversion methods requires extensive listening tests across a number of perceptual dimensions. This article presents three objective measures that may be used to assess the acoustic quality, degree of foreign accent, and speaker identity of accent-converted utterances. Accent conversion generates novel utterances: those of a foreign speaker with a native accent. Therefore, the acoustic quality in accent conversion cannot be evaluated with conventional measures of spectral distortion, which assume that a clean recording of the speech signal is available for comparison. Here we evaluate a single-ended measure of speech quality, ITU-T recommendation P.563 for narrow-band telephony. We also propose a measure of foreign accent that exploits a weakness of automatic speech recognizers: their sensitivity to foreign accents. Namely, we use phoneme-level match scores given by the HTK recognizer trained on a large number of English American speakers to obtain a measure of native accent. Finally, we propose a measure of speaker identity that projects acoustic vectors (e.g., Mel cepstral, F0) onto the linear discriminant that maximizes separability for a given pair of source and target speakers. The three measures are evaluated on a corpus of accent-converted utterances that had been previously rated through perceptual tests. Our results show that the three measures have a high degree of correlation with their corresponding subjective ratings, suggesting that they may be used to accelerate the development of foreign-accent conversion tools. Applications of these measures in the context of computer assisted pronunciation training and voice conversion are also discussed.
  • Keywords
    natural language processing; speaker recognition; ITU-T recommendation P.563; acoustic quality; automatic speech recognizers; computer assisted pronunciation training; foreign speaker recognition; foreign-accent conversion tool; foreign-accented speech transform; linear discriminant analysis; narrow-band telephony; phoneme-level match scores; speaker recognition; spectral distortion; speech signal; Accent conversion; foreign accent recognition; speaker recognition; voice conversion;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2009.2038818
  • Filename
    5439721