• DocumentCode
    3308428
  • Title

    Improvement of Acoustic Model in Text-independent Pronunciation Quality Assessment

  • Author

    Qi, Yaohui ; Shi, Changhai ; Ge, Fengpei ; Yan, Yonghong

  • Author_Institution
    Coll. of Inf. & Electron., Beijing Inst. of Technol., Beijing, China
  • fYear
    2012
  • fDate
    12-14 Jan. 2012
  • Firstpage
    193
  • Lastpage
    196
  • Abstract
    In order to give an accurate assessment, the test speech should be recognized firstly in the text-independent pronunciation quality assessment system. Field test data has some flaws which degrade the recognition performance, such as noise, accent and spontaneous speaking style. In this paper, we investigate these factors by improving the acoustic model (AM) for the speech recognition system. Background noise is added to the training data to enhance the ability of anti-noise. Speaker-based Cepstral Mean and Variance Normalization (SCMVN) is adopted to alleviate the distortion of channel and the impact of inter-speaker pronunciation variability. Maximum a Posteriori (MAP) adaptation is applied twice, in order to tune acoustic model to match the pronunciation characteristic of the accent and the spontaneous style in spoken language. According to the experimental results, above measures increase the word correct rate relatively by 44.1% and the correlation coefficient between machine score and expert score relatively by 6.3%.
  • Keywords
    acoustic noise; cepstral analysis; maximum likelihood estimation; speech recognition; accent speaking style; accurate assessment; acoustic model; anti-noise; background noise; inter-speaker pronunciation variability; maximum a posteriori adaptation; speaker-based cepstral mean; speech recognition; spoken language; spontaneous speaking style; text-independent pronunciation quality assessment; training data; variance normalization; Acoustics; Adaptation models; Hidden Markov models; Noise; Speech; Speech recognition; Training data; MAP; Speaker-based Cepstral Mean and Variance Normalization; acoustic model; text-independent pronunciation quality assessment;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Computation Technology and Automation (ICICTA), 2012 Fifth International Conference on
  • Conference_Location
    Zhangjiajie, Hunan
  • Print_ISBN
    978-1-4673-0470-2
  • Type

    conf

  • DOI
    10.1109/ICICTA.2012.55
  • Filename
    6150220