• DocumentCode
    2831697
  • Title

    The study of q-logarithmic modulation spectral normalization for robust speech recognition

  • Author

    Fan, Hao-Teng ; Hsu, Che-hsien ; Hung, Jeih-weih

  • Author_Institution
    Dept. of Electr. Eng., Nat. Chi Nan Univ., Nantou, Taiwan
  • fYear
    2012
  • fDate
    June 30 2012-July 2 2012
  • Firstpage
    183
  • Lastpage
    186
  • Abstract
    This paper presents a novel use of the generalized logarithm operation (q-logarithm) in refining the modulation spectrum of speech features for noise-robust speech recognition. The resulting new method, generalized logarithmic modulation spectral mean normalization (GLMSMN), equalizes the average of the magnitude modulation spectrum in q-logarithmic domain for different utterances in order to alleviate the effect of noise. In the Aurora-2 connected-digit database and evaluation task, the presented GLMSMN operating on the MVN features reveals significant improvement in recognition accuracy in comparison with the MFCC baseline and MVN. The overall averaged recognition accuracy brought by GLMSMN can be nearly 90%.
  • Keywords
    speech recognition; visual databases; GLMSMN; digit database; generalized logarithm operation; generalized logarithmic modulation spectral mean normalization; magnitude modulation spectrum; modulation spectrum; noise robust speech recognition; q-logarithmic domain; q-logarithmic modulation spectral normalization; robust speech recognition; speech features; Accuracy; Mel frequency cepstral coefficient; Modulation; Noise; Robustness; Speech; Speech recognition; modulation spectrum; q-logarithm; robust speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    System Science and Engineering (ICSSE), 2012 International Conference on
  • Conference_Location
    Dalian, Liaoning
  • Print_ISBN
    978-1-4673-0944-8
  • Electronic_ISBN
    978-1-4673-0943-1
  • Type

    conf

  • DOI
    10.1109/ICSSE.2012.6257173
  • Filename
    6257173