• DocumentCode
    3564154
  • Title

    Feature-based noise robust speech recognition on an Indonesian language automatic speech recognition system

  • Author

    Satriawan, Cil Hardianto ; Lestari, Dessi Puji

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Inst. Teknol. Bandung, Bandung, Indonesia
  • fYear
    2014
  • Firstpage
    42
  • Lastpage
    46
  • Abstract
    Mel-frequency Cesptral Coefficients (MFCC) and Predictive Linear Prediction (PLP) coefficients are two popular representations of continuous speech in existing Hidden Markov Model (HMM) based Automatic Speech Recognition (ASR) systems. Cepstral Mean Normalization (CMN) is often used as a post-processing step in the extraction of MFCC and PLP features to further enhance noise robustness at almost negligible computational cost. In this paper we build a closed dictionary, large vocabulary HMM-based Indonesian language ASR system using the CMU Sphinx in speech recognition toolkit implementing MFCC and PLP feature extraction, and CMN. We test the effect of various types and levels of noise on the word error rate (WER) of speech recognition. Utilizing CMN, an average improvement of 2% recognition over standard MFCC and PLP extraction methods is obtained at signal-to-noise ratios (SNR) below 24 decibels. A significant drop in recognition is observed between 12 and 6 dB SNR.
  • Keywords
    cepstral analysis; feature extraction; hidden Markov models; natural language processing; speech recognition; CMN; CMU Sphinx; HMM-based Indonesian language ASR system; Indonesian language automatic speech recognition system; MFCC feature extraction; Mel-frequency cesptral coefficients; PLP coefficients; PLP feature extraction; SNR; WER; cepstral mean normalization; continuous speech representations; feature-based noise robust speech recognition; hidden Markov model-based automatic speech recognition systems; predictive linear prediction coefficients; signal-to-noise ratios; speech recognition toolkit; word error rate; Feature extraction; Mel frequency cepstral coefficient; Noise robustness; Signal to noise ratio; Speech; Speech recognition; ASR; CMN; Indonesian language; MFCC; PLP;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electrical Engineering and Computer Science (ICEECS), 2014 International Conference on
  • Print_ISBN
    978-1-4799-8477-0
  • Type

    conf

  • DOI
    10.1109/ICEECS.2014.7045217
  • Filename
    7045217