• DocumentCode
    835902
  • Title

    Robust speech recognition in noisy environments based on subband spectral centroid histograms

  • Author

    Gaji, Bojana ; Paliwal, Kuldip K.

  • Author_Institution
    Sch. of Microelectron. Eng., Griffith Univ., Brisbane, Qld., Australia
  • Volume
    14
  • Issue
    2
  • fYear
    2006
  • fDate
    3/1/2006 12:00:00 AM
  • Firstpage
    600
  • Lastpage
    608
  • Abstract
    We investigate how dominant-frequency information can be used in speech feature extraction to increase the robustness of automatic speech recognition against additive background noise. First, we review several earlier proposed auditory-based feature extraction methods and argue that the use of dominant-frequency information might be one of the major reasons for their improved noise robustness. Furthermore, we propose a new feature extraction method, which combines subband power information with dominant subband frequency information in a simple and computationally efficient way. The proposed features are shown to be considerably more robust against additive background noise than standard mel-frequency cepstrum coefficients on two different recognition tasks. The performance improvement increased as we moved from a small-vocabulary isolated-word task to a medium-vocabulary continuous-speech task, where the proposed features also outperformed a computationally expensive auditory-based method. The greatest improvement was obtained for noise types characterized by a relatively flat spectral density.
  • Keywords
    feature extraction; speech processing; speech recognition; auditory-based feature extraction methods; dominant-frequency information; medium-vocabulary continuous-speech task; robust speech recognition; small-vocabulary isolated-word task; speech feature extraction; subband spectral centroid histograms; Additive noise; Automatic speech recognition; Background noise; Feature extraction; Frequency; Histograms; Noise robustness; Speech enhancement; Speech recognition; Working environment noise; Auditory models; dominant frequencies; feature extraction; noise robustness; speech recognition; subband spectral centroids (SCCs);
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TSA.2005.855834
  • Filename
    1597263