• DocumentCode
    1749644
  • Title

    Robust, real-time endpoint detector with energy normalization for ASR in adverse environments

  • Author

    Li, Qi ; Zheng, Jinsong ; Zhou, Qiru ; Lee, Chin-Hui

  • Author_Institution
    Multimedia Commun. Res. Lab., Lucent Technol. Bell Labs., Murray Hill, NJ, USA
  • Volume
    1
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    233
  • Abstract
    When automatic speech recognition (ASR) is applied to hands-free or other adverse acoustic environments, endpoint detection and energy normalization can be crucial to the entire system. In low signal-to-noise (SNR) situations, conventional approaches of endpointing and energy normalization often fail and ASR performances usually degrade dramatically. The goal of this paper is to find a fast, accurate, and robust endpointing algorithm for real-time ASR. We propose a novel approach of using a special filter plus a 3-state decision logic for endpoint detection. The filter has been designed under several criteria to ensure the accuracy and robustness of detection. The detected endpoints are then applied to energy normalization simultaneously. Evaluation results show that the proposed algorithm significantly reduce the string error rates on 7 out of 12 tested databases. The reduction rates even exceeded 50% on two of them. The algorithm only uses one-dimensional energy with 24-frame lookahead; therefore, it has a low complexity and is suitable for real-time ASR
  • Keywords
    acoustic noise; filtering theory; signal detection; speech recognition; 1D short-term energy; 24-frame lookahead; SNR; adverse acoustic environments; automatic speech recognition; decision logic; energy normalization; hands-free environment; low signal-to-noise; one-dimensional short-term energy; optimal filter; real-time ASR; robust endpointing algorithm; robust real-time endpoint detector; string error rate reduction; Acoustic signal detection; Automatic speech recognition; Databases; Degradation; Detectors; Error analysis; Filters; Logic; Robustness; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7041-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2001.940810
  • Filename
    940810