• DocumentCode
    2704820
  • Title

    Unsupervised Speech/Non-Speech Detection for Automatic Speech Recognition in Meeting Rooms

  • Author

    Maganti, H.K. ; Motlicek, Petr ; Gatica-Perez, Daniel

  • Author_Institution
    IDIAP Res. Inst., Switzerland
  • Volume
    4
  • fYear
    2007
  • fDate
    15-20 April 2007
  • Abstract
    The goal of this work is to provide robust and accurate speech detection for automatic speech recognition (ASR) in meeting room settings. The solution is based on computing long-term modulation spectrum, and examining specific frequency range for dominant speech components to classify speech and non-speech signals for a given audio signal. Manually segmented speech segments, short-term energy, short-term energy and zero-crossing based segmentation techniques, and a recently proposed multi layer perceptron (MLP) classifier system are tested for comparison purposes. Speech recognition evaluations of the segmentation methods are performed on a standard database and tested in conditions where the signal-to-noise ratio (SNR) varies considerably, as in the cases of close-talking headset, lapel, distant microphone array output, and distant microphone. The results reveal that the proposed method is more reliable and less sensitive to mode of signal acquisition and unforeseen conditions.
  • Keywords
    acoustic signal detection; architectural acoustics; multilayer perceptrons; speech processing; speech recognition; SNR; automatic recognition; automatic speech recognition; meeting rooms; multi layer perceptron; nonspeech signals; segmentation methods; signal acquisition; signal-to-noise ratio; unsupervised nonspeech detection; zero-crossing based segmentation; Acoustic noise; Acoustic signal detection; Automatic speech recognition; Databases; Hidden Markov models; Microphone arrays; Signal processing; Speech enhancement; Speech processing; Speech recognition; Acoustic signal detection; speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
  • Conference_Location
    Honolulu, HI
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0727-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2007.367250
  • Filename
    4218281