DocumentCode :
164830
Title :
Spectrogram patch based acoustic event detection and classification in speech overlapping conditions
Author :
Espi, Miquel ; Fujimoto, Mitoshi ; Kubo, Yuji ; Nakatani, Takeshi
Author_Institution :
NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
fYear :
2014
fDate :
12-14 May 2014
Firstpage :
117
Lastpage :
121
Abstract :
Speech does not always contain all the information needed to understand a conversation scene. Non-speech events can reveal aspects of the scene that speakers miss or neglect to mention, which could further support speech enhancement and recognition systems with information about the surrounding noise. This paper focuses on the task of detecting and classifying acoustic events in a conversation scene where these often overlap with speech. State-of-the-art techniques are based on derived features (e.g. MFCC, or Mel-filter banks), which have successfully parameterized speech spectrograms, but that reduce both resolution and detail when we are targeting other kinds of events. In this paper, we propose a method that learns hidden features directly from spectrogram patches, and integrates them within the deep neural network framework to detect and classify acoustic events. The result is a model that performs feature extraction and classification simultaneously. Experiments confirm that the proposed method outperforms deep neural networks with derived features as well as related work on the CHIL2007-AED task, showing that there is room for further improvement.
Keywords :
feature extraction; neural nets; speech enhancement; speech recognition; CHIL2007-AED task; deep neural network framework; feature classification; feature extraction; nonspeech events; parameterized speech spectrograms; spectrogram patch based acoustic event detection; speech enhancement; speech overlapping conditions; speech recognition system; Acoustics; Conferences; Feature extraction; Hidden Markov models; Spectrogram; Speech; Training; acoustic event detection; communication scene understanding; spectrogram patch;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Hands-free Speech Communication and Microphone Arrays (HSCMA), 2014 4th Joint Workshop on
Conference_Location :
Villers-les-Nancy
Type :
conf
DOI :
10.1109/HSCMA.2014.6843263
Filename :
6843263
Link To Document :
بازگشت