DocumentCode :
2799535
Title :
Evaluating the robustness of privacy-sensitive audio features for speech detection in personal audio log scenarios
Author :
Parthasarathi, Sree Hari Krishnan ; Magimai-Doss, Mathew ; Bourlard, Herve ; Gatica-Perez, Daniel
Author_Institution :
Idiap Res. Inst., Martigny, Switzerland
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
4474
Lastpage :
4477
Abstract :
Personal audio logs are often recorded in multiple environments. This poses challenges for robust front-end processing, including speech/nonspeech detection (SND). Motivated by this, we investigate the robustness of four different privacy-sensitive features for SND, namely energy, zero crossing rate, spectral flatness, and kurtosis. We study early and late fusion of these features in conjunction with modeling temporal context. These combinations are evaluated in mismatched conditions on a dataset of nearly 450 hours. While both combinations yield improvements over individual features, generally feature combinations perform better. Comparisons with a state-of-the-art spectral based and a privacy-sensitive feature set are also provided.
Keywords :
acoustic signal processing; audio acoustics; speech processing; feature combinations; kurtosis; personal audio log scenarios; privacy-sensitive audio features; robust front-end processing; spectral flatness; speech detection; speech-nonspeech detection; zero crossing rate; Audio recording; Computer vision; Context modeling; Face detection; Microphones; Pattern analysis; Privacy; Robustness; Speech analysis; Speech processing; Privacy Sensitive Features; Speech/nonspeech detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495596
Filename :
5495596
Link To Document :
بازگشت