DocumentCode
730074
Title
Robust unsupervised detection of human screams in noisy acoustic environments
Author
Nandwana, Mahesh Kumar ; Ziaei, Ali ; Hansen, John H. L.
Author_Institution
Center for Robust Speech Syst. (CRSS), Univ. of Texas at Dallas, Richardson, TX, USA
fYear
2015
fDate
19-24 April 2015
Firstpage
161
Lastpage
165
Abstract
This study is focused on an unsupervised approach for detection of human scream vocalizations from continuous recordings in noisy acoustic environments. The proposed detection solution is based on compound segmentation, which employs weighted mean distance, T2-statistics and Bayesian Information Criteria for detection of screams. This solution also employs an unsupervised threshold optimized Combo-SAD for removal of non-vocal noisy segments in the preliminary stage. A total of five noisy environments were simulated for noise levels ranging from -20dB to +20dB for five different noisy environments. Performance of proposed system was compared using two alternative acoustic front-end features (i) Mel-frequency cepstral coefficients (MFCC) and (ii) perceptual minimum variance distortionless response (PMVDR). Evaluation results show that the new scream detection solution works well for clean, +20, +10 dB SNR levels, with performance declining as SNR decreases to -20dB across a number of the noise sources considered.
Keywords
Bayes methods; acoustic noise; acoustic signal detection; cepstral analysis; signal denoising; speech recognition; Bayesian information criteria; MFCC; Mel-frequency cepstral coefficients; PMVDR; T2-statistics; acoustic front end features; compound segmentation; continuous recordings; human scream unsupervised detection; human scream vocalizations; noisy acoustic environments; nonvocal noisy segment removal; perceptual minimum variance distortionless response; unsupervised threshold optimized Combo-SAD; weighted mean distance; Mel frequency cepstral coefficient; Principal component analysis; Robustness; Single photon emission computed tomography; CompSeg; PMVDR; T2 distance; T2-BIC SAD; scream detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7177952
Filename
7177952
Link To Document