Robust unsupervised detection of human screams in noisy acoustic environments

Author

Nandwana, Mahesh Kumar ; Ziaei, Ali ; Hansen, John H. L.

Author_Institution

Center for Robust Speech Syst. (CRSS), Univ. of Texas at Dallas, Richardson, TX, USA

fYear

2015

fDate

19-24 April 2015

Firstpage

161

Lastpage

165

Abstract

This study is focused on an unsupervised approach for detection of human scream vocalizations from continuous recordings in noisy acoustic environments. The proposed detection solution is based on compound segmentation, which employs weighted mean distance, T²-statistics and Bayesian Information Criteria for detection of screams. This solution also employs an unsupervised threshold optimized Combo-SAD for removal of non-vocal noisy segments in the preliminary stage. A total of five noisy environments were simulated for noise levels ranging from -20dB to +20dB for five different noisy environments. Performance of proposed system was compared using two alternative acoustic front-end features (i) Mel-frequency cepstral coefficients (MFCC) and (ii) perceptual minimum variance distortionless response (PMVDR). Evaluation results show that the new scream detection solution works well for clean, +20, +10 dB SNR levels, with performance declining as SNR decreases to -20dB across a number of the noise sources considered.

Keywords

Bayes methods; acoustic noise; acoustic signal detection; cepstral analysis; signal denoising; speech recognition; Bayesian information criteria; MFCC; Mel-frequency cepstral coefficients; PMVDR; T2-statistics; acoustic front end features; compound segmentation; continuous recordings; human scream unsupervised detection; human scream vocalizations; noisy acoustic environments; nonvocal noisy segment removal; perceptual minimum variance distortionless response; unsupervised threshold optimized Combo-SAD; weighted mean distance; Mel frequency cepstral coefficient; Principal component analysis; Robustness; Single photon emission computed tomography; CompSeg; PMVDR; T² distance; T²-BIC SAD; scream detection;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location

South Brisbane, QLD

Type

conf

DOI

10.1109/ICASSP.2015.7177952

Filename

7177952