DocumentCode
3166333
Title
ProfLifeLog: Environmental analysis and keyword recognition for naturalistic daily audio streams
Author
Sangwan, Abhijeet ; Ziaei, Ali ; Hansen, John H L
Author_Institution
Dept. of Electr. Eng., Univ. of Texas at Dallas, Richardson, TX, USA
fYear
2012
fDate
25-30 March 2012
Firstpage
4941
Lastpage
4944
Abstract
This study presents keyword recognition evaluation on a new corpus named ProfLifeLog. ProfLifeLog is a collection of data captured on a portable audio recording device called the LENA unit. Each session in ProfLifeLog consists of 10+ hours of continuous audio recording that captures the work day of the speaker (person wearing the LENA unit). This study presents keyword spotting evaluation on the ProfLifeLog corpus using the PCN-KWS (phone confusion network-keyword spotting) algorithm [2]. The ProfLifeLog corpus contains speech data in a variety of noise backgrounds which is challenging for keyword recognition. In order to improve keyword recognition, this study also develops a front-end environment estimation strategy that uses the knowledge of speech-pause decisions and SNR (signal-to-noise ratio) to provide noise robustness. The combination of the PCN-KWS and the proposed front-end technique is evaluated on 1 hour of ProfLifeLog corpus. Our evaluation experiments demonstrate the effectiveness of the proposed technique as the number of false alarms in keyword recognition are reduced considerably.
Keywords
audio signal processing; audio streaming; speech recognition; LENA unit; PCN-KWS algorithm; ProfLifeLog corpus; SNR; environmental analysis; front-end environment estimation strategy; keyword recognition; naturalistic daily audio streams; phone confusion network-keyword spotting algorithm; signal-to-noise ratio; speech data; Estimation; Hidden Markov models; Lattices; Signal to noise ratio; Speech; Speech recognition; Environment Estimation; False Alarms; Keyword Spotting; Noise Robustness; Phone Confusion Networks;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location
Kyoto
ISSN
1520-6149
Print_ISBN
978-1-4673-0045-2
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2012.6289028
Filename
6289028
Link To Document