DocumentCode :
3244865
Title :
Pitch-based emphasis detection for characterization of meeting recordings
Author :
Kennedy, Lyndon S. ; Ellis, Daniel P W
Author_Institution :
Dept. of Electr. Eng., Columbia Univ., New York, NY, USA
fYear :
2003
fDate :
30 Nov.-3 Dec. 2003
Firstpage :
243
Lastpage :
248
Abstract :
The automatic extraction of key utterances in spoken data has emerged as an interesting and difficult topic in automatic speech recognition. "Emphasis" or "excitement" may be a useful identifier for these utterances of interest. We undertake the task of reliably and automatically identifying emphasized or excited utterances in natural speech in a meeting setting. We start by endeavoring to establish reliable ground truth emphasis labels by using several hand-labelers. The results show that human listeners can reliably identify emphasized utterances in meeting recordings. We then build an automatic emphasis detection system, which uses normalized pitch as its only acoustic predictor. The results show that this pitch-based emphasis detection scheme can distinguish between non-emphasized and emphasized utterances with an accuracy of 92% when ambiguous cases are excluded, a rate comparable to human interlabeler agreement.
Keywords :
feature extraction; natural languages; speech recognition; acoustic predictor; automatic speech recognition; excited utterances; key utterance extraction; meeting recording characterization; natural speech; normalized pitch; pitch-based emphasis detection; Acoustic signal detection; Automatic speech recognition; Data mining; Humans; Intelligent systems; Labeling; Loudspeakers; Microphones; Natural languages; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
Type :
conf
DOI :
10.1109/ASRU.2003.1318448
Filename :
1318448
Link To Document :
بازگشت