Title :
Extractive speech summarization by active learning
Author :
Zhang, Justin Jian ; Chan, Ricky Ho Yin ; Fung, Pascale
Author_Institution :
Dept. of Electron. & Comput. Eng., Hong Kong Univ. of Sci. & Technol. (HKUST), Hong Kong, China
fDate :
Nov. 13 2009-Dec. 17 2009
Abstract :
In this paper, we propose an active learning approach for feature-based extractive summarization of lecture speech. Most state-of-the-art speech summarization systems are trained by using a large amount of human reference summaries. Active learning targets to minimize human annotation efforts by automatically selecting a small amount of unlabeled examples for labeling. Our method chooses the unlabeled examples according to a combination of informativeness criterion and robustness criterion. Our summarization results show an increasing learning curve of ROUGE-L F-measure, from 0.44 to 0.54, consistently higher than that of using randomly chosen training samples. We also show that, by following the rhetorical structure in presentation slides, it is possible for humans to produce "gold standard" reference summaries with very high inter-labeler agreement.
Keywords :
feature extraction; learning (artificial intelligence); speech processing; ROUGE-L F-measure; active learning; extractive speech summarization; feature-based extractive summarization; human annotation efforts; human reference summary; learning curve; lecture speech; robustness criterion; state-of-the-art speech summarization systems; Data mining; Guidelines; Humans; Labeling; Natural languages; Reproducibility of results; Speech analysis; Stability; Supervised learning; Training data; active learning; speech summarization;
Conference_Titel :
Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
Conference_Location :
Merano
Print_ISBN :
978-1-4244-5478-5
Electronic_ISBN :
978-1-4244-5479-2
DOI :
10.1109/ASRU.2009.5373269