DocumentCode :
2769858
Title :
A Mandarin lecture speech transcription system for speech summarization
Author :
Chan, Ho Yin ; Zhang, Justin Jian ; Fung, Pascale ; Cao, Lu
Author_Institution :
Univ. of Sci. & Technol., Hong Kong
fYear :
2007
fDate :
9-13 Dec. 2007
Firstpage :
467
Lastpage :
471
Abstract :
This paper introduces our work on mandarin lecture speech transcription. In particular, we present our work on a small database, which contains only 16 hours of audio data and 0.16 M words of text data. A range of experiments have been done to improve the performances of the acoustic model and the language model, these include adapting the lecture speech data to the reading speech data for acoustic modeling and the use of lecture conference paper, power points and similar domain web data for language modeling. We also study the effects of automatic segmentation, unsupervised acoustic model adaptation and language model adaptation in our recognition system. By using a 3timesRT multiple passes decoding strategy, we obtain 70.3% accuracy performance in our final system. Finally, we apply our speech transcription system into a SVM summarizer and obtain a ROUGE-L F-measure of 66.5%.
Keywords :
natural language processing; speech recognition; Mandarin lecture speech transcription system; automatic segmentation; language modeling; multiple passes decoding strategy; recognition system; speech summarization; Adaptation model; Audio databases; Decoding; Error analysis; Humans; Natural languages; Power system modeling; Speech recognition; Testing; Training data; lecture speech transcription; model adaptation; multi-pass decoding; speech summarization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-1746-9
Electronic_ISBN :
978-1-4244-1746-9
Type :
conf
DOI :
10.1109/ASRU.2007.4430157
Filename :
4430157
Link To Document :
بازگشت