Automatic transcription of spontaneous lecture speech

Author

Kawahara, Tatsuya ; Nanjo, Hiroaki ; Furui, Sadaoki

Author_Institution

Sch. of Informatics, Kyoto Univ., Japan

fYear

2001

fDate

2001

Firstpage

186

Lastpage

189

Abstract

We introduce our extensive projects on spontaneous speech processing and current trials of lecture speech recognition. A large corpus of lecture presentations and talks is being collected in the project. We have trained initial baseline models and confirmed significant difference of real lectures and written notes. In spontaneous lecture speech, the speaking rate is generally faster and changes a lot, which makes it harder to apply fixed segmentation and decoding settings. Therefore, we propose sequential decoding and speaking-rate dependent decoding strategies. The sequential decoder simultaneously performs automatic segmentation and decoding of input utterances. Then, the most adequate acoustic analysis, phone models and decoding parameters are applied according to the current speaking rate. These strategies achieve improvement on automatic transcription of real lecture speech.

Keywords

acoustic signal processing; learning (artificial intelligence); natural language interfaces; speech processing; speech recognition; text analysis; acoustic analysis; automatic transcription; lecture speech recognition; segmentation; sequential decoding; speaking rate; spontaneous speech processing; Automatic speech recognition; Broadcasting; Buildings; Decoding; Informatics; Loudspeakers; Speech analysis; Speech processing; Telephony; Testing;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on

Print_ISBN

0-7803-7343-X

Type

conf

DOI

10.1109/ASRU.2001.1034618

Filename

1034618