DocumentCode
3244458
Title
Automatic indexing of key sentences for lecture archives
Author
Kawahara, Tatsuya ; Shitaoka, K. ; Kitade, Tasuku ; Nanjo, Hiroaki
Author_Institution
Sch. of Informatics, Kyoto Univ., Japan
fYear
2003
fDate
30 Nov.-3 Dec. 2003
Firstpage
141
Lastpage
144
Abstract
Automatic extraction of key sentences from lecture audio archives is addressed. The method makes use of the characteristic expressions used in initial utterances of sections, which are defined as discourse markers and derived in an unsupervised manner based on word statistics. The statistics of the discourse markers is then used to define the importance of the sentences. It is also combined with the conventional tf-idf measure for content words. Experimental results confirm the effectiveness of the method using the discourse markers and its combination with the keyword-based method. We also present a statistical method for inserting periods into raw speech transcriptions for improving the readability.
Keywords
indexing; speech processing; speech recognition; statistical analysis; vocabulary; automatic indexing; discourse markers; initial utterances; key sentences; keyword-based method; lecture audio archives; period insertion; raw speech transcriptions; readability; tf-idf measure; unsupervised manner; word statistics; Acoustic testing; Informatics; Loudspeakers; Machine assisted indexing; Natural languages; Speech recognition; Statistical analysis; Statistics; Vocabulary; Voice mail;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN
0-7803-7980-2
Type
conf
DOI
10.1109/ASRU.2003.1318418
Filename
1318418
Link To Document