• DocumentCode
    3244458
  • Title

    Automatic indexing of key sentences for lecture archives

  • Author

    Kawahara, Tatsuya ; Shitaoka, K. ; Kitade, Tasuku ; Nanjo, Hiroaki

  • Author_Institution
    Sch. of Informatics, Kyoto Univ., Japan
  • fYear
    2003
  • fDate
    30 Nov.-3 Dec. 2003
  • Firstpage
    141
  • Lastpage
    144
  • Abstract
    Automatic extraction of key sentences from lecture audio archives is addressed. The method makes use of the characteristic expressions used in initial utterances of sections, which are defined as discourse markers and derived in an unsupervised manner based on word statistics. The statistics of the discourse markers is then used to define the importance of the sentences. It is also combined with the conventional tf-idf measure for content words. Experimental results confirm the effectiveness of the method using the discourse markers and its combination with the keyword-based method. We also present a statistical method for inserting periods into raw speech transcriptions for improving the readability.
  • Keywords
    indexing; speech processing; speech recognition; statistical analysis; vocabulary; automatic indexing; discourse markers; initial utterances; key sentences; keyword-based method; lecture audio archives; period insertion; raw speech transcriptions; readability; tf-idf measure; unsupervised manner; word statistics; Acoustic testing; Informatics; Loudspeakers; Machine assisted indexing; Natural languages; Speech recognition; Statistical analysis; Statistics; Vocabulary; Voice mail;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
  • Print_ISBN
    0-7803-7980-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2003.1318418
  • Filename
    1318418