• DocumentCode
    3426746
  • Title

    Rhetorical-State Hidden Markov Models for extractive speech summarization

  • Author

    Fung, Pascale ; Chan, Ricky Ho Yin ; Zhang, Justin Jian

  • Author_Institution
    Dept. of Electron. & Comput. Eng., Human Language Technol. Center, Hong Kong Univ. of Sci. & Technol., Hong Kong
  • fYear
    2008
  • fDate
    March 31 2008-April 4 2008
  • Firstpage
    4957
  • Lastpage
    4960
  • Abstract
    We propose an extractive summarization system with a novel non-generative probabilistic framework for speech summarization. One of the most underutilized features in extractive summarization is rhetorical information - semantically cohesive units that are hidden in spoken documents. We propose Rhetorical-State Hidden Markov Models (RSHMMs) to automatically decode this underlying structure in speech. We show that RSHMMs give a 71.69% ROUGE-L F-measure, a 5.69% absolute increase in lecture speech summarization performance compared to the baseline system without using RSHMM. It equally outperforms the baseline system with additional discourse features, showing that our RSHMM is a more refined improvement on the conventional discourse feature.
  • Keywords
    hidden Markov models; speech processing; baseline system; extractive speech summarization; nongenerative probabilistic framework; rhetorical information; rhetorical-state hidden Markov models; Automatic speech recognition; Data mining; Decoding; Feature extraction; Hidden Markov models; Humans; Natural languages; Support vector machine classification; Support vector machines; Text recognition; hidden Markov models; rhetorical information; speech features; spoken document summarization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
  • Conference_Location
    Las Vegas, NV
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-1483-3
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2008.4518770
  • Filename
    4518770