• DocumentCode
    835944
  • Title

    A Probabilistic Model of Meetings That Combines Words and Discourse Features

  • Author

    Dowman, Mike ; Savova, Virginia ; Griffiths, Thomas L. ; Kording, Konrad P. ; Tenenbaum, Joshua B. ; Purver, Matthew

  • Author_Institution
    Dept. of Gen. Syst. Studies, Univ. of Tokyo, Tokyo
  • Volume
    16
  • Issue
    7
  • fYear
    2008
  • Firstpage
    1238
  • Lastpage
    1248
  • Abstract
    In order to determine the points at which meeting discourse changes from one topic to another, probabilistic models were used to approximate the process through which meeting transcripts were produced. Gibbs sampling was used to estimate the values of random variables in the models, including the locations of topic boundaries. This paper shows how discourse features were integrated into the Bayesian model and reports empirical evaluations of the benefit obtained through the inclusion of each feature and of the suitability of alternative models of the placement of topic boundaries. It demonstrates how multiple cues to segmentation can be combined in a principled way, and empirical tests show a clear improvement over previous work.
  • Keywords
    Bayes methods; natural languages; probability; text analysis; Bayesian model; Gibbs sampling; meeting discourse features; multiple cues; natural language texts; probabilistic model; random variables; topic boundaries; word features; word segmentation; Bayesian methods; Computational linguistics; Monte Carlo methods; Natural languages; Probability distribution; Psychology; Random variables; Sampling methods; Testing; Training data; Gibbs Sampling; Markov chain Monte Carlo; hierarchical Bayesian models; latent Dirichlet allocation; topical segmentation;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2008.925867
  • Filename
    4599394