• DocumentCode
    3141523
  • Title

    Multi-document summarization based on hierarchical topic model

  • Author

    Liu, Hongyan ; Li, Lei

  • Author_Institution
    Center for Intell. Sci. & Technol., Beijing Univ. of Posts & Telecommun., Beijing, China
  • fYear
    2011
  • fDate
    27-29 Nov. 2011
  • Firstpage
    88
  • Lastpage
    91
  • Abstract
    In this paper, we introduced an extractive multi-document summarization method based on hierarchical topic model of hierarchical Latent Dirichlet Allocation (hLDA) and sentences compression. hLDA is a representative generative probabilistic model, which not only can mine latent topics from a large amount of discrete data, but also can organize these topics into a hierarchy to achieve a deeper semantic analysis. At the same time we also use sentence compression technology to refine the summaries, making them more concise. We use TAC 2010 data sets as the experimental test corpus and ROUGE method to evaluate our summaries. The evaluations confirmed that our method has better performance than some traditional methods.
  • Keywords
    data compression; document handling; statistical analysis; ROUGE method; hierarchical latent Dirichlet allocation model; hierarchical topic model; multidocument summarization; semantic analysis; sentence compression technology; hierarchical Latent Dirichlet Allocationtopic model; multi-document summarization; sentence compression;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing andKnowledge Engineering (NLP-KE), 2011 7th International Conference on
  • Conference_Location
    Tokushima
  • Print_ISBN
    978-1-61284-729-0
  • Type

    conf

  • DOI
    10.1109/NLPKE.2011.6138174
  • Filename
    6138174