• DocumentCode
    2208019
  • Title

    Multi-document Summarization Using Minimum Distortion

  • Author

    Ma, Tengfei ; Wan, Xiaojun

  • Author_Institution
    MOE Key Lab. of Comput. Linguistics, Peking Univ., Beijing, China
  • fYear
    2010
  • fDate
    13-17 Dec. 2010
  • Firstpage
    354
  • Lastpage
    363
  • Abstract
    Document summarization plays an important role in the area of natural language processing and text mining. This paper proposes several novel information-theoretic models for multi-document summarization. They consider document summarization as a transmission system and assume that the best summary should have the minimum distortion. By defining a proper distortion measure and a new representation method, the combination of the last two models (the linear representation model and the facility location model) gains good experimental results on the DUC2002 and DUC2004 datasets. Moreover, we also indicate that the model has high interpretability and extensibility.
  • Keywords
    data mining; knowledge representation; natural language processing; text analysis; document summarization; information-theoretic model; natural language processing; text mining; J-S Divergence; information-theoretic summarization; linear representation; minimum distortion; multi-document summarization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2010 IEEE 10th International Conference on
  • Conference_Location
    Sydney, NSW
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4244-9131-5
  • Electronic_ISBN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2010.106
  • Filename
    5693989