• DocumentCode
    2824862
  • Title

    Query-Based Multi-Document Summarization Using Non-Negative Semantic Feature and NMF Clustering

  • Author

    Park, Sun ; Cha, ByungRae

  • Author_Institution
    Dept. of Comput. Eng., Univ. of Honam, Gwangju
  • Volume
    2
  • fYear
    2008
  • fDate
    2-4 Sept. 2008
  • Firstpage
    609
  • Lastpage
    614
  • Abstract
    In this paper, a novel summarization method, which uses non-negative matrix factorization (NMF) and NMF clustering, is introduced to extract meaningful sentences from query-based multi-documents. The proposed method decomposes a sentence into the linear combination of sparse non-negative semantic features so that it can represent a sentence as the sum of a few semantic features that are comprehensible intuitively. It can improve the quality of document summaries because it can avoid extracting the sentences whose similarities with query are high but are meaningless by using the similarity between the query and the semantic features. Besides, it uses NMF clustering to remove noises so that it can avoid the biased inherent semantics of the documents to be reflected in summaries. Also it can ensure the coherence of summaries by using the rank score of sentences with respect to semantic features. The experimental results demonstrate that the proposed method has better performance than other methods using the thesaurus, the LSA, the K-means, and the NMF.
  • Keywords
    matrix decomposition; pattern clustering; query processing; text analysis; NMF clustering; nonnegative matrix factorization; query-based multidocument summarization; sparse nonnegative semantic feature; Cognition; Computer networks; Data mining; Humans; Information management; Matrix decomposition; Sorting; Sparse matrices; Sun; Thesauri;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Networked Computing and Advanced Information Management, 2008. NCM '08. Fourth International Conference on
  • Conference_Location
    Gyeongju
  • Print_ISBN
    978-0-7695-3322-3
  • Type

    conf

  • DOI
    10.1109/NCM.2008.246
  • Filename
    4624213