• DocumentCode
    3106228
  • Title

    Diverse Topic Phrase Extraction through Latent Semantic Analysis

  • Author

    Chen, Jilin ; Yan, Jun ; Zhang, Benyu ; Yang, Qiang ; Chen, Zheng

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of Minnesota, Minneapolis, MN
  • fYear
    2006
  • fDate
    18-22 Dec. 2006
  • Firstpage
    834
  • Lastpage
    838
  • Abstract
    We propose a novel algorithm for extracting diverse topic phrases in order to provide summary for large corpora. Previous works often ignore the importance of diversity and thus extract phrases crowded on some hot topics while failing to cover other less obvious but important topics. We solve this problem through document re-weighting and phrase diversification by using latent semantic analysis (LSA). Experiments on various datasets show that our new algorithm can improve relevance as well as diversity over different topics for topic phrase extraction problems.
  • Keywords
    text analysis; diverse topic phrase extraction; document re-weighting; latent semantic analysis; phrase diversification; Asia; Computer science; Data mining; Frequency; Supervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2006. ICDM '06. Sixth International Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1550-4786
  • Print_ISBN
    0-7695-2701-7
  • Type

    conf

  • DOI
    10.1109/ICDM.2006.61
  • Filename
    4053112