DocumentCode
3106228
Title
Diverse Topic Phrase Extraction through Latent Semantic Analysis
Author
Chen, Jilin ; Yan, Jun ; Zhang, Benyu ; Yang, Qiang ; Chen, Zheng
Author_Institution
Dept. of Comput. Sci. & Eng., Univ. of Minnesota, Minneapolis, MN
fYear
2006
fDate
18-22 Dec. 2006
Firstpage
834
Lastpage
838
Abstract
We propose a novel algorithm for extracting diverse topic phrases in order to provide summary for large corpora. Previous works often ignore the importance of diversity and thus extract phrases crowded on some hot topics while failing to cover other less obvious but important topics. We solve this problem through document re-weighting and phrase diversification by using latent semantic analysis (LSA). Experiments on various datasets show that our new algorithm can improve relevance as well as diversity over different topics for topic phrase extraction problems.
Keywords
text analysis; diverse topic phrase extraction; document re-weighting; latent semantic analysis; phrase diversification; Asia; Computer science; Data mining; Frequency; Supervised learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location
Hong Kong
ISSN
1550-4786
Print_ISBN
0-7695-2701-7
Type
conf
DOI
10.1109/ICDM.2006.61
Filename
4053112
Link To Document