DocumentCode :
3106228
Title :
Diverse Topic Phrase Extraction through Latent Semantic Analysis
Author :
Chen, Jilin ; Yan, Jun ; Zhang, Benyu ; Yang, Qiang ; Chen, Zheng
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of Minnesota, Minneapolis, MN
fYear :
2006
fDate :
18-22 Dec. 2006
Firstpage :
834
Lastpage :
838
Abstract :
We propose a novel algorithm for extracting diverse topic phrases in order to provide summary for large corpora. Previous works often ignore the importance of diversity and thus extract phrases crowded on some hot topics while failing to cover other less obvious but important topics. We solve this problem through document re-weighting and phrase diversification by using latent semantic analysis (LSA). Experiments on various datasets show that our new algorithm can improve relevance as well as diversity over different topics for topic phrase extraction problems.
Keywords :
text analysis; diverse topic phrase extraction; document re-weighting; latent semantic analysis; phrase diversification; Asia; Computer science; Data mining; Frequency; Supervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location :
Hong Kong
ISSN :
1550-4786
Print_ISBN :
0-7695-2701-7
Type :
conf
DOI :
10.1109/ICDM.2006.61
Filename :
4053112
Link To Document :
بازگشت