DocumentCode :
1040381
Title :
Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling
Author :
Chen, Kuan-Yu ; Luesukprasert, Luesak ; Chou, Seng-Cho T.
Author_Institution :
Nat. Taiwan Univ., Taipei
Volume :
19
Issue :
8
fYear :
2007
Firstpage :
1016
Lastpage :
1025
Abstract :
With the vast amount of digitized textual materials now available on the Internet, it is almost impossible for people to absorb all pertinent information in a timely manner. To alleviate the problem, we present a novel approach for extracting hot topics from disparate sets of textual documents published in a given time period. Our technique consists of two steps. First, hot terms are extracted by mapping their distribution over time. Second, based on the extracted hot terms, key sentences are identified and then grouped into clusters that represent hot topics by using multidimensional sentence vectors. The results of our empirical tests show that this approach is more effective in identifying hot topics than existing methods.
Keywords :
knowledge acquisition; text analysis; Internet; digitized textual materials; extracted hot terms; multidimensional sentence modeling; multidimensional sentence vectors; textual documents; timeline analysis; Aggregates; Data mining; Event detection; Explosions; Humans; Information analysis; Internet; Multidimensional systems; Organizing; Testing; Aging theory; clustering; hot topic detection; term weighting; topic detection and tracking.;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2007.1040
Filename :
4262533
Link To Document :
بازگشت