Title :
A Two-layer Text Clustering Approach for Retrospective News Event Detection
Author :
Dai, Xiangying ; He, Yancheng ; Sun, Yunlian
Author_Institution :
Shenzhen Grad. Sch., Harbin Inst. of Technol., Shenzhen, China
Abstract :
For retrospective news event detection (RED), the widely used agglomerative hierarchical clustering (AHC) has a shortcoming that news stories belong to different news events are probably clustered together if they share enough common words. The reason is that AHC takes single news story as initial event at the beginning of the iteration. However, the single news story owns few event-specific features. To effectively solve this problem, we propose a two-layer text clustering approach, and apply it to RED. In the two-layer text clustering method, we first employ a recently developed algorithm named affinity propagation clustering (AP) as the first layer. Then a second feature selection on the generated clusters of AP was conducted. Finally, we adopt the usual agglomerative hierarchical clustering to generate the ultimate news events. A series of experiments were performed on two datasets to test the performance of our proposed method. And we selected the traditional AHC and the classic K-Means as comparative methods. The experimental results on the two datasets show that our approach outperforms AHC and K-Means.
Keywords :
pattern clustering; text analysis; affinity propagation clustering; agglomerative hierarchical clustering; k-means; retrospective news event detection; two layer text clustering approach; Availability; Clustering algorithms; Clustering methods; Computational modeling; Event detection; Measurement; Text categorization; Affinity Propagation Clustering; Agglomerative Hierarchical Clustering; Retrospective News Event Detection; Vector Space Model;
Conference_Titel :
Artificial Intelligence and Computational Intelligence (AICI), 2010 International Conference on
Conference_Location :
Sanya
Print_ISBN :
978-1-4244-8432-4
DOI :
10.1109/AICI.2010.83