DocumentCode
21527
Title
How Hierarchical Topics Evolve in Large Text Corpora
Author
Weiwei Cui ; Shixia Liu ; Zhuofeng Wu ; Hao Wei
Author_Institution
Microsoft Res., Redmond, WA, USA
Volume
20
Issue
12
fYear
2014
fDate
Dec. 31 2014
Firstpage
2281
Lastpage
2290
Abstract
Using a sequence of topic trees to organize documents is a popular way to represent hierarchical and evolving topics in text corpora. However, following evolving topics in the context of topic trees remains difficult for users. To address this issue, we present an interactive visual text analysis approach to allow users to progressively explore and analyze the complex evolutionary patterns of hierarchical topics. The key idea behind our approach is to exploit a tree cut to approximate each tree and allow users to interactively modify the tree cuts based on their interests. In particular, we propose an incremental evolutionary tree cut algorithm with the goal of balancing 1) the fitness of each tree cut and the smoothness between adjacent tree cuts; 2) the historical and new information related to user interests. A time-based visualization is designed to illustrate the evolving topics over time. To preserve the mental map, we develop a stable layout algorithm. As a result, our approach can quickly guide users to progressively gain profound insights into evolving hierarchical topics. We evaluate the effectiveness of the proposed method on Amazon´s Mechanical Turk and real-world news data. The results show that users are able to successfully analyze evolving topics in text data.
Keywords
text analysis; complex evolutionary patterns; document organisation; evolutionary tree cut algorithm; hierarchical topics; interactive visual text analysis; large text corpora; stable layout algorithm; time based visualization; topic trees sequence; tree cut; Algorithm design and analysis; Context awareness; Data visualization; Document handling; Text analysis; Text mining; Hierarchical topic visualization; data transformation; evolutionary tree clustering;
fLanguage
English
Journal_Title
Visualization and Computer Graphics, IEEE Transactions on
Publisher
ieee
ISSN
1077-2626
Type
jour
DOI
10.1109/TVCG.2014.2346433
Filename
6875938
Link To Document