• DocumentCode
    54040
  • Title

    HierarchicalTopics: Visually Exploring Large Text Collections Using Topic Hierarchies

  • Author

    Wenwen Dou ; Li Yu ; Xiaoyu Wang ; Zhiqiang Ma ; Ribarsky, William

  • Author_Institution
    Univ. of North Carolina at Charlotte, Charlotte, NC, USA
  • Volume
    19
  • Issue
    12
  • fYear
    2013
  • fDate
    Dec. 2013
  • Firstpage
    2002
  • Lastpage
    2011
  • Abstract
    Analyzing large textual collections has become increasingly challenging given the size of the data available and the rate that more data is being generated. Topic-based text summarization methods coupled with interactive visualizations have presented promising approaches to address the challenge of analyzing large text corpora. As the text corpora and vocabulary grow larger, more topics need to be generated in order to capture the meaningful latent themes and nuances in the corpora. However, it is difficult for most of current topic-based visualizations to represent large number of topics without being cluttered or illegible. To facilitate the representation and navigation of a large number of topics, we propose a visual analytics system - HierarchicalTopic (HT). HT integrates a computational algorithm, Topic Rose Tree, with an interactive visual interface. The Topic Rose Tree constructs a topic hierarchy based on a list of topics. The interactive visual interface is designed to present the topic content as well as temporal evolution of topics in a hierarchical fashion. User interactions are provided for users to make changes to the topic hierarchy based on their mental model of the topic space. To qualitatively evaluate HT, we present a case study that showcases how HierarchicalTopics aid expert users in making sense of a large number of topics and discovering interesting patterns of topic groups. We have also conducted a user study to quantitatively evaluate the effect of hierarchical topic structure. The study results reveal that the HT leads to faster identification of large number of relevant topics. We have also solicited user feedback during the experiments and incorporated some suggestions into the current version of HierarchicalTopics.
  • Keywords
    computational complexity; data visualisation; text analysis; trees (mathematics); user interfaces; vocabulary; HierarchicalTopic; computational algorithm; hierarchical topic structure; interactive visual interface; interactive visualizations; text collections; text corpora; textual collections; topic groups; topic hierarchy; topic rose tree; topic-based text summarization methods; topic-based visualizations; user feedback; user interactions; visual analytics system; vocabulary; Algorithm design and analysis; Analytical models; Computational modeling; Text mining; Visual analytics; Vocabulary; Algorithm design and analysis; Analytical models; Computational modeling; Hierarchical topic representation; Text mining; Visual analytics; Vocabulary; rose tree; topic modeling; visual analytics; Algorithms; Artificial Intelligence; Computer Graphics; Documentation; Image Enhancement; Image Interpretation, Computer-Assisted; Information Storage and Retrieval; Natural Language Processing; Pattern Recognition, Automated; Software; User-Computer Interface;
  • fLanguage
    English
  • Journal_Title
    Visualization and Computer Graphics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1077-2626
  • Type

    jour

  • DOI
    10.1109/TVCG.2013.162
  • Filename
    6634160