• DocumentCode
    1480429
  • Title

    Analyzing and Visualizing Web Opinion Development and Social Interactions With Density-Based Clustering

  • Author

    Yang, Christopher C. ; Ng, Tobun Dorbin

  • Author_Institution
    Chinese Univ. of Hong Kong, Hong Kong, China
  • Volume
    41
  • Issue
    6
  • fYear
    2011
  • Firstpage
    1144
  • Lastpage
    1155
  • Abstract
    Due to the advancement of Web 2.0 technologies, a large volume of Web opinions is available on social media sites such as Web forums and Weblogs. These technologies provide a platform for Internet users around the world to communicate with each other and express their opinions. Analysis of developing Web opinions is potentially valuable for discovering ongoing topics of interests of the public like terrorist and crime detection, understanding how topics evolve together with the underlying social interaction between participants, and identifying important participants who have great influence in various topics of discussions. Nonetheless, the work of analyzing and clustering Web opinions is extremely challenging. Unlike regular documents, Web opinions are short and sparse text messages with noisy content. Typical document clustering techniques with the goal of clustering all documents applied to Web opinions produce unsatisfactory performance. In this paper, we investigated the density-based clustering algorithm and proposed the scalable distance-based clustering technique for Web opinion clustering. We conducted experiments and benchmarked with the density-based algorithm to show that the new algorithm obtains higher microaccuracy and macroaccuracy. This Web opinion clustering technique enables the identification of themes within discussions in Web social networks and their development, as well as the interactions of active participants. We also developed interactive visualization tools, which make use of the identified topic clusters to display social network development, the network topology similarity between topics, and the similarity values between participants.
  • Keywords
    data analysis; data visualisation; pattern clustering; social networking (online); Web 2.0 technology; Web forum; Web opinion analysis; Web opinion clustering; Web opinion development; Web opinion visualization; Weblogs; density-based clustering algorithm; distance-based clustering technique; interactive visualization tool; social interaction; social network development; Algorithm design and analysis; Clustering algorithms; Discussion forums; Social network services; Visualization; Web and internet services; Density-based clustering; information visualization; social media analytics; social network analysis; web opinion mining;
  • fLanguage
    English
  • Journal_Title
    Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1083-4427
  • Type

    jour

  • DOI
    10.1109/TSMCA.2011.2113334
  • Filename
    5738691