Title :
Analyzing and Visualizing Web Opinion Development and Social Interactions With Density-Based Clustering
Author :
Yang, Christopher C. ; Ng, Tobun Dorbin
Author_Institution :
Chinese Univ. of Hong Kong, Hong Kong, China
Abstract :
Due to the advancement of Web 2.0 technologies, a large volume of Web opinions is available on social media sites such as Web forums and Weblogs. These technologies provide a platform for Internet users around the world to communicate with each other and express their opinions. Analysis of developing Web opinions is potentially valuable for discovering ongoing topics of interests of the public like terrorist and crime detection, understanding how topics evolve together with the underlying social interaction between participants, and identifying important participants who have great influence in various topics of discussions. Nonetheless, the work of analyzing and clustering Web opinions is extremely challenging. Unlike regular documents, Web opinions are short and sparse text messages with noisy content. Typical document clustering techniques with the goal of clustering all documents applied to Web opinions produce unsatisfactory performance. In this paper, we investigated the density-based clustering algorithm and proposed the scalable distance-based clustering technique for Web opinion clustering. We conducted experiments and benchmarked with the density-based algorithm to show that the new algorithm obtains higher microaccuracy and macroaccuracy. This Web opinion clustering technique enables the identification of themes within discussions in Web social networks and their development, as well as the interactions of active participants. We also developed interactive visualization tools, which make use of the identified topic clusters to display social network development, the network topology similarity between topics, and the similarity values between participants.
Keywords :
data analysis; data visualisation; pattern clustering; social networking (online); Web 2.0 technology; Web forum; Web opinion analysis; Web opinion clustering; Web opinion development; Web opinion visualization; Weblogs; density-based clustering algorithm; distance-based clustering technique; interactive visualization tool; social interaction; social network development; Algorithm design and analysis; Clustering algorithms; Discussion forums; Social network services; Visualization; Web and internet services; Density-based clustering; information visualization; social media analytics; social network analysis; web opinion mining;
Journal_Title :
Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on
DOI :
10.1109/TSMCA.2011.2113334