DocumentCode :
3157999
Title :
The Impact of Measurement Time on Subgroup Detection in Online Communities
Author :
Zeini, Sam ; Gohnert, Tilman ; Hoppe, Ulrich ; Krempel, L.
Author_Institution :
Univ. Duisburg-Essen, Duisburg, Germany
fYear :
2012
fDate :
26-29 Aug. 2012
Firstpage :
389
Lastpage :
394
Abstract :
More and more communities use internet based services and infrastructure for communication and collaboration. All these activities leave digital traces that are of interest for research as real world data sources that can be processed automatically or semi-automatically. Since productive online communities (such as open source developer teams) tend to support the establishment of ties between actors who work on or communicate about the same or similar objects, social network analysis is a frequently used research methodology in this field. A typical application of SNA techniques is the detection of cohesive subgroups of actors (also called "community detection"). A relatively new method for detecting cohesive subgroups is the Clique Percolation Method (CPM), which allows for detecting overlapping subgroups. We have used CPM to analyze data from some open source developer communities (mailing lists and log files) and have compared the results for varied time windows of measurement. The influence of the time span of data capturing/aggregation can be compared to photography: A certain minimal window size is needed to get a clear image with enough "light" (i.e. dense enough interaction data), whereas for very long time spans the image will be blurred because subgroup membership will indeed change during the time span (corresponding to a moving target). In this sense, our target parameter is "resolution" of subgroup structures. We have identified several indicators for good resolution. Applying these indicators to the different CPM results shows the best resolution is a time span of around 2-3 months. In general, this value will vary for different types of communities with different communication frequency and behavior. Following our findings, an explicit analysis and comparison of the influence of time window for different communities may be used to better adjust analysis techniques for the communities at hand.
Keywords :
Internet; data mining; public domain software; social networking (online); software engineering; CPM; Internet based infrastructure; Internet based service; actor ties; clique percolation method; cohesive subgroup detection; collaboration; communication behavior; communication frequency; community detection; data aggregation; data analysis; data capturing; data source; digital trace; interaction data; log files; mailing lists; measurement time window; open source developer team; overlapping subgroup detection; productive online community; social network analysis; subgroup membership; subgroup structure resolution; time span; Communities; Internet; Licenses; Size measurement; Social network services; Software; Time measurement; Clique Percolation Method; Clustering; Community Detection; Time Dimension; k-cliques;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advances in Social Networks Analysis and Mining (ASONAM), 2012 IEEE/ACM International Conference on
Conference_Location :
Istanbul
Print_ISBN :
978-1-4673-2497-7
Type :
conf
DOI :
10.1109/ASONAM.2012.70
Filename :
6425734
Link To Document :
بازگشت