DocumentCode
3042736
Title
Collecting Random Samples from Facebook: An Efficient Heuristic for Sampling Large and Undirected Graphs via a Metropolis-Hastings Random Walk
Author
Pina-Garcia, C.A. ; Dongbing Gu
Author_Institution
Sch. of Comput. Sci. & Electron. Eng., Univ. of Essex, Colchester, UK
fYear
2013
fDate
13-16 Oct. 2013
Firstpage
2244
Lastpage
2249
Abstract
The lack of a sampling frame (i.e., a complete list of users) for most Online Social Networks (OSNs) makes sampling methods especially difficult. Thus, reliable and efficient sampling methods are essential for practical estimation of OSN properties. Recent work in this area has thus focused on sampling methods that allow precise inference from a relatively large-scale social networks such as Facebook. We propose a sampling method on OSNs, based on a Metropolis-Hastings Random Walk (MHRW) algorithm. In this regard, we have developed a social explorer in order to collect random samples from Facebook. In addition, we address the question whether different probability distributions may be able to alter the behavior of the MHRW and enhance the effectiveness of yielding a representative sample. Thus, in this paper, we seek to understand whether the MHRW algorithm can be exploited by switching the random generator to provide better results. We evaluated the performance of our MHRW algorithm providing a descriptive statistics of the collected data. Moreover, we sketch the collecting procedure carried out on Facebook in real-time. Finally, we provide a formal convergence analysis to evaluate whether the sample of draws has attained an equilibrium state to get a rough estimate of the sample quality.
Keywords
graph theory; optimisation; social networking (online); statistical distributions; Facebook; MHRW algorithm; OSNs; descriptive statistics; formal convergence analysis; heuristic; metropolis-hastings random walk algorithm; online social networks; performance evaluation; probability distributions; random generator; random sample collection; social explorer; undirected graph sampling; Algorithm design and analysis; Convergence; Facebook; Gaussian distribution; Probability distribution; Standards; Facebook; Markov Chain Monte Carlo Methods; Probability Distributions; Random Walks; Social Networks;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on
Conference_Location
Manchester
Type
conf
DOI
10.1109/SMC.2013.384
Filename
6722137
Link To Document