DocumentCode :
2420051
Title :
Towards Unbiased Sampling of Online Social Networks
Author :
Wang, Dong ; Li, Zhenyu ; Xie, Gaogang
Author_Institution :
Inst. of Comput. Technol., Chinese Acad. of Sci. (CAS), Beijing, China
fYear :
2011
fDate :
5-9 June 2011
Firstpage :
1
Lastpage :
5
Abstract :
Unbiased sampling of online social networks (OSNs) makes it possible to get accurate statistical properties of large-scale OSNs. However, the most used sampling methods, Breadth-First-Search (BFS) and Greedy, are known to be biased towards high degree nodes, yielding inaccurate statistical results. To give a general requirement for unbiased sampling, we model the crawling process as a Markov Chain and deduce a necessary and sufficient condition, which enables us to design various efficient unbiased sampling methods. To the best of our knowledge, we are among the first to give such a condition. Metropolis-Hastings Random Walk (MHRW) is an example which satisfies the condition. However, walkers in MHRW may stay at some low-degree nodes for a long time, resulting considerable self-loops on these nodes, which adversely affect the crawling efficiency. Based on the condition, a new unbiased sampling method, called USRS, is proposed to reduce the probabilities of self-loops. We use the dataset of Renren, the largest OSN in China, to evaluate the performance of USRS. The results have demonstrated that USRS generates unbiased samples with low self-loop probabilities, and achieves higher crawling efficiency.
Keywords :
Markov processes; data analysis; probability; sampling methods; social networking (online); tree searching; Markov chain; Renren dataset; breadth-first-search method; crawling efficiency; crawling process; greedy method; metropolis-hastings random walk; online social networks; sampling methods; self-loop probability; statistical properties; unbiased sampling; Algorithm design and analysis; IEEE Communications Society; Markov processes; Peer to peer computing; Sampling methods; Social network services; Sufficient conditions;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communications (ICC), 2011 IEEE International Conference on
Conference_Location :
Kyoto
ISSN :
1550-3607
Print_ISBN :
978-1-61284-232-5
Electronic_ISBN :
1550-3607
Type :
conf
DOI :
10.1109/icc.2011.5963203
Filename :
5963203
Link To Document :
بازگشت