• DocumentCode
    3042736
  • Title

    Collecting Random Samples from Facebook: An Efficient Heuristic for Sampling Large and Undirected Graphs via a Metropolis-Hastings Random Walk

  • Author

    Pina-Garcia, C.A. ; Dongbing Gu

  • Author_Institution
    Sch. of Comput. Sci. & Electron. Eng., Univ. of Essex, Colchester, UK
  • fYear
    2013
  • fDate
    13-16 Oct. 2013
  • Firstpage
    2244
  • Lastpage
    2249
  • Abstract
    The lack of a sampling frame (i.e., a complete list of users) for most Online Social Networks (OSNs) makes sampling methods especially difficult. Thus, reliable and efficient sampling methods are essential for practical estimation of OSN properties. Recent work in this area has thus focused on sampling methods that allow precise inference from a relatively large-scale social networks such as Facebook. We propose a sampling method on OSNs, based on a Metropolis-Hastings Random Walk (MHRW) algorithm. In this regard, we have developed a social explorer in order to collect random samples from Facebook. In addition, we address the question whether different probability distributions may be able to alter the behavior of the MHRW and enhance the effectiveness of yielding a representative sample. Thus, in this paper, we seek to understand whether the MHRW algorithm can be exploited by switching the random generator to provide better results. We evaluated the performance of our MHRW algorithm providing a descriptive statistics of the collected data. Moreover, we sketch the collecting procedure carried out on Facebook in real-time. Finally, we provide a formal convergence analysis to evaluate whether the sample of draws has attained an equilibrium state to get a rough estimate of the sample quality.
  • Keywords
    graph theory; optimisation; social networking (online); statistical distributions; Facebook; MHRW algorithm; OSNs; descriptive statistics; formal convergence analysis; heuristic; metropolis-hastings random walk algorithm; online social networks; performance evaluation; probability distributions; random generator; random sample collection; social explorer; undirected graph sampling; Algorithm design and analysis; Convergence; Facebook; Gaussian distribution; Probability distribution; Standards; Facebook; Markov Chain Monte Carlo Methods; Probability Distributions; Random Walks; Social Networks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on
  • Conference_Location
    Manchester
  • Type

    conf

  • DOI
    10.1109/SMC.2013.384
  • Filename
    6722137