• DocumentCode
    2660851
  • Title

    On the bias of BFS (Breadth First Search)

  • Author

    Kurant, Maciej ; Markopoulou, Athina ; Thiran, Patrick

  • Author_Institution
    Sch. of Comput. & Comm. Sci., EPFL, Lausanne, Switzerland
  • fYear
    2010
  • fDate
    7-9 Sept. 2010
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Breadth First Search (BFS) and other graph traversal techniques are widely used for measuring large unknown graphs, such as online social networks. It has been empirically observed that incomplete BFS is biased toward high degree nodes. In contrast to more studied sampling techniques, such as random walks, the bias of BFS has not been characterized to date. In this paper, we quantify the degree bias of BFS sampling. In particular, we calculate the node degree distribution expected to be observed by BFS as a function of the fraction of covered nodes, in a random graph RG(pk) with a given (and arbitrary) degree distribution pk. Furthermore, we also show that, for RG(pk), all commonly used graph traversal techniques (BFS, DFS, Forest Fire, and Snowball Sampling) lead to the same bias, and we show how to correct for this bias. To give a broader perspective, we compare this class of exploration techniques to random walks that are well-studied and easier to analyze. Next, we study by simulation the effect of graph properties not captured directly by our model. We find that the bias gets amplified in graphs with strong positive assortativity. Finally, we demonstrate the above results by sampling the Facebook social network, and we provide some practical guidelines for graph sampling in practice.
  • Keywords
    graph theory; signal sampling; social networking (online); tree searching; BFS sampling; Facebook social network; breadth first search; forest fire; graph traversal technique; node degree distribution; online social networks; random graph; random walks; snowball sampling; Facebook; Fires; Indexes; Mathematical model; Peer to peer computing; World Wide Web; BFS; Breadth First Search; Facebook; OSN; Online Social Networks; bias; graph sampling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Teletraffic Congress (ITC), 2010 22nd International
  • Conference_Location
    Amsterdam
  • Print_ISBN
    978-1-4244-8837-7
  • Electronic_ISBN
    978-1-4244-8835-3
  • Type

    conf

  • DOI
    10.1109/ITC.2010.5608727
  • Filename
    5608727