• DocumentCode
    1925483
  • Title

    Differentiating Your Friends for Scaling Online Social Networks

  • Author

    Huang, Yewei ; Deng, Qianni ; Zhu, Yanmin

  • Author_Institution
    Shanghai Jiao Tong Univ., Shanghai, China
  • fYear
    2012
  • fDate
    24-28 Sept. 2012
  • Firstpage
    411
  • Lastpage
    419
  • Abstract
    Online social networks (OSN) have been increasingly popular and attracted hundreds of millions of users, and they are usually deployed on cluster systems. A crucial problem for scaling online social networks is allocation and replication of user data records in cluster nodes with the objective of reducing access time and minimizing the costs of storage and intra-cluster communication. In these applications, users not only access their own data but also data of their friends. It is preferable that all data required by a user be placed in the same node so that access time can be reduced and communication overhead is small. The inherently complex social interactions between users, however, pose great challenges to the mechanism of data allocation and replication. Analyzing a large real dataset from OSNs, we observe that for over 90% of users, all their interactions are contributed by only 22.03% of their fiends, a Pareto distribution property. Thus, the majority of the interactions of a user are attributed to a small subset of the user´s friends. Inspired by this observation, we first build a dynamic weighted social graph which differentiates the importance of social interactions between a user and the user´s friends. Using this graph, we design WEPAR, an online partitioning and replication algorithm taking into account both read and write operations. WEPAR tries to place master data copies of users with frequent interactions in the same cluster node, and generates slave data copies for the users who tend to receive relatively more reading requests from a cluster-node. Extensive evaluations based on real datasets show that our approach significantly reduces storage cost and improves write response time with read response time comparable to that of existing algorithms.
  • Keywords
    graph theory; pattern clustering; social networking (online); OSN; Pareto distribution property; WEPAR; cluster nodes; cluster systems; complex social interactions; data allocation; data replication; dynamic weighted social graph; intracluster communication; master data copies; online partitioning algorithm; online social network scaling; replication algorithm; user data records; Clustering algorithms; Heuristic algorithms; Partitioning algorithms; Redundancy; Resource management; Social network services; Time factors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2012 IEEE International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4673-2422-9
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2012.55
  • Filename
    6337804