DocumentCode :
244962
Title :
Locally Estimating Core Numbers
Author :
O´Brien, Michael P. ; Sullivan, Blair D.
Author_Institution :
Dept. of Comput. Sci., North Carolina State Univ., Raleigh, NC, USA
fYear :
2014
fDate :
14-17 Dec. 2014
Firstpage :
460
Lastpage :
469
Abstract :
Graphs are a powerful way to model interactions and relationships in data from a wide variety of application domains. In this setting, entities represented by vertices at the ´center´ of the graph are often more important than those associated with vertices on the ´fringes´. For example, central nodes tend to be more critical in the spread of information or disease and play an important role in clustering/community formation. Identifying such ´core´ vertices has recently received additional attention in the context of network experiments, which analyze the response when a random subset of vertices are exposed to a treatment (e.g. Inoculation, free product samples, etc). Specifically, the likelihood of having many central vertices in any exposure subset can have a significant impact on the experiment. We focus on using k-cores and core numbers to measure the extent to which a vertex is central in a graph. Existing algorithms for computing the core number of a vertex require the entire graph as input, an unrealistic scenario in many real world applications. Moreover, in the context of network experiments, the sub graph induced by the treated vertices is only known in a probabilistic sense. We introduce a new method for estimating the core number based only on the properties of the graph within a region of radius δ around the vertex, and prove an asymptotic error bound of our estimator on random graphs. Further, we empirically validate the accuracy of our estimator for small values of δ on a representative corpus of real data sets. Finally, we evaluate the impact of improved local estimation on an open problem in network experimentation posed by Ugander et al.
Keywords :
computational complexity; graph theory; network theory (graphs); set theory; application domains; asymptotic error bound; central nodes; central vertices; clustering formation; community formation; core vertices; data interactions; data relationships; empirical analysis; exposure subset; graph fringes; graph properties; graph vertices; k-cores; local core number estimation; network experiments; probability; radius region; random graphs; random vertex subset; real data sets; real world applications; response analysis; subgraphs; time complexity; unrealistic scenario; Accuracy; Communities; Computational complexity; Data models; Equations; Estimation; Upper bound; core numbers; graph algorithms; k-cores; network experiments;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2014 IEEE International Conference on
Conference_Location :
Shenzhen
ISSN :
1550-4786
Print_ISBN :
978-1-4799-4303-6
Type :
conf
DOI :
10.1109/ICDM.2014.136
Filename :
7023363
Link To Document :
بازگشت