Title :
Sharding for literature search via cutting citation graphs
Author_Institution :
Coll. of Comput. & Inf., Drexel Univ., Philadelphia, PA, USA
Abstract :
Distributed information retrieval will be a general practice of searching in the exponentially growing scientific literature. At the core of the success for efficient and effective distributed literature search is an adequate sharding policy. This paper proposes a novel sharding policy for literature search that bases on cutting the document citation and co-citation graphs. Experiments on the iSearch test collection reveal that relevant documents for a given query distribute over the shards generated through citation graph cutting in such a pattern that a few shards becomes optimal shards thus can be leveraged in a selective search strategy, potentially leading to efficient and effective literature search solutions.
Keywords :
citation analysis; graph theory; query processing; scientific information systems; distributed information retrieval; distributed literature search; document co-citation graph cutting; iSearch test collection; optimal shards; query processing; scientific literature search; selective search strategy; sharding policy; Clustering algorithms; Conferences; Partitioning algorithms; Resource management; Search problems; Vectors; citation; distributed information retrieval; graph partition; literature search; sharding;
Conference_Titel :
Big Data (Big Data), 2014 IEEE International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/BigData.2014.7004500