DocumentCode :
660751
Title :
Sociometric Methods for Relevancy Analysis of Long Tail Science Data
Author :
Rajasekar, Arjun ; Sankaran, S. ; Lander, Howard ; Carsey, Tom ; Crabtree, Jonathan ; Crosas, Merce ; King, Grant W. ; Hye-Chung Kum ; Zhan, Junpeng
Author_Institution :
Univ. of North Carolina Chapel Hill, Chapel Hill, NC, USA
fYear :
2013
fDate :
8-14 Sept. 2013
Firstpage :
1
Lastpage :
6
Abstract :
As the push towards electronic storage, publication, curation, and discoverability of research data collected in multiple research domains has grown, so too have the massive numbers of small to medium datasets that are highly distributed and not easily discoverable - a region of data that is sometimes referred to as the long tail of science. The rapidly increasing, sheer volume of these long tail data present one aspect of the Big Data problem: how does one more easily access, discover, use, and reuse long tail data to lead to new multidisciplinary collaborative research and scientific advancement? In this paper, we describe Data Bridge, a new e-science collaboration environment that will realize the potential of long tail data by implementing algorithms and tools to more easily enable data discoverability and reuse. Data Bridge will define different types of semantic bridges that link diverse datasets by applying a set of sociometric network analysis (SNA) and relevance algorithms. We will measure relevancy by examining different ways datasets can be related to each other: data to data, user to data, and method to data connections. Through analysis of metadata and ontology, by pattern analysis and feature extraction, through usage tools and models, and via human connections, Data Bridge will create an environment for long tail data that is greater than the sum of its parts. In the project´s initial phase, we will test and validate the new tools with real-world data contained in the Data verse Network, the largest social science data repository. In this short paper, we discuss the background and vision for the Data Bridge project, and present an introduction to the proposed SNA algorithms and analytical tools that are relevant for discoverability of long tail science data.
Keywords :
feature extraction; groupware; indexing; information retrieval; meta data; ontologies (artificial intelligence); research and development; storage management; Data Bridge project; SNA; big data problem; data curation; data verse network; discoverability; e-science collaboration environment; electronic storage; feature extraction; long tail science data; meta data; multidisciplinary collaborative research; ontology; pattern analysis; publication; relevancy analysis; research data; scientific advancement; sociometric network analysis; Algorithm design and analysis; Collaboration; Communities; Data models; Distributed databases; Educational institutions; Electrical resistance measurement; Long tail data; data discoverability; sociometric network analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Social Computing (SocialCom), 2013 International Conference on
Conference_Location :
Alexandria, VA
Type :
conf
DOI :
10.1109/SocialCom.2013.6
Filename :
6693303
Link To Document :
بازگشت