DocumentCode :
1791669
Title :
Managing the academic data lifecycle: A case study of HPCC
Author :
Payne, Michael E. ; Ngo, Linh B. ; Villanustre, Flavio ; Apon, Amy W.
Author_Institution :
Sch. of Comput., Clemson Univ., Clemson, SC, USA
fYear :
2014
fDate :
27-30 Oct. 2014
Firstpage :
22
Lastpage :
30
Abstract :
Academic data can be classified into multiple categories and come from a large number of sources. Many research areas require combining data from different sources into a unified set on which analytical techniques can be applied. In this research paper the authors introduce the High Performance Computing Cluster (HPCC) as a platform to streamline the process of ingesting, curating, integrating and transforming scholarly data from multiple sources and in varying formats, particularly when several of these datasets lack common attributes to support the integration process.
Keywords :
Big Data; data integration; parallel processing; HPCC; academic data lifecycle; high performance computing cluster; scholarly data integration; Awards activities; Educational institutions; Layout; Manuals; Science - general; US Government; XML; HPCC; academic research; big data; data integration; scalable platform;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data (Big Data), 2014 IEEE International Conference on
Conference_Location :
Washington, DC
Type :
conf
DOI :
10.1109/BigData.2014.7004348
Filename :
7004348
Link To Document :
بازگشت