Title :
Managing the academic data lifecycle: A case study of HPCC
Author :
Payne, Michael E. ; Ngo, Linh B. ; Villanustre, Flavio ; Apon, Amy W.
Author_Institution :
Sch. of Comput., Clemson Univ., Clemson, SC, USA
Abstract :
Academic data can be classified into multiple categories and come from a large number of sources. Many research areas require combining data from different sources into a unified set on which analytical techniques can be applied. In this research paper the authors introduce the High Performance Computing Cluster (HPCC) as a platform to streamline the process of ingesting, curating, integrating and transforming scholarly data from multiple sources and in varying formats, particularly when several of these datasets lack common attributes to support the integration process.
Keywords :
Big Data; data integration; parallel processing; HPCC; academic data lifecycle; high performance computing cluster; scholarly data integration; Awards activities; Educational institutions; Layout; Manuals; Science - general; US Government; XML; HPCC; academic research; big data; data integration; scalable platform;
Conference_Titel :
Big Data (Big Data), 2014 IEEE International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/BigData.2014.7004348