DocumentCode :
168276
Title :
Towards building a scholarly big data platform: Challenges, lessons and opportunities
Author :
Zhaohui Wu ; Jian Wu ; Khabsa, Madian ; Williams, Kresimir ; Hung-Hsuan Chen ; Wenyi Huang ; Tuarob, Suppawong ; Choudhury, Sagnik Ray ; Ororbia, Alexander ; MITRA, PINAKI ; Giles, C. Lee
Author_Institution :
Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
fYear :
2014
fDate :
8-12 Sept. 2014
Firstpage :
117
Lastpage :
126
Abstract :
We introduce a Big Data platform that provides various services for harvesting scholarly information and enabling efficient scholarly applications. The core architecture of the platform is built on a secured private cloud, crawls data using a scholarly focused crawler that leverages a dynamic scheduler, processes by utilizing a map reduce based crawl-extraction-ingestion (CEI) workflow, and is stored in distributed repositories and databases. Services such as scholarly data harvesting, information extraction, and user information and log data analytics are integrated into the platform and provided by an OAI and RESTful API. We also introduce a set of scholarly applications built on top of this platform including citation recommendation and collaborator discovery.
Keywords :
Big Data; application program interfaces; citation analysis; cloud computing; data privacy; distributed databases; parallel programming; recommender systems; Big Data platform; CEI workflow; MapReduce-based crawl-extraction-ingestion workflow; OAI; RESTful API; citation recommendation; collaborator discovery; data crawling; data storage; distributed databases; distributed repositories; dynamic scheduler; information extraction; log data analytics; scholarly applications; scholarly focused crawler; scholarly information harvesting; secured private cloud; user information; Big data; Books; Cloud computing; Crawlers; Data mining; Databases; Servers; Big Data; Information Extraction; Scholarly Big Data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Libraries (JCDL), 2014 IEEE/ACM Joint Conference on
Conference_Location :
London
Type :
conf
DOI :
10.1109/JCDL.2014.6970157
Filename :
6970157
Link To Document :
بازگشت