Title :
Cloud Computing: A Digital Libraries Perspective
Author :
Teregowda, Pradeep ; Urgaonkar, Bhuvan ; Giles, C. Lee
Author_Institution :
Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
Abstract :
Provisioning and maintenance of infrastructure for Web based digital library search engines such as CiteSeerx present several challenges. CiteSeerx provides autonomous citation indexing, full text indexing, and extensive document metadata from document scrawled from the web across computer and information sciences and related fields. Infrastructure virtualization and cloud computing are particularly attractive choices for CiteSeerx, which is challenged by both growth in the size of the indexed document collection, new features and most prominently usage. In this paper, we discuss constraints and choices faced by information retrieval systems like CiteSeerx by exploring in detail aspects of placing CiteSeerx into current cloud infrastructure offerings. We also implement an ad-hoc virtualized storage system for experimenting with adoption of cloud infrastructure services. Our results show that a cloud implementation of CiteSeerx may be a feasible alternative for its continued operation and growth.
Keywords :
Internet; citation analysis; digital libraries; indexing; information retrieval; meta data; search engines; virtual storage; CiteSeer; Cloud computing; Web based digital library search engines; ad-hoc virtualized storage system; autonomous citation indexing; cloud infrastructure services; extensive document metadata; full text indexing; information retrieval systems; infrastructure virtualization; Cloud computing; Clouds; Crawlers; Engines; Indexes; Maintenance engineering; CiteSeer; Cloud Computing; Digital Libraries; Economics; SeerSuite; Virtualization;
Conference_Titel :
Cloud Computing (CLOUD), 2010 IEEE 3rd International Conference on
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4244-8207-8
Electronic_ISBN :
978-0-7695-4130-3
DOI :
10.1109/CLOUD.2010.49