Title :
Advanced Information Systems for Archival Appraisals of Contemporary Documents
Author :
McFadden, William ; McHenry, Kenton ; Kooper, Rob ; Ondrejcek, Michal ; Yahja, Alex ; Bajcsy, Peter
Author_Institution :
Nat. Center for Super Comput. Applic., Univ. of Illinois at Urbana-Champaign, IL, USA
Abstract :
This work addresses the problem of designing a scalable framework for archival appraisals of contemporary PDF documents. The motivation for our work is to provide an e-science solution that (a) fuses the independent research methodologies focusing on specific information types to one comprehensive analytical framework, (b) optimizes tradeoffs between computational requirements and preservation costs, and (b) bridges the small scale and large scale computational studies. The e-science solution presented here consists of (1) a methodology for comprehensive comparisons of contemporary documents containing text, images and vector graphics, (2) a framework for including 3D and 3D+time data sets into the appraisal analyses, (3) interfaces supporting exploratory archival appraisal analyses with small scale data sets, and (4) infrastructure supporting the transition from small scale to large scale computations using commodity and high performance computing resources. The novelty of our work is in designing methodologies, mathematical frameworks and prototypes for comprehensive and scalable document appraisals that include text, images, vector graphics, and high dimensional data.
Keywords :
document handling; information retrieval; scientific information systems; archival appraisal; computational requirement; contemporary PDF document; document appraisal; e-science; high dimensional data; image data; information system; preservation cost; research methodology; small scale data set; text data; vector graphics; Appraisal; Fuses; Graphics; High performance computing; Image analysis; Information analysis; Information systems; Large-scale systems; Optimization methods; Performance analysis;
Conference_Titel :
eScience, 2008. eScience '08. IEEE Fourth International Conference on
Conference_Location :
Indianapolis, IN
Print_ISBN :
978-1-4244-3380-3
Electronic_ISBN :
978-0-7695-3535-7
DOI :
10.1109/eScience.2008.140