DocumentCode
2008386
Title
Text, Image and Vector Graphics Based Appraisal of Contemporary Documents
Author
Lee, Sang-Chul ; McFadden, William ; Bajcsy, Peter
Author_Institution
Dept. of Comput. & Inf. Eng., Inha Univ., Incheon, South Korea
fYear
2008
fDate
11-13 Dec. 2008
Firstpage
729
Lastpage
734
Abstract
We have designed a framework for content based appraisal of documents. Our motivation is to provide computer assisted support for answering several appraisal criteria according to the general appraisal guidelines in the National Archives and Record Administration (NARA) 1441 directive. The appraisal criteria led us to investigations related to (a) finding groups of PDF documents with similar content, (b) ranking documents according to their creation/ modification time and digital volume, and (c) detecting inconsistency between ranking and content within a group of related documents. The novelty of our work is in designing a methodology and a mathematical framework for document appraisals, and prototyping the framework working with text, image and vector graphics components of PDF documents. We present example results of grouping, ranking and integrity verification for groups of scientific documents about medical topics.
Keywords
document handling; formal verification; National Archives and Record Administration; PDF documents; contemporary documents; content based appraisal; document appraisals; image graphics; integrity verification; text graphics; vector graphics; Algorithm design and analysis; Appraisal; Biomedical imaging; Design methodology; Focusing; Frequency; Graphics; Image analysis; Image color analysis; Prototypes; PDF; appraisal; document analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications, 2008. ICMLA '08. Seventh International Conference on
Conference_Location
San Diego, CA
Print_ISBN
978-0-7695-3495-4
Type
conf
DOI
10.1109/ICMLA.2008.39
Filename
4725056
Link To Document