DocumentCode :
1632177
Title :
Image Classification to Improve Printing Quality of Mixed-Type Documents
Author :
Lins, Rafael Dueire ; Silva, Gabriel Pereira e ; Simske, Steven J. ; Fan, Jian ; Shaw, Mark ; Sa, Pankaj ; Thielo, Marcelo
Author_Institution :
UFPE, Recife, Brazil
fYear :
2009
Firstpage :
1106
Lastpage :
1110
Abstract :
Functional image classification is the assignment of different image types to separate classes to optimize their rendering for reading or other specific end task, and is an important area of research in the publishing and multi-average industries. This paper presents recent research on optimizing the simultaneous classification of documents, photos and logos. Each of these is handled during printing with a class-specific pipeline of image transformation algorithms, and misclassification results in pejorative imaging effects. This paper reports on replacing an existing classifier with a Weka-based classifier that simultaneously improves accuracy (from 85.3% to 90.8%) and performance (from 1458 msec to 418 msec/image). Generic subsampling of the images further improved the performance (to 199 msec/image) with only a modest impact on accuracy (to 90.4%). A staggered subsampling approach, finally, improved both accuracy (to 96.4%) and performance (to 147 msec/image) for the Weka-base classifier. This approach did not appreciable benefit the HP classifier (85.4% accuracy, 497 msec/image). These data indicate staggered subsampling using the optimized Weka classifier substantially improves the classification accuracy and performance without resulting in additional ldquoegregiousrdquo misclassifications (assigning photos or logos to the ldquodocumentrdquo class).
Keywords :
document image processing; image classification; image sampling; rendering (computer graphics); Weka-based classifier; class-specific pipeline; image classification; image sampling; image transformation algorithm; mixed-type document; pejorative imaging effect; printing quality; task rendering; Image analysis; Image classification; Image databases; Image retrieval; Information retrieval; Pipelines; Printing; Rendering (computer graphics); Spatial databases; Text analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
Conference_Location :
Barcelona
ISSN :
1520-5363
Print_ISBN :
978-1-4244-4500-4
Electronic_ISBN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2009.167
Filename :
5277475
Link To Document :
بازگشت