DocumentCode :
3487748
Title :
The Significance of Reading Order in Document Recognition and Its Evaluation
Author :
Clausner, C. ; Pletschacher, S. ; Antonacopoulos, A.
Author_Institution :
Pattern Recognition & Image Anal. (PRImA) Res. Lab., Univ. of Salford, Salford, UK
fYear :
2013
fDate :
25-28 Aug. 2013
Firstpage :
688
Lastpage :
692
Abstract :
Reading order detection and representation is an important task in many digitisation scenarios involving the preservation of the logical structure of a document. The corresponding need for the evaluation of reading order results generated by layout analysis methods poses a particular challenge due to potential deviations between ground truth and actually detected segmentation of the page. To this end a novel evaluation approach that responds to this problem by incorporating region correspondence analysis is proposed. Furthermore, a sophisticated reading order representation scheme is presented and used by the system allowing the grouping of objects with ordered and/or unordered relations. This is a typical requirement for documents with complex layouts such as magazines and newspapers. The evaluation method has been validated using the results of two state-of-the-art OCR / layout analysis systems and a basic top-to-bottom reading order detection algorithm applied on representative samples from the PRImA contemporary and the IMPACT historical document datasets.
Keywords :
document image processing; image representation; image segmentation; optical character recognition; IMPACT historical document datasets; OCR; PRImA contemporary; basic top-to-bottom reading order detection algorithm; document recognition; document segmentation; layout analysis methods; logical structure; novel evaluation approach; reading order representation scheme; region correspondence analysis; Engines; Layout; Optical character recognition software; Performance evaluation; Text analysis; Text recognition; document layout analysis; document structure; performance evaluation; reading order detection; reading order evaluation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
ISSN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2013.141
Filename :
6628706
Link To Document :
بازگشت