DocumentCode :
2143462
Title :
A Mixed Approach for Handwritten Documents Structural Analysis
Author :
Malleron, Vincent ; Eglin, Véronique
Author_Institution :
LIRIS, Univ. de Lyon, Lyon, France
fYear :
2011
fDate :
18-21 Sept. 2011
Firstpage :
269
Lastpage :
273
Abstract :
In this paper we propose a new method for document pages segmentation. First dedicated to handwritten documents, our method is designed to extract the different text zones, paragraph and fragment in unconstrained documents. The proposed approach is a mixed one, using both the advantages of top-down and bottom-up approaches. In this paper we proposed and evaluation of our methods on a 183 documents database, taken from a 19th century handwritten corpus : the "dossiers de Bouvard et Pécuchet" from Flaubert. With this evaluation we demonstrate that the combination of the top-down and the bottom-up approach allow to improve the obtained results.
Keywords :
document image processing; handwritten character recognition; image segmentation; text analysis; visual databases; bottom-up approach; document database; document page segmentation; handwritten corpus; handwritten document structural analysis; text zones; top-down approach; unconstraint documents; Algorithm design and analysis; Image segmentation; Layout; Text analysis; Transforms; White spaces; handwritten; layout segmentation; logical structure; physical structure;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
ISSN :
1520-5363
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2011.62
Filename :
6065317
Link To Document :
بازگشت