DocumentCode
3488915
Title
Text Line Extraction Using DMLP Classifiers for Historical Manuscripts
Author
Baechler, Micheal ; Liwicki, Marcus ; Ingold, Rolf
Author_Institution
Dept. of Inf., Univ. of Fribourg, Fribourg, Switzerland
fYear
2013
fDate
25-28 Aug. 2013
Firstpage
1029
Lastpage
1033
Abstract
This paper proposes a novel text line extraction method for historical documents. The method works in two steps. In the first step, layout analysis is performed to recognize the physical structure of a given document using a classification technique, more precisely the pixels of a coloured document image are classified into five classes: text-block, core-text-line, decoration, background, and periphery. This layout recognition is achieved by a cascade of two Dynamic Multilayer Perceptron (DMLP) classifiers and works without binarisation. In the second step, an algorithm takes the layout recognition results as an input, extracts the text lines, and groups them into blocks using the connected components approach. Finally, the algorithm refines the boundaries of the text lines using the binary image and the layout recognition results. Our system is evaluated on three historical manuscripts with a test set of 49 pages. The best obtained hit rate for text lines is 96.3%.
Keywords
document image processing; feature extraction; history; image classification; image colour analysis; multilayer perceptrons; DMLP classifiers; background class; binary image; coloured document image classification; connected components approach; core-text-line class; decoration class; dynamic multilayer perceptron classifiers; historical documents; historical manuscripts; layout analysis; layout recognition; periphery class; text line extraction method; text-block class; Feature extraction; Image resolution; Image segmentation; Layout; Neurons; Text analysis; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location
Washington, DC
ISSN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2013.206
Filename
6628771
Link To Document