DocumentCode :
591983
Title :
Evolution Maps for Connected Components in Text Documents
Author :
Biller, Ofer ; Kedem, Klara ; Dinstein, Itshak ; El-Sana, Jihad
Author_Institution :
Ben-Gurion Univ., Beer-Sheva, Israel
fYear :
2012
fDate :
18-20 Sept. 2012
Firstpage :
405
Lastpage :
410
Abstract :
For highly degraded text documents, common tasks such as binarization and line extraction, remain difficult tasks. Equipped with a reliable information regarding the distribution of character dimensions in the document, one can improve results of these algorithms significantly. We introduce a novel perspective of the image data which maps the evolution of connected components along the change in gray scale threshold. We use these maps to provide a robust algorithm for extracting information about character dimensions in degraded documents, and demonstrate improvement in binarization results using this information. We analyze statistically the characteristics of the evolution maps for text documents, and compare our results with ground truth data.
Keywords :
character recognition; document image processing; text analysis; binarization; character dimensions; connected components; evolution maps; gray scale threshold; ground truth data; image data; information extraction; line extraction; reliable information; robust algorithm; text document; Degradation; Educational institutions; Estimation; Histograms; Noise; Robustness; binarization; connected components analysis; degraded documents;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
Conference_Location :
Bari
Print_ISBN :
978-1-4673-2262-1
Type :
conf
DOI :
10.1109/ICFHR.2012.201
Filename :
6424427
Link To Document :
بازگشت