DocumentCode :
3488200
Title :
An Efficient Ground Truthing Tool for Binarization of Historical Manuscripts
Author :
Nafchi, H.Z. ; Ayatollahi, S.M. ; Moghaddam, Reza Farrahi ; Cheriet, Mohamed
Author_Institution :
Synchromedia Lab. for Multimedia Commun. in Telepresence, Ecole de Technol. Super., Montreal, QC, Canada
fYear :
2013
fDate :
25-28 Aug. 2013
Firstpage :
807
Lastpage :
811
Abstract :
For the purpose of facilitating benchmark contributions for binarization methods, a new fast ground truthing approach, called the PhaseGT, is proposed. This approach is used for building the 1st groundtruthed Persian Heritage Image Binarization Dataset (PHIBD 2012). The PhaseGT is a semiautomatic approach to ground truthing of images of any language, especially designed for historical document images. The main goal of the PhaseGT is to accelerate the ground truthing process and reduce the manual ground truthing effort. It uses the phase congruency features to preprocess the input image and to provide a more accurate initial binarization to the human expert who performs the manual part. This preprocessing is in turn based on a priori knowledge that is provided by human user. The PHIBD 2012 dataset contains 15 historical document images with their corresponding ground truth binary images. The historical images in the dataset suffer from various types of degradation. It has been also divided into two subsets of training and testing images for those binarization methods that use learning approaches.
Keywords :
document image processing; feature extraction; history; learning (artificial intelligence); PHIBD 2012 dataset; Persian heritage image binarization dataset; PhaseGT fast ground truthing approach; binarization method; document image degradation type; historical document image; historical manuscript binarization; image ground truthing tool; learning approach; phase congruency features; Data preprocessing; Degradation; Image edge detection; Ink; Manuals; Noise; Training; Binarization; Groundtruthing; Persian heritage dataset;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
ISSN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2013.165
Filename :
6628730
Link To Document :
بازگشت