Author :
Nafchi, H.Z. ; Moghaddam, Reza Farrahi ; Cheriet, Mohamed
Author_Institution :
Synchromedia Lab. for Multimedia Commun. in Telepresence, Ecole de Technol. Super., Montreal, QC, Canada
Abstract :
In this paper, a phase-based binarization model for ancient document images is proposed, as well as a postprocessing method that can improve any binarization method and a ground truth generation tool. Three feature maps derived from the phase information of an input document image constitute the core of this binarization model. These features are the maximum moment of phase congruency covariance, a locally weighted mean phase angle, and a phase preserved denoised image. The proposed model consists of three standard steps: 1) preprocessing; 2) main binarization; and 3) postprocessing. In the preprocessing and main binarization steps, the features used are mainly phase derived, while in the postprocessing step, specialized adaptive Gaussian and median filters are considered. One of the outputs of the binarization step, which shows high recall performance, is used in a proposed postprocessing method to improve the performance of other binarization methodologies. Finally, we develop a ground truth generation tool, called PhaseGT, to simplify and speed up the ground truth generation process for ancient document images. The comprehensive experimental results on the DIBCO´09, H-DIBCO´10, DIBCO´11, H-DIBCO´12, DIBCO´13, PHIBD´12, and BICKLEY DIARY data sets show the robustness of the proposed binarization method on various types of degradation and document images.
Keywords :
Gaussian processes; adaptive filters; document image processing; history; median filters; adaptive Gaussian median filters; adaptive median filters; ancient document images; ground truth generation tool; input document image; mean phase angle; phase based binarization model; phase congruency covariance; phase information; phase preserved image denoising; postprocessing method; Degradation; Feature extraction; Image edge detection; Mathematical model; Noise; Robustness; Standards; Historical document binarization; document enhancement; ground truthing; phase-derived features;