Title of article :
A learning framework for the optimization and automation of document binarization methods
Author/Authors :
Cheriet، نويسنده , , Mohamed and Farrahi Moghaddam، نويسنده , , Reza and Hedjam، نويسنده , , Rachid، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2013
Abstract :
Almost all binarization methods have a few parameters that require setting. However, they do not usually achieve their upper-bound performance unless the parameters are individually set and optimized for each input document image. In this work, a learning framework for the optimization of the binarization methods is introduced, which is designed to determine the optimal parameter values for a document image. The framework, which works with any binarization method, has a standard structure, and performs three main steps: (i) extracts features, (ii) estimates optimal parameters, and (iii) learns the relationship between features and optimal parameters. First, an approach is proposed to generate numerical feature vectors from 2D data. The statistics of various maps are extracted and then combined into a final feature vector, in a nonlinear way. The optimal behavior is learned using support vector regression (SVR). Although the framework works with any binarization method, two methods are considered as typical examples in this work: the grid-based Sauvola method, and Lu’s method, which placed first in the DIBCO’09 contest. The experiments are performed on the DIBCO’09 and H-DIBCO’10 datasets, and combinations of these datasets with promising results.
Keywords :
Document image processing , Binarization , Parametric methods , Learning machines , Multi-level maps
Journal title :
Computer Vision and Image Understanding
Journal title :
Computer Vision and Image Understanding