Title :
A precise skew estimation algorithm for document images using KNN clustering and fourier transform
Author :
Fabrizio, Jonathan
Author_Institution :
LRDE-EPITA, Le Kremlin-Bicètre, France
Abstract :
In this article, we propose a simple and precise skew estimation algorithm for binarized document images. The estimation is performed in the frequency domain. To get a precise result, the Fourier transform is not applied to the document itself but the document is preprocessed: all regions of the document are clustered using a KNN and contours of grouped regions are smoothed using the convex hull to form more regular shapes, with better orientation. No assumption has been made concerning the nature or the content of the document. This method has been shown to be very accurate and was ranked first at the DISEC´13 contest, during the ICDAR competitions.
Keywords :
Fourier transforms; document image processing; estimation theory; frequency-domain analysis; pattern clustering; DISEC´13 contest; Fourier transform; ICDAR competition; KNN clustering; binarized document imaging; convex hull; frequency domain estimation algorithm; skew estimation algorithm; Clustering algorithms; Estimation; Fourier transforms; Frequency-domain analysis; Robustness; Text analysis; Fourier transform; KNN; Skew estimation;
Conference_Titel :
Image Processing (ICIP), 2014 IEEE International Conference on
Conference_Location :
Paris
DOI :
10.1109/ICIP.2014.7025523