Title :
Fast text line extraction in document images
Author :
Seong Jong Ha ; Bora Jin ; Nam Ik Cho
Author_Institution :
Dept. of EECS, Seoul Nat. Univ., Seoul, South Korea
fDate :
Sept. 30 2012-Oct. 3 2012
Abstract :
This paper proposes an algorithm for fast text line extraction in document image. Instead of binarization or multi-oriented Gaussian blurring of an image as in the conventional methods, we use integral image and design filters that are proper to detect text regions on the integral image. After the filtering, the center points in the regions are discovered by cascade text region verification followed by non-maximum suppression. Finally, text lines are extracted by grouping the points on the same line. The proposed method is tested with document images taken in various environments, and it is shown to be faster than the conventional ones while its performance is comparable.
Keywords :
Gaussian processes; document image processing; feature extraction; filtering theory; image segmentation; text detection; center points; design filters; document images; fast text line extraction; image binarization; integral image; multioriented Gaussian blurring; nonmaximum suppression; points grouping; text region detection; text region verification; Algorithm design and analysis; Complexity theory; Computer vision; Feature extraction; Histograms; Joining processes; Optical character recognition software; text line extraction;
Conference_Titel :
Image Processing (ICIP), 2012 19th IEEE International Conference on
Conference_Location :
Orlando, FL
Print_ISBN :
978-1-4673-2534-9
Electronic_ISBN :
1522-4880
DOI :
10.1109/ICIP.2012.6466980