DocumentCode
2264923
Title
A text detection, localization and segmentation system for OCR in images
Author
Gllavata, Julinda ; Ewerth, Ralph ; Freisleben, Bernd
Author_Institution
Siegen Univ., Germany
fYear
2004
fDate
13-15 Dec. 2004
Firstpage
310
Lastpage
317
Abstract
One way to include semantic knowledge into the process of indexing databases of digital images is to use caption text, since it provides important information about the image content and is a very good entity for queries based on keywords. In this paper, we propose an approach to automatically localize, segment and binarize text appearing in complex images. First, an unsupervised method based on a wavelet transform is used to efficiently detect text regions. Second, connected components are generated, and the exact text positions are found via a refinement algorithm. Third, an unsupervised learning method for text segmentation and binarization is applied using a color quantizer and a wavelet transform. Comparative experimental results demonstrate the performance of our approach for the main processing steps: text localization and segmentation, and in particular their combination.
Keywords
content-based retrieval; database indexing; image segmentation; image texture; multimedia databases; optical character recognition; text analysis; unsupervised learning; visual databases; wavelet transforms; OCR image; binarization; color quantizer; digital image; image content; indexing database; refinement algorithm; semantic knowledge; text detection; text localization; text segmentation; unsupervised method; wavelet transform; Image recognition; Image retrieval; Image segmentation; Image texture analysis; Indexing; Layout; Optical character recognition software; Text recognition; Unsupervised learning; Wavelet transforms;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia Software Engineering, 2004. Proceedings. IEEE Sixth International Symposium on
Print_ISBN
0-7695-2217-3
Type
conf
DOI
10.1109/MMSE.2004.18
Filename
1376677
Link To Document