Title :
Text and non-text region identification using texture and connected components
Author :
Vidyarthi, Ankit ; Mittal, Natasha ; Kansal, Apoorv
Author_Institution :
Malaviya Nat. Inst. of Technol., Jaipur, India
Abstract :
Finding text area from document image i.e. an image which has text embedded with graphic is a challenging task. In the past few years, people are working on document images to extract the text from complex colored background images but results in the extraction of text with the loss of the existing graphics from the original image. However, it is a challenging problem to detect text and non-text region, because extraction of a text region from a document image has lower pixel intensity over graphics pixel intensity. In this paper, a new texture based method is proposed for extraction of Text and Non-Text area without losing the graphics from the document image using binarization and nearly connected component.
Keywords :
document image processing; feature extraction; image colour analysis; image texture; binarization; colored background images; document image; graphics pixel intensity; nearly connected component; nontext region identification; text extraction; text region identification; texture components; Abstracts; Biomedical imaging; Histograms; Image color analysis; Binarization; Image Variance; Morphological closing; Nearly Connected Component; Object Pixel;
Conference_Titel :
Signal Propagation and Computer Technology (ICSPCT), 2014 International Conference on
Conference_Location :
Ajmer
Print_ISBN :
978-1-4799-3139-2
DOI :
10.1109/ICSPCT.2014.6884904