DocumentCode
679793
Title
Binarization and its evaluation for Urdu Nastalique document images
Author
Naz, Mamoona ; ul Ain Akram, Qurat ; Hussain, Shiraz
Author_Institution
Center for Language Eng., Al-Khawarizmi Inst. of Comput. Sci. Univ. of Eng. & Technol., Lahore, Pakistan
fYear
2013
fDate
19-20 Dec. 2013
Firstpage
213
Lastpage
218
Abstract
Binarization converts a colored or gray scale image into a black and white image and is normally a preliminary step in optical character recognition. Binarization of images of Urdu language documents written in Nastalique writing style requires particular attention because Nastalique is not written with a uniform stroke but as a sequence of thin and thick strokes with a variety of marks. In the current work, three binarization methods are compared to determine an accurate and efficient technique for Urdu. This technique is further tuned for binarizing Urdu document images written in Nastalique writing style, to avoid disconnecting thin character connections but also to simultaneously prevent joining of diacritics with main bodies due to thickened strokes.
Keywords
document image processing; optical character recognition; Nastalique writing style; Urdu Nastalique document images; Urdu language documents; binarization methods; colored image; gray scale image; optical character recognition; Accuracy; Character recognition; Lighting; Optical character recognition software; Optical imaging; Standards; Writing; Urdu Optical Character Recognition; Urdu image corpus; binarization;
fLanguage
English
Publisher
ieee
Conference_Titel
Multi Topic Conference (INMIC), 2013 16th International
Conference_Location
Lahore
Type
conf
DOI
10.1109/INMIC.2013.6731352
Filename
6731352
Link To Document