Title :
OCR for Malayalam script using neural networks
Author :
Rahiman, M.A. ; Rajasree, M.S.
Author_Institution :
Dept. of Comput. Sci. & Engg, LBS Inst. of Technol. for Women, Trivandrum, India
Abstract :
This paper specifies an OCR system for printed Malayalam characters. Malayalam is the principal language of the South Indian state Kerala. The input to the system would be the scanned image of a page of text and the output is a machine editable file. Malayalam Character recognition is a complex task because of the presence of two scripts; old script and new script and a lot of combinational characters. Initially, the image is preprocessed to remove noise. Then skew correction methods are applied to the document. Lines, words and characters are segmented from the processed document image. The proposed method uses wavelet analysis for extracting features of the image and Back propagation neural network is used to accomplish the recognition tasks.
Keywords :
backpropagation; feature extraction; image segmentation; neural nets; optical character recognition; wavelet transforms; Malayalam character recognition; backpropagation neural network; document image segmentation; machine editable file output; new script; old script; optical character recognition; skew correction methods; wavelet analysis; Character recognition; Computer science; Feature extraction; Image analysis; Image segmentation; Natural languages; Neural networks; Optical character recognition software; Optical noise; Writing; Feature extraction; Malayalam Character; Optical Character recognition; Segmentation; Wavelet;
Conference_Titel :
Ultra Modern Telecommunications & Workshops, 2009. ICUMT '09. International Conference on
Conference_Location :
St. Petersburg
Print_ISBN :
978-1-4244-3942-3
Electronic_ISBN :
978-1-4244-3941-6
DOI :
10.1109/ICUMT.2009.5345474