OCR for Malayalam script using neural networks

Author

Rahiman, M.A. ; Rajasree, M.S.

Author_Institution

Dept. of Comput. Sci. & Engg, LBS Inst. of Technol. for Women, Trivandrum, India

fYear

2009

fDate

12-14 Oct. 2009

Firstpage

1

Lastpage

6

Abstract

This paper specifies an OCR system for printed Malayalam characters. Malayalam is the principal language of the South Indian state Kerala. The input to the system would be the scanned image of a page of text and the output is a machine editable file. Malayalam Character recognition is a complex task because of the presence of two scripts; old script and new script and a lot of combinational characters. Initially, the image is preprocessed to remove noise. Then skew correction methods are applied to the document. Lines, words and characters are segmented from the processed document image. The proposed method uses wavelet analysis for extracting features of the image and Back propagation neural network is used to accomplish the recognition tasks.

Keywords

backpropagation; feature extraction; image segmentation; neural nets; optical character recognition; wavelet transforms; Malayalam character recognition; backpropagation neural network; document image segmentation; machine editable file output; new script; old script; optical character recognition; skew correction methods; wavelet analysis; Character recognition; Computer science; Feature extraction; Image analysis; Image segmentation; Natural languages; Neural networks; Optical character recognition software; Optical noise; Writing; Feature extraction; Malayalam Character; Optical Character recognition; Segmentation; Wavelet;

fLanguage

English

Publisher

ieee

Conference_Titel

Ultra Modern Telecommunications & Workshops, 2009. ICUMT '09. International Conference on

Conference_Location

St. Petersburg

Print_ISBN

978-1-4244-3942-3

Electronic_ISBN

978-1-4244-3941-6

Type

conf

DOI

10.1109/ICUMT.2009.5345474

Filename

5345474