DocumentCode :
3087439
Title :
On Devanagari document processing
Author :
Sinha, R.M.K. ; Bansal, Veena
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Kanpur, India
Volume :
2
fYear :
1995
fDate :
22-25 Oct 1995
Firstpage :
1621
Abstract :
Devnagari document processing system discussed here makes use of various knowledge sources at all levels. Extraction of test zone from a document is a preprocessing stage which uses document layout knowledge represented syntactically. The test zone is then segmented into lines, lines into words and words into characters. Since Devnagari characters is a complex composition of symbols, various algorithms are used to further segment the character into its constituent symbols instead of treating the character as a unit. The symbols are then recognized using various features which are extracted and saved during training phase. The recognized symbols are composed back and sent for validation through a partitioned dictionary
Keywords :
character recognition; document handling; feature extraction; image segmentation; learning systems; Devanagari document processing; Devnagari characters; dictionary; document layout; feature extraction; learning system; segmentation; symbol recognition; test zone extraction; Natural languages; Partitioning algorithms; Research and development; Shape; Synthesizers; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man and Cybernetics, 1995. Intelligent Systems for the 21st Century., IEEE International Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
0-7803-2559-1
Type :
conf
DOI :
10.1109/ICSMC.1995.538004
Filename :
538004
Link To Document :
بازگشت