DocumentCode
2016437
Title
On Using Classical Poetry Structure for Indian Language Post-Processing
Author
Namboodiri, Anoop M. ; Narayanan, P.J. ; Jawahar, C.V.
Author_Institution
Int. Inst. of Inf. Technol., Hyderabad
Volume
2
fYear
2007
fDate
23-26 Sept. 2007
Firstpage
1238
Lastpage
1242
Abstract
Post-processors are critical to the performance of language recognizers like OCRs, speech recognizers, etc. Dictionary-based post-processing commonly employ either an algorithmic approach or a statistical approach. Other linguistic features are not exploited for this purpose. The language analysis is also largely limited to the prose form. This paper proposes a framework to use the rich metric and formal structure of classical poetic forms in Indian languages for post-processing a recognizer like an OCR engine. We show that the structure present in the form of the vrtta and prasa can be efficiently used to disambiguate some cases that may be difficult for an OCR. The approach is efficient, and complementary to other post-processing approaches and can be used in conjunction with them.
Keywords
natural language processing; optical character recognition; Indian language postprocessing; classical poetry structure; dictionary-based postprocessing; language recognizer; Dictionaries; Engines; Error correction; Information technology; Natural languages; Optical character recognition software; Robustness; Speech enhancement; Speech recognition; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
Conference_Location
Parana
ISSN
1520-5363
Print_ISBN
978-0-7695-2822-9
Type
conf
DOI
10.1109/ICDAR.2007.4377113
Filename
4377113
Link To Document