DocumentCode
2216361
Title
Recognizing broken characters in Thai Historical documents
Author
Sumetphong, Chaivatna ; Tangwongsan, Supachai
Author_Institution
Fac. of Inf. & Commun. Technol., Mahidol Univ., Bangkok, Thailand
Volume
1
fYear
2010
fDate
20-22 Aug. 2010
Abstract
One of the biggest challenges in restoring historical documents is to achieve a high level of OCR accuracy. The main characteristic inherent to these valuable but degraded documents is the abundant presence of broken characters. This paper represents this problem as a mathematical model. We also propose a novel solution based on set-partitions to recognize broken characters in Thai Historical documents. Experiments based on this solution have been performed and the results are very promising.
Keywords
character recognition; document image processing; image restoration; mathematical analysis; natural language processing; OCR accuracy; Thai historical document; broken character recognition; degraded document; historical document restoration; mathematical model; set-partition; Character recognition; Broken Characters; Error Correction; Optical Character Recognition; Set-Partitions; Thai Historical Documents;
fLanguage
English
Publisher
ieee
Conference_Titel
Advanced Computer Theory and Engineering (ICACTE), 2010 3rd International Conference on
Conference_Location
Chengdu
ISSN
2154-7491
Print_ISBN
978-1-4244-6539-2
Type
conf
DOI
10.1109/ICACTE.2010.5579053
Filename
5579053
Link To Document