DocumentCode :
3776032
Title :
Character recognition of medieval English manuscripts supported by a word frequency table
Author :
Kei Tanaka;Kengo Terasawa
Author_Institution :
Department of Media Architecture, Future University Hakodate, 116-2 Kameda-Nakanocho, Hakodate, Hokkaido, 041-8655 Japan
fYear :
2015
Firstpage :
700
Lastpage :
704
Abstract :
This paper proposes a method to reduce the effort involved in making transcriptions of historical documents. The method consists of preprocessing, line and word segmentation, and word clustering stages. In the line segmentation process, we determine the borders around lines using dynamic programming to be able to avoid influence of letter ascenders and descenders. In the word clustering process, we propose a novel method, basically a hierarchical cluster analysis, which uses a word frequency table as supplementary information. The effectiveness of the proposed method is evaluated experimentally by comparing with a baseline method which does not use a word frequency table. The experiments confirmed that the proposed method outperforms the baseline method.
Keywords :
"Conferences","Pattern recognition"
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ACPR), 2015 3rd IAPR Asian Conference on
Electronic_ISBN :
2327-0985
Type :
conf
DOI :
10.1109/ACPR.2015.7486593
Filename :
7486593
Link To Document :
بازگشت