Title :
On the segmentation of multi-font printed Uygur scripts
Author :
Ymin, Anniwear ; Aoki, Yoshinao
Author_Institution :
Dept. of Inf. Eng., Hokkaido Univ., Sapporo, Japan
Abstract :
In many OCR systems, character segmentation is a necessary preprocessing step for character recognition. It is an important step because incorrectly segmented characters are not likely to be correctly recognized. The most difficult case in character segmentation is cursive scripts. Uygur character is a cursive script. This paper presents the problem of segmenting the Uygur characters in various fonts and size in printed scripts. The technique for the segmentation is presented as following: line separation, word separation, segmenting the word into isolated characters consists of the two step´s algorithms, topological segmentation, and quasi-topological segmentation. Topological segmentation is based on tracing the outer contour of a given word. Quasi-topological segmentation is based on the decision to section a character on a combination of feature-extraction and character-width measurements. Our approach relies on the feature of characters and fonts and profile models
Keywords :
image segmentation; optical character recognition; OCR systems; character segmentation; character-width measurements; cursive scripts; feature-extraction measurements; incorrectly segmented characters; line separation; multifont printed Uygur script segmentation; quasi-topological segmentation; word separation; Asia; Character recognition; Cities and towns; Feature extraction; Image segmentation; Information science; Natural languages; Optical character recognition software; Text recognition;
Conference_Titel :
Pattern Recognition, 1996., Proceedings of the 13th International Conference on
Conference_Location :
Vienna
Print_ISBN :
0-8186-7282-X
DOI :
10.1109/ICPR.1996.546941