DocumentCode :
596344
Title :
A clustering strategy for touching characters in Korean and English printed text segmentation
Author :
Wahyono ; Kang-Hyun Jo
Author_Institution :
Dept. of Electr. Eng., Univ. of Ulsan, Ulsan, South Korea
fYear :
2012
fDate :
26-28 Nov. 2012
Firstpage :
23
Lastpage :
25
Abstract :
This paper proposes segmentation method in mixed Korean and English printed text which contains touching characters using clustering strategy. At the first step, a vertical projection of image text is determined, and clustering process performed on it. Then the cluster with the smallest mean value used as candidate segmentation point. This process will produce candidate bounding boxes. Furthermore, they should be verified whether according to Korean or English characteristics otherwise they will be splitted or merged each others. The merged process could be done based on Korean vowel characteristics since Korean alphabet consist several symbols, while splitted process could be done by local vertical projection clustering. The proposed method gives 99.36% correct segmentation rate in un-touching characters and 99.25% in touching characters. This result shows that the proposed method using clustering strategy is very effective for touching problem in mixed Korean and English printed text. Besides, it also improves the speed of segmentation process, because the method does not need a character recognizer to verify bounding boxes.
Keywords :
character recognition; image segmentation; natural language processing; pattern clustering; robot vision; text analysis; English printed text segmentation method; Korean alphabet; Korean printed text segmentation method; Korean vowel characteristics; candidate bounding boxes; candidate segmentation point; local vertical projection clustering strategy; segmentation process; segmentation rate; touching characters; vertical image text projection; Ambient intelligence; Character recognition; Clustering algorithms; Image segmentation; Robots; Text recognition; Writing; Character Recognition; Clustering; Segmentation; Touching Character;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Ubiquitous Robots and Ambient Intelligence (URAI), 2012 9th International Conference on
Conference_Location :
Daejeon
Print_ISBN :
978-1-4673-3111-1
Electronic_ISBN :
978-1-4673-3110-4
Type :
conf
DOI :
10.1109/URAI.2012.6462921
Filename :
6462921
Link To Document :
بازگشت