DocumentCode :
778891
Title :
Segmentation methods for character recognition: from segmentation to document structure analysis
Author :
Fujisawa, Hiromichi ; Nakano, Yasuaki ; Kurino, Kiyomichi
Author_Institution :
Hitachi Ltd., Tokyo, Japan
Volume :
80
Issue :
7
fYear :
1992
fDate :
7/1/1992 12:00:00 AM
Firstpage :
1079
Lastpage :
1092
Abstract :
A pattern-oriented segmentation method for optical character recognition that leads to document structure analysis is presented. As a first example, segmentation of handwritten numerals that touch are treated. Connected pattern components are extracted, and spatial interrelations between components are measured and grouped into meaningful character patterns. Stroke shapes are analyzed and a method of finding the touching positions that separates about 95% of connected numerals correctly is described. Ambiguities are handled by multiple hypotheses and verification by recognition. An extended form of pattern-oriented segmentation, tabular form recognition, is considered. Images of tabular forms are analyzed, and frames in the tabular structure are extracted. By identifying semantic relationships between label frames and data frames, information on the form can be properly recognized
Keywords :
document image processing; optical character recognition; OCR; character patterns; character recognition; connected pattern components; data frames; document structure analysis; handwritten numerals; label frames; multiple hypotheses; segmentation; semantic relationships; spatial interrelations; tabular form recognition; touching positions; Character recognition; Image segmentation; Optical character recognition software; Paper technology; Pattern classification; Pattern recognition; Pixel; Text analysis; Usability; Writing;
fLanguage :
English
Journal_Title :
Proceedings of the IEEE
Publisher :
ieee
ISSN :
0018-9219
Type :
jour
DOI :
10.1109/5.156471
Filename :
156471
Link To Document :
بازگشت