Title :
A method for connecting disappeared junction patterns on frame lines in form documents
Author :
Shinjo, Hiroshi ; Nakashima, Kazuki ; Koga, Masashi ; Marukawa, Katsumi ; Shima, Yoshihiro ; Hadano, Eiichi
Author_Institution :
Hitachi Ltd, Japan
Abstract :
Form document structure analysis is an essential technique for recognizing the positions of characters in general forms. However, it has a fundamental problem that interruptions of lines, as well as noise, lead to incorrect analysis. The paper focuses on a method for connecting junction patterns in which portions of the horizontal and vertical lines are not visible, referred to as “disappeared junction patterns”. Our method has two key stages for making correct connections. The first is noise elimination, in which lines whose two end points meet no other lines and which are shorter than the minimum line length parameter, are eliminated. The second is object line selection, where only frame lines of tables are selected as object lines for connection. Experiments with 39 form images demonstrated the feasibility of this method
Keywords :
business data processing; business forms; document image processing; interference suppression; optical character recognition; character recognition; disappeared junction pattern connection; form document structure analysis; form images; frame lines; minimum line length parameter; noise elimination; object line selection; vertical lines; Character recognition; Joining processes; Text analysis;
Conference_Titel :
Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
Conference_Location :
Ulm
Print_ISBN :
0-8186-7898-4
DOI :
10.1109/ICDAR.1997.620590