DocumentCode :
2630147
Title :
Structural recognition of tabulated data
Author :
Chandran, Surekha ; Kasturi, Rangachar
Author_Institution :
Dept. of Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
fYear :
1993
fDate :
20-22 Oct 1993
Firstpage :
516
Lastpage :
519
Abstract :
A system for extraction of structural information of a table from its image is discussed. Following the initial binarization and deskewing operations, the image is scanned to extract all horizontal and vertical lines that are present. The table´s dimensions are estimated based on these lines. Unlike other systems, the procedure described does not depend on the sole existence of lines to mark the item blocks. White streams are recognized in both the horizontal and vertical direction as substitutes for any missing demarcation lines. A structure interpretation procedure uses the extracted demarcation information to identify each of the item blocks in the table. Subsequently, the interrelations of these item blocks are used to recognize the structure of the tabulated data
Keywords :
feature extraction; image scanners; optical character recognition; deskewing operations; extracted demarcation information; information extraction; initial binarization; item blocks; missing demarcation lines; structural information; structural recognition; structure interpretation procedure; tabulated data; vertical direction; vertical lines; Computer science; Data mining; Data preprocessing; Detection algorithms; Image analysis; Image converters; Image resolution; Streaming media; Text analysis; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on
Conference_Location :
Tsukuba Science City
Print_ISBN :
0-8186-4960-7
Type :
conf
DOI :
10.1109/ICDAR.1993.395683
Filename :
395683
Link To Document :
بازگشت