DocumentCode :
1582099
Title :
Three approaches to "industrial" table spotting
Author :
Klein, Bertin ; Gokkus, Serdar ; Kieninger, Thomas ; Dengel, Andreas
Author_Institution :
Insiders Inf. Manage. GmbH, Kaiserslautern, Germany
fYear :
2001
fDate :
6/23/1905 12:00:00 AM
Firstpage :
513
Lastpage :
517
Abstract :
This paper introduces three approaches for an industrial, comprehensive document analysis system to enable it to spot tables in documents. Searching for a set of known table headers (approach 1) works rather well in a significant number of documents. But this approach (though it is implemented tolerant to OCR errors) is not tolerant enough towards some kinds of even minor aberrations. This not only decreases the recognition results, but also, even worse, makes users feel uncomfortable. Pragmatically trying to mimic for what the human eyes might key, leads to our two further, complementary approaches: searching for layout structures which resemble parts of columns (approach 2), and searching for groupings of similar lines (approach 3). The suitability of the approaches for our system requires them to be very simple to implement and simple to explain to users, computationally cheap, and combinable. In the domain of health insurances who receive huge amounts of so called medical liquidations on a daily basis we obtain very good results. On document samples representative for the every day practice of five customers-health insurance companies-tables were spotted as good and as fast as the customers expected the system to be. We thus consider our current approaches as a step towards cognitive adequacy
Keywords :
document image processing; medical administrative data processing; document analysis; document analysis and understanding; health insurances; layout structures; table headers; table recognition; table spotting; Databases; Eyes; Humans; Information analysis; Information management; Insurance; Layout; Medical treatment; Optical character recognition software; Text analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7695-1263-1
Type :
conf
DOI :
10.1109/ICDAR.2001.953842
Filename :
953842
Link To Document :
بازگشت