Title :
Table Content Understanding in SmartFIX
Author :
Deckert, S. ; Seidler, Benjamin ; Ebbecke, Markus ; Gillmann, Michael
Author_Institution :
Insiders Technol. GmbH, Kaiserslautern, Germany
Abstract :
The analysis of table structures and the retrieval of table contents is widely agreed to be a difficult challenge in the area of document analysis systems. Instead of extracting the layout of tables, we are interested in understanding their content. In this paper, we present and discuss the smartFIX approach to table recognition and content extraction. Rather than relying on layout features only, we recognize tables by taking into account the presence and semantics of data entities that we expect to find contained in a table. The relationship of a document, including a table, to a specific business process aids in shaping helpful knowledge and expectations about the table´s content. smartFIX is a commercial document analysis system complying with the complete bandwidth of industrial requirements. Therefore, smartFIX must locate the tables and extract its business process relevant information with high reliability.
Keywords :
business data processing; content management; document handling; information retrieval; pattern recognition; business process aids; content extraction; data entities; document analysis systems; document capturing systems; layout features; semantics; smartFIX; table content understanding; table contents retrieval; table recognition; table structures analysis; Business; Databases; Layout; Measurement; Semantics; Text analysis; document analysis; smartFIX; table analysis; table content extraction; table recognition; table understanding;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
DOI :
10.1109/ICDAR.2011.104