DocumentCode
2014783
Title
Searching for Tables in Digital Documents
Author
Liu, Ying ; Bai, Kun ; Mitra, Prasenjit ; Giles, C. Lee
Author_Institution
Pennsylvania State Univ., University Park
Volume
2
fYear
2007
fDate
23-26 Sept. 2007
Firstpage
934
Lastpage
938
Abstract
Tables are ubiquitous. In scientific documents, tables are widely used to present experimental results or statistical data in a condensed fashion. Current search engines do not allow the end-user to search for relevant tables. In this paper, we describe TableSeer, an automatic table extraction and search engine system. TableSeer crawls scientific documents, identifies documents with tables, extracts tables from documents, indexes them and enables end-users to search for tables. We also propose an extensive set of medium-independent metadata for tables representation. Given a query, TableSeer ranks the returned results using an innovative ranking algorithm - TableRank. Our results show that TableSeer outperforms popular search engines, such as Google Scholar when the end-user seeks for tables.
Keywords
search engines; ubiquitous computing; TableSeer; automatic table extraction; digital documents; innovative ranking algorithm; medium-independent metadata; search engine system; statistical data; Data mining; Displays; Economic indicators; Floods; Image retrieval; Indexing; Information retrieval; Internet; Search engines; Software libraries;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
Conference_Location
Parana
ISSN
1520-5363
Print_ISBN
978-0-7695-2822-9
Type
conf
DOI
10.1109/ICDAR.2007.4377052
Filename
4377052
Link To Document