• DocumentCode
    2014783
  • Title

    Searching for Tables in Digital Documents

  • Author

    Liu, Ying ; Bai, Kun ; Mitra, Prasenjit ; Giles, C. Lee

  • Author_Institution
    Pennsylvania State Univ., University Park
  • Volume
    2
  • fYear
    2007
  • fDate
    23-26 Sept. 2007
  • Firstpage
    934
  • Lastpage
    938
  • Abstract
    Tables are ubiquitous. In scientific documents, tables are widely used to present experimental results or statistical data in a condensed fashion. Current search engines do not allow the end-user to search for relevant tables. In this paper, we describe TableSeer, an automatic table extraction and search engine system. TableSeer crawls scientific documents, identifies documents with tables, extracts tables from documents, indexes them and enables end-users to search for tables. We also propose an extensive set of medium-independent metadata for tables representation. Given a query, TableSeer ranks the returned results using an innovative ranking algorithm - TableRank. Our results show that TableSeer outperforms popular search engines, such as Google Scholar when the end-user seeks for tables.
  • Keywords
    search engines; ubiquitous computing; TableSeer; automatic table extraction; digital documents; innovative ranking algorithm; medium-independent metadata; search engine system; statistical data; Data mining; Displays; Economic indicators; Floods; Image retrieval; Indexing; Information retrieval; Internet; Search engines; Software libraries;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
  • Conference_Location
    Parana
  • ISSN
    1520-5363
  • Print_ISBN
    978-0-7695-2822-9
  • Type

    conf

  • DOI
    10.1109/ICDAR.2007.4377052
  • Filename
    4377052