• DocumentCode
    3695201
  • Title

    Automated analysis of line plots in documents

  • Author

    Rathin Radhakrishnan Nair;Nishant Sankaran;Ifeoma Nwogu;Venu Govindaraju

  • Author_Institution
    Department of Computer Science and Engineering, University at Buffalo, NY 14260-1660, USA
  • fYear
    2015
  • Firstpage
    796
  • Lastpage
    800
  • Abstract
    Information graphics, such as graphs and plots, are used in technical documents to convey information to humans and to facilitate greater understanding. Usually, graphics are a key component in a technical document, as they enable the author to convey complex ideas in a simplified visual format. However, in an automatic text recognition system, which are typically used to digitize documents, the ideas conveyed in a graphical format are lost. We contend that the message or extracted information can be used to help better understand the ideas conveyed in the document. In scientific papers, line plots are the most commonly used graphic to represent experimental results in the form of correlation present between values represented on the axes. The contribution of our work is in the series of image processing algorithms that are used to automatically extract relevant information, including text and plot from graphics found in technical documents. We validate the approach by performing the experiments on a dataset of line plots obtained from scientific documents from computer science conference papers and evaluate the variation of a reconstructed curve from the original curve. Our algorithm achieves a classification accuracy of 91% across the dataset and successfully extracts the axes from 92% of line plots. Axes label extraction and line curve tracing are performed successfully in about half the line plots as well.
  • Keywords
    "Three-dimensional displays","Accuracy","Image color analysis"
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2015 13th International Conference on
  • Type

    conf

  • DOI
    10.1109/ICDAR.2015.7333871
  • Filename
    7333871