DocumentCode :
1640143
Title :
Extraction and interpretation of charts in technical documents
Author :
Kallimani, Jagadish S. ; Srinivasa, K.G. ; Eswara, Reddy B.
Author_Institution :
Dept. of CSE, JNTUK, Kakinada, India
fYear :
2013
Firstpage :
382
Lastpage :
387
Abstract :
The Information Extraction is a method for filtering information from large volumes of text. It includes the extraction of documents from collections and the tagging of particular terms in text. But non-text information such as graphs, images, figures, etc are common in any technical documents. Scientific charts are commonly used in graphical representation of statistical, experimental and technical data. These are the major visual aids for data analysis and are simple, clear and widely used. Image understanding is the research area concerned with the design and experimentation of computer systems that integrate models of a visual image problem domain. A system for recognition and interpretation of simple bar graphs is proposed. The system generates a natural language description based on semantic understanding. An approach for chart interpretation and conversion to natural language text is discussed. Chart interpretation processes graphical and textual components separately, and then it associates the graphical information to its corresponding textual data. The semantic meaning of the chart is presented in a natural language format.
Keywords :
data analysis; document image processing; information filtering; natural language processing; text analysis; bar graph; chart extraction; chart interpretation process; chart semantic meaning; computer system; data analysis; document extraction; experimental data; graphical components; graphical information; graphical representation; image understanding; information extraction; information filtering; natural language description; natural language format; natural language text conversion; nontext information; semantic understanding; statistical data; technical data; technical documents; text volume; textual components; textual data; visual aids; visual image problem domain; Bars; Data mining; Image edge detection; Image segmentation; Natural languages; Optical character recognition software; Transforms; Content Tagging Module; Image Understanding; Modified Probabilistic Hough Transform; NL Generator; Visual Extraction Module;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advances in Computing, Communications and Informatics (ICACCI), 2013 International Conference on
Conference_Location :
Mysore
Print_ISBN :
978-1-4799-2432-5
Type :
conf
DOI :
10.1109/ICACCI.2013.6637202
Filename :
6637202
Link To Document :
بازگشت