DocumentCode
3341235
Title
A Large-Scale Analysis of Mathematical Expressions for an Accurate Understanding of Their Structure
Author
Aly, Walaa ; Uchida, Seiichi ; Suzuki, Masakazu
Author_Institution
Kyushu Univ., Fukuoka
fYear
2008
fDate
16-19 Sept. 2008
Firstpage
549
Lastpage
556
Abstract
A wide variety of mathematical expressions printed in scientific and technical reports can be recognized by analyzing the two-dimensional layout structure. In this paper, the position relation between adjacent characters is analyzed for the purpose of automatic discrimination between baseline, subscript, and superscript characters. This analyzing is one of the most important parts of structure analysis. The proposed method is very promising, as the results reached up to (99.76%) over a very large database by using distribution map. This distribution map is defined by two important features, i.e., relative size and relative position.
Keywords
document image processing; mathematics computing; optical character recognition; very large databases; automatic discrimination; distribution map; large-scale analysis; layout structure; math OCR; mathematical expressions; position relation; scientific reports; structure analysis; technical reports; very large database; Character recognition; Information analysis; Large-scale systems; Optical character recognition software; Pattern recognition; Performance analysis; Spatial databases; Text analysis; Text recognition; Writing; Baseline characters; Mathematical documents; Subscript characters; Supscript characters;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis Systems, 2008. DAS '08. The Eighth IAPR International Workshop on
Conference_Location
Nara
Print_ISBN
978-0-7695-3337-7
Type
conf
DOI
10.1109/DAS.2008.53
Filename
4670005
Link To Document