Title :
Typeset mathematical expression analysis
Author :
Jin, Jian-Ming ; Han, Zhi ; Wang, Qing-Ren
Author_Institution :
Inst. of Machine Intelligence, Nankai Univ., Tianjin, China
Abstract :
Many mathematical expressions can be found in scientific papers, but no OCR system can recognize a scanned expression. A typeset mathematical expression analysis method, with the main idea to decompose an expression into a serial of sub-expressions and to determine the relations among these sub-expressions, is presented in this paper. The decomposition process, which is hierarchical and recursive, is illustrated by the decomposition tree. Eleven relations are defined, and the actual relationship among the sub-expressions is determined in 5 steps. In order to reduce the complexity of the original expression, mathematical glyphs are divided into 3 levels. Experimental results show that this method is good for analyzing a variety of complex expressions.
Keywords :
character recognition; document image processing; edge detection; trees (mathematics); complexity; decomposition tree; document image processing; mathematical glyphs; multiple line expression detection; typeset mathematical expression; Document image processing; Equations; Image analysis; Image recognition; Image segmentation; MATLAB; Machine intelligence; Optical character recognition software; Text recognition; Typesetting;
Conference_Titel :
Machine Learning and Cybernetics, 2002. Proceedings. 2002 International Conference on
Print_ISBN :
0-7803-7508-4
DOI :
10.1109/ICMLC.2002.1174541