Title :
N-gram Index Structure Study for Semantic Based Mathematical Formula
Author :
Yuexia Xu ; Wei Su ; Ming Cheng ; Zhiyi Qu ; Hui Li
Author_Institution :
Coll. of Inf. Sci. & Eng., Lanzhou Univ., Lanzhou, China
Abstract :
Recently mathematical formula retrieval has become a hot and difficult problem in the field of information science research. The paper presents an N-grams division method of mathematical formula, determines the granularities of division by experiments, and proposes calculating method of a sub-formula weight based on the complexity of formula, the length of N-grams and depth. In addition, this paper considers the impact that operators made on weights of sub-formulas and gives a method of calculating the weight that sub-formula shares in the whole formula. Experiments show that the methods of N-grams division and index construction have great help for sub-formula matching and weighting computation. The methods can also improve the recall and precision of sub-formulas. The quick and feasible semantic based method can greatly enhance the semantic search capabilities of mathematical search.
Keywords :
information retrieval; mathematics computing; pattern matching; search engines; N-gram index structure study; N-grams division method; formula complexity; index construction; information science research; semantic based mathematical formula retrieval; semantic search capabilities; subformula matching; subformula weight; weighting computation; Complexity theory; Educational institutions; Indexing; Mathematical model; Search engines; Semantics; Formula search; Math Search; MathML; N-grams division; Search engine; Sub-formula weight;
Conference_Titel :
Computational Intelligence and Security (CIS), 2014 Tenth International Conference on
Conference_Location :
Kunming
Print_ISBN :
978-1-4799-7433-7
DOI :
10.1109/CIS.2014.174