Title :
Symbol Knowledge Extraction from a Simple Graphical Language
Author :
Li, Jinpeng ; Mouchere, Harold ; Viard-gaudin, Christian
Author_Institution :
IRCCyN, Univ. de Nantes, Nantes, France
Abstract :
In this paper, we study the problem of symbol knowledge extraction. We assume that some unknown symbols are used to compose a handwritten message, and from a dataset of handwritten samples, we would like to recover the symbol set used in the corresponding language. We applied our approach on online handwriting, and select the domain of numerical expressions, mixing digits and operators, to test the ability to retrieve the corresponding symbol classes. The proposed method is based on three steps: a quantization of the stroke space, a description of the layout of strokes with a relational graph, and the extraction of an optimal lexicon using a minimum description length algorithm. At the symbol level, a recall rate of 74% is obtained on the test dataset produced by 100 writers.
Keywords :
graph theory; handwriting recognition; knowledge acquisition; symbol manipulation; visual languages; graphical language; handwritten message; minimum description length algorithm; numerical expressions; online handwriting; optimal lexicon extraction; relational graph; stroke layout description; stroke space quantization; symbol classes; symbol knowledge extraction; symbol set; Clustering algorithms; Databases; Grammar; Huffman coding; Prototypes; Training; Viterbi algorithm; knowledge extraction; minimum description length; online handwriting; spatial relation;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
DOI :
10.1109/ICDAR.2011.128