DocumentCode
3340194
Title
Categorization of On-Line Handwritten Documents
Author
Saldarriaga, Sebastián Peña ; Morin, Emmanuel ; Viard-gaudin, Christian
Author_Institution
Univ. de Nantes, Nantes
fYear
2008
fDate
16-19 Sept. 2008
Firstpage
95
Lastpage
102
Abstract
With the growth of on-line handwriting technologies, managing facilities for handwritten documents, such as retrieval of documents by topic, are required. These documents can contain graphics, equations or text for instance. This work reports experiments on categorization of on-line handwritten documents based on their textual contents. We assume that handwritten text blocks have been extracted from the documents, and as a first step of the proposed system, we process them with an existing handwritten recognition engine. We analyse the effect of the word recognition rate on the categorization performances, and we compare them with those obtained with the same texts available as ground truth. Two categorization algorithms (kNN and SVM) are compared in this work. The handwritten texts are a subset of the Reuters-21578 corpus collected from more than 1500 writers. Results show that there is no significant categorization performance loss when the word error rate stands below 22%.
Keywords
handwritten character recognition; text analysis; Reuters-21578 corpus; handwritten recognition engine; online handwriting technologies; online handwritten document categorization; Engines; Graphics; Handwriting recognition; Optical character recognition software; Personal digital assistants; Technology management; Text analysis; Text categorization; Text recognition; Writing; Noisy Text; On-line Documents; Text categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis Systems, 2008. DAS '08. The Eighth IAPR International Workshop on
Conference_Location
Nara
Print_ISBN
978-0-7695-3337-7
Type
conf
DOI
10.1109/DAS.2008.45
Filename
4669950
Link To Document