DocumentCode :
1584656
Title :
A priori algorithm for sub-category classification analysis of handwriting
Author :
Cha, Sung-Hyuk ; Srihari, Sargur N.
Author_Institution :
Sch. of Comput. Sci. & Inf. Syst., Pace Univ., Pleasantville, NY, USA
fYear :
2001
fDate :
6/23/1905 12:00:00 AM
Firstpage :
1022
Lastpage :
1025
Abstract :
The sub-category classification problem is that of discriminating a pattern to all sub-categories. Not surprisingly, sub-category classification performance estimates are useful information to mine as many researchers are interested in any trend of pattern in specific sub-category. This paper presents a datamining technique to mine a database consisting of experimental and observational unit variables. Experimental unit variables are those attributes which make sub-categories of the entity, e.g., demographic data and observational unit variables are features observed to classify the entity, e.g., test results or handwriting styles, etc. Since there are an enormously large number of subcategories based on the experimental unit variables, we apply the a priori algorithm to select only sub-categories that have enough support among all possible ones in a given database. Those selected sub-categories are then discriminated using observational unit variables as input features to the Artificial Neural Network (ANN) classifier. The importance of this paper is twofold. First, we propose an algorithm that quickly selects all sub-categories that have enough both support and classification rate. Second, we successfully applied the proposed algorithm to the field of handwriting analysis. The task is to determine similarity of handwriting style of a specific group of people. Document examiners are interested in trends in the handwriting of specific groups, e.g., (i) does a male write differently from a female? (ii) can we tell the difference in handwriting of age group between 25 and 45 from others?, etc. Subgroups of white males in the age group 15-24 and white females in the age group 45-64 show 87 % correct classification performance
Keywords :
data mining; handwriting recognition; neural nets; pattern classification; Artificial Neural Network; classification; datamining technique; handwriting analysis; performance estimates; sub-categories; sub-category classification; Algorithm design and analysis; Artificial neural networks; Classification algorithms; Computer science; Data mining; Demography; Information analysis; Information systems; Spatial databases; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7695-1263-1
Type :
conf
DOI :
10.1109/ICDAR.2001.953940
Filename :
953940
Link To Document :
بازگشت