Title :
Testing linear separability in classification of inflection rules
Author :
Toth, Zoltan ; Kovacs, Levente
Author_Institution :
Dept. of Inf. Technol., Univ. of Miskolc, Miskolc, Hungary
Abstract :
Agglutinative languages, such as Hungarian, use inflection to modify the meaning of words. Inflection is a string transformation which describe how can a word converted into its inflected form. The transformation can be described by a transformational string. The words can be classified by their transformational string, so inflection is considered as a classification. Linear separability of clusters is important to create an efficient and accurate classification method. This paper review a linear programming based testing method of linear separability. This method was analyzed on generated data sets, these measurements showed the time cost of the algorithm grows polynomially with the number of the points. The accusative case of Hungarian was used to create a data set of 56.000 samples. The words were represented in vector space by alphabetical and phonetic encoding and left and right adjust, thus four different representation of words were used during the tests. Our test results showed there are non linear separable cluster pairs in both of the representations.
Keywords :
linear programming; natural language processing; pattern classification; speech coding; Hungarian languages; agglutinative languages; alphabetical encoding; generated data sets; linear cluster separability; linear programming based testing method; phonetic encoding; string transformation; transformational string; Clustering algorithms; Encoding; Linear programming; Testing; Time measurement; Training; Vectors;
Conference_Titel :
Intelligent Systems and Informatics (SISY), 2014 IEEE 12th International Symposium on
Conference_Location :
Subotica
DOI :
10.1109/SISY.2014.6923610