Title :
Efficiency analysis of inflection rule induction
Author :
Szabo, Gabor ; Kovacs, Laszlo
Author_Institution :
Inst. of Inf. Technol., Univ. of Miskolc, Miskolc, Hungary
Abstract :
The world inflection is an important area of computerized linguistics for the agglutinative languages. The presented paper provides an overview of the two main algorithms for learning of inflection rules. The TASR and OSTIA methods are implemented and analyzed with real life data from the Hungarian language. The main novelty of the research work is the development of a robust method to generate training and test data from the documents available on the Internet. The implementation language is Java as Java 8 has great features for parallel and functional programming that could be leveraged in this big data analysis task. The performed tests show that current methods cannot provide both high accuracy and high cost efficiency on the same time.
Keywords :
Big Data; Internet; Java; computational linguistics; data analysis; document handling; learning (artificial intelligence); Hungarian language; Internet; Java 8; OSTIA methods; TASR methods; agglutinative languages; big data analysis task; computerized linguistics; inflection rule induction; inflection rule learning; Grammar; Pragmatics; computational linguistics; grammar induction; word inflection;
Conference_Titel :
Carpathian Control Conference (ICCC), 2015 16th International
Conference_Location :
Szilvasvarad
Print_ISBN :
978-1-4799-7369-9
DOI :
10.1109/CarpathianCC.2015.7145135