Title :
Learning syntactic tree patterns from a balanced Hungarian natural language database, the Szeged Treebank
Author :
Barta, Csongor ; Csendes, Dóra ; Csirik, Jáinos ; Hócza, András ; Kocsor, András ; Kovács, Kornél
Author_Institution :
Dept. of Appl. Informatics, Univ. of Szeged, Hungary
fDate :
30 Oct.-1 Nov. 2005
Abstract :
The current paper has a twofold objective. On the one hand, it describes the creation and the features of the Szeged Treebank, which is currently the largest manually processed Hungarian textual database serving as a reference material for research in natural language processing. On the other hand, detailed information is given about different experiments that aimed at the automatic recognition of syntactic structures with the use of machine learning algorithms. In order to provide comparable results, we applied methods of different categories, namely a rule-based, a logic and a numeric learner to pre-defined parsing problems. The aforementioned Szeged Treebank was used for the training and the testing of the algorithms.
Keywords :
computational linguistics; grammars; learning (artificial intelligence); natural languages; Hungarian textual database; Szeged Treebank; automatic syntactic structure recognition; balanced Hungarian natural language database; logic learner; machine learning algorithm; natural language processing; numeric learner; pre-defined parsing problem; rule-based learner; syntactic tree pattern learning; Data mining; Informatics; Learning systems; Logic; Machine learning algorithms; Natural language processing; Natural languages; Spatial databases; Testing; Text recognition;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
Print_ISBN :
0-7803-9361-9
DOI :
10.1109/NLPKE.2005.1598739