• DocumentCode
    3317276
  • Title

    Learning syntactic tree patterns from a balanced Hungarian natural language database, the Szeged Treebank

  • Author

    Barta, Csongor ; Csendes, Dóra ; Csirik, Jáinos ; Hócza, András ; Kocsor, András ; Kovács, Kornél

  • Author_Institution
    Dept. of Appl. Informatics, Univ. of Szeged, Hungary
  • fYear
    2005
  • fDate
    30 Oct.-1 Nov. 2005
  • Firstpage
    225
  • Lastpage
    231
  • Abstract
    The current paper has a twofold objective. On the one hand, it describes the creation and the features of the Szeged Treebank, which is currently the largest manually processed Hungarian textual database serving as a reference material for research in natural language processing. On the other hand, detailed information is given about different experiments that aimed at the automatic recognition of syntactic structures with the use of machine learning algorithms. In order to provide comparable results, we applied methods of different categories, namely a rule-based, a logic and a numeric learner to pre-defined parsing problems. The aforementioned Szeged Treebank was used for the training and the testing of the algorithms.
  • Keywords
    computational linguistics; grammars; learning (artificial intelligence); natural languages; Hungarian textual database; Szeged Treebank; automatic syntactic structure recognition; balanced Hungarian natural language database; logic learner; machine learning algorithm; natural language processing; numeric learner; pre-defined parsing problem; rule-based learner; syntactic tree pattern learning; Data mining; Informatics; Learning systems; Logic; Machine learning algorithms; Natural language processing; Natural languages; Spatial databases; Testing; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
  • Print_ISBN
    0-7803-9361-9
  • Type

    conf

  • DOI
    10.1109/NLPKE.2005.1598739
  • Filename
    1598739