DocumentCode
3142360
Title
Unsupervised induction of dependency structures using Probabilistic Bilexical Grammars
Author
Dominguez, Martín Ariel ; Infante-lopez, Gabriel
Author_Institution
Grupo de Procesamiento de Lenguaje Natural, Univ. Nac. de Cordoba, Córdoba, Argentina
fYear
2011
fDate
27-29 Nov. 2011
Firstpage
314
Lastpage
318
Abstract
Unsupervised parsing induction has attracted a significant amount of attention over the last few years. However, current systems exhibit a degree of complexity that can shy away newcomers to the field. We challenge the need for such complexity and present a straightforward weak-EM based system. The results we obtained are close to state-of-the-art ones while still making it extremely simple to experiment with different sub-components. We use a k-best parser, an inductor for Probabilistic Bilexical Grammars (PBGs) [1] and a simple treebank builder. Since our algorithm is independent of the PBG inductor, it overlaps with other models from the literature such as Dependency Model with Valence [2]. Our algorithms are fully fleshed and easily reproducible. We experiment in 8 languages that inform intuitions in training- size dependent parameterization.
Keywords
grammars; learning (artificial intelligence); natural language processing; dependency model; probabilistic bilexical grammars; treebank builder; unsupervised dependency structures induction; unsupervised parsing induction; valence; weak EM based system; Automata; Grammar; Inductors; Learning automata; Materials; Probabilistic logic; Training; Unsupervised dependency parsing; bilexical grammar; grammar induction; weak-EM;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing andKnowledge Engineering (NLP-KE), 2011 7th International Conference on
Conference_Location
Tokushima
Print_ISBN
978-1-61284-729-0
Type
conf
DOI
10.1109/NLPKE.2011.6138216
Filename
6138216
Link To Document