DocumentCode :
3628499
Title :
Investigating language independence in HMM PoS/MSD-tagging
Author :
Zeljko Agic;Marko Tadic;Zdravko Dovedan
Author_Institution :
Department of Information Sciences, Faculty of Humanities and Social Sciences, University of Zagreb, Ivana Lu?i?a 3, HR-10000 Croatia
fYear :
2008
fDate :
6/1/2008 12:00:00 AM
Firstpage :
657
Lastpage :
662
Abstract :
The paper presents an investigation of functional dependencies in morphosyntactic tagging using hidden Markov models. Starting from a well known fact that the HMM tagging paradigm relies on lexical knowledge acquired from training corpora and stored in form of transition and emission matrices, also called a language model, in the experiment, we apply the TnT trigram tagger on creating language models for seven different languages from MULTEXT East version 3 project translations of George Orwellpsilas novel 1984. - Czech, Estonian, Hungarian, Romanian, Serbian, Slovene and original English version. We then use these language models in the tagging procedure and obtain details on various relations between training corpora statistics, training outputs and outputs of the tagging procedure.
Keywords :
"Accuracy","Training","Hidden Markov models","Tagging","Testing","Read only memory","Runtime"
Publisher :
ieee
Conference_Titel :
Information Technology Interfaces, 2008. ITI 2008. 30th International Conference on
ISSN :
1330-1012
Print_ISBN :
978-953-7138-12-7
Type :
conf
DOI :
10.1109/ITI.2008.4588489
Filename :
4588489
Link To Document :
بازگشت