DocumentCode :
2665794
Title :
Synther - a new m-gram POS tagger
Author :
Sündermann, David ; Ney, Hermanil
Author_Institution :
Comput. Sci. Dept., Univ. of Technol., Aachen, Germany
fYear :
2003
fDate :
26-29 Oct. 2003
Firstpage :
622
Lastpage :
627
Abstract :
The part-of-speech (POS) tagger synther based on m-gram statistics is described. After explaining its basic architecture, three smoothing approaches and the strategy for handling unknown words is exposed. Subsequently, synther´s performance is evaluated in comparison with four state-of-the-art POS taggers. All of them are trained and tested on three corpora of different languages and domains. In the course of this evaluation, synther resulted in the lowest error rates or at least below average error rates. Finally, it is shown that the linear interpolation smoothing strategy with coverage-dependent weights features better properties than the two other approaches.
Keywords :
interpolation; natural languages; speech synthesis; statistical analysis; coverage-dependent weights; linear interpolation smoothing strategy; m-gram statistics; synther m-gram part-of-speech tagger; Computer science; Error analysis; Frequency estimation; History; Interpolation; Smoothing methods; Statistics; Tagging; Testing; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003 International Conference on
Conference_Location :
Beijing, China
Print_ISBN :
0-7803-7902-0
Type :
conf
DOI :
10.1109/NLPKE.2003.1275981
Filename :
1275981
Link To Document :
بازگشت