DocumentCode :
263929
Title :
On certain aspects of Kazakh part-of-speech tagging
Author :
Makazhanov, Aibek ; Yessenbayev, Zhandos ; Sabyrgaliyev, Islam ; Sharafudinov, Anuar ; Makhambetov, Olzhas
Author_Institution :
Nazarbayev Univ. Res. & Innovation Syst., Astana, Kazakhstan
fYear :
2014
fDate :
15-17 Oct. 2014
Firstpage :
1
Lastpage :
4
Abstract :
We compare and discuss various approaches to the problem of part of speech (POS) tagging of texts written in Kazakh, an agglutinative and highly inflectional Turkic language. In Kazakh a single root may produce hundreds of word forms, and it is difficult, if at all possible, to label enough training data to account for a vast set of all possible word forms in the language. Thus, current state of the art statistical POS taggers may not be as effective for Kazakh as for morphologically less complex languages, e.g. English. Also the choice of a POS tag set may influence the informativeness and the accuracy of tagging.
Keywords :
natural language processing; text analysis; word processing; Kazakh part-of-speech tagging; Turkic language; text POS tagging; word form; Accuracy; Computational linguistics; Natural language processing; Speech; Tagging; Training; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Application of Information and Communication Technologies (AICT), 2014 IEEE 8th International Conference on
Conference_Location :
Astana
Print_ISBN :
978-1-4799-4120-9
Type :
conf
DOI :
10.1109/ICAICT.2014.7035953
Filename :
7035953
Link To Document :
بازگشت