DocumentCode :
2665777
Title :
The methods of lemmatization of bound case markers in modern Tibetan
Author :
Di, Jiang ; Caijun, Kang
Author_Institution :
Dept. of Comput. Linguistics, Chinese Acad. of Sci., Beijing, China
fYear :
2003
fDate :
26-29 Oct. 2003
Firstpage :
616
Lastpage :
621
Abstract :
This paper discusses identifying approaches of bound case markers in modern Tibetan language. The aim is to differentiate bound case markers adhered to presyllables from those homographic endings which is a part of the words. (1) To build up a table consisting of words with (-r/-s) endings, and match words from texts with them. (2) To judge the property of ending forms with the information extracted from predicate verbs and their attributive table. Yet, from the result of our experiment, we still need (3) to further analyze the rules of word-formation of nouns and adjectives, and pay more attention to lexicalized examples or specific words. All of these processing technologies are called lemmatization in our project.
Keywords :
grammars; natural languages; text analysis; bound case markers; homographic endings; lemmatization method; lexicalized examples; modern Tibetan language; predicate verbs; word-formation; Automatic testing; Automation; Computational linguistics; Computer aided software engineering; Dictionaries; Gold; Instruments; Mood; Natural languages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003 International Conference on
Conference_Location :
Beijing, China
Print_ISBN :
0-7803-7902-0
Type :
conf
DOI :
10.1109/NLPKE.2003.1275980
Filename :
1275980
Link To Document :
بازگشت