Title :
The methods of lemmatization of bound case markers in modern Tibetan
Author :
Di, Jiang ; Caijun, Kang
Author_Institution :
Dept. of Comput. Linguistics, Chinese Acad. of Sci., Beijing, China
Abstract :
This paper discusses identifying approaches of bound case markers in modern Tibetan language. The aim is to differentiate bound case markers adhered to presyllables from those homographic endings which is a part of the words. (1) To build up a table consisting of words with (-r/-s) endings, and match words from texts with them. (2) To judge the property of ending forms with the information extracted from predicate verbs and their attributive table. Yet, from the result of our experiment, we still need (3) to further analyze the rules of word-formation of nouns and adjectives, and pay more attention to lexicalized examples or specific words. All of these processing technologies are called lemmatization in our project.
Keywords :
grammars; natural languages; text analysis; bound case markers; homographic endings; lemmatization method; lexicalized examples; modern Tibetan language; predicate verbs; word-formation; Automatic testing; Automation; Computational linguistics; Computer aided software engineering; Dictionaries; Gold; Instruments; Mood; Natural languages;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003 International Conference on
Conference_Location :
Beijing, China
Print_ISBN :
0-7803-7902-0
DOI :
10.1109/NLPKE.2003.1275980