DocumentCode
2665777
Title
The methods of lemmatization of bound case markers in modern Tibetan
Author
Di, Jiang ; Caijun, Kang
Author_Institution
Dept. of Comput. Linguistics, Chinese Acad. of Sci., Beijing, China
fYear
2003
fDate
26-29 Oct. 2003
Firstpage
616
Lastpage
621
Abstract
This paper discusses identifying approaches of bound case markers in modern Tibetan language. The aim is to differentiate bound case markers adhered to presyllables from those homographic endings which is a part of the words. (1) To build up a table consisting of words with (-r/-s) endings, and match words from texts with them. (2) To judge the property of ending forms with the information extracted from predicate verbs and their attributive table. Yet, from the result of our experiment, we still need (3) to further analyze the rules of word-formation of nouns and adjectives, and pay more attention to lexicalized examples or specific words. All of these processing technologies are called lemmatization in our project.
Keywords
grammars; natural languages; text analysis; bound case markers; homographic endings; lemmatization method; lexicalized examples; modern Tibetan language; predicate verbs; word-formation; Automatic testing; Automation; Computational linguistics; Computer aided software engineering; Dictionaries; Gold; Instruments; Mood; Natural languages;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003 International Conference on
Conference_Location
Beijing, China
Print_ISBN
0-7803-7902-0
Type
conf
DOI
10.1109/NLPKE.2003.1275980
Filename
1275980
Link To Document