• DocumentCode
    2665777
  • Title

    The methods of lemmatization of bound case markers in modern Tibetan

  • Author

    Di, Jiang ; Caijun, Kang

  • Author_Institution
    Dept. of Comput. Linguistics, Chinese Acad. of Sci., Beijing, China
  • fYear
    2003
  • fDate
    26-29 Oct. 2003
  • Firstpage
    616
  • Lastpage
    621
  • Abstract
    This paper discusses identifying approaches of bound case markers in modern Tibetan language. The aim is to differentiate bound case markers adhered to presyllables from those homographic endings which is a part of the words. (1) To build up a table consisting of words with (-r/-s) endings, and match words from texts with them. (2) To judge the property of ending forms with the information extracted from predicate verbs and their attributive table. Yet, from the result of our experiment, we still need (3) to further analyze the rules of word-formation of nouns and adjectives, and pay more attention to lexicalized examples or specific words. All of these processing technologies are called lemmatization in our project.
  • Keywords
    grammars; natural languages; text analysis; bound case markers; homographic endings; lemmatization method; lexicalized examples; modern Tibetan language; predicate verbs; word-formation; Automatic testing; Automation; Computational linguistics; Computer aided software engineering; Dictionaries; Gold; Instruments; Mood; Natural languages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003 International Conference on
  • Conference_Location
    Beijing, China
  • Print_ISBN
    0-7803-7902-0
  • Type

    conf

  • DOI
    10.1109/NLPKE.2003.1275980
  • Filename
    1275980