• DocumentCode
    1754315
  • Title

    Experiences with developing language processing tools and corpora for Amharic

  • Author

    Gambäck, Björn ; Asker, Lars

  • Author_Institution
    SICS, Swedish Inst. of Comput. Sci. AB, Kista, Sweden
  • fYear
    2010
  • fDate
    19-21 May 2010
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    A major bottleneck for promoting use of computers and the Internet is that many languages lack access to basic tools that would make it possible for people to access ICT in their own language. The paper describes the development a set of such resources for the processing of Amharic, the working language of the Ethiopian government. The primary goal was to investigate techniques and methods that can be used to efficiently create computational linguistic resources for new languages based on existing tools and resources. The resources created consist of linguistically annotated text collections and tools for word-level analysis of Amharic.
  • Keywords
    computational linguistics; natural language processing; text analysis; Amharic; Ethiopian government; Internet; computational linguistic; language processing tools development; linguistically annotated text collections; word-level analysis; Accuracy; Computational linguistics; Dictionaries; Internet; Tagging; Text categorization; Training; Amharic; Corpora; Part-of-Speech Tagging; Text Categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    IST-Africa, 2010
  • Conference_Location
    Durban
  • Print_ISBN
    978-1-905824-15-1
  • Type

    conf

  • Filename
    5753057