• DocumentCode
    2290402
  • Title

    A comparison of morpheme and word based document retrieval for Asian languages

  • Author

    Nguyen, Van Be Hai ; Vines, Phil ; Wilkinson, Ross

  • Author_Institution
    Dept. of Comput. Sci., R. Melbourne Inst. of Technol., Vic., Australia
  • fYear
    1996
  • fDate
    9-10 Sep 1996
  • Firstpage
    291
  • Lastpage
    296
  • Abstract
    Most document retrieval systems are word based. Words are very convenient retrieval units in English but not so in some Asian languages. The task of determining which morphemes constitute words in Vietnamese and Chinese is problematic, and has been assumed to be the reason that word based retrieval does not work so well. The paper examines a number of segmentation algorithms, and then reports on some experiments comparing morpheme and word based retrieval. It shows that morpheme based retrieval is hard to improve on
  • Keywords
    information retrieval system evaluation; natural languages; query processing; Asian languages; Chinese language; Vietnamese language; morpheme based document retrieval; segmentation algorithms; word based document retrieval; Computer science; Data mining; Frequency; Indexing; Information retrieval; Natural languages; Robustness;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Database and Expert Systems Applications, 1996. Proceedings., Seventh International Workshop on
  • Conference_Location
    Zurich
  • Print_ISBN
    0-8186-7662-0
  • Type

    conf

  • DOI
    10.1109/DEXA.1996.558329
  • Filename
    558329