• DocumentCode
    2843229
  • Title

    Multi-word term indexing for Arabic document retrieval

  • Author

    Boulaknadel, Siham ; Daille, Beatrice ; Driss, Aboutajdine

  • Author_Institution
    LINA FRE CNRS 2729, Univ. de Nantes, Nantes
  • fYear
    2008
  • fDate
    6-9 July 2008
  • Firstpage
    869
  • Lastpage
    873
  • Abstract
    To improve information retrieval system performances, it seems important to identify key phrases which constitute a better representation of text semantic content than single word terms. In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.
  • Keywords
    indexing; information retrieval systems; natural language processing; Arabic document retrieval; Arabic language; information retrieval system; linguistic specifications; multiword term extraction; multiword term indexing; multiword term weighting functions; text semantic content; Content based retrieval; Data mining; Databases; Indexing; Information management; Information retrieval; Internet; Natural languages; Protocols; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computers and Communications, 2008. ISCC 2008. IEEE Symposium on
  • Conference_Location
    Marrakech
  • ISSN
    1530-1346
  • Print_ISBN
    978-1-4244-2702-4
  • Electronic_ISBN
    1530-1346
  • Type

    conf

  • DOI
    10.1109/ISCC.2008.4625661
  • Filename
    4625661