Title :
Multi-word term indexing for Arabic document retrieval
Author :
Boulaknadel, Siham ; Daille, Beatrice ; Driss, Aboutajdine
Author_Institution :
LINA FRE CNRS 2729, Univ. de Nantes, Nantes
Abstract :
To improve information retrieval system performances, it seems important to identify key phrases which constitute a better representation of text semantic content than single word terms. In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.
Keywords :
indexing; information retrieval systems; natural language processing; Arabic document retrieval; Arabic language; information retrieval system; linguistic specifications; multiword term extraction; multiword term indexing; multiword term weighting functions; text semantic content; Content based retrieval; Data mining; Databases; Indexing; Information management; Information retrieval; Internet; Natural languages; Protocols; Text recognition;
Conference_Titel :
Computers and Communications, 2008. ISCC 2008. IEEE Symposium on
Conference_Location :
Marrakech
Print_ISBN :
978-1-4244-2702-4
Electronic_ISBN :
1530-1346
DOI :
10.1109/ISCC.2008.4625661