DocumentCode :
2843229
Title :
Multi-word term indexing for Arabic document retrieval
Author :
Boulaknadel, Siham ; Daille, Beatrice ; Driss, Aboutajdine
Author_Institution :
LINA FRE CNRS 2729, Univ. de Nantes, Nantes
fYear :
2008
fDate :
6-9 July 2008
Firstpage :
869
Lastpage :
873
Abstract :
To improve information retrieval system performances, it seems important to identify key phrases which constitute a better representation of text semantic content than single word terms. In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.
Keywords :
indexing; information retrieval systems; natural language processing; Arabic document retrieval; Arabic language; information retrieval system; linguistic specifications; multiword term extraction; multiword term indexing; multiword term weighting functions; text semantic content; Content based retrieval; Data mining; Databases; Indexing; Information management; Information retrieval; Internet; Natural languages; Protocols; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computers and Communications, 2008. ISCC 2008. IEEE Symposium on
Conference_Location :
Marrakech
ISSN :
1530-1346
Print_ISBN :
978-1-4244-2702-4
Electronic_ISBN :
1530-1346
Type :
conf
DOI :
10.1109/ISCC.2008.4625661
Filename :
4625661
Link To Document :
بازگشت