Title :
Study and Comparison of Different Vocabulary Selection Methods: Application to Topic Detection of Arabic Documents
Author :
Mellouli, Amal ; Jamoussi, Salma
Author_Institution :
Miracle Lab., Inst. of Comput. Sci. & Multimedia, Sfax, Tunisia
Abstract :
In this paper we present many studies and comparisons of different methods of vocabulary selection applied to Topic Detection of some Arabic documents. The topic detection is realized by two methods: neuronal network and support vector machines (SVM). We tested and compared different vocabulary selection methods: word frequency per topic, entropy, Gini and “fselect” based on SVM.
Keywords :
document handling; natural language processing; neural nets; support vector machines; Arabic documents; neuronal network; support vector machines; topic detection; vocabulary selection methods; Artificial neural networks; Frequency measurement; Indexes; Mutual information; Support vector machines; Training; Vocabulary; Arabic language; Multi layer Perceptron; SVM; Topic detection; neuronal network; vocabulary selection methods;
Conference_Titel :
Computational and Information Sciences (ICCIS), 2010 International Conference on
Conference_Location :
Chengdu
Print_ISBN :
978-1-4244-8814-8
Electronic_ISBN :
978-0-7695-4270-6
DOI :
10.1109/ICCIS.2010.314