DocumentCode
3289316
Title
A hybrid BSO-Chi2-SVM approach to Arabic text categorization
Author
Belkebir, Riadh ; Guessoum, Abderrezak
Author_Institution
Comput. Sci. Dept., USTHB, El-Alia Bab-Ezzouar, Algeria
fYear
2013
fDate
27-30 May 2013
Firstpage
1
Lastpage
7
Abstract
Automatic categorization of documents has become an important task, especially with the rapid growth of the number of documents available online. Automatic categorization of documents consists in assigning a category to a text based on the information it contains. It aims to automate the association of a document with a category. Automatic categorization can allow solving several problems such as identifying the language of a document, the filtering and detection of spam (junk mail), the routing and forwarding of emails to their recipients, etc. In this paper, we present the results of Arabic text categorization based on three different approaches: artificial neural networks, support vector machines (SVMs) and a hybrid approach BSO-CHI-SVM. We explain the approach and present the results of the implementation and evaluation using two types of representations: root-based stemming and light stemming. The evaluation in each case was done on the Open Source Arabic Corpora (OSAC) using different performance measures.
Keywords
neural nets; support vector machines; text analysis; Arabic text categorization; OSAC; artificial neural networks; automatic document categorization; hybrid BSO-Chi2-SVM approach; light stemming; open source Arabic corpora; root-based stemming; spam detection; spam filtering; support vector machines; Accuracy; Artificial neural networks; Particle swarm optimization; Support vector machines; Text categorization; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Systems and Applications (AICCSA), 2013 ACS International Conference on
Conference_Location
Ifrane
ISSN
2161-5322
Type
conf
DOI
10.1109/AICCSA.2013.6616437
Filename
6616437
Link To Document