Title of article :
A comparison of text-classification techniques applied to Arabic text
Author/Authors :
Ghassan Kanaan1، نويسنده , ,
Riyad Al-Shalabi1، نويسنده , ,
Sameh Ghwanmeh2، نويسنده , ,
Hamda Al-Maʹadeed، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2009
Abstract :
Many algorithms have been implemented for the problem of text classification. Most of the work in this area was carried out for English text. Very little research has been carried out on Arabic text. The nature of Arabic text is different than that of English text, and preprocessing of Arabic text is more challenging. This paper presents an implementation of three automatic text-classification techniques for Arabic text. A corpus of 1445 Arabic text documents belonging to nine categories has been automatically classified using the kNN, Rocchio, and naïve Bayes algorithms. The research results reveal that Naïve Bayes was the best performer, followed by kNN and Rocchio.
Journal title :
Journal of the American Society for Information Science and Technology
Journal title :
Journal of the American Society for Information Science and Technology