Title : 
Weirdness Coefficient as a Feature Selection Method for Arabic Special Domain Text Classification
         
        
            Author : 
Al-Thubaity, Abdulmohsen ; Alanazi, Ayidh ; Hazzaa, I. ; Al-Tuwaijri, Haya
         
        
            Author_Institution : 
Comput. Res. Inst., King Abdulaziz City for Sci. & Technol., Riyadh, Saudi Arabia
         
        
        
        
        
        
            Abstract : 
Given the importance of organizing and managing the rapid growth in knowledge of Arabic electronic content, this study introduces the Weirdness Coefficient (W) as a new feature selection method for Arabic special domain text classification. The proposed method was used to classify a dataset comprising five Islamic topics using Naive base (NB) and K-nearest neighbor (K-NN) classifiers, and three representation schemas. The results were also compared with a well-known feature selection method, Chi-squared. In addition to its simplicity in computation, the Weirdness Coefficient showed promising classification accuracy.
         
        
            Keywords : 
pattern classification; text analysis; Arabic electronic content; Arabic special domain text classification; Islamic topics; K-NN; K-nearest neighbor classifiers; NB; Naïve base classifiers; feature selection method; weirdness coefficient; Accuracy; Classification algorithms; Computers; Educational institutions; Electronic mail; Niobium; Text categorization; Arabic text classification; K-NN; NB; Weirdness Coefficient; feature selection;
         
        
        
        
            Conference_Titel : 
Asian Language Processing (IALP), 2012 International Conference on
         
        
            Conference_Location : 
Hanoi
         
        
            Print_ISBN : 
978-1-4673-6113-2
         
        
            Electronic_ISBN : 
978-0-7695-4886-9
         
        
        
            DOI : 
10.1109/IALP.2012.64