• DocumentCode
    708162
  • Title

    Scalable multi-label Arabic text classification

  • Author

    Ahmed, Nizar A. ; Shehab, Mohammed A. ; Al-Ayyoub, Mahmoud ; Hmeidi, Ismail

  • Author_Institution
    Jordan Univ. of Sci. & Technol., Irbid, Jordan
  • fYear
    2015
  • fDate
    7-9 April 2015
  • Firstpage
    212
  • Lastpage
    217
  • Abstract
    Multi-label text classification (MTC) is a natural extension of the traditional text classification (TC) in which a possibly large set of labels can be assigned to each document. The dimensionality of labels makes MTC difficult and challenging. Several ways are proposed to ease the classification process and one of them is called the problem transformation (PT) method. It is used to transform the multi-labeled data into a single-label one that is suitable for normal classification. Our paper presents a detailed study about using the supervised approach to address the MTC problem for Arabic text. Moreover, the scalability of such an approach is considered in our experiments. The MEKA system is used to convert the multi-label data into a single-label one using different PT methods: LC, BR and RT. Then, different classifiers commonly used for TC such as SVM, NB, KNN, and Decision Tree, are applied to each dataset. The results show that using SVM on the LC dataset generated the best results with 71% ML-accuracy.
  • Keywords
    decision trees; pattern classification; support vector machines; text analysis; KNN classification; LC dataset; MEKA system; MTC; NB classification; PT method; SVM classification; data classification; decision tree classification; k-nearest neighbor classification; label dimensionality; multilabel Arabic text classification; naive Bayes classification; problem transformation method; support vector machines; Accuracy; Loss measurement; Niobium; Scalability; Support vector machines; Training; Exact match; Hamming loss; MEKA; Multi-label classification; Problem transformation methods; Scalability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information and Communication Systems (ICICS), 2015 6th International Conference on
  • Conference_Location
    Amman
  • Type

    conf

  • DOI
    10.1109/IACS.2015.7103229
  • Filename
    7103229