• DocumentCode
    1695673
  • Title

    Document categorizer agent based on ACM hierarchy

  • Author

    Chekima, K. ; Chin Kim On ; Alfred, Rayner ; Gan Kim Soon ; Anthony, Philip

  • Author_Institution
    Sch. of Eng. & Inf. Technol., Centre of Excellence in Semantic Agents, Univ. Malaysia Sabah, Kota Kinabalu, Malaysia
  • fYear
    2012
  • Firstpage
    386
  • Lastpage
    391
  • Abstract
    As the number of research papers increases, the need for academic categorizer system becomes crucial. This is to help academicians organize their research papers into pre-defined categories based on the documents´ content similarity. This paper presents the Document Categorizer Agent based on ACM CCS (Association for Computing Machinery Computing Classification System). First, we studied the ACM categories hierarchy. Next, based on these categories, we retrieved our corpus from ACM DL (ACM Digital Library) to train our Categorizer Agent using a popular machine learning technique called Naïve Bayes Classifier. We used two types of training data for the corpus namely, negative training data and positive training data. Next, these papers are categorized according to their content based on the same training data. We tested our Document Categorizer Agent on a number of academic papers to test its accuracy. The result we obtained showed promising results.
  • Keywords
    Bayes methods; computational linguistics; content management; digital libraries; document handling; information retrieval; learning (artificial intelligence); ACM CCS; ACM DL; ACM digital library; ACM hierarchy; Naïve Bayes classifier; academic categorizer system; academic paper; association for computing machinery computing classification system; corpus retrieval; document categorizer agent; document content similarity; machine learning; Agent Technology; Document Categorizer Agent; Information Retrieval; Naïve Bayes Classifier;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Control System, Computing and Engineering (ICCSCE), 2012 IEEE International Conference on
  • Conference_Location
    Penang
  • Print_ISBN
    978-1-4673-3142-5
  • Type

    conf

  • DOI
    10.1109/ICCSCE.2012.6487176
  • Filename
    6487176