• DocumentCode
    3580865
  • Title

    Experiments on keyword list generation by term distribution clustering for text classification

  • Author

    Fonda, Wilson ; Purwarianti, Ayu

  • Author_Institution
    Sch. of Electr. Eng. & Inf., Inst. Teknol. Bandung, Bandung, Indonesia
  • fYear
    2014
  • Firstpage
    297
  • Lastpage
    301
  • Abstract
    Text classification is a useful task in text mining. Most researchers employ one word weight type in the text classification. Here, we proposed to build a keyword list by combining several word weights for a rule based multi label text classification. Through this research, we conducted experiments on the term distribution clustering to produce the best automatic generated keyword list. We compared several term weights such as TFxIDF, MI, IG, and DF. As for the case study, we implemented the text classification of authority classification in complaint management system using the generated keyword list. The experiments on 245 Twitter data using keyword list generated from 2325 Twitter data showed that the best accuracy was achieved by using all term weights compared to only one term weight in the term distribution clustering.
  • Keywords
    pattern classification; pattern clustering; social networking (online); text analysis; DF term weight; IG term weight; MI term weight; TFxIDF term weight; Twitter data; authority classification; automatic keyword list generation; complaint management system; rule based multilabel text classification; term distribution clustering; text mining; word weights; Conferences; Decision support systems; Electrical engineering; Frequency measurement; Informatics; Training data; Transportation; DF; IG; MI; TFxIDF; clustering; keyword list; term distribution; text classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Computer Science and Information Systems (ICACSIS), 2014 International Conference on
  • Type

    conf

  • DOI
    10.1109/ICACSIS.2014.7065879
  • Filename
    7065879