• DocumentCode
    124209
  • Title

    DiscWord: Learning Discriminative Topics

  • Author

    Yu Jiang ; Xian Li ; Weiyi Meng

  • Author_Institution
    Dept. of Comput. Sci., Binghamton Univ., Binghamton, NY, USA
  • Volume
    2
  • fYear
    2014
  • fDate
    11-14 Aug. 2014
  • Firstpage
    63
  • Lastpage
    70
  • Abstract
    Topic modeling is a popular research topic and is widely used in text mining based applications. Many researchers realize that the learned topics in the LDA model, each as a multinomial distribution on the word vocabulary space, are often not intuitive in term of human recognition and communication. Based on our observation, given a topic, the most frequent words in it are usually less important than some words that are dedicated to it. In this paper, aiming at learning discriminative topics, we introduce a measure named word discriminability to capture a word´s ability to identify different topics, and propose an iterative algorithm that is able to train and utilize word discriminability information during the topic learning process. Experimental results show that applying our method on the LDA topic model can improve its document classification accuracy significantly, the learned topics are more discriminative, and the top words of a topic are usually more representative.
  • Keywords
    iterative methods; learning (artificial intelligence); pattern classification; text analysis; word processing; DiscWord; LDA topic model; discriminative topic learning; document classification accuracy; iterative algorithm; latent Dirichlet allocation; topic learning process; word discriminability information; Accuracy; Computational modeling; Equations; Mathematical model; Measurement; Vectors; Vocabulary; discriminative topic; feature selection; topic model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on
  • Conference_Location
    Warsaw
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2014.81
  • Filename
    6927608