• DocumentCode
    3756537
  • Title

    BL-LDA: Bringing Bigram to Supervised Topic Model

  • Author

    Youngsun Park;Md. Hijbul Alam;Woo-Jong Ryu;Sangkeun Lee

  • Author_Institution
    Dept. of Comput. Sci. &
  • fYear
    2015
  • Firstpage
    83
  • Lastpage
    88
  • Abstract
    With the increasing amount of data being published on the Web, it is difficult to analyze their content within a short time. Topic modeling techniques can summarize textual data that contains several topics. Both the label (such as category or tag) and word co-occurrence play a significant role in understanding textual data. However, many conventional topic modeling techniques are limited to the bag-of-words assumption. In this paper, we develop a probabilistic model called Bigram Labeled Latent Dirichlet Allocation (BL-LDA), to address the limitation of the bag-of-words assumption. The proposed BL-LDA incorporates the bigram into the Labeled LDA (L-LDA) technique. Extensive experiments on Yelp data show that the proposed scheme is better than the L-LDA in terms of accuracy.
  • Keywords
    "Data models","Mathematical model","Training data","Computational modeling","Analytical models","Probabilistic logic","Data mining"
  • Publisher
    ieee
  • Conference_Titel
    Computational Science and Computational Intelligence (CSCI), 2015 International Conference on
  • Type

    conf

  • DOI
    10.1109/CSCI.2015.146
  • Filename
    7424068