• DocumentCode
    1791683
  • Title

    Pairwise Topic Model via relation extraction

  • Author

    Xiaoli Song ; Yue Shang ; Yuan Ling ; Mengwen Liu ; Xiaohua Hu

  • Author_Institution
    Coll. of Comput. & Inf., Drexel Univ., Philadelphia, PA, USA
  • fYear
    2014
  • fDate
    27-30 Oct. 2014
  • Firstpage
    96
  • Lastpage
    103
  • Abstract
    Topic modeling is a powerful tool to model documents to find their underlying topics. However, the unstructured nature of the raw text makes it hard to model the semantic relationship between the text units, which may be the words, phrases or sentences, and thus even harder to model their corresponding underlying topics. In our work, we try to examine the pairwise relationship of the underlying topics through relation extraction. We first extract the entity pairs within one relation tuple out of the raw text. Then, we model the relationship between the entity pairs by adding the dependencies between entities and their corresponding topics. We propose six different versions of Pairwise Topic Model (PTM) to simultaneously discover the latent topics and their pairwise relationship. The experiment on four data sets (AP news articles, DUC 2004 task2, Clinical Notes and Neuroscience Papers) shows the PTM models are better-structured language model than the traditional topic model Latent Dirichlet Allocation (LDA). Also, empirical results show that the proposed Pairwise Topic Models (PTMs) can explicitly explain how two topics are related.
  • Keywords
    text analysis; LDA; PTM; documents modeling; entity pairs extraction; latent Dirichlet allocation; latent topics; pairwise relationship; pairwise topic model; phrases; raw text relation tuple; relation extraction; semantic relationship; sentences; structured language model; text units; words; Data mining; Data models; Data structures; Educational institutions; Hidden Markov models; Joints; Syntactics; Pairwise Topic Modeling; Relation Extraction; Structured Data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data (Big Data), 2014 IEEE International Conference on
  • Conference_Location
    Washington, DC
  • Type

    conf

  • DOI
    10.1109/BigData.2014.7004362
  • Filename
    7004362