DocumentCode :
1791683
Title :
Pairwise Topic Model via relation extraction
Author :
Xiaoli Song ; Yue Shang ; Yuan Ling ; Mengwen Liu ; Xiaohua Hu
Author_Institution :
Coll. of Comput. & Inf., Drexel Univ., Philadelphia, PA, USA
fYear :
2014
fDate :
27-30 Oct. 2014
Firstpage :
96
Lastpage :
103
Abstract :
Topic modeling is a powerful tool to model documents to find their underlying topics. However, the unstructured nature of the raw text makes it hard to model the semantic relationship between the text units, which may be the words, phrases or sentences, and thus even harder to model their corresponding underlying topics. In our work, we try to examine the pairwise relationship of the underlying topics through relation extraction. We first extract the entity pairs within one relation tuple out of the raw text. Then, we model the relationship between the entity pairs by adding the dependencies between entities and their corresponding topics. We propose six different versions of Pairwise Topic Model (PTM) to simultaneously discover the latent topics and their pairwise relationship. The experiment on four data sets (AP news articles, DUC 2004 task2, Clinical Notes and Neuroscience Papers) shows the PTM models are better-structured language model than the traditional topic model Latent Dirichlet Allocation (LDA). Also, empirical results show that the proposed Pairwise Topic Models (PTMs) can explicitly explain how two topics are related.
Keywords :
text analysis; LDA; PTM; documents modeling; entity pairs extraction; latent Dirichlet allocation; latent topics; pairwise relationship; pairwise topic model; phrases; raw text relation tuple; relation extraction; semantic relationship; sentences; structured language model; text units; words; Data mining; Data models; Data structures; Educational institutions; Hidden Markov models; Joints; Syntactics; Pairwise Topic Modeling; Relation Extraction; Structured Data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data (Big Data), 2014 IEEE International Conference on
Conference_Location :
Washington, DC
Type :
conf
DOI :
10.1109/BigData.2014.7004362
Filename :
7004362
Link To Document :
بازگشت