DocumentCode
189199
Title
Generating Cohesive Semantic Topics from Latent Factors
Author
Viana Bicalho, Paulo ; De Oliveira Cunha, Tiago ; Jesus Mourao, Fernando Henrique ; Lobo Pappa, Gisele ; Meira, Wagner
Author_Institution
Comput. Sci., Univ. Fed. de Minas Gerais, Belo Horizonte, Brazil
fYear
2014
fDate
18-22 Oct. 2014
Firstpage
271
Lastpage
276
Abstract
Extracting topics from posts in social networks is a challenging and relevant computational task. Traditionally, topics are extracted by analyzing syntactic properties in the messages, assuming a high correlation between syntax and semantics. This work proposes SToC, a new method for generating more cohesive and meaningful semantic topics within a context. SToC post-processes the output of a Non-Negative Matrix Factorization (NMF) method in order to determine which latent factors should be further merged to improve cohesion. Based on NMF´s output, SToC defines a topics transition graph and uses Markovian theory to merge pairs of topics mutually reachable in this graph. Experiments on two real data sample from Twitter demonstrate that is statistically better than fair baselines in supervised scenarios and able to determine cohesive and semantically valid topics in unsupervised scenarios.
Keywords
Markov processes; data mining; graph theory; matrix decomposition; social networking (online); text analysis; Markovian theory; SToC; Twitter; latent factor; nonnegative matrix factorization; semantic topic; social network; syntactic property; topics transition graph; Context; Entropy; Measurement; Merging; Nominations and elections; Observatories; Semantics; Latent Factors; Merging Topics; Semantic Topics; Social Networks;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Systems (BRACIS), 2014 Brazilian Conference on
Conference_Location
Sao Paulo
Type
conf
DOI
10.1109/BRACIS.2014.56
Filename
6984842
Link To Document