• DocumentCode
    189199
  • Title

    Generating Cohesive Semantic Topics from Latent Factors

  • Author

    Viana Bicalho, Paulo ; De Oliveira Cunha, Tiago ; Jesus Mourao, Fernando Henrique ; Lobo Pappa, Gisele ; Meira, Wagner

  • Author_Institution
    Comput. Sci., Univ. Fed. de Minas Gerais, Belo Horizonte, Brazil
  • fYear
    2014
  • fDate
    18-22 Oct. 2014
  • Firstpage
    271
  • Lastpage
    276
  • Abstract
    Extracting topics from posts in social networks is a challenging and relevant computational task. Traditionally, topics are extracted by analyzing syntactic properties in the messages, assuming a high correlation between syntax and semantics. This work proposes SToC, a new method for generating more cohesive and meaningful semantic topics within a context. SToC post-processes the output of a Non-Negative Matrix Factorization (NMF) method in order to determine which latent factors should be further merged to improve cohesion. Based on NMF´s output, SToC defines a topics transition graph and uses Markovian theory to merge pairs of topics mutually reachable in this graph. Experiments on two real data sample from Twitter demonstrate that is statistically better than fair baselines in supervised scenarios and able to determine cohesive and semantically valid topics in unsupervised scenarios.
  • Keywords
    Markov processes; data mining; graph theory; matrix decomposition; social networking (online); text analysis; Markovian theory; SToC; Twitter; latent factor; nonnegative matrix factorization; semantic topic; social network; syntactic property; topics transition graph; Context; Entropy; Measurement; Merging; Nominations and elections; Observatories; Semantics; Latent Factors; Merging Topics; Semantic Topics; Social Networks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems (BRACIS), 2014 Brazilian Conference on
  • Conference_Location
    Sao Paulo
  • Type

    conf

  • DOI
    10.1109/BRACIS.2014.56
  • Filename
    6984842