• DocumentCode
    2009086
  • Title

    Richness evaluation of blogs on its topics using a generative model and probabilistic analysis

  • Author

    Jinhee Park ; Jaedong Lee ; Hye-Wuk Jung ; Jee-Hyong Lee

  • Author_Institution
    Dept. of Electr. Comput. Eng., Sungkyunkwan Univ., Suwon, South Korea
  • fYear
    2012
  • fDate
    20-24 Nov. 2012
  • Firstpage
    381
  • Lastpage
    385
  • Abstract
    Nowadays, blogs are one of important web services to publish and share various information. Accordingly, evaluation of various keywords in blogs is one of the important research topics for effective and efficient classification and retrieval of blogs in the blogosphere. In this paper, we propose a method to identify important keywords in a blog. In order to identify such keywords, we consider web context, assuming that the blogs documents are generated from web contexts by proposed generative model. Therefore, if the contexts of keyword on the web are reflected well in the blog, then we may regard the keyword is essential because the blog is rich on the keyword. We clustered the blog articles on the given keyword by several subtopics using LDA (Latent Dirichlet Analysis), and compared the clusters with the web context documents obtained by web search. Finally, we evaluated the richness of blog on each keyword.
  • Keywords
    Web services; Web sites; information retrieval; pattern classification; pattern clustering; probability; LDA; Web context; Web service; blog article clustering; blog classification; blog retrieval; blogosphere; generative model; latent Dirichlet analysis; probabilistic analysis; richness evaluation; Data Mining; Information Retrieval; Semantic Web; Text Mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Soft Computing and Intelligent Systems (SCIS) and 13th International Symposium on Advanced Intelligent Systems (ISIS), 2012 Joint 6th International Conference on
  • Conference_Location
    Kobe
  • Print_ISBN
    978-1-4673-2742-8
  • Type

    conf

  • DOI
    10.1109/SCIS-ISIS.2012.6505393
  • Filename
    6505393