• DocumentCode
    2351704
  • Title

    Describing Web Topics Meticulously through Word Graph Analysis

  • Author

    Sun, Bai ; Shi, Lei ; Kong, Liang ; Zhang, Yan

  • Author_Institution
    Dept. of Machine Intell., Peking Univ., Beijing, China
  • Volume
    2
  • fYear
    2009
  • fDate
    11-14 Oct. 2009
  • Firstpage
    142
  • Lastpage
    147
  • Abstract
    Topic description is as important as topic detection. In this paper, we propose a novel method to describe Web topics with topic words. Under the assumption that representative words exist in important sentences and have high probability of occurrence with other representative words, two graphs are built, one of which represents the relationship for sentences, the other for words. Considering a topic cluster contains a set of different Web pages, sentence clusters are also introduced. Experimental results on a real data set show that our method achieves excellent performance in both high precision and efficiency, especially when real Web data contain mass of noises.
  • Keywords
    Internet; content management; data mining; graph theory; information retrieval; Web pages; Web topics; noise; sentence clusters; topic cluster; topic description; topic detection; topic words; word graph analysis; Broadcasting; Data mining; Frequency; Information analysis; Information retrieval; Information technology; Machine intelligence; Noise reduction; Sun; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Information Technology, 2009. CIT '09. Ninth IEEE International Conference on
  • Conference_Location
    Xiamen
  • Print_ISBN
    978-0-7695-3836-5
  • Type

    conf

  • DOI
    10.1109/CIT.2009.55
  • Filename
    5329146