• DocumentCode
    1805882
  • Title

    Discover web forums via user browsing behavior detection

  • Author

    Jiang, Jingtian ; Nenghai Yu

  • Author_Institution
    Dept. of Electron. Eng. & Inf. Sci., Univ. of Sci. & Technol. of China, Hefei, China
  • Volume
    4
  • fYear
    2011
  • fDate
    24-26 Dec. 2011
  • Firstpage
    2390
  • Lastpage
    2395
  • Abstract
    Web forums are important services where users can request and exchange information with others.Recently, there are more and more research works on mining knowledge from web forums due to the richness of information. In contrast, there is little work about discovering web forums. However, automatic web forum discovery is crucial for large-scale applications, e.g. a forum search engine. In this paper, we study how to discover web forums from browse log automatically. Although web forums have different layouts or styles, they always have similar implicit navigation paths leading users from their entry pages to thread pages. The implicit navigation paths make the user browsing behavior in web forums different from that in general sites. Thus we propose an efficient approach to discover web forums from browse log via detecting the specific user browsing behavior. We first build a browse map by clustering the browsed URLs, and then detect the browse behavior from the browse map. Next we adopt a few features to determine whether a site is a web forum or not. Experiment results on a large data set show that our approach is very effective and efficient.
  • Keywords
    Internet; Web sites; data mining; online front-ends; search engines; URL; Web forums; automatic Web forum discovery; browse behavior; browse log; browse map; knowledge mining; large-scale applications; navigation paths; search engine; thread pages; user browsing behavior detection; Clustering algorithms; Image edge detection; Indexes; Knowledge engineering; Layout; Navigation; Vectors; URL clustering; user browsing behavior; web forum; web forum discovery;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Network Technology (ICCSNT), 2011 International Conference on
  • Conference_Location
    Harbin
  • Print_ISBN
    978-1-4577-1586-0
  • Type

    conf

  • DOI
    10.1109/ICCSNT.2011.6182453
  • Filename
    6182453