• DocumentCode
    2292602
  • Title

    Intelligent Web topics search using early detection and data analysis

  • Author

    Lee, Ching-Cheng ; Yang, Yixin

  • Author_Institution
    California State Univ., Hayward, CA, USA
  • fYear
    2003
  • fDate
    3-6 Nov. 2003
  • Firstpage
    584
  • Lastpage
    589
  • Abstract
    Topic-specific search engines that offer users relevant topics as search results have recently been developed. However, these topic-specific search engines require intensive human efforts to build and maintain. In addition, they visit many irrelevant pages. In our project, we propose a new approach for Web topics search. First, we do early detection for "candidate topics" while extracting words from the HTML text. Secondly, we perform data analysis on the appearance information such as appearance times and places for candidate topics. By these two techniques, we can reduce candidate topics\´ crawling times and computing cost. Analysis of the results and the comparisons with related research will be made to demonstrate the effectiveness of our approach.
  • Keywords
    Web sites; data analysis; hypermedia markup languages; information retrieval systems; search engines; HTML; Web crawling; Web searching; World Wide Web; appearance information; appearance places; appearance times; data analysis; early detection; information access; intelligent systems; topic-specific search engines; word extraction; Costs; Crawlers; Data analysis; Databases; HTML; Humans; Information filtering; Information filters; Search engines; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Software and Applications Conference, 2003. COMPSAC 2003. Proceedings. 27th Annual International
  • ISSN
    0730-3157
  • Print_ISBN
    0-7695-2020-0
  • Type

    conf

  • DOI
    10.1109/CMPSAC.2003.1245399
  • Filename
    1245399