Title :
Intelligent Web topics search using early detection and data analysis
Author :
Lee, Ching-Cheng ; Yang, Yixin
Author_Institution :
California State Univ., Hayward, CA, USA
Abstract :
Topic-specific search engines that offer users relevant topics as search results have recently been developed. However, these topic-specific search engines require intensive human efforts to build and maintain. In addition, they visit many irrelevant pages. In our project, we propose a new approach for Web topics search. First, we do early detection for "candidate topics" while extracting words from the HTML text. Secondly, we perform data analysis on the appearance information such as appearance times and places for candidate topics. By these two techniques, we can reduce candidate topics\´ crawling times and computing cost. Analysis of the results and the comparisons with related research will be made to demonstrate the effectiveness of our approach.
Keywords :
Web sites; data analysis; hypermedia markup languages; information retrieval systems; search engines; HTML; Web crawling; Web searching; World Wide Web; appearance information; appearance places; appearance times; data analysis; early detection; information access; intelligent systems; topic-specific search engines; word extraction; Costs; Crawlers; Data analysis; Databases; HTML; Humans; Information filtering; Information filters; Search engines; Web pages;
Conference_Titel :
Computer Software and Applications Conference, 2003. COMPSAC 2003. Proceedings. 27th Annual International
Print_ISBN :
0-7695-2020-0
DOI :
10.1109/CMPSAC.2003.1245399