Title :
Study on Web Content Acquirement Technology Based on Conception Topic
Author :
Geng, Zengmin ; Li, Xuefei ; Sun, Xiaodong
Author_Institution :
Comput. Inf. Center, Beijing Inst. of Fashion Technol., Beijing, China
Abstract :
For the improvement of recall and precision of search engine based on key words, this thesis makes a study of web pages acquirement technology based on conception topic. Initially it analyses the shortcoming of HITS algorithm and develop a new algorithm based on conception topic. The new algorithm reforms HITS by getting rid of link farms when extending the root set, it overcomes the weakness of which is based on text analysis or link analysis, it promotes the search engine´s intellectual judgment of the users´ interest and makes the crawling web pages more in need of the users´ requirement. Experiments in fashion field show the new algorithm is better than Google.
Keywords :
Web sites; search engines; HITS algorithm; Web content acquirement technology; Web page crawling; Web pages; conception topic; link farms; search engine; Algorithm design and analysis; Google; Presses; Search engines; Textiles; Web mining; Web pages;
Conference_Titel :
Internet Technology and Applications (iTAP), 2011 International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-7253-6
DOI :
10.1109/ITAP.2011.6006352