Title :
Relevant document crawling with usage pattern and domain profile based page ranking
Author :
Gupta, Arpan ; Dixit, Abhishek ; Sharma, Arvind Kumar
Author_Institution :
Comput. Eng. Dept., YMCA Univ. of Sci. & Technol., Faridabad, India
Abstract :
WWW is a distributed heterogeneous information resource. With the exponential growth of WWW, it has become difficult to access desired information that matches with user needs and interest. In spite of strong crawling, indexing and page ranking techniques, the returned result-sets of the search engine lack in accuracy and preciseness. Large number of irrelevant links, topic drift, and load on servers are some of the other issues that need to be addressed towards developing an efficient search engine. In this paper a solution is being proposed for the development of a crawling technique that attempts to reduce server load by taking advantage of migrants for downloading the relevant pages; pertaining to a specific topic only. The downloaded documents are then ranked considering user preferences and past usage patterns of the web page thereby improving the quality of retuned result-sets.
Keywords :
Internet; Web sites; information retrieval; search engines; WWW; Web page; distributed heterogeneous information resource; document crawling; domain profile; indexing techniques; page ranking; returned result-set quality; search engine; usage pattern; user preferences; Computers; Crawlers; Search engines; Servers; Uniform resource locators; Web pages; World Wide Web; Crawler; Domain Profile; Indexer; Page Ranking; Quality; Usage Pattern;
Conference_Titel :
Information Systems and Computer Networks (ISCON), 2013 International Conference on
Conference_Location :
Mathura
Print_ISBN :
978-1-4673-5987-0
DOI :
10.1109/ICISCON.2013.6524186