DocumentCode :
264879
Title :
Design of improved focused web crawler by analyzing semantic nature of URL and anchor text
Author :
Dahiwale, Prashant ; Raghuwanshi, M.M. ; Malik, Latesh
Author_Institution :
Dept. of CSE, GHRCE, Nagpur, India
fYear :
2014
fDate :
15-17 Dec. 2014
Firstpage :
1
Lastpage :
6
Abstract :
The world is completely working on digital data. The largest and prime or main collection of this digital data is web. The size of this web is increasing round-the-clock. The principal problem is to search this huge database for specific information. To state whether a web page is relevant to a search topic is a dilemma[l]. There are many techniques to state the relevancy but if focus on the users´ perspective as key issue to guide search then semantic based web crawler are unsurpassed. Semantic based web crawlers maps relevancy with the help of lexical database. The crawler uses the senses provided by lexical database to discover relatedness among the search query and the web page being searched. Focused web crawler helps to find the similarity of web page to the search query without downloading that page. Thus focused web crawler is saving the bandwidth required to download a web page. This paper proposed and discuss one such approach to implement semantic based focused web crawler.
Keywords :
Web sites; database management systems; query processing; text analysis; URL; Web page; anchor text; database searching; focused Web crawler design; lexical database; search query; semantic based Web crawler; semantic nature analysis; Books; Crawlers; Databases; Engines; Semantics; Uniform resource locators; Web pages; Lexical Database; Metadata; Relevance; Searching; Semantic; Web Crawling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Industrial and Information Systems (ICIIS), 2014 9th International Conference on
Conference_Location :
Gwalior
Print_ISBN :
978-1-4799-6499-4
Type :
conf
DOI :
10.1109/ICIINFS.2014.7036556
Filename :
7036556
Link To Document :
بازگشت