DocumentCode
264879
Title
Design of improved focused web crawler by analyzing semantic nature of URL and anchor text
Author
Dahiwale, Prashant ; Raghuwanshi, M.M. ; Malik, Latesh
Author_Institution
Dept. of CSE, GHRCE, Nagpur, India
fYear
2014
fDate
15-17 Dec. 2014
Firstpage
1
Lastpage
6
Abstract
The world is completely working on digital data. The largest and prime or main collection of this digital data is web. The size of this web is increasing round-the-clock. The principal problem is to search this huge database for specific information. To state whether a web page is relevant to a search topic is a dilemma[l]. There are many techniques to state the relevancy but if focus on the users´ perspective as key issue to guide search then semantic based web crawler are unsurpassed. Semantic based web crawlers maps relevancy with the help of lexical database. The crawler uses the senses provided by lexical database to discover relatedness among the search query and the web page being searched. Focused web crawler helps to find the similarity of web page to the search query without downloading that page. Thus focused web crawler is saving the bandwidth required to download a web page. This paper proposed and discuss one such approach to implement semantic based focused web crawler.
Keywords
Web sites; database management systems; query processing; text analysis; URL; Web page; anchor text; database searching; focused Web crawler design; lexical database; search query; semantic based Web crawler; semantic nature analysis; Books; Crawlers; Databases; Engines; Semantics; Uniform resource locators; Web pages; Lexical Database; Metadata; Relevance; Searching; Semantic; Web Crawling;
fLanguage
English
Publisher
ieee
Conference_Titel
Industrial and Information Systems (ICIIS), 2014 9th International Conference on
Conference_Location
Gwalior
Print_ISBN
978-1-4799-6499-4
Type
conf
DOI
10.1109/ICIINFS.2014.7036556
Filename
7036556
Link To Document