DocumentCode :
3105228
Title :
Web indexing using HTML priority system
Author :
Sagar, Yashwant
Author_Institution :
Dept. of Inf. Technol., SRM Univ., Kattankulathur, India
fYear :
2015
fDate :
25-27 Feb. 2015
Firstpage :
581
Lastpage :
584
Abstract :
The unstructured nature and the sheer size of the World Wide Web make it a challenging task to index. This paper will discuss about how web can be incrementally indexed using Inverted Indices and Distributed Hash Table for efficiently organizing the data while incrementally build the index using the search mechanism itself, and HTML Priority System for ranking the pages to improve precision and recall. It also discusses certain challenges that a content-based ranking system must face to counter spam.
Keywords :
Internet; hypermedia markup languages; indexing; HTML priority system; Web indexing; World Wide Web; content-based ranking system; distributed hash table; inverted indices; spam; Crawlers; HTML; Indexing; Search engines; Uniform resource locators; Unsolicited electronic mail; Distributed Hash Tables; HTML Priority System; Inverted Index; Search Engine; Web Indexing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE), 2015 International Conference on
Conference_Location :
Noida
Print_ISBN :
978-1-4799-8432-9
Type :
conf
DOI :
10.1109/ABLAZE.2015.7154929
Filename :
7154929
Link To Document :
بازگشت