Author :
Swaminathan, Ashwin ; Mathew, Cherian V. ; Kirovski, Darko
Abstract :
Results to Web search queries are ranked using heuristics that typically analyze the global link topology, user behavior, and content relevance. We point to a particular inefficiency of such methods: information redundancy. In queries where learning about a subject is an objective, modern search engines return relatively unsatisfactory results as they consider the query coverage by each page individually, not a set of pages as a whole. We address this problem using essential pages. If we denote as $mathbb{S}_Q$ the total knowledge that exists on the Web about a given query $Q$, we want to build a search engine that returns a set of essential pages $E_Q$ that maximizes the information covered over $mathbb{S}_Q$. We present a preliminary prototype that optimizes the selection of essential pages; we draw some informal comparisons with respect to existing search engines; and finally, we evaluate our prototype using a blind-test user study.
Keywords :
Web page ranking; Web search; coverage; learning queries; redundancy elimination;
Conference_Titel :
Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT '09. IEEE/WIC/ACM International Joint Conferences on
Conference_Location :
Milan, Italy
Print_ISBN :
978-0-7695-3801-3
Electronic_ISBN :
978-1-4244-5331-3
DOI :
10.1109/WI-IAT.2009.33