DocumentCode :
3039629
Title :
Clustering of web search results using Suffix tree algorithm and avoidance of repetition of same images in search results using L-Point Comparison algorithm
Author :
Suneetha, Manne ; Fatima, S. Sameen ; Pervez, Shaik Mohd Zaheer
Author_Institution :
Dept. of Inf. Technol., Velagapudi Ramakrishna Siddhartha Eng. Coll., Vijayawada, India
fYear :
2011
fDate :
23-24 March 2011
Firstpage :
1041
Lastpage :
1046
Abstract :
It is a common experience to the web users with the existing search engines like Google, Yahoo, MSN, Ask, e.t.c., that the information related to the entered query returns a long ranked list of results (snippets). It becomes cumbersome to the user to go through each title, snippet and even sometimes link of the search results until relevant results are found to the query. Clustering of search results is a special technique in data mining using which the retrieved results are organized into meaningful groups enlightening the user work. This paper deals with the generalized Suffix tree based clustering approach. The most repeated phrase in the document tags is considered as cluster name. Thus in short, web search results that are fetched from the prevailing web search engines grouped under phrases that contain one or more search keywords. This paper aims at organizing web search results into clusters facilitating quick browsing options to the browser providing an excellent interface to results precisely. Suffix tree clustering produces comparatively more accurate and informative grouped results. A basic problem during image searching in any search engine is Image Repetition. This can be avoided by using the L-Point Comparison algorithm, a specially worked out technique in field of Information Retrieval systems, is also discussed with a practical example.
Keywords :
Internet; content-based retrieval; data mining; image retrieval; pattern clustering; search engines; tree data structures; trees (mathematics); Ask; Google; L-point comparison algorithm; MSN; Web search result clustering; Yahoo; cluster name; data mining; document tags; generalized suffix tree based clustering approach; image repetition avoidance; image searching; information retrieval system; query return; quick browsing option; search engines; suffix tree algorithm; Clustering algorithms; Data mining; Engines; Pixel; Search engines; Shape; Web search; Cleaning of Document; Coherent clustering; L-point image Comparison (LPC); Shared phrase; Suffix Tree Based Clustering (STBC);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Emerging Trends in Electrical and Computer Technology (ICETECT), 2011 International Conference on
Conference_Location :
Tamil Nadu
Print_ISBN :
978-1-4244-7923-8
Type :
conf
DOI :
10.1109/ICETECT.2011.5760272
Filename :
5760272
Link To Document :
بازگشت