DocumentCode
2735428
Title
Search engines evaluation using precision and document-overlap measurements at 10-50 cutoff points
Author
Ismail, Amirah ; Sembok, Tengku Mohd T. ; Zaman, Halimah Badioze
Author_Institution
Fac. of technol. & Inf. Sci., Univ. Kebangsaan Malaysia, Malaysia
Volume
3
fYear
2000
fDate
2000
Firstpage
90
Abstract
The Internet has become a huge store of distributed documents. A user of the Internet, at times, seeks information which he may not know to solve a problem. He therefore has to express his information needs as a request for information in one form or another using a search engine. The search engine then tries to infer and retrieve relevant documents and presents the results in a hit list. But, the relevant documents from the hit list can only be determined by the user. The quality of hit lists very depending on the effectiveness of the indexing process which generate the surrogates from the original documents. Usually, the quality of the hit list can be measured by the precision measure, i.e. the ratio of the number of retrieved and relevant documents over the number of retrieved documents. This measure has been used to evaluate ten major search engines using ten queries at cutoff points of 10, 20, 30, 40 and 50. We have also introduced an overlap measure to determine the commonality of documents between the hit lists of various search engines. With these two measures we can evaluate the performance of the search engines. The search engines chosen for study are Altavista, Hotbot, Excite, Lycos, Webcrawler, Infoseek, Magellan, Northernlight, SavvySearch and Metacrawler
Keywords
information retrieval system evaluation; search engines; software performance evaluation; Altavista; Excite; Hotbot; Infoseek; Internet; Lycos; Magellan; Metacrawler; Northernlight; SavvySearch; Webcrawler; cutoff points; distributed document; document-overlap; indexing process; overlap measure; precision; precision measure; retrieved documents; search engine; search engine evaluation; Indexing; Information retrieval; Information science; Internet; Search engines;
fLanguage
English
Publisher
ieee
Conference_Titel
TENCON 2000. Proceedings
Conference_Location
Kuala Lumpur
Print_ISBN
0-7803-6355-8
Type
conf
DOI
10.1109/TENCON.2000.892230
Filename
892230
Link To Document