DocumentCode
869827
Title
Search engine coverage of the OAI-PMH corpus
Author
McCown, Frank ; Liu, Xindong ; Nelson, Michael L. ; Zubair, Mohammad ; Liu, Xiaoming
Author_Institution
Dept. of Comput. Sci., Old Dominion Univ., Norfolk, VA, USA
Volume
10
Issue
2
fYear
2006
Firstpage
66
Lastpage
73
Abstract
Having indexed much of the "surface" Web, search engines are now using various approaches to index the "deep" Web. At the same time, institutional repositories and digital libraries are adopting the open archives initiative protocol for metadata harvesting (OAI-PMH) to expose their holdings. The authors harvested nearly 10 million records from OAI-PMH repositories. From these records, they extracted 3.3 million unique resource URLs and then conducted searches on samples from this collection to determine how much of the OAI-PMH corpus the three major search engines have indexed.
Keywords
digital libraries; meta data; search engines; OAI-PMH corpus; digital library; institutional repository; open archives initiative protocol for metadata harvesting; search engine; Crawlers; Data models; Investments; Protection; Protocols; Robots; Search engines; Software libraries; Uniform resource locators; Writing; OAI PMH; deep web; indexing; search engines;
fLanguage
English
Journal_Title
Internet Computing, IEEE
Publisher
ieee
ISSN
1089-7801
Type
jour
DOI
10.1109/MIC.2006.41
Filename
1607990
Link To Document