Title :
Empirical study on crawler visibility of PDF documents in digital libraries
Author :
Weideman, Melius
Author_Institution :
Dept. of Res. Dev., Cape Peninsula Univ. of Technol., Cape Town, South Africa
Abstract :
Digital library users might not enter a digital library through homepage menus. As a result, digital library owners should consider the visibility to search engines of stored PDF documents. The aim of this research project was to determine to what extent the visibility of these PDF documents can be improved. In a series of empirical experiments, 100 PDF documents stored on digital libraries were identified an inspected. Searches were done for them and rankings on search engine result pages recorded. The current visibility of these documents was then calculated. After submission to Google, a waiting period was allowed for crawler visitation and the searches repeated. The results of these experiments proved that the visibility of these documents could be improved only marginally. It is therefore concluded that the designers of university digital libraries should consider other alternatives, such as providing text extracts of PDF documents, to enhance the overall visibility of content.
Keywords :
digital libraries; search engines; PDF document; crawler visibility; digital library; university library; Crawlers; Internet; Libraries; Noise measurement; Robustness; Runtime; PDF document; digital library; search engine crawler;
Conference_Titel :
Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on
Conference_Location :
Chengdu
Print_ISBN :
978-1-4244-5537-9
DOI :
10.1109/ICCSIT.2010.5563944