Title :
A highly efficient distributed indexing system based on large cluster of commodity machines
Author :
Pole, Govind S. ; Potey, Madhuri A.
Author_Institution :
Dept. of Comput. Eng., D.Y. Patil Coll. of Eng., Pune, India
Abstract :
An Information Retrieval System using centralized approach demands long time to update the web index. A highly efficient distributed indexing system operates on large & diverse datasets with optimum time consumption compared to centralized approach to update web index. In this paper, a prototype model of highly efficient distributed indexing system deployed to run on cluster of commodity machines for the creation of large index using functionality of Apache Lucene. Experimental results showed efficiency of distributed indexing process. This distributed approach helps to reduce time interval for index creation and updation, in turn keeps the index content more fresh.
Keywords :
Internet; indexing; information retrieval systems; Apache Lucene functionality; Web index update; centralized approach; commodity machine cluster; highly efficient distributed indexing system; index content; information retrieval system; large index creation; Computers; Educational institutions; Indexing; Search engines; Standards; Web pages; commodity computing; dataset; distributed indexers; lucene; parser; retrieval;
Conference_Titel :
Wireless and Optical Communications Networks (WOCN), 2012 Ninth International Conference on
Conference_Location :
Indore
Print_ISBN :
978-1-4673-1988-1
DOI :
10.1109/WOCN.2012.6335562