Title :
Research Distributed Search Engine Based on Hadoop
Author_Institution :
Suzhou Ind. Park Inst. of Services Outsourcing, Suzhou, China
Abstract :
In the age internet, processing massive data appears bottlenecks based on the Lucene. In order to improve the timeliness of the massive data retrieval, this paper focuses on the research of distributed search engine based on Hadoop.First,this paper introduces the Hadoop Distributed File System (HDFS), MapReduce parallel programming model, HBase database. Then through the MapReduce calculation model build index file and store it to cluster HBase. At last, the final experiment shows the advantages of distributed search engine based on Hadoop.
Keywords :
"Distributed databases","Search engines","Indexes","File systems","Computational modeling","Data models","Google"
Conference_Titel :
Network and Information Systems for Computers (ICNISC), 2015 International Conference on
DOI :
10.1109/ICNISC.2015.149