• DocumentCode
    2397558
  • Title

    Research and application of distributed parallel search hadoop algorithm

  • Author

    AiLing Duan

  • Author_Institution
    Sch. of Inf. Sci. & Eng., Henan Univ. of Technol., Zhengzhou, China
  • fYear
    2012
  • fDate
    19-20 May 2012
  • Firstpage
    2462
  • Lastpage
    2465
  • Abstract
    Hadoop is an open source distributed parallel computing platform, which is mainly composed of MapReduce algorithm and a distributed file system. This paper introduces Hadoop and the related technologies, discusses in detail the idea and basic framework of MapReduce algorithm, together with the parallelization method and feasibility regarding the massive data involved in Internet search The paper also puts forward the idea and strategy to use MapReduce for parallel processing of webpage inverted index.
  • Keywords
    Web services; file organisation; information retrieval; parallel algorithms; public domain software; search problems; Hadoop; Internet search; MapReduce algorithm; Web page inverted index; distributed file system; distributed parallel algorithm; open source computing; parallel processing; Distributed databases; Educational institutions; File systems; Indexes; Internet; Parallel processing; Servers; Hadoop; MapReduce algorithm; inverted index; parallel computing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems and Informatics (ICSAI), 2012 International Conference on
  • Conference_Location
    Yantai
  • Print_ISBN
    978-1-4673-0198-5
  • Type

    conf

  • DOI
    10.1109/ICSAI.2012.6223552
  • Filename
    6223552