• DocumentCode
    226870
  • Title

    Distributed index mechanism based on Hadoop

  • Author

    Qin Liu ; Ni Zhang ; Xiaowen Yang ; Hongming Zhu

  • Author_Institution
    Sch. of Software Eng., Tongji Univ., Shanghai, China
  • fYear
    2014
  • fDate
    24-26 Sept. 2014
  • Firstpage
    274
  • Lastpage
    278
  • Abstract
    Recent years, MapReduce has aroused much attention. However, MapReduce has its own weakness- require an entire block scan as it cannot precisely locate the query result. Currently, there are already some researches that have built index on Hadoop, but some of them could only deal with full-text search, which cannot support dataset with certain schema. There´s not yet a general distributed unstructured data index system optimized from MapReduce that could handle multi-schema dataset and support query well no matter with index or without index. So in this paper, we proposed a distributed index mechanism and set up this index mechanism on MapReduce which can reduce its query time and map task number in some context. Moreover, this distributed index mechanism could support multi-schema dataset, has a good scalability and is customizable. From our experiment, we find our distributed index mechanism could save up to 30% query time, and 90% map task number in some context compared to the query performance of original MapReduce framework, and the advantage grows as the dataset expands.
  • Keywords
    data handling; database indexing; distributed databases; parallel programming; query processing; text analysis; Hadoop; MapReduce framework; block scan; customizability; distributed index mechanism; full-text search; general distributed unstructured data index system; map task number reduction; multischema dataset handling; query performance; query time reduction; scalability; Bandwidth; Context; Indexing; Particle separators; Query processing; Servers; MapReduce; hadoop; index; schema;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications and Information Technologies (ISCIT), 2014 14th International Symposium on
  • Conference_Location
    Incheon
  • Type

    conf

  • DOI
    10.1109/ISCIT.2014.7011915
  • Filename
    7011915