• DocumentCode
    707330
  • Title

    Efficient data layouts for cost-optimized Map-Reduce operations

  • Author

    Kaur, Narinder ; Taruna, S.

  • Author_Institution
    BVICAM, New Delhi, India
  • fYear
    2015
  • fDate
    11-13 March 2015
  • Firstpage
    600
  • Lastpage
    604
  • Abstract
    The MapReduce programming model accepted by Hadoop and other Big Data technologies is a powerful tool to address Big Data analysis problem. It is becoming ubiquitous, but still there are issues in concern with its performance and efficiency. It offers high scalability and fault tolerance in large scale data processing, but gives low efficiency. Hence, how to enhance efficiency with high level of scalability and fault tolerance is a major challenge. The efficiency problem, especially I/O costs can be addressed in two ways: by optimizing I/O operations in Map-Reduce and by utilizing the features of modern hardware such as SSD (Solid State Disk) that can help in minimizing computations in Map-Reduce considerably. This paper explores various existing data layout structures that can improve the efficiency of map-reduce operations and help in overcoming its pitfalls.
  • Keywords
    Big Data; data handling; parallel programming; Big Data technologies; Hadoop; I/O costs; I/O operation optimization; MapReduce programming model; SSD; computation minimization; cost-optimized MapReduce operations; data layout structures; efficiency enhancement; high-level fault tolerance; high-level scalability; large-scale data processing; solid state disk; Data models; Data processing; Fault tolerance; Fault tolerant systems; Indexes; Layout; Trojan horses; Column Oriented Storage; Cost-Optimization; Data Layout; Index; Map-Reduce;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing for Sustainable Global Development (INDIACom), 2015 2nd International Conference on
  • Conference_Location
    New Delhi
  • Print_ISBN
    978-9-3805-4415-1
  • Type

    conf

  • Filename
    7100320