• DocumentCode
    3184698
  • Title

    Enhancing performance of Hadoop and MapReduce for scientific data using NoSQL database

  • Author

    Alshammari, Hamoud ; Bajwa, Hassan ; Jeongkyu Lee

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Bridgeport, Bridgeport, CT, USA
  • fYear
    2015
  • fDate
    1-1 May 2015
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Scientific data sets usually have similar jobs that are frequently applied to them by different users. In addition, many of these data sets are unstructured and complex, and required fast and simple processing. In order to increase the performance of the existing Hadoop and MapReduce algorithm, it is necessary to develop an algorithm based on the type of data sets and requirements of the jobs. Genomic and biological data is an example of unstructured data because it only has a huge sequence of unreadable and non-relational letters. In this paper, we present an overview of a developed MapReduce algorithm and its simulation using HBase as a NoSQL database.
  • Keywords
    parallel algorithms; relational databases; HBase; Hadoop algorithm; MapReduce algorithm; NoSQL database; biological data; genomic data; Bioinformatics; Biological cells; Computer architecture; Computer science; DNA; Databases; Genomics; BigData; HBase; Hadoop; MapReduce; NoSQL Database;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Applications and Technology Conference (LISAT), 2015 IEEE Long Island
  • Conference_Location
    Farmingdale, NY
  • Type

    conf

  • DOI
    10.1109/LISAT.2015.7160180
  • Filename
    7160180