Title :
An enhanced framework of genomics using big data computing
Author :
Antim Jaiswal;Arvind Upadhyay
Author_Institution :
Department of Computer Science & Engineering, IES, IPS Academy, Indore, India
Abstract :
Genomics is the study of the complete genetic material (genome) of organisms. Since the advent of complete genome sequencing, vast amounts of nucleotide and amino acid sequence data have been produced. These data need to be effectively analyzed and verified because they may be used for biological discovery. Future of life sciences will depend on our ability to properly interpret the large scale High Dimensional data sets. Due to Data Deluge problem (sudden exponential growth of genomics) it made many of current sequence analysis tools obsolete because they do not scale with data scale. Genomics sequencing designed using PIG (Hadoop Component) which will be resultant of MapReduce and Cloud Computing approach, which will provide proper data analysis and management to genomes. Pig is a flexible data scripting language that uses data structure of Hadoop and map reduce framework to very large data files which processed in parallel and distributed manner. Genomics in Pig gives efficient sequence alignment and mapping. By implementing Blast(Basic Local Allignment Search Tool) at PIG, system represent a novel framework for sequence Alignment and Mapping and takes less time and memory in execution.
Keywords :
"Bioinformatics","Genomics","Big data","Sequential analysis","Cloud computing","Algorithm design and analysis"
Conference_Titel :
Computer, Communication and Control (IC4), 2015 International Conference on
DOI :
10.1109/IC4.2015.7375662