Title :
An array-CGH based analyzing tool for detecting unknown copy number variation
Author :
Jung, Kwang Su ; Choi, JongPill ; Park, Kiejung
Author_Institution :
Div. of Bio-Med. Inf., Korea Nat. Inst. of Health, Cheongwon, South Korea
Abstract :
It has been generally known that most genes exist in two copies in a genome. However, recent investigations have reported that large segments of DNA from thousands to millions base pairs can vary in copy number. Genes that were considered to always express in two copies per genome have now been discovered to be present in one or more than two copies. Sometimes, genes are missing altogether. Copy number variation (CNV) has important roles both in human disease and drug response since they often include genes. Realizing a whole process of CNV formation could be useful to better grasp human genome evolution. To handle this issue, we have implemented a java-based program named Conovar that discovers CNVs through array CGH data and analyzes them in user-friendly interface. The Smith-Waterman Array algorithm is embedded in our system to identify copy number variants. Our system summarizes statistics of the user-selected CGH region among samples. Conovar displays CGH values of samples chosen by users in order to compare differences of log ratio per sample. Conovar proposes another map viewing difference of CNV regions per sample as well. The proposed system has an ability to automatically report the well-known CNV regions notified in Database of Genomic Variants (DGV, http://projects.tcag.ca/variation) since users want to verify whether CNV regions found by themselves have been already reported or not. Conovar needs to connect MySQL database to use DGV data, thus users are needed to handle MySQL database. DGV offers contents of the genomic variants as text files.
Keywords :
DNA; Java; bioinformatics; biological techniques; dynamic programming; genetics; molecular biophysics; CGH values; CNV discovery; CNV formation process; Conovar; Database of Genomic Variants; MySQL database; Smith-Waterman array algorithm; array CGH based analyzing tool; copy number variant identification; copy number variation; gene copies; human genome evolution; java based program; large DNA segments; unknown CNV detection; user friendly interface; user selected CGH region statistics; Arrays; Bioinformatics; Conferences; Databases; Genomics; Humans; Informatics;
Conference_Titel :
Bioinformatics and Biomedicine Workshops (BIBMW), 2011 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4577-1612-6
DOI :
10.1109/BIBMW.2011.6112560