Title :
Detection of gene copy number change in array CGH data
Author :
Hu, Jing ; Gao, Jianbo ; Cao, Yinhe ; Zhang, Weijia
Author_Institution :
Dept. of Electr. & Comput. Eng., Florida Univ., Gainesville, FL
Abstract :
Developing effective methods for analyzing array-CGH data to detect chromosomal aberrations is very important for the diagnosis of pathogenesis of cancer and other diseases. Current analysis methods, being largely based on smoothing and/or segmentation, are not quite capable of detecting both the aberration regions and the boundary break points very accurately. This is undesirable, since each point in the array represents a gene. Furthermore, when evaluating the accuracy of an algorithm for analyzing array-CGH data, it is commonly assumed that noise in the data follows normal distribution. A fundamental question is whether noise in array-CGH is indeed Gaussian, and if not, can one exploit the characteristics of noise to develop novel analysis methods that are capable of detecting accurately the aberration regions as well as the boundary break points simultaneously? By analyzing bacterial artificial chromosomes (BACs) arrays, oligo-nucleotide arrays, and high density NimbleGen data, we show that when there are aberrations, noise in all three types of arrays is highly non-Gaussian and possesses long-range spatial correlations, and that such noise leads to worse performance of existing methods for detecting aberrations in array-CGH than the Gaussian noise case. We further develop a novel method, which has optimally exploited the characteristics of the noise, and is capable of identifying both aberration regions as well as the boundary break points very accurately
Keywords :
bio-optics; cancer; cellular biophysics; genetics; medical signal processing; microorganisms; molecular biophysics; noise; patient diagnosis; array CGH data; bacterial artificial chromosomes arrays; boundary break points; cancer diagnosis; chromosomal aberrations; gene copy number change; high density NimbleGen data; noise; oligonucleotide arrays; segmentation method; smoothing method; Algorithm design and analysis; Biological cells; Cancer detection; Data analysis; Diseases; Gaussian distribution; Gaussian noise; Pathogens; Performance analysis; Smoothing methods;
Conference_Titel :
Life Science Systems and Applications Workshop, 2006. IEEE/NLM
Conference_Location :
Bethesda, MD
Print_ISBN :
1-4244-0277-8
Electronic_ISBN :
1-4244-0278-6
DOI :
10.1109/LSSA.2006.250402