Title :
Selecting Oligonucleotide Probes for Whole-Genome Tiling Arrays with a Cross-Hybridization Potential
Author :
Hafemeister, Christoph ; Krause, Roland ; Schliep, Alexander
Author_Institution :
Dept. of Biol., New York Univ., New York, NY, USA
Abstract :
For designing oligonucleotide tiling arrays popular, current methods still rely on simple criteria like Hamming distance or longest common factors, neglecting base stacking effects which strongly contribute to binding energies. Consequently, probes are often prone to cross-hybridization which reduces the signal-to-noise ratio and complicates downstream analysis. We propose the first computationally efficient method using hybridization energy to identify specific oligonucleotide probes. Our Cross-Hybridization Potential (CHP) is computed with a Nearest Neighbor Alignment, which efficiently estimates a lower bound for the Gibbs free energy of the duplex formed by two DNA sequences of bounded length. It is derived from our simplified reformulation of t-gap insertion-deletion-like metrics. The computations are accelerated by a filter using weighted ungapped q-grams to arrive at seeds. The computation of the CHP is implemented in our software OSProbes, available under the GPL, which computes sets of viable probe candidates. The user can choose a trade-off between running time and quality of probes selected. We obtain very favorable results in comparison with prior approaches with respect to specificity and sensitivity for cross-hybridization and genome coverage with high-specificity probes. The combination of OSProbes and our Tileomatic method, which computes optimal tiling paths from candidate sets, yields globally optimal tiling arrays, balancing probe distance, hybridization conditions, and uniqueness of hybridization.
Keywords :
DNA; Hamming codes; bioinformatics; biological techniques; free energy; genomics; molecular biophysics; molecular configurations; DNA sequence; Gibbs free energy; Hamming distance; OSProbes software; base stacking effect; binding energy; cross hybridization potential; downstream analysis; hybridization condition; hybridization energy; hybridization uniqueness; nearest neighbor alignment; oligonucleotide probe; optimal tiling path; probe distance; t-gap insertion-deletion-like metrics; tileomatic method; weighted ungapped q-gram; whole genome tiling array; Artificial neural networks; DNA; Databases; Genomics; Nearest neighbor searches; Probes; Biology and genetics; DNA microarrays; cross hybridization.; oligonucleotide probes; tiling arrays; Base Sequence; DNA; Gene Expression Profiling; Genome; Nucleic Acid Hybridization; Oligonucleotide Probes; Thermodynamics;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2011.39