DocumentCode :
3627986
Title :
Universal models with memory for genomic sequence analysis
Author :
Ioan Tabus; Yinghua Yang;Jaakko Astola
Author_Institution :
Department of Signal Processing, Tampere University of Technology, Finland
fYear :
2008
Firstpage :
1211
Lastpage :
1217
Abstract :
In this paper we discuss the use of universal models for solving several genomic sequence analysis problems. A number of typical genomic problems, e.g., approximate matching, segmentation, and clustering, can be phrased as specific modeling problems involving discrete variables, for which discrete regression models need to be estimated based on rather short data segments. Universal models are known to possess appealing optimality properties, not only asymptotically, but also for short samples. We briefly review universal models with memory, which have been shown recently to perform well for the compression of full genomes. Two new applications of universal models with memory for genomic sequence analysis are shown here, the first one is the segmentation of DNA sequences for uncovering gene duplications and the second one is haplotype segmentation.
Keywords :
"Genomics","Bioinformatics","Sequences","Statistics","Signal analysis","DNA","Maximum likelihood estimation","Signal processing","Encoding","Minimax techniques"
Publisher :
ieee
Conference_Titel :
Communications, Control and Signal Processing, 2008. ISCCSP 2008. 3rd International Symposium on
Print_ISBN :
978-1-4244-1687-5
Type :
conf
DOI :
10.1109/ISCCSP.2008.4537410
Filename :
4537410
Link To Document :
بازگشت