DocumentCode :
2777232
Title :
Towards a DNA sequencing theory (learning a string)
Author :
Li, Ming
Author_Institution :
Waterloo Univ., Ont., Canada
fYear :
1990
fDate :
22-24 Oct 1990
Firstpage :
125
Abstract :
Mathematical frameworks suitable for massive automated DNA sequencing and for analyzing DNA sequencing algorithms are studied under plausible assumptions. The DNA sequencing problem is modeled as learning a superstring from its randomly drawn substrings. Under certain restrictions, this may be viewed as learning a superstring in L.G. Valiant´s (1984) learning model, and in this case the author gives an efficient algorithm for learning a superstring and a quantitative bound on how many samples suffice. A major obstacle to the approach turns out to be a quite well-known open question on how to approximate the shortest common superstring of a set of strings. The author presents the first provably good algorithm that approximates the shortest superstring of length n by a superstring of length O(n log n)
Keywords :
DNA; biology computing; learning systems; merging; search problems; DNA sequencing; efficient algorithm; randomly drawn substrings; samples; shortest common superstring; superstring learning; Approximation algorithms; Bioinformatics; DNA; Genomics; Humans; Laboratories; Machine learning; Machine learning algorithms; Postal services; Sequences;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Foundations of Computer Science, 1990. Proceedings., 31st Annual Symposium on
Conference_Location :
St. Louis, MO
Print_ISBN :
0-8186-2082-X
Type :
conf
DOI :
10.1109/FSCS.1990.89531
Filename :
89531
Link To Document :
بازگشت