DocumentCode
2777232
Title
Towards a DNA sequencing theory (learning a string)
Author
Li, Ming
Author_Institution
Waterloo Univ., Ont., Canada
fYear
1990
fDate
22-24 Oct 1990
Firstpage
125
Abstract
Mathematical frameworks suitable for massive automated DNA sequencing and for analyzing DNA sequencing algorithms are studied under plausible assumptions. The DNA sequencing problem is modeled as learning a superstring from its randomly drawn substrings. Under certain restrictions, this may be viewed as learning a superstring in L.G. Valiant´s (1984) learning model, and in this case the author gives an efficient algorithm for learning a superstring and a quantitative bound on how many samples suffice. A major obstacle to the approach turns out to be a quite well-known open question on how to approximate the shortest common superstring of a set of strings. The author presents the first provably good algorithm that approximates the shortest superstring of length n by a superstring of length O (n log n )
Keywords
DNA; biology computing; learning systems; merging; search problems; DNA sequencing; efficient algorithm; randomly drawn substrings; samples; shortest common superstring; superstring learning; Approximation algorithms; Bioinformatics; DNA; Genomics; Humans; Laboratories; Machine learning; Machine learning algorithms; Postal services; Sequences;
fLanguage
English
Publisher
ieee
Conference_Titel
Foundations of Computer Science, 1990. Proceedings., 31st Annual Symposium on
Conference_Location
St. Louis, MO
Print_ISBN
0-8186-2082-X
Type
conf
DOI
10.1109/FSCS.1990.89531
Filename
89531
Link To Document