DocumentCode
2819889
Title
Improved Pattern Matching to Find DNA Patterns
Author
Dudás, L.
Author_Institution
Dept. of Inf. Eng., Miskolc Univ.
Volume
2
fYear
2006
fDate
25-28 May 2006
Firstpage
345
Lastpage
349
Abstract
The process of finding given patterns in DNA sequences is widely used in modern biological sciences. This paper shows an algorithmic improvement for exact pattern matching introducing a new heuristic. For implementation the author created an application that uses three well-known heuristics to ensure the O(n) worst case time, the O(n log sigma (m)/m) average case time and the O(n/m) best case time of searching for an m length pattern in an n length text that use a sigma letter alphabet. This application served as a testbed for the new H4 heuristic. The novelty is in optimization of the direction of text window movement in the preprocessing phase. The idea takes advantages of RAM based searching: usually all the text resides in today´s gigabyte memory so the opposite direction of searching window moving requires the same time as the usual. A new function predicts the better moving direction in preprocessing time, based on the unsymmetrical property of pattern. Tests proved that this heuristic may result in fewer jumps and tested characters in the search phase of pattern matching
Keywords
DNA; biology computing; computational complexity; genetics; pattern matching; random-access storage; DNA patterns; RAM based searching; computational complexity; pattern matching; Biological information theory; Biology; DNA; Hydrogen; Pattern matching; Random access memory; Read-write memory; Sequences; Software algorithms; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Automation, Quality and Testing, Robotics, 2006 IEEE International Conference on
Conference_Location
Cluj-Napoca
Print_ISBN
1-4244-0360-X
Electronic_ISBN
1-4244-0361-8
Type
conf
DOI
10.1109/AQTR.2006.254657
Filename
4022980
Link To Document