DocumentCode
3134471
Title
String join using precedence count matrix
Author
Cao, Xia ; Tung, Anthony K H ; Ooi, Beng Chin ; Tan, Kian-Lee ; Li, Shuai Cheng
Author_Institution
Dept. of Comput. Sci., National Univ. of Singapore, Singapore
fYear
2004
fDate
21-23 June 2004
Firstpage
345
Lastpage
348
Abstract
In this paper; we propose a filter-and-refine string join algorithm. While the filtering phase can rapidly prune away strings that are not joinable, the refinement phase employs a comprehensive algorithm to remove the remaining false alarms. The efficiency of the proposed scheme lies in the use of the precedence count matrix (PCM) for computing the edit distance between two sequences. With PCM, the complexity of sequence comparison is a constant time. We also evaluated the proposed sequence join algorithm, and our study shows that it outperforms the known techniques.
Keywords
DNA; distributed databases; genetics; query languages; relational databases; scientific information systems; string matching; DNA sequences; constant time complexity; false alarm removal; filter-and-refine string join algorithm; genomic applications; precedence count matrix; sequence comparison; sequence edit distance computing; sequence join algorithm; string data manipulation; string pruning; string refinement; string similarity; Assembly; Bioinformatics; Computer science; Dynamic programming; Filtering algorithms; Filters; Finance; Genomics; Phase change materials;
fLanguage
English
Publisher
ieee
Conference_Titel
Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on
ISSN
1099-3371
Print_ISBN
0-7695-2146-0
Type
conf
DOI
10.1109/SSDM.2004.1311228
Filename
1311228
Link To Document