DocumentCode
2775078
Title
Sequence learning using the adaptive suffix trie algorithm
Author
Gunasinghe, Upuli ; Alahakoon, Damminda
Author_Institution
Cognitive & Connectionist Syst. Lab., Monash Univ., Clayton, VIC, Australia
fYear
2012
fDate
10-15 June 2012
Firstpage
1
Lastpage
8
Abstract
Sequences occur naturally in many domains such as biology, engineering, finance and scientific research. Since humans have the inherent ability to comprehend and utilize sequences in day to day cognitive tasks such as speech, vision and motor control; biologically inspired sequence learning techniques are used for explanatory data analysis in these domains. Identifying the common substrings which exist in sequences helps in determining the underlying structure and calculating the similarity between sequences. The suffix trie, suffix tree and suffix array are data structures which are used in many solutions to sequence based problems. However, these are static data structures and not flexible tools which can be used for sequence learning. In this paper we present the Adaptive Suffix Trie algorithm, a sequence learning algorithm which can be used for identifying substrings of different lengths and frequencies from a given set of sequences. In contrast to suffix data structures which store all suffixes, the adaptive suffix trie only captures the frequent substrings that occur in the given dataset, resulting in a less complex structure with only the relevant or useful information. We show how the algorithms´ learning parameters can be adapted for extracting substrings with the required characteristics and then demonstrate it´s application in the classification of biological sequences.
Keywords
DNA; RNA; biology computing; genetics; learning (artificial intelligence); molecular biophysics; proteins; tree data structures; adaptive suffix trie algorithm; biological sequence; cognitive task; complex structure; explanatory data analysis; sequence learning algorithm; similarity measure; static data structures; substring extraction; suffix array; suffix data structure; suffix tree; Arrays; Hebbian theory; Heuristic algorithms; Humans; Training; Vegetation; Frequent substring extraction; Sequence learning; Suffix trie;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks (IJCNN), The 2012 International Joint Conference on
Conference_Location
Brisbane, QLD
ISSN
2161-4393
Print_ISBN
978-1-4673-1488-6
Electronic_ISBN
2161-4393
Type
conf
DOI
10.1109/IJCNN.2012.6252671
Filename
6252671
Link To Document