• DocumentCode
    2775078
  • Title

    Sequence learning using the adaptive suffix trie algorithm

  • Author

    Gunasinghe, Upuli ; Alahakoon, Damminda

  • Author_Institution
    Cognitive & Connectionist Syst. Lab., Monash Univ., Clayton, VIC, Australia
  • fYear
    2012
  • fDate
    10-15 June 2012
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Sequences occur naturally in many domains such as biology, engineering, finance and scientific research. Since humans have the inherent ability to comprehend and utilize sequences in day to day cognitive tasks such as speech, vision and motor control; biologically inspired sequence learning techniques are used for explanatory data analysis in these domains. Identifying the common substrings which exist in sequences helps in determining the underlying structure and calculating the similarity between sequences. The suffix trie, suffix tree and suffix array are data structures which are used in many solutions to sequence based problems. However, these are static data structures and not flexible tools which can be used for sequence learning. In this paper we present the Adaptive Suffix Trie algorithm, a sequence learning algorithm which can be used for identifying substrings of different lengths and frequencies from a given set of sequences. In contrast to suffix data structures which store all suffixes, the adaptive suffix trie only captures the frequent substrings that occur in the given dataset, resulting in a less complex structure with only the relevant or useful information. We show how the algorithms´ learning parameters can be adapted for extracting substrings with the required characteristics and then demonstrate it´s application in the classification of biological sequences.
  • Keywords
    DNA; RNA; biology computing; genetics; learning (artificial intelligence); molecular biophysics; proteins; tree data structures; adaptive suffix trie algorithm; biological sequence; cognitive task; complex structure; explanatory data analysis; sequence learning algorithm; similarity measure; static data structures; substring extraction; suffix array; suffix data structure; suffix tree; Arrays; Hebbian theory; Heuristic algorithms; Humans; Training; Vegetation; Frequent substring extraction; Sequence learning; Suffix trie;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks (IJCNN), The 2012 International Joint Conference on
  • Conference_Location
    Brisbane, QLD
  • ISSN
    2161-4393
  • Print_ISBN
    978-1-4673-1488-6
  • Electronic_ISBN
    2161-4393
  • Type

    conf

  • DOI
    10.1109/IJCNN.2012.6252671
  • Filename
    6252671