• DocumentCode
    3460105
  • Title

    Using Mutual Information and Models of Evolution for Improved Pattern Petection

  • Author

    Kitchovitch, Stephan ; Leung, Ian ; Song, Yuedong ; Liò, Pietrò

  • Author_Institution
    Comput. Lab., Univ. of Cambridge, Cambridge, UK
  • fYear
    2009
  • fDate
    3-5 Aug. 2009
  • Firstpage
    215
  • Lastpage
    221
  • Abstract
    Many uses of information theory have recently been discovered in the field of bioinformatics - clustering and classification of data, sequence alignment scoring, discovering dependencies between sites in amino acid alignments, etc. Mutual Information has proven itself to be a very convenient metric for determining the dependency between two sets of data, and has advantages over other common statistical methods such as correlation. Models of evolution, or substitution matrices, have always been at the very heart of bioinformatics, with a large variety of applications based on PAM, BLOSUM, JTT or other matrices. In this paper we describe a novel algorithm that incorporates substitution rates from a given matrix when calculating the mutual information between sites in an amino acid alignment. We formally describe this algorithm in detail as well as some experimental results. As a result of this work we demonstrate that the incorporation of substitution matrices in the calculation leads to an improved detection of patterns of similarity between sites within a multiple sequence alignment.
  • Keywords
    bioinformatics; data handling; information theory; pattern classification; pattern clustering; Shannon entropy; amino acid alignment; bioinformatics; data classification; data clustering; data dependency; information theory; mutual Information; pattern detection; sequence alignment; substitution matrix; substitution rate; Amino acids; Bioinformatics; Biology computing; DNA; Evolution (biology); Information analysis; Matrices; Mutual information; Sequences; Statistical analysis; Mutual Information; Shannon Entropy; sequence alignment; substitution matrices;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics, Systems Biology and Intelligent Computing, 2009. IJCBS '09. International Joint Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-0-7695-3739-9
  • Type

    conf

  • DOI
    10.1109/IJCBS.2009.77
  • Filename
    5260687