• DocumentCode
    3321391
  • Title

    Performance comparison of generalized PSSM in in signal peptide cleavage site and disulfide bond recognition

  • Author

    Clote, P.

  • Author_Institution
    Dept. of Biol., Boston Coll., Chestnut Hill, MA, USA
  • fYear
    2003
  • fDate
    10-12 March 2003
  • Firstpage
    37
  • Lastpage
    44
  • Abstract
    We generalize the familiar position-specific score matrix (PSSM), aka weight matrix, by considering a log-odds score for (nonadjacent) k-tuple frequencies, each k-tuple score weighted by the product of its mutual information and its statistical significance, as measured by a point estimator for the p-value of the mutual information. Performance of this new approach, along with other variants of generalized PSSM and profile methods, is measured by receiver-operating characteristic (ROC) curves for the specific problem of signal peptide cleavage site recognition. We additionally compare Vert´s recent support vector machine string kernel, Brown´s joint probability approximation algorithm and the method WAM. Similar algorithm comparisons are made, though not as extensively, in the case of disulfide bond recognition. While in the case of signal peptide cleavage site recognition, the monoresidue PSSM is essentially competitive, within the limits of statistical significance, even against Vert´s support vector machine kernel, diresidue and triresidue PSSM methods display improved performance over monoresidue PSSM for disulfide bond recognition.
  • Keywords
    molecular biophysics; organic compounds; probability; statistics; Brown´s joint probability approximation algorithm; diresidue methods; k-tuple score; log-odds score; p-value; position-specific score matrix; profile methods; receiver-operating characteristic curves; signal peptide cleavage site recognition; statistical significance limits; triresidue methods; weight matrix; Bonding; Character recognition; Frequency estimation; Frequency measurement; Kernel; Mutual information; Peptides; Position measurement; Probability; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Bioengineering, 2003. Proceedings. Third IEEE Symposium on
  • Print_ISBN
    0-7695-1907-5
  • Type

    conf

  • DOI
    10.1109/BIBE.2003.1188927
  • Filename
    1188927