• DocumentCode
    3431003
  • Title

    A nearest neighbor method for predicting solenoid proteins

  • Author

    Cheng, Wen ; Sanjaka, Malinda ; Yan, Changhui

  • Author_Institution
    Department of Computer Science, North Dakota State University, Fargo, USA
  • fYear
    2012
  • fDate
    11-13 Aug. 2012
  • Firstpage
    68
  • Lastpage
    71
  • Abstract
    Solenoid proteins are proteins with repeats of 5 to 40 residues in length. Identifying solenoid proteins presents a big challenge because the repeat sequences are highly degenerated. Here, we present a nearest neighbor (NN) method for predicting solenoid proteins based on residue composition. The distance between proteins is calculated as a weighted Euclidean distance defined by the residue composition vector. The NN method predicts solenoid proteins with an overall accuracy of 95.5% with 94.3% sensitivity and 96% specificity, outperforming other methods in direct comparisons. We also demonstrate that combining the NN method with HHrepID and Trust, which are previously published methods for addressing the same problem, can dramatically reduce the false positive rates in predicting repeats.
  • Keywords
    Accuracy; Databases; Proteins; Solenoids; nearest neighbor; prediction; solenoids; weighted Eclidean distance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Granular Computing (GrC), 2012 IEEE International Conference on
  • Conference_Location
    Hangzhou, China
  • Print_ISBN
    978-1-4673-2310-9
  • Type

    conf

  • DOI
    10.1109/GrC.2012.6468600
  • Filename
    6468600