Abstract :
Enzymes are biological catalysts that mediate almost all chemical reactions and are found in all tissues and fluids of the body. These enzymes play a central role in metabolic pathways, and in the prediction of metabolic pathways. Our goals in the present study were to identify new features for reliable enzyme functional classification and prediction that do not rely on sequence alignment, and to improve the accuracy or lower the error rate using the attribute selection method. In this study, we designed novel features, including PPR, NNR, PNPR, PPRDist(x, y), NNRDist(x, y), and PNPRDist(x, y), extracted from each protein sequence. Using only protein sequences, we compiled a set of 84 attributes that characterize proteins, and obtained accuracy of 72.13% through identification of optimal attributes in a given dataset. Our experiment results demonstrate that these attributes, as novel features, are useful for enzyme functional classification. In addition, we identify and analyze the biologically meaningful features of a given dataset.
Keywords :
biochemistry; catalysts; enzymes; feature extraction; molecular biophysics; pattern classification; NNRDist(x, y); PNPRDist(x, y); PPRDist(x, y); biological catalysts; enzyme functional classification; feature extraction; metabolic pathways; protein sequences; sequence alignment; Amino acids; Biochemistry; Biomedical engineering; Discrete cosine transforms; Feature extraction; Protein engineering; Protein sequence; Solvents; Support vector machine classification; Support vector machines;