DocumentCode
2089075
Title
Prediction of Protein Functional Sites Using Novel String Kernels
Author
Das, Chandra ; Maji, Pradipta
Author_Institution
Dept. of Comput. Sci. & Eng., Netaji Subhash Eng. Coll., Kolkata, India
fYear
2008
fDate
17-20 Dec. 2008
Firstpage
127
Lastpage
130
Abstract
In most pattern recognition algorithms, amino acids cannot be used directly as inputs since they are nonnumerical variables. They, therefore, need encoding prior to input. In this regard, a novel string kernel is introduced, which maps a nonnumerical sequence space to a numerical feature space.The proposed string kernel is developed based on the conventional bio-basis function and termed as novel bio-basis function. The novel bio-basis function is designed based on the principle of asymmetricity of biological distance, which is calculated using an amino acid mutation matrix. The concept of zone of influence of bio-basis is introduced in the proposed string kernel to normalize the asymmetric distance. An efficient method to select bio-bases for the novel string kernel is described integrating the concepts of the Fisher ratio and degree of resemblance. The effectiveness of the proposed string kernel and bio-bases selection method, along with a comparison with existing kernel and related selection methods, is demonstrated on different protein data sets.
Keywords
biology computing; matrix algebra; pattern recognition; proteins; Fisher ratio; amino acid mutation matrix; amino acids; bio-bases selection method; bio-basis function; biological distance; nonnumerical sequence space; novel string kernels; numerical feature space; pattern recognition algorithms; protein data sets; protein functional sites; Amino acids; Biological information theory; Biological system modeling; Computer science; Educational institutions; Encoding; Genetic mutations; Information technology; Kernel; Protein engineering; Bioinformatics; Fisher ratio; Pattern recognition; Sequence analysis; Support vector machine;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Technology, 2008. ICIT '08. International Conference on
Conference_Location
Bhubaneswar
Print_ISBN
978-1-4244-3745-0
Type
conf
DOI
10.1109/ICIT.2008.11
Filename
4731312
Link To Document