Title of article :
Granular support vector machines with association rules mining for protein homology prediction
Author/Authors :
Tang، نويسنده , , Yuchun and Jin، نويسنده , , Bo and Zhang، نويسنده , , Yan-Qing، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2005
Pages :
14
From page :
121
To page :
134
Abstract :
SummaryObjective: n homology prediction between protein sequences is one of critical problems in computational biology. Such a complex classification problem is common in medical or biological information processing applications. How to build a model with superior generalization capability from training samples is an essential issue for mining knowledge to accurately predict/classify unseen new samples and to effectively support human experts to make correct decisions. ology: learning model called granular support vector machines (GSVM) is proposed based on our previous work. GSVM systematically and formally combines the principles from statistical learning theory and granular computing theory and thus provides an interesting new mechanism to address complex classification problems. It works by building a sequence of information granules and then building support vector machines (SVM) in some of these information granules on demand. A good granulation method to find suitable granules is crucial for modeling a GSVM with good performance. In this paper, we also propose an association rules-based granulation method. For the granules induced by association rules with high enough confidence and significant support, we leave them as they are because of their high “purity” and significant effect on simplifying the classification task. For every other granule, a SVM is modeled to discriminate the corresponding data. In this way, a complex classification problem is divided into multiple smaller problems so that the learning task is simplified. s and conclusions: oposed algorithm, here named GSVM-AR, is compared with SVM by KDDCUP04 protein homology prediction data. The experimental results show that finding the splitting hyperplane is not a trivial task (we should be careful to select the association rules to avoid overfitting) and GSVM-AR does show significant improvement compared to building one single SVM in the whole feature space. Another advantage is that the utility of GSVM-AR is very good because it is easy to be implemented. More importantly and more interestingly, GSVM provides a new mechanism to address complex classification problems.
Keywords :
Binary classification , Protein homology prediction , Granular computing , Association rules , Granular support vector machines
Journal title :
Artificial Intelligence In Medicine
Serial Year :
2005
Journal title :
Artificial Intelligence In Medicine
Record number :
1835077
Link To Document :
بازگشت