• DocumentCode
    1992693
  • Title

    G-protein Coupled Receptor Subfamilies Prediction Based on Nearest Neighbor Approach

  • Author

    Fayyaz, Mudassir ; Mujahid, Adnan ; Khan, Asifullah ; Choi, Tae-Sun ; Iqbal, Nadeem

  • Author_Institution
    Ghulam Ishaq Khan Inst. of Eng. Sci. & Technol., Swabi
  • fYear
    2007
  • fDate
    14-17 Oct. 2007
  • Firstpage
    1348
  • Lastpage
    1354
  • Abstract
    Hydrophobicity has been considered as the potential measurement for the prediction of G-Proteins coupled receptor subfamilies. In the present work, using Hydrophobicity measure, we make use of fast Fourier transform to better analyze the sequence information. In our experiments, we have observed that sequence pattern based information could easily be exploited in the frequency domain using proximity rather than increasing margin of separation between the classes. Based on this information, a simple nearest neighbor (NN) method is then used to classify the 17 subfamilies. The proposed proximity based approach has outperformed the one against all implementation of support vector machine (SVM) [Y. Z. Guo, et al, Acta Biochimica et Biophysica Sinica, 37(2005) 759]. Our simple proximity based approach has superior performance in terms of all three measures on both Jackknife and independent data set. For B, C, D and F subfamilies, the Mathew´s correlation coefficient and overall accuracy using jackknife test are 0.96 and 96.03%, while, using independent data set are 0.91 and 91.6% respectively. The results validate the idea of exploiting sequence pattern based information in the frequency domain using proximity in terms of Euclidian distance. Another side advantage is that instead of training and saving 17 SVM models, we need a single NN classifier.
  • Keywords
    Fourier transform spectra; biochemistry; biological techniques; biology computing; cellular biophysics; correlation methods; molecular biophysics; proteins; Euclidian distance; G-protein coupled receptor subfamily prediction; Mathew´s correlation coefficient; SVM models; fast Fourier transform; frequency domain; hydrophobicity; jackknife test; nearest neighbor approach; proximity based approach; sequence pattern based information; single NN classifier; support vector machine; Amino acids; Databases; Fast Fourier transforms; Hidden Markov models; Nearest neighbor searches; Neural networks; Proteins; Sequences; Support vector machine classification; Support vector machines; Fast Fourier Transform; G-Proteins Coupled Receptors; Multilevel classification; Nearest Neighbor Classifier;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on
  • Conference_Location
    Boston, MA
  • Print_ISBN
    978-1-4244-1509-0
  • Type

    conf

  • DOI
    10.1109/BIBE.2007.4375745
  • Filename
    4375745