• DocumentCode
    1989348
  • Title

    The RNA String Kernel for siRNA Efficacy Prediction

  • Author

    Qiu, Shibin ; Lane, Terran

  • Author_Institution
    Pathwork Diagnostics, Inc., Sunnyvale
  • fYear
    2007
  • fDate
    14-17 Oct. 2007
  • Firstpage
    307
  • Lastpage
    314
  • Abstract
    String kernels directly model sequence similarities without the necessity of extracting numerical features in a vector space. Since they better capture complex traits in the sequences, string kernels often achieve better prediction performance. RNA interference is a cell defense mechanism with many biological and therapeutical applications, where strings can be used to represent target messenger RNAs and initiating short RNAs and string kernels can be applied for training and prediction. While most existing string kernels are developed for general purpose sequences and have been applied to text and protein classifications, the RNA string kernel is particularly designed to model mismatches, GU wobbles, and bulges of RNA biology and has been applied to RNAi off-target evaluation. We adapt the RNA string kernel to compute the similarity of siRNA sequences and use it in support vector regression to predict siRNA silencing efficacy. We evaluate the performance of the RNA kernel against the spectrum kernel, the string subsequence kernel of arbitrary mismatch, the randomized string kernel, and numerical kernels computed from numerical features extracted according to siRNA design rules. We also give insights into computational performance and common properties and differences of the RNA kernel as compared to other kernels. Empirical results on biological data sets demonstrate that the RNA string kernel performed favorably than most existing string kernels and achieved significant improvements over kernels computed from numerical descriptors extracted according to structural and thermodynamic rules. Meanwhile, the string kernels achieved favorable results relative to other methods in related work. Furthermore, the RNA string kernel is simple to implement and fast to compute.
  • Keywords
    biology computing; cellular biophysics; genetics; molecular biophysics; molecular configurations; regression analysis; support vector machines; RNA interference; RNA string kernel; cell defense mechanism; randomized string kernel; sequence similarities; siRNA efficacy prediction; siRNA sequences; siRNA silencing efficacy; spectrum kernel; string subsequence kernel; support vector regression; Biological system modeling; Biology computing; Cells (biology); Data mining; Feature extraction; Interference; Kernel; Protein engineering; RNA; Sequences;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on
  • Conference_Location
    Boston, MA
  • Print_ISBN
    978-1-4244-1509-0
  • Type

    conf

  • DOI
    10.1109/BIBE.2007.4375581
  • Filename
    4375581