Title of article :
Computational Detection of piRNA in Human Using Support Vector Machine
Author/Authors :
Seyeddokht, Atefeh Department of Animal Science - Faculty of Agriculture - Ferdowsi University of Mashhad, Mashhad , Masoudinejad, Ali Tehran Laboratory of System Biology and Bioinformatics (LBB) - Institute of Biochemistry and Biophysics - University of Tehran , Aslaminejad, Ali asghar Department of Animal Science - Faculty of Agriculture - Ferdowsi University of Mashhad, Mashhad , Nasseri, Mohammadreza Department of Animal Science - Research Institute of Biotechnology - Ferdowsi University of Mashhad, Mashhad , Zahiri, Javad Bioinformatics and Computational Omics - LAB (BioCOOL) - Faculty of Biological Sciences - Tarbiat Modares University (TMU), Tehran , Sadeghi, Balal Department of Food Hygiene and Public Health - Faculty of Veterinary Medicine - Shahid Bahonar University of Kerman, Kerman
Abstract :
Background: Piwi-interacting RNAs (piRNAs) are small non-coding RNAs (ncRNAs),
with a length of about 24-32 nucleotides, which have been discovered recently. These
ncRNAs play an important role in germline development, transposon silencing, epigenetic
regulation, protecting the genome from invasive transposable elements, and the
pathophysiology of diseases such as cancer. piRNA identification is challenging due to
the lack of conserved piRNA sequences and structural elements.
Methods: To detect piRNAs, an appropriate feature set, including 8 diverse feature
groups to encode each RNA was applied. In addition, a Support Vector Machine
(SVM) classifier was used with optimized parameters for RNA classification. According
to the obtained results, the classification performance using the optimized feature
subsets was much higher than the one in previously published studies.
Results: Our results revealed 98% accuracy, Mathew’ correlation coefficient of 98%
and 99% specificity in discriminating piRNAs from the other RNAs. Also, the obtained
results show that the proposed method outperforms its competitors.
Conclusion: In this paper, a prediction method was proposed to identify piRNA in
human. Also, 48 heterogeneous features (sequence and structural features) were used
to encode RNAs. To assess the performance of the method, a benchmark dataset containing
515 piRNAs and 1206 types of other RNAs was constructed. Our method
reached the accuracy of 99% on the benchmark dataset. Also, our analysis revealed
that the structural features are the most contributing features in piRNA prediction.
Keywords :
Piwi-interacting RNAs (piRNAs) , RNA , Support Vector Machines (SVM)
Journal title :
Astroparticle Physics