DocumentCode :
2469734
Title :
Parallel and distributed kmeans to identify the translation initiation site of proteins
Author :
Rodrigues, Laerte M. ; Zárate, Luis E. ; Nobre, Cristiane N. ; Freitas, Henrique C.
Author_Institution :
Dept. of Comput. Sci., Pontificia Univ. Catolica de Minas Gerais, Belo Horizonte, Brazil
fYear :
2012
fDate :
14-17 Oct. 2012
Firstpage :
1639
Lastpage :
1645
Abstract :
Prediction of the translation initiation site is of vital importance in bioinformatics since through this process it is possible to understand the organic formation and metabolic behavior of living organisms. Sequential algorithms are not always a viable solution due to the fact that mRNA databases are normally very large, resulting in long processing times. Applying parallel and distributed computing resources to such databases could help reduce this time. The objective of this article is to present a class balancing solution for the translation initiation site process using parallel and distributed computing resources in a hybrid model. The results reveal a speedup of up to 23 times compared to sequential methods and performance rates for accuracy, precision, sensitivity, specificity and adjusted accuracy of 91.15%, 39.83%, 89.11%, 88.93% and 89.02%, respectively, for the Homo sapiens database. For the Drosophila melanogaster database, the speedup was 18.33 times and accuracy, precision, sensitivity, specificity and adjusted accuracy were 95.22%, 43.01%, 90.83%, 90.47% and 90.64%, respectively. Both sets of results are considered important. Thus, the solution presented in this article demonstrated itself viable for the problem in question.
Keywords :
bioinformatics; message passing; molecular biophysics; parallel algorithms; parallel programming; pattern clustering; proteins; very large databases; Drosophila melanogaster database; Homo sapiens database; accuracy performance; adjusted accuracy performance; bioinformatics; class balancing solution; distributed computing; distributed k-means clustering; living organism; mRNA database; parallel computing; parallel k-means clustering; precision performance; protein translation initiation site; sensitivity performance; specificity performance; translation initiation site processing; Accuracy; Clustering algorithms; Databases; Equations; Libraries; Organisms; Sensitivity; Bioinformatics; Clustering; Parallel and distributed Systems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-4673-1713-9
Electronic_ISBN :
978-1-4673-1712-2
Type :
conf
DOI :
10.1109/ICSMC.2012.6377972
Filename :
6377972
Link To Document :
بازگشت