Title of article :
SubRF_Seq: Identification of Sub-Golgi Protein Types with Random Forest with Partial Sequence Information
Author/Authors :
Cui,Qingyu School of Information - University of Jinan, China , Cao,Yi School of Information - University of Jinan, China , Bao, Wenzheng School of Information Engineering - Xuzhou University of Technology, China , Yang, Bin School of Information Science and Engineering, Zaozhuang University, China , Chen, Yuehui School of Information - University of Jinan, China
Pages :
7
From page :
1
To page :
7
Abstract :
In the recent years, the subject of Golgi classification has been studied intensively. It has been scientifically proven that Golgi can synthesize many substances, such as polysaccharides, and it can also combine proteins with sugars or lipids with glycoproteins and lipoproteins. In some cells (such as liver cells), the Golgi apparatus is also involved in the synthesis and secretion of lipoproteins. Therefore, the loss of Golgi protein function may have severe effects on the human body. For example, Alzheimer’s disease and diabetes are related to the loss of Golgi protein function. Because the classification of Golgi proteins has a specific effect on the treatment of these diseases, many scholars have studied the classification of Golgi proteins, but the data sets they used were complete Golgi sequences. The focus of this article is whether there is redundancy in the Golgi protein classification or, in other words, whether a part of the entire Golgi protein sequence can be used to complete the Golgi protein classification. Besides, we have adopted a new method to deal with the problem of sample imbalance. After experiments, our model has certain observability.
Keywords :
SubRF_Seq , Identification , Sub-Golgi Protein , Random Forest , Partial Sequence Information
Journal title :
Scientific Programming
Serial Year :
2020
Full Text URL :
Record number :
2610830
Link To Document :
بازگشت