Title :
Predicting the Risk Type of Human Papillomaviruses Based on Sequence-Derived Features
Author :
Pu Wang ; Xuan Xiao
Author_Institution :
Dept of Machine & Electron, Jing-De-Zhen Ceramic Inst., Jing-De-Zhen, China
Abstract :
Human papillomaviruses (HPVs) are a group of viruses that are now recognized as one of the major causes of cervical cancer. But there are over 100 different types of HPV, so scientists have separated HPV types into those that are more likely to develop into cancer and those that are less likely. The so-called "high risk" HPV types are more likely to lead to the development of cancer, while "low-risk" viruses rarely develop into cancer. Therefore, how can we identify whether it is a risk type of HPVs is very useful and necessary to the diagnosis and the remedy of cervical cancer. To predict and to classify the risk types of HPV by bioinformatics analysis, we construct a HPV dataset from available databases. The classification was achieved on the basis of multitudinous physicochemical and statistical features from protein sequences using Fuzzy K nearest neighbor (FKNN) classifier. The overall predictive accuracy about 96% has been achieved through the rigorous leave-one-out cross-validation on the dataset. This indicates that our method can be a useful associated tool for risk type prediction of human papillomaviruses.
Keywords :
bioinformatics; cancer; fuzzy set theory; medical information systems; microorganisms; pattern classification; statistical analysis; FKNN; Fuzzy K nearest neighbor; HPV; bioinformatics analysis; cervical cancer; human papillomaviruses; physicochemical features; sequence derived features; statistical features; Amino acids; Cervical cancer; Complexity theory; Humans; Protein sequence;
Conference_Titel :
Bioinformatics and Biomedical Engineering, (iCBBE) 2011 5th International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-5088-6
DOI :
10.1109/icbbe.2011.5779985