Title :
A study of comparing the ambiguity of existing virus taxonomy structures using protein´s region names in the vector space model
Author_Institution :
Department of Computer Science and Information Engineering, Asia University No. 500, Lioufeng Rd. Wufeng, Taichung 41354, Taiwan
Abstract :
An interesting and challenging research area is to evaluate whether one existing taxonomy structure is adequate, especially if that taxonomy is domain-specific and its diversity increases with time. The aim of this paper is to evaluate the ambiguities of two existing virus taxonomy structures-Baltimore and International Committee on Taxonomy of Viruses (ICTV) classification systems - using the protein´s names in the vector space model. Performing this comparison first involves transforming all virus instances into representative vectors (points) according to the protein´s region names that each instance contains, and subsequently computing the Class Structure Ambiguity (CSA) of one taxonomy structure. In this paper, there are four taxonomy structures selected for experiments, including 7 groups from the Baltimore classifications system; and 6 orders, 42 families, and 36 genera from the ICTV classification system. Experimental results show that the virus taxonomy structure derived from the Baltimore classification system is more ambiguous than that derived from the ICTV classification system. Furthermore, for virologists and biologists, the ambiguities identified within these virus taxonomy structures can provide hints to further verify the suitability of classification for the viruses falling in the ambiguous regions or to reorganize (adjust) their taxonomy structures in the future.
Keywords :
"Taxonomy","Proteins","Viruses (medical)","Genomics","Bioinformatics","Accuracy","Transforms"
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2015 IEEE Conference on
DOI :
10.1109/CIBCB.2015.7300272