Title :
Identification of Viral Protein Genotypic Determinants Using Combinatorial Filtering and Active Learning
Author :
Wu, Chuang ; Walsh, Andrew S. ; Rosenfeld, Roni
Author_Institution :
Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
fDate :
May 31 2010-June 3 2010
Abstract :
RNA viruses such as HIV, Influenza, impose very significant disease burden throughout the world. Identifying key protein residue determinants that affect a given viral phenotype is an important step in learning the genotype-phenotype mapping and making clinic decisions. This identification is currently done through a laborious experimental process which is arguably inefficient, incomplete, and unreliable. We describe a supervised combinatorial filtering algorithm that systematically and efficiently infers the correct set of key residue positions from all available labeled data. We demonstrate its consistency, validate it on a variety of datasets, show the superior power to conventional identification methods, and describe its use under incremental relaxation of constraints. For cases where more data is needed to fully converge to an answer, we introduce an active learning algorithm to help choose the most informative experiment from a set of unlabeled candidate strains or mutagenesis experiments, so as to minimize the expected total laboratory time or financial cost. As an example, we demonstrate the savings afforded by this algorithm in identifying the molecular determinants of fusogenicity from a previously published dataset of Feline Immunodeficiency Virus Envelope proteins.
Keywords :
bioinformatics; genetics; learning (artificial intelligence); macromolecules; microorganisms; proteins; Feline Immunodeficiency Virus Envelope proteins; RNA viruses; active learning; combinatorial filtering; fusogenicity; genotype-phenotype mapping; viral phenotype; viral protein genotypic determinants; Active filters; Capacitive sensors; Diseases; Filtering algorithms; Human immunodeficiency virus; Influenza; Laboratories; Proteins; RNA; Viruses (medical); Active Learning; Combinatorial Filtering; Key residues identification;
Conference_Titel :
BioInformatics and BioEngineering (BIBE), 2010 IEEE International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
978-1-4244-7494-3
DOI :
10.1109/BIBE.2010.25