Title :
CLASSEQ: Classification of Sequences via Comparative Analysis of Multiple Genomes
Author :
Choi, Kwangmin ; Yang, Youngik ; Kim, Sun
Author_Institution :
Indiana Univ., Bloomington
Abstract :
CLASSEQ is a Web-based system for the analysis and comparison of uncharacterized protein sequences against multiple genomes. The user sequences are combined with protein sequences from the user-specified genomes and then clustered using our in-house fast clustering algorithm, BAG. The pre-computed genome-to-genome pairwise comparison database, PCDB, makes our service fast enough to be provided on the Web even though the analysis typically involves tens of thousands of sequences. Clusters containing the user input sequences can be further characterized by domain search, multiple sequence alignment, phylogenetic tree analysis, and gene neighborhood analysis. This Web service is a useful resource for characterizing proteins of unknown functions via comparative genomics approach. CLASSEQ is available at http://platcom.org/CLASSEQ.
Keywords :
Web services; biology computing; genetics; molecular biophysics; pattern classification; pattern clustering; proteins; CLASSEQ; Web service; Web-based system; gene neighborhood analysis; multiple genomes comparative analysis; phylogenetic tree analysis; sequences classification; uncharacterized protein sequences; user input sequences; user-specified genomes; Bioinformatics; Clustering algorithms; Databases; Genomics; Informatics; Machine learning; Performance analysis; Phylogeny; Proteins; Sun;
Conference_Titel :
Machine Learning and Applications, 2007. ICMLA 2007. Sixth International Conference on
Conference_Location :
Cincinnati, OH
Print_ISBN :
978-0-7695-3069-7
DOI :
10.1109/ICMLA.2007.94