DocumentCode
3060784
Title
CLASSEQ: Classification of Sequences via Comparative Analysis of Multiple Genomes
Author
Choi, Kwangmin ; Yang, Youngik ; Kim, Sun
Author_Institution
Indiana Univ., Bloomington
fYear
2007
fDate
13-15 Dec. 2007
Firstpage
554
Lastpage
559
Abstract
CLASSEQ is a Web-based system for the analysis and comparison of uncharacterized protein sequences against multiple genomes. The user sequences are combined with protein sequences from the user-specified genomes and then clustered using our in-house fast clustering algorithm, BAG. The pre-computed genome-to-genome pairwise comparison database, PCDB, makes our service fast enough to be provided on the Web even though the analysis typically involves tens of thousands of sequences. Clusters containing the user input sequences can be further characterized by domain search, multiple sequence alignment, phylogenetic tree analysis, and gene neighborhood analysis. This Web service is a useful resource for characterizing proteins of unknown functions via comparative genomics approach. CLASSEQ is available at http://platcom.org/CLASSEQ.
Keywords
Web services; biology computing; genetics; molecular biophysics; pattern classification; pattern clustering; proteins; CLASSEQ; Web service; Web-based system; gene neighborhood analysis; multiple genomes comparative analysis; phylogenetic tree analysis; sequences classification; uncharacterized protein sequences; user input sequences; user-specified genomes; Bioinformatics; Clustering algorithms; Databases; Genomics; Informatics; Machine learning; Performance analysis; Phylogeny; Proteins; Sun;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications, 2007. ICMLA 2007. Sixth International Conference on
Conference_Location
Cincinnati, OH
Print_ISBN
978-0-7695-3069-7
Type
conf
DOI
10.1109/ICMLA.2007.94
Filename
4457288
Link To Document