DocumentCode
1047379
Title
Exploring Microbial Genome Sequences to Identify Protein Families on the Grid
Author
Sun, Yudong ; Wipat, Anil ; Pocock, Matthew ; Lee, Peter A. ; Flanagan, Keith ; Worthington, James T.
Author_Institution
Oxford Univ., Oxford
Volume
11
Issue
4
fYear
2007
fDate
7/1/2007 12:00:00 AM
Firstpage
435
Lastpage
442
Abstract
The analysis of microbial genome sequences can identify protein families that provide potential drug targets for new antibiotics. With the rapid accumulation of newly sequenced genomes, this analysis has become a computationally intensive and data-intensive problem. This paper describes the development of a Web-service-enabled, component-based, architecture to support the large-scale comparative analysis of complete microbial genome sequences and the subsequent identification of orthologues and protein families (Microbase). The system is coordinated through the use of Web-service-based notifications and integrates distributed computing resources together with genomic databases to realize all-against-all comparisons for a large volume of genome sequences and to present the data in a computationally amenable format through a Web service interface. We demonstrate the use of the system in searching for orthologues and candidate protein families, which ultimately could lead to the identification of potential therapeutic targets.
Keywords
Web services; biology computing; genetics; microorganisms; object-oriented programming; proteins; Microbase; Web service based notifications; Web service enabled architecture; Web service interface; all against all comparisons; antibiotics; complete microbial genome sequences; component based architecture; distributed computing resources; genome sequencing; genomic databases; large scale comparative analysis; orthology identification; orthology searching; potential drug targets; potential therapeutic targets; protein family identification; Antibiotics; Bioinformatics; Computer architecture; Computer interfaces; Distributed computing; Distributed databases; Drugs; Genomics; Large-scale systems; Protein engineering; Genome analysis; Web services; grid; microbial genomes; protein families; Bacterial Proteins; Chromosome Mapping; Databases, Protein; Genome, Bacterial; Information Storage and Retrieval; Internet; Multigene Family;
fLanguage
English
Journal_Title
Information Technology in Biomedicine, IEEE Transactions on
Publisher
ieee
ISSN
1089-7771
Type
jour
DOI
10.1109/TITB.2007.892913
Filename
4267692
Link To Document