مرکز منطقه ای اطلاع رساني علوم و فناوري - Identifying protein binding functionality of protein family sequences by Aligned Pattern clusters

DocumentCode :

2691242

Title :

Identifying protein binding functionality of protein family sequences by Aligned Pattern clusters

Author :

Lee, En-Shiun Annie ; Wong, Andrew K C

Author_Institution :

Syst. Design Eng., Univ. of Waterloo, Waterloo, ON, Canada

fYear :

2012

fDate :

4-7 Oct. 2012

Firstpage :

Lastpage :

Abstract :

A basic task in protein analysis is to discover a set of sequence patterns that reflect the function of a protein family. This set of sequence patterns contains non-exact significant residue associations. Currently, the existing combinatorial methods are computationally expensive and probabilistic methods require richer representation of the amino acid associations. To undertake this task, we create a synthesized pattern representation called an Aligned Pattern (AP) Cluster that identifies the residue associations in the binding segment and the site variations in the aligned residues. In this paper, our algorithm identifies the binding segments for two protein families: the Cytochrome Complex and the Ubiquitin protein families. For each of the experiments, the AP Clusters obtained correspond to protein binding segments including a few beyond those identified by the other protein databases, PROSITE and pFam. Furthermore, the columns of aligned sites that exist only as a single value in the AP Clusters also corresponds to the binding residues. Additional information retained by the AP Clusters can reveal the amino acid residues of interest, thus averting time-consuming simulations and experimentation.

Keywords :

association; biochemistry; bioinformatics; molecular biophysics; molecular configurations; pattern clustering; probability; proteins; AP clusters; PROSITE; aligned pattern clusters; amino acid associations; amino acid residues; binding segments; combinatorial methods; computationally expensive methods; computationally probabilistic methods; cytochrome complex; pFam; protein analysis; protein binding functionality; protein binding segments; protein databases; protein family sequences; sequence patterns; synthesized pattern representation; time-consuming simulation; ubiquitin protein families; Amino acids; Clustering algorithms; Hamming distance; Heuristic algorithms; Probabilistic logic; Protein engineering; Proteins; Aligned Pattern Cluster; Hierarchical Clustering; Pattern; Protein Function Prediction;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Bioinformatics and Biomedicine (BIBM), 2012 IEEE International Conference on

Conference_Location :

Philadelphia, PA

Print_ISBN :

978-1-4673-2559-2

Electronic_ISBN :

978-1-4673-2558-5

Type :

conf

DOI :

10.1109/BIBM.2012.6392682

Filename :

6392682

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2691242