Title :
A Three-Step Validation Following Genome-Wide Data Mining for Myosin Family Members Improves Search Efficiency
Author :
Syamaladevi, Divya P. ; Kalaimathy, S. ; Pasha, Naseer ; Subramonian, S. ; Sowdhamini, R.
Author_Institution :
Nat. Centre for Biol. Sci. (NCBS-TIFR), Bangalore, India
Abstract :
Profile based sequence search methods are widely used to obtain homologous proteins with the usage of stringent statistical measures. These simple searches, albeit their high sensitivity, cannot solely be relied, while searching for protein families with high functional diversity and sharing structural similarity with other families. Myosins are motor proteins that drive cellular mobility and associated functions in eukaryotes. These motors utilize the chemical energy released by ATP hydrolysis to bring about conformational changes leading to a motor function. The major feature of the protein is a highly conserved head domain which is an ATPase followed by a variable tail that binds to different cargoes. Motor domain is an ATPase that evolved from P-loop containing NTPase ancestral protein. A number of other protein families are believed to be related through divergent evolution of an ancestral P-loop NTP binding motif, hence sequence searches for the members of one super family results in cross talks with another. We developed a strategic protocol for effective sequence-based searches of such families. This protocol employs position-specific Iterative Blast followed by a three way validation: 1. Text search scripts 2. clustering using neighbour joining method 3. domain architecture definitions and applied in Myosins from five model genomes as standard. This protocol can be followed for genome scan of similar protein families with sequence wise diverse members and sharing common ancestral structural motifs with other families.
Keywords :
bioinformatics; data mining; genomics; proteins; statistical analysis; ATP hydrolysis; NTPase ancestral protein; cellular mobility; domain architecture definitions; homologous proteins; myosin family members; myosins; neighbour joining method; profile based sequence search methods; protein families; stringent statistical measures; text search scripts; three-step validation following genome-wide data mining; Coils; Data mining; Genomics; Humans; Phylogeny; Proteins; Protocols; genome scan; genome wide survey; myosin genome wide data mining; sequence data mining;
Conference_Titel :
Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
978-1-4673-0005-6
DOI :
10.1109/ICDMW.2011.18