DocumentCode :
3603876
Title :
Advancing the Accuracy of Protein Fold Recognition by Utilizing Profiles From Hidden Markov Models
Author :
Lyons, James ; Dehzangi, Abdollah ; Heffernan, Rhys ; Yuedong Yang ; Yaoqi Zhou ; Sharma, Alok ; Paliwal, Kuldip
Author_Institution :
Sch. of Eng., Griffith Univ., Brisbane, QLD, Australia
Volume :
14
Issue :
7
fYear :
2015
Firstpage :
761
Lastpage :
772
Abstract :
Protein fold recognition is an important step towards solving protein function and tertiary structure prediction problems. Among a wide range of approaches proposed to solve this problem, pattern recognition based techniques have achieved the best results. The most effective pattern recognition-based techniques for solving this problem have been based on extracting evolutionary-based features. Most studies have relied on the Position Specific Scoring Matrix (PSSM) to extract these features. However it is known that profile-profile sequence alignment techniques can identify more remote homologs than sequence-profile approaches like PSIBLAST. In this study we use a profile-profile sequence alignment technique, namely HHblits, to extract HMM profiles. We will show that unlike previous studies, using the HMM profile to extract evolutionary information can significantly enhance the protein fold prediction accuracy. We develop a new pattern recognition based system called HMMFold which extracts HMM based evolutionary information and captures remote homology information better than previous studies. Using HMMFold we achieve up to 93.8% and 86.0% prediction accuracies when the sequential similarity rates are less than 40% and 25%, respectively. These results are up to 10% better than previously reported results for this task. Our results show significant enhancement especially for benchmarks with sequential similarity as low as 25% which highlights the effectiveness of HMMFold to address this problem and its superiority over previously proposed approaches found in the literature. The HMMFold is available online at: http://sparks-lab.org/pmwiki/download/index.php?Download =HMMFold.tar.bz2.
Keywords :
Markov processes; biological techniques; molecular biophysics; proteins; HHblits; HMM profile; HMM-based evolutionary information; PSIBLAST; evolutionary-based feature; hidden Markov model; pattern recognition-based system; pattern recognition-based technique; position specific scoring matrix; profile-profile sequence alignment technique; protein fold prediction accuracy; protein fold recognition; protein function; remote homology information; sequence-profile approach; sequential similarity rate; tertiary structure prediction; Amino acids; Benchmark testing; Data mining; Feature extraction; Hidden Markov models; Protein sequence; Evolutionary-based features; HMM profile; HMMFold; PSSM profile; protein fold recognition; support vector machine (SVM);
fLanguage :
English
Journal_Title :
NanoBioscience, IEEE Transactions on
Publisher :
ieee
ISSN :
1536-1241
Type :
jour
DOI :
10.1109/TNB.2015.2457906
Filename :
7163361
Link To Document :
بازگشت