DocumentCode
3603876
Title
Advancing the Accuracy of Protein Fold Recognition by Utilizing Profiles From Hidden Markov Models
Author
Lyons, James ; Dehzangi, Abdollah ; Heffernan, Rhys ; Yuedong Yang ; Yaoqi Zhou ; Sharma, Alok ; Paliwal, Kuldip
Author_Institution
Sch. of Eng., Griffith Univ., Brisbane, QLD, Australia
Volume
14
Issue
7
fYear
2015
Firstpage
761
Lastpage
772
Abstract
Protein fold recognition is an important step towards solving protein function and tertiary structure prediction problems. Among a wide range of approaches proposed to solve this problem, pattern recognition based techniques have achieved the best results. The most effective pattern recognition-based techniques for solving this problem have been based on extracting evolutionary-based features. Most studies have relied on the Position Specific Scoring Matrix (PSSM) to extract these features. However it is known that profile-profile sequence alignment techniques can identify more remote homologs than sequence-profile approaches like PSIBLAST. In this study we use a profile-profile sequence alignment technique, namely HHblits, to extract HMM profiles. We will show that unlike previous studies, using the HMM profile to extract evolutionary information can significantly enhance the protein fold prediction accuracy. We develop a new pattern recognition based system called HMMFold which extracts HMM based evolutionary information and captures remote homology information better than previous studies. Using HMMFold we achieve up to 93.8% and 86.0% prediction accuracies when the sequential similarity rates are less than 40% and 25%, respectively. These results are up to 10% better than previously reported results for this task. Our results show significant enhancement especially for benchmarks with sequential similarity as low as 25% which highlights the effectiveness of HMMFold to address this problem and its superiority over previously proposed approaches found in the literature. The HMMFold is available online at: http://sparks-lab.org/pmwiki/download/index.php?Download =HMMFold.tar.bz2.
Keywords
Markov processes; biological techniques; molecular biophysics; proteins; HHblits; HMM profile; HMM-based evolutionary information; PSIBLAST; evolutionary-based feature; hidden Markov model; pattern recognition-based system; pattern recognition-based technique; position specific scoring matrix; profile-profile sequence alignment technique; protein fold prediction accuracy; protein fold recognition; protein function; remote homology information; sequence-profile approach; sequential similarity rate; tertiary structure prediction; Amino acids; Benchmark testing; Data mining; Feature extraction; Hidden Markov models; Protein sequence; Evolutionary-based features; HMM profile; HMMFold; PSSM profile; protein fold recognition; support vector machine (SVM);
fLanguage
English
Journal_Title
NanoBioscience, IEEE Transactions on
Publisher
ieee
ISSN
1536-1241
Type
jour
DOI
10.1109/TNB.2015.2457906
Filename
7163361
Link To Document