Title :
Text-Based Protein Structure Modeling for Structure Comparison
Author :
Razmara, Jafar ; Deris, Safaai B. ; Parvizpour, Sepideh
Author_Institution :
Fac. of Comput. Sci. & Inf. Syst., Univ. Teknol. Malaysia, Johor Bahru, Malaysia
Abstract :
Protein structure comparison in three dimensions is a vital step in structural biology in order to predict and analyze a new unknown protein function. Several methods have been explored over the past decade. However, none of them completely solves the problem. We introduce a novel method for protein structure modeling in textual sequence in order to structural comparison of proteins based on language modeling techniques. The roots of the comparison method are inspired from computational linguistics and the related techniques for quantifying and comparing strings of characters. In this way, the protein structure is represented in three sequences of characters and then n-gram modeling technique is applied to capture the contents regularities. In the sequel, these regularities are contrasted by cross-entropy concept and structural similarity between two proteins is measured. To find an overlap between two protein structures in 3D-space, a superposition task is also applied. In order to confirm the validity of the method, some experiments were performed using a collection of the protein data sets. The results represent the usefulness and applicability of the new approach and motivate further studies on development of tools based on computational linguistics methods.
Keywords :
biology computing; computational linguistics; computational linguistics; cross-entropy concept; language modeling techniques; n-gram modeling technique; protein structure comparison; structural biology; text-based protein structure modeling; textual sequence; Algorithm design and analysis; Amino acids; Computational linguistics; Computer science; Information systems; Natural languages; Pattern recognition; Predictive models; Proteins; Sequences; cross-entropy; n-gram modeling; protein structure comparison;
Conference_Titel :
Soft Computing and Pattern Recognition, 2009. SOCPAR '09. International Conference of
Conference_Location :
Malacca
Print_ISBN :
978-1-4244-5330-6
Electronic_ISBN :
978-0-7695-3879-2
DOI :
10.1109/SoCPaR.2009.100