Title :
Molecular information theory: from clinical applications to molecular machine efficiency
Author :
Schneider, Thomas D.
Abstract :
Information theory was introduced by Claude Shannon in 1948 to precisely characterize data flows in communications systems. The same mathematics can also be fruitfully applied to molecular biology problems. We start with the problem of understanding how proteins interact with DNA at specific sequences called binding sites. Information theory allows us to make an average picture of the binding sites and this can be shown with a computer graphic called a (http://www.lecb.nciferf.gov/∼toms/glossary.html#sequence_logo). Sequence logos show how strongly parts of a binding site are conserved, on a scale in bits of information. They have been used to study a variety of genetic control systems. More recently the same mathematics has been used to look at individual binding sites using another computer graphic called a sequence walker. (http://www.lecb.nciferf.gov/∼toms/glossary.html#sequence_walker). Sequence walkers are being used to predict whether changes in human genes cause mutations or are neutral polymorphisms. It may soon be possible to predict the degree of colon cancer by this method. Information theory can also be used to understand the relationship between the binding energy dissipated when two molecules stick together and the amount of sequence conservation of the molecules measured in bits. ´Using the Second Law of Thermodynamics, this relationship can be expressed as the efficiency of the molecular interaction. Surprisingly, many molecular systems including genetic systems, visual pigments and motility proteins have efficiencies near 70%. A purely geometrical explanation of this result shows that although biological systems are selected to have the highest efficiency, it is restricted to 70% because having precisely distinguishable molecular states is more important.
Keywords :
DNA; binding energy; biocomputing; biological techniques; cancer; computer graphics; genetics; information theory; macromolecules; molecular biophysics; proteins; thermodynamics; tumours; DNA sequences; binding energy dissipation; binding sites; biological systems; clinical applications; colon cancer; communications systems; computer graphic; data flows characterization; genetic control systems; human genes; molecular information theory; molecular interaction; molecular machine efficiency; motility proteins; mutations; neutral polymorphisms; proteins; second law of thermodynamics; sequence logos; sequence walker; visual pigments; Communication systems; Computer graphics; Control systems; DNA; Genetic communication; Information theory; Mathematics; Proteins; Sequences; Terminology;
Conference_Titel :
Engineering in Medicine and Biology Society, 2003. Proceedings of the 25th Annual International Conference of the IEEE
Print_ISBN :
0-7803-7789-3
DOI :
10.1109/IEMBS.2003.1281006