• DocumentCode
    3408707
  • Title

    Reasoning about molecular similarity and properties

  • Author

    Singh, Rahul

  • Author_Institution
    San Francisco State Univ., CA, USA
  • fYear
    2004
  • fDate
    2004
  • Firstpage
    266
  • Lastpage
    277
  • Abstract
    Ascertaining the similarity amongst molecules is a fundamental problem in biology and drug discovery. Since similar molecules tend to have similar biological properties, the notion of molecular similarity plays an important role in exploration of molecular structural space, query-retrieval in molecular databases, and in structure-activity modeling. This problem is related to the issue of molecular representation. Currently, approaches with high descriptive power like 3D surface-based representations are available. However, most techniques tend to focus on 2D graph-based molecular similarity due to the complexity that accompanies reasoning with more elaborate representations. This paper addresses the problem of determining similarity when molecules are described using complex surface-based representations. It proposes an intrinsic, spherical representation that systematically maps points on a molecular surface to points on a standard coordinate system (a sphere). Molecular geometry, molecular fields, and effects due to field super-positioning can then be captured as distributions on the surface of the sphere. Molecular similarity is obtained by computing the similarity of the corresponding property distributions using a novel formulation of histogram-intersection. This method is robust to noise, obviates molecular pose-optimization, can incorporate conformational variations, and facilitates highly efficient determination of similarity. Retrieval performance, applications in structure-activity modeling of complex biological properties, and comparisons with existing research and commercial methods demonstrate the validity and effectiveness of the approach.
  • Keywords
    biology computing; molecular biophysics; physiological models; query processing; complex surface-based representations; conformational variations; field superpositioning; histogram-intersection; molecular databases; molecular fields; molecular geometry; molecular pose-optimization; molecular properties; molecular representation; molecular similarity; molecular structural space; query-retrieval; structure-activity modeling; Biological system modeling; Biology computing; Chemistry; Computer science; Databases; Drugs; Geometry; Pharmaceuticals; Proteins; Research and development;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
  • Print_ISBN
    0-7695-2194-0
  • Type

    conf

  • DOI
    10.1109/CSB.2004.1332440
  • Filename
    1332440