• DocumentCode
    327529
  • Title

    Mining the organic compound jungle-a functional programming approach

  • Author

    Burn-Thornton, K.E.

  • Author_Institution
    Network Res. Group, Plymouth Univ., UK
  • fYear
    1998
  • fDate
    35923
  • Firstpage
    42583
  • Lastpage
    42586
  • Abstract
    Pharmaceutical companies are continually striving to determine the common key characteristics of compounds that determine their functionality (e.g. relief of asthma) so that they may continue to provide safe and effective medicines. Historically this has been carried out by visually comparing graphical representations of the structures of compounds, which possess the same functionality, so that key substructures (pharmacophores) may be determined. However, with the advent of high throughput screening techniques providing data on enormous numbers of compounds, this has become inappropriate. (A human can only compare a certain number of patterns accurately in a day.) Potential solutions to this problem may appear to come from a knowledge based systems approach based upon pattern matching. However, we suggest this solution does lie with a knowledge based systems approach but one which relies on data mining as the underpinning technology. We discuss our initial work which shows that organic compounds, possessing the same functionality, may be mined for common substructures using data mining techniques. We also discuss how our prototype tool, in which a selection of data mining algorithms may be chosen, has been developed in a functional programming language. The functional language Gofer, which was used to rapidly prototype the tool, readily lends itself to the task due to its polymorphism and lazy evaluation. The lazy evaluation of Gofer is a particularly useful feature of the language which readily enables the common characteristics to be determined no matter how large the compounds
  • Keywords
    pharmaceutical industry; Gofer; common substructures; data mining algorithms; data mining techniques; functional programming approach; functional programming language; graphical representations; high throughput screening techniques; key substructures; knowledge based systems approach; lazy evaluation; organic compounds; pattern matching; pharmaceutical companies; pharmacophores; polymorphism; prototype tool; rapid prototyping;
  • fLanguage
    English
  • Publisher
    iet
  • Conference_Titel
    Knowledge Discovery and Data Mining (1998/434), IEE Colloquium on
  • Conference_Location
    London
  • Type

    conf

  • DOI
    10.1049/ic:19980648
  • Filename
    710063