• DocumentCode
    3036374
  • Title

    Knowledge discovery in databases: applications in the electrical power engineering domain

  • Author

    Steele, J.A. ; McDonald, J.R. ; Arcy, C.D.

  • Author_Institution
    Strathclyde Univ., Glasgow, UK
  • fYear
    1997
  • fDate
    35767
  • Firstpage
    42583
  • Lastpage
    42586
  • Abstract
    Knowledge discovery in databases (KDD) is defined as the non trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data (W.J. Frawley et al., 1991). KDD is an iterative process involving five steps which lead to the final goal of useful information. The five steps are: selection of data-determining which fields and records are to be analysed; preprocessing-cleaning the data, by removal of noise and outliers, if appropriate, and deciding on strategies for missing attribute values; transformation-representing the data by new features, and reducing its dimensionality; data mining-deciding which algorithms to apply to the data i.e., classification, regression, rule induction, neural networks; and interpretation/evaluation-feasibility analysis of the results from the data mining step. There are two general `goals´ in KDD: verification of a hypothesis; and discovery, where the `system´ autonomously discovers patterns. Within the KDD process a data warehouse is typically employed as the `source´ of the KDD exercise. The power industry has evolved to become dependent upon computerised environments with more online data being stored for later extraction and investigation. Two key areas where KDD has been shown to be applicable is in the analysis of energy pooling and settlement data, and for condition monitoring of power system plant
  • Keywords
    knowledge acquisition; KDD; computerised environments; condition monitoring; data mining; data warehouse; electrical power engineering domain; energy pooling; feasibility analysis; hypothesis verification; iterative process; knowledge discovery in databases; missing attribute values; online data storage; power industry; power system plant; preprocessing; rule induction; settlement data; understandable patterns;
  • fLanguage
    English
  • Publisher
    iet
  • Conference_Titel
    IT Strategies for Information Overload (Digest No: 1997/340), IEE Colloquium on
  • Conference_Location
    London
  • Type

    conf

  • DOI
    10.1049/ic:19971153
  • Filename
    659910