• DocumentCode
    740513
  • Title

    Using Copulas in Data Mining Based on the Observational Calculus

  • Author

    Holena, Martin ; Bajer, Lukas ; Scavnicky, Martin

  • Author_Institution
    Institute of Computer Science, Academy of Sciences of the Czech Republic, Pod vod??renskou v?????? 2, Prague, Czech Republic
  • Volume
    27
  • Issue
    10
  • fYear
    2015
  • Firstpage
    2851
  • Lastpage
    2864
  • Abstract
    The objective of the paper is a contribution to data mining within the framework of the observational calculus, through introducing ǵeneralized quantifiers related to copulas. Fitting copulas to multidimensional data is an increasingly important method for analyzing dependencies, and the proposed quantifiers of observational calculus assess the results of estimating the structure of joint distributions of continuous variables by means of hierarchical Archimedean copulas. To this end, the existing theory of hierarchical Archimedean copulas has been slightly extended in the paper: It has been proven that sufficient conditions for the function defining a hierarchical Archimedean copula to be indeed a copula, which have so far been rigorously established only for the special case of fully nested Archimedean copulas, hold in general. These conditions allow us to define three new generalized quantifiers, which are then thoroughly validated on four benchmark data sets and one data set from a real-world application. The paper concludes by comparing the proposed quantifiers to a more traditional approach—maximum weight spanning trees.
  • Keywords
    Calculus; Data mining; Estimation; Generators; Joints; Labeling; Random variables; Data mining; copulas; data mining; generalized quantifiers; hierarchical; hierarchical Archimedean copulas; joint probability distribution; observational calculus;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2015.2426705
  • Filename
    7095574