DocumentCode :
740513
Title :
Using Copulas in Data Mining Based on the Observational Calculus
Author :
Holena, Martin ; Bajer, Lukas ; Scavnicky, Martin
Author_Institution :
Institute of Computer Science, Academy of Sciences of the Czech Republic, Pod vod??renskou v?????? 2, Prague, Czech Republic
Volume :
27
Issue :
10
fYear :
2015
Firstpage :
2851
Lastpage :
2864
Abstract :
The objective of the paper is a contribution to data mining within the framework of the observational calculus, through introducing ǵeneralized quantifiers related to copulas. Fitting copulas to multidimensional data is an increasingly important method for analyzing dependencies, and the proposed quantifiers of observational calculus assess the results of estimating the structure of joint distributions of continuous variables by means of hierarchical Archimedean copulas. To this end, the existing theory of hierarchical Archimedean copulas has been slightly extended in the paper: It has been proven that sufficient conditions for the function defining a hierarchical Archimedean copula to be indeed a copula, which have so far been rigorously established only for the special case of fully nested Archimedean copulas, hold in general. These conditions allow us to define three new generalized quantifiers, which are then thoroughly validated on four benchmark data sets and one data set from a real-world application. The paper concludes by comparing the proposed quantifiers to a more traditional approach—maximum weight spanning trees.
Keywords :
Calculus; Data mining; Estimation; Generators; Joints; Labeling; Random variables; Data mining; copulas; data mining; generalized quantifiers; hierarchical; hierarchical Archimedean copulas; joint probability distribution; observational calculus;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2015.2426705
Filename :
7095574
Link To Document :
بازگشت