Title :
Statistical Inference of Rough Set Dependence and Importance Analysis
Author :
Dan Hu ; Xianchuan Yu
Author_Institution :
Coll. of Inf. Sci. & Technol., Beijing Normal Univ., Beijing, China
Abstract :
Statistical inference about dependence degree (DD) and importance degree (ID) of variables in an information system is crucial for variables appraisement and model reconstruction. However, in rough set data analysis (RSDA), the literature is restricted to validate independence or test whether the degree is significantly big, while the fixed value test and interval estimation for related measurements have been ignored. Because these important issues have not been addressed, we cannot determine whether the data support expert opinions and compare the features in depth. To enhance the integrity of statistical inference for DD and ID in an RSDA, fixed value tests and interval estimations of DD and ID are presented in this paper. With multinomial distribution as the carrier for statistical information in the databases, the fixed value test of DD is successfully transformed into a restricted estimation of multinomial distribution and a goodness-of-fit test for distributions. The fixed value test and interval estimation algorithms for DD and ID are then presented in detail and illustrated with examples. Explicit expressions for the DD and ID interval estimation, DD confidence curves, and the limit theory for DD and ID are shown. Furthermore, the effectiveness and discrimination of the proposed algorithms are validated using the Car evaluation, Tic-Tac-Toe endgame, and Fisher´s Iris databases.
Keywords :
data analysis; estimation theory; inference mechanisms; rough set theory; statistical distributions; statistical testing; visual databases; Car evaluation; DD; Fisher´s Iris databases; ID; RSDA; Tic-Tac-Toe endgame; dependence degree; fixed value tests; importance analysis; importance degree; information system; interval estimation algorithms; limit theory; model reconstruction; restricted multinomial distribution estimation; rough set data analysis; rough set dependence; statistical inference; statistical information; variable appraisement; Databases; Equations; Maximum likelihood estimation; Sociology; Testing; Dependence degree; fixed value test; importance degree; interval estimation; statistical hypothesis testing;
Journal_Title :
Fuzzy Systems, IEEE Transactions on
DOI :
10.1109/TFUZZ.2013.2242474