Title :
Mutual information of contingency tables and related inequalities
Author :
Harremoes, Peter
Author_Institution :
Copenhagen Bus. Coll., Copenhagen, Denmark
fDate :
June 29 2014-July 4 2014
Abstract :
For testing independence it is very popular to use either the χ2-statistic or G2-statistics (mutual information). Asymptotically both are χ2-distributed so an obvious question is which of the two statistics that has a distribution that is closest to the χ2-distribution. Surprisingly the distribution of mutual information is much better approximated by a χ2-distribution than the χ2-statistic. For technical reasons we shall focus on the simplest case with one degree of freedom. We introduce the signed log-likelihood and demonstrate that its distribution function can be related to the distribution function of a standard Gaussian by inequalities. For the hypergeometric distribution we formulate a general conjecture about how close the signed log-likelihood is to a standard Gaussian, and this conjecture gives much more accurate estimates of the tail probabilities of this type of distribution than previously published results. The conjecture has been proved numerically in all cases relevant for testing independence and further evidence of its validity is given.
Keywords :
Gaussian distribution; Poisson distribution; binomial distribution; gamma distribution; χ2-distribution; χ2-statistic; G2-statistics; contingency tables; distribution function; hypergeometric distribution; mutual information; signed log-likelihood; standard Gaussian distribution; Approximation methods; Gaussian processes; Mutual information; Random variables; Standards; Testing;
Conference_Titel :
Information Theory (ISIT), 2014 IEEE International Symposium on
Conference_Location :
Honolulu, HI
DOI :
10.1109/ISIT.2014.6875279