Title :
Specificity: A Graph-Based Estimator of Divergence
Author :
Twining, Carole J. ; Taylor, Christopher J.
Author_Institution :
Imaging Sci. Res. Group, Univ. of Manchester, Manchester, UK
Abstract :
In statistical modeling, there are various techniques used to build models from training data. Quantitative comparison of modeling techniques requires a method for evaluating the quality of the fit between the model probability density function (pdf) and the training data. One graph-based measure that has been used for this purpose is the specificity. We consider the large-numbers limit of the specificity, and derive expressions which show that it can be considered as an estimator of the divergence between the unknown pdf from which the training data was drawn and the model pdf built from the training data. Experiments using artificial data enable us to show that these limiting large-number relations enable us to obtain good quantitative and qualitative predictions of the behavior of the measured specificity, even for small numbers of training examples and in some extreme cases. We demonstrate that specificity can provide a more sensitive measure of difference between various modeling methods than some previous graph-based techniques. Key points are illustrated using real data sets. We thus establish a proper theoretical basis for the previously ad hoc concept of specificity, and obtain useful insights into the application of specificity in the analysis of real data.
Keywords :
data analysis; graph theory; modelling; probability; statistical analysis; artificial data; graph-based divergence estimator; graph-based measures; model probability density function; modeling technique; real data analysis; specificity limit; statistical modeling; training data; Data models; Entropy; Euclidean distance; Nearest neighbor searches; Statistical analysis; Training data; Kullback-Leibler divergence.; Specificity; assessment of modeling; cross entropy; entropy estimation; estimation of divergence; estimation of statistical distance; generalization; graph-based estimators; nearest-neighbor estimators;
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
DOI :
10.1109/TPAMI.2011.90