Title :
Jeffreys Centroids: A Closed-Form Expression for Positive Histograms and a Guaranteed Tight Approximation for Frequency Histograms
Author_Institution :
Sony Comput. Sci. Labs., Inc., Tokyo, Japan
Abstract :
Due to the success of the bag-of-word modeling paradigm, clustering histograms has become an important ingredient of modern information processing. Clustering histograms can be performed using the celebrated k-means centroid-based algorithm. From the viewpoint of applications, it is usually required to deal with symmetric distances. In this letter, we consider the Jeffreys divergence that symmetrizes the Kullback-Leibler divergence, and investigate the computation of Jeffreys centroids. We first prove that the Jeffreys centroid can be expressed analytically using the Lambert W function for positive histograms. We then show how to obtain a fast guaranteed approximation when dealing with frequency histograms. Finally, we conclude with some remarks on the k-means histogram clustering.
Keywords :
approximation theory; document handling; pattern classification; pattern clustering; Jeffreys centroids; Kullback-Leibler divergence; Lambert W function; bag-of-word modeling paradigm; closed-form expression; document classification; frequency histograms; guaranteed tight approximation; histograms clustering; k-means centroid-based algorithm; k-means histogram clustering; positive histograms; Approximation algorithms; Approximation methods; Clustering algorithms; Databases; Histograms; Signal processing algorithms; Visualization; Centroid; Jeffreys divergence; Kullback–Leibler divergence; Lambert $W$ function; clustering; histogram;
Journal_Title :
Signal Processing Letters, IEEE
DOI :
10.1109/LSP.2013.2260538