Computational Information Geometry Wonderland : Tags : Chernoff, Bregman, Jensen, Exponential families

## Tags : Chernoff, Bregman, Jensen, Exponential families

## May 18, 2011

### Renyi and Tsallis entropies and divergences for exponential families

It is well-known that the Kullback-Leibler divergence of two densities belonging to the same exponential family can be equivalently computed as the Bregman divergence on natural parameters [for the log-normalizer].
So what happens if we consider generelizations of Kullback-Leibler divergence? KL divergence is based on Shannon entropy, so let us look at two generalizations of Shannon entropy: Renyi [preserve additivity] and Tsallis [non-extensive]. It turns out that we have simple closed-form expressions for the relative Renyi/Tsallis entropies of densities belonging to the same exponential family. Moreover, when we consider only Tsallis/Renyi entropies, we end up with simple formula that yields closed-form formula for families with standard carrier measure. We illustrate these results by computing the Tsallis/Renyi entropies/divergences for multivariate normals.

The technical details can be found here.

Frank.

## Feb 15, 2011

### Chernoff Information, Jensen divergence, Bregman divergence, Exponential families

I was intrigued a few years ago by a very nice paper of ISIT in 1995 by Pr. M. Basseville and J..F. Cardoso:

M. Basseville, J.F. Cardoso, On entropies, divergences and mean values. IEEE International Symposium on Information Theory - ISIT'95, p.330,

At that time, ISIT had only 1-page papers.

It is interestingly stated that "Still in this geometrical vein, the interplay between CH; JH and DH can be understood via Thales theorem."

So I asked Pr. Basseville for explanations. Pr. Basseville kindly replied me with the ISIT slides. This was a great source of inspiration for the following work dealing with Chernoff information on exponential families.

Chernoff information upper bounds the probability of error of the optimal Bayesian decision rule for $2$-class classification problems. However, it turns out that in practice the Chernoff bound is hard to calculate or even approximate. In statistics, many usual distributions, such as Gaussians, Poissons or frequency histograms called multinomials, can be handled in the unified framework of exponential families. In this note, we prove that the Chernoff information for members of the same exponential family can be either derived analytically in closed form, or efficiently approximated using a simple geodesic bisection optimization technique based on an exact geometric characterization of the "Chernoff point" on the underlying statistical manifold.

Frank.