<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
    xmlns:admin="http://webns.net/mvcb/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
<title>Computational Information Geometry Wonderland</title>
<link>http://blog.informationgeometry.org/index.php</link>
<pubDate>Thu, 04 Mar 2010 13:13:24 </pubDate>
<description>
Computational Information Geometry Wonderland - RSS 2.0 (Really Simple Syndication).
</description>
<item>
<title>Convex Hull Peeling</title>
<link>http://blog.informationgeometry.org/article.php?id=96</link>
<pubDate>Thu, 04 Mar 2010 13:13:24 +0900</pubDate>
<description>A long time ago, well in 1996, I investigated output-sensitive algorithms. 
I then designed an algorithm for peeling ite...</description>
<content:encoded>
<![CDATA[<p>A long time ago, well in 1996, I investigated output-sensitive algorithms. 
I then designed an algorithm for peeling iteratively the convex hulls of a 2D point set.
This yields the notion of depth of a point set, and is a useful index in statistics as attested by recent papers published in this area.<BR>
</p>
<p>Here is the reference work:<BR>
<A HREF="http://www.citeulike.org/user/ono/article/1460739">Output-sensitive peeling of convex and maximal layers</A>
</p>
]]>
</content:encoded>
</item><item>
<title>The geometric median</title>
<link>http://blog.informationgeometry.org/article.php?id=95</link>
<pubDate>Thu, 04 Mar 2010 12:24:40 +0900</pubDate>
<description>The center of mass (=centroid) is defined as the center point minimizing the squared of the Euclidean distances (=varian...</description>
<content:encoded>
<![CDATA[<p>The center of mass (=centroid) is defined as the center point minimizing the squared of the Euclidean distances (=variance).
If one of the source point is an outlier corrupting your dataset, and if that outlier goes to infinity, then your centroid follows it can is clearly not a robust centerpoint. The median on the contrary is defined as the center point minimising the sum of Euclidean distances. It is robust as it breaks only if n/2 outlier points go to infinity. However, there is no closed form solutions.
<BR>
</p>
<p>Doing some survey, I found that since Fermat (who allegedly first ask the question for 3 points), it has been studied and rediscovered in many communities. The current labeling of this point should be Fermat-Fagnano-Weber-Torricelli-Steiner point, and I forgot many names...
<BR>
</p>
<p>History is full of insights!<BR>
Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Natural Exponential Families QVF</title>
<link>http://blog.informationgeometry.org/article.php?id=94</link>
<pubDate>Wed, 03 Mar 2010 13:44:05 +0900</pubDate>
<description>They are only 6 exponential family distributions that admit variance as a quadratic function (QVF=quadratic variance fun...</description>
<content:encoded>
<![CDATA[<p>They are only 6 exponential family distributions that admit variance as a quadratic function (QVF=quadratic variance function) of the parameter.
For the multivariate case, it is a bit more complex but well defined and studied:<BR>
</p>
<p>
<A HREF="http://projecteuclid.org/DPubS?service=UI&amp;version=1.0&amp;verb=Display&amp;handle=euclid.aos/1032298298">
The $2d+4$ simple quadratic natural exponential families on $R^d$
</A>
<BR>
</p>
<p>Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Bhattacharyya matrices and Cramer-Rao lower bounds</title>
<link>http://blog.informationgeometry.org/article.php?id=93</link>
<pubDate>Tue, 02 Mar 2010 11:20:14 +0900</pubDate>
<description>Yesterday, I looked at historical papers of A. Bhattacharyya and found that he wrote three papers (one per year) as foll...</description>
<content:encoded>
<![CDATA[<p>Yesterday, I looked at historical papers of A. Bhattacharyya and found that he wrote three papers (one per year) as follows:</p>
<pre>
1946: On some analogues of the amount of information and their use in statistical estimation: I
1947: On some analogues of the amount of information and their use in statistical estimation: II
1948: On some analogues of the amount of information and their use in statistical estimation: III
</pre>
<p>The celebrated Cramer-Rao inequality can be seen as a particular case of Bhattacharyya inequality.
(provide a lower bound on the variance of ANY unbiased estimator of a parametric function).</p>
<p>The Bhattacharyya bounds obtained from Bhattacharyya matrices become sharper viz. the order of the matrix.
<BR>
</p>
<p>And information geometry cherishes lower bounds too. A great result not so popular somehow. 
It deserves prime time nowadays.<BR>
</p>
<p>Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Attosecond and  Computational photography</title>
<link>http://blog.informationgeometry.org/article.php?id=92</link>
<pubDate>Mon, 01 Mar 2010 13:10:52 +0900</pubDate>
<description>I am always astonished when I read that some folks managed to take pictures at the attosecond time. You capture electron...</description>
<content:encoded>
<![CDATA[<p>I am always astonished when I read that some folks managed to take pictures at the attosecond time. You capture electron density. And we like densities a lot in information geometry. -:) <BR>
</p>
<p>See <A HREF="http://www.nature.com/nphys/journal/vaop/ncurrent/extref/nphys1511-s1.pdf">Attosecond imaging of molecular electronic wavepackets</A>
<BR>
</p>
<p>I noticed that using twitter slow down blogging. What is the trade-off? Ponder!
<BR>
</p>
<p>Twitter: FrnkNlsn (yes, the one with vowels was already taken...)</p>
]]>
</content:encoded>
</item><item>
<title>Twitter</title>
<link>http://blog.informationgeometry.org/article.php?id=91</link>
<pubDate>Thu, 25 Feb 2010 10:54:15 +0900</pubDate>
<description>Ok, I'll try to twit and see how it impact my daily communication.
Here is my account:

FrnkNlsn

Everyone is welcome to...</description>
<content:encoded>
<![CDATA[<p>Ok, I'll try to twit and see how it impact my daily communication.
Here is my account:</p>
<pre>
FrnkNlsn
</pre>
<p>Everyone is welcome to subscribe -:)</p>
<p>
<BR>
</p>
]]>
</content:encoded>
</item><item>
<title>Hierarchical Gaussian Mixture Models</title>
<link>http://blog.informationgeometry.org/article.php?id=90</link>
<pubDate>Wed, 24 Feb 2010 11:46:15 +0900</pubDate>
<description>We have finalized our poster for the forthcoming ICASSP conference.








Here are some accompanying materials:

  pap...</description>
<content:encoded>
<![CDATA[<p>We have finalized our poster for the forthcoming ICASSP conference.<BR>
</p>
<p>
<center>
<A HREF="http://blog.informationgeometry.org/resources/Poster-ICASSP-HGMM-2010.pdf">
<img src="http://blog.informationgeometry.org/resources/poster-ICASSP2010.png"  alt="poster-ICASSP2010.png" />
</A>
</center>
</p>
<p>Here are some accompanying materials:
<UL>
<LI>  <a href="http://blog.informationgeometry.org/resources/2010-HierarchicalGMM-ICASSP.pdf">paper pdf</a>
</LI>
<li> <a href="http://blog.informationgeometry.org/resources/Poster-ICASSP-HGMM-2010.pdf">poster pdf</a>
</LI>
<LI> <a href="http://blog.informationgeometry.org/resources/Slides-ICASSP2010.pdf">slides pdf</a>
</LI>
</UL>
</p>
<p>It will be on display as follows:</p>
<pre>
SPTM-P8: Signal and System Modeling and Estimation I
Location:   Poster Area B
Time:   Friday, March 19, 10:00 - 12:00
</pre>
<p>Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Maximum likelihood estimators</title>
<link>http://blog.informationgeometry.org/article.php?id=89</link>
<pubDate>Tue, 23 Feb 2010 13:25:36 +0900</pubDate>
<description>It is often very instructive to look back at the seminal paper that forge the field.
Fisher is dubbed the father of stat...</description>
<content:encoded>
<![CDATA[<p>It is often very instructive to look back at the seminal paper that forge the field.
Fisher is dubbed the father of statistics and paved the way to statistical science by defining the notions of unbiasedness, consistency, efficiency and sufficiency.
<BR>
The seminal short paper introducing the MLE in disguise dates back of 1912:</p>
<p>
<A HREF="http://digital.library.adelaide.edu.au/coll/special/fisher/1.pdf">On an absolute criterion for fitting frequency curves</A>
</p>
<p>Worth a glance!<BR>
</p>
<p>Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Havrda-Charvat entropy=Tsallis entropy</title>
<link>http://blog.informationgeometry.org/article.php?id=88</link>
<pubDate>Mon, 22 Feb 2010 15:27:58 +0900</pubDate>
<description>Non-extensive entropy has become widely popular in statistical physics.
At its core is the so-called Tsallis entropy tha...</description>
<content:encoded>
<![CDATA[<p>Non-extensive entropy has become widely popular in statistical physics.
At its core is the so-called Tsallis entropy that generalizes the Boltzmann-Shannon entropy.<BR>
In fact, this family of parameterized entropies was historically first studied by researchers in information theory:<BR>
</p>
<p>
<A HREF="http://dml.cz/bitstream/handle/10338.dmlcz/125526/Kybernetika_03-1967-1_3.pdf">
Quantification method of classification processes. Concept of structural a-entropy
</A>
<BR>
from a 1967 paper of Jan Havrda and Frantisek Charvat.</p>
</p>
<p>Frank.</p>
]]>
</content:encoded>
</item><item>
<title>The geometry of a hyperbolic horosphere is Euclidean</title>
<link>http://blog.informationgeometry.org/article.php?id=87</link>
<pubDate>Tue, 16 Feb 2010 12:01:44 +0900</pubDate>
<description>To grasp non-euclidean geometries, we are familiar with the Riemannian mapping theorem that tells us that we can see any...</description>
<content:encoded>
<![CDATA[<p>To grasp non-euclidean geometries, we are familiar with the Riemannian mapping theorem that tells us that we can see any Riemannian geometry as the geometry defined by a surface in high-dimensional Euclidean space.
Hyperbolic geometry admits several models (conformal or not) like the Poincare upper space/ball or Beltrami-Klein ball.
However, one might ask: "Can I view Euclidean geometry in hyperbolic geometry?" After all, we are not sure we are living in a Euclidean space. The question is yes, and was answered back in the 18th century by a student of Gauss, Wachter. He showed that a horosphere is metrically equivalent to Euclidean geometry. Therefore we have a non-Euclidean model of Euclidean geometry too. <BR>
</p>
<p>More in the book of Jeremy Gray<BR>
Ideas of space: Euclidean, non-Euclidean, and relativistic.</p>
<p>
<BR>
</p>
<p>Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Mixture of beta distributions</title>
<link>http://blog.informationgeometry.org/article.php?id=86</link>
<pubDate>Tue, 26 Jan 2010 18:28:48 +0900</pubDate>
<description>Mixtures of beta distributions are useful in various areas of computational science for modeling underlying distribution...</description>
<content:encoded>
<![CDATA[<p>Mixtures of <A HREF="http://en.wikipedia.org/wiki/Beta_distribution">beta distributions</A> are useful in various areas of computational science for modeling underlying distributions of datasets. Although beta distributions are not the most famous mixture models (leaving the place to unthroned Gaussian mixture models or GMMs for short) they are convenient for a number of reasons. The two parameters alpha and beta give flexibility for modeling various shapes, and they are prior conjugate of Bernoulli distributions for Bayesian inference.<BR>
</p>
<p>In bioinformatics, beta mixtures have been proven useful for analyzing gene expressions. 
Either the same gene observed under different modalities (say, different maker microarrays or radioactivity labeled DNAc) or for extracting pathways (co-expressed genes). The basic feature is the correlation number of pairwise expressions. Modeling those correlation coefficient distribution allows one to fit beta mixtures with two components: The similar and dissimilar without using a priori threshold (often set arbitrarily to 0.5).
<BR>
A standard EM algorithm using numerical optimization let us fit beta mixture.
What is the best method and best tool for doing that? If there are biologists surfing here around, let me know please.
<BR>
The reference of the paper is:</p>
<pre>
Bioinformatics. 2005 May 1;21(9):2118-22. Epub 2005 Feb 15.
Applications of beta-mixture models in bioinformatics.
Ji Y, Wu C, Liu P, Wang J, Coombes KR.
</pre>
<p>See <A HREF="http://www.ncbi.nlm.nih.gov/pubmed/15713737">link</A>
<BR> for retrieving the paper.<BR>
</p>
<p>See also our open source library for handling mixture models: <A HREF="http://www.lix.polytechnique.fr/~nielsen/MEF/">jMEF</A>
<BR>
</p>
<p>Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Experiencing with centroidal Voronoi tesselations</title>
<link>http://blog.informationgeometry.org/article.php?id=85</link>
<pubDate>Wed, 20 Jan 2010 15:02:50 +0900</pubDate>
<description>One cool application of centroidal Voronoi tesselations is stippling.
Stippling is an artistic rendering of images using...</description>
<content:encoded>
<![CDATA[<p>One cool application of centroidal Voronoi tesselations is stippling.
Stippling is an artistic rendering of images using the grey intensities as the underlying point density for uniform sampling.
Here is a (preliminary) result (of myself):<BR>
</p>
<p>
<center>
<img src="http://blog.informationgeometry.org/resources/stipplingfrank600.png" alt="stipplingfrank600.png" />
</center>
As you can see there is some artifacts if we proceed  the straight way. Hence the many papers dealing with CVTs and noise analysis<BR>
Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Variational distance is the only metric f-divergence</title>
<link>http://blog.informationgeometry.org/article.php?id=84</link>
<pubDate>Thu, 07 Jan 2010 22:40:41 +0900</pubDate>
<description>A quite interesting property is that the variational divergence (a f-divergence)  and its multiples are the only metric ...</description>
<content:encoded>
<![CDATA[<p>A quite interesting property is that the variational divergence (a f-divergence)  and its multiples are the only metric f-divergences. This was recently shown <BR>
<A HREF="http://ece.iut.ac.ir/faculty/khosravi/index_files/Page600.htm">Confliction of the Convexity and Metric Properties in f-Divergences</A>  (IEICE 2007).
<BR>
</p>
<p>If instead, we assume f strictly positive concave (and self-dual f(x)=x f(1/x)) then the formula of f-divergences yields metrics.<BR>
</p>
<p>Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Synthetical information geometry (versus analytical geometry)</title>
<link>http://blog.informationgeometry.org/article.php?id=83</link>
<pubDate>Wed, 23 Dec 2009 07:54:11 +0900</pubDate>
<description>Let us give some examples of information manifolds:

Statistical manifolds (parametric distributions),
Neural manifolds ...</description>
<content:encoded>
<![CDATA[<p>Let us give some examples of <em>information manifolds</em>:
<UL>
<LI>Statistical manifolds (parametric distributions),
<LI>Neural manifolds (Boltzmann machines with fixed topology, i.e., number of nodes),
<LI>ARMA(p,q) time-series manifolds  (e-flat=-1-flat)
</UL>
</p>
<p>Strictly speaking, geometrizing information-theoretic problems does not provide a more powerful framework in theory.
This is because synthetical and analytical geometries are equivalent. 
Informally, that means that we can do geometry by algebraic equations.</p>
<p>However, geometrizing problems help grab intuition on the problem at hand.
Geometry also yields novel notions to mathematical theories.
For example, let us cite the two curvature notions in statistics: exponential and mixture curvatures emanating from conjugate connections.
So although synthetical geometry provides the same power as analytical geometry, the third-order asymptotic theory of statistics has been obtained so far only from synthetical information geometry.</p>
<p>Dual differential geometries are also useful to tackle information-theoretic problems such as
<UL>
<LI>Multiterminal problems met in information theory,
<LI>Linear programming problems (e.g., continuous Karmarkar inner method walking along the m-geodesic),
<LI>Clustering (negative entropy and dual Legendre log-normalizer conjugate for soft/hard clustering).
</UL>
</p>
<p>
<BR>
Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Beta divergence as representational Bregman divergences</title>
<link>http://blog.informationgeometry.org/article.php?id=82</link>
<pubDate>Mon, 21 Dec 2009 22:53:17 +0900</pubDate>
<description>In the paper 

The dual Voronoi diagrams with respect to representational Bregman divergences
 (ISVD 2009), 
Internation...</description>
<content:encoded>
<![CDATA[<p>In the paper 
<em>
<A HREF="http://www.lix.polytechnique.fr/~nielsen/ISVD09-GenBregmanVD.pdf">The dual Voronoi diagrams with respect to representational Bregman divergences</A>
</em> (ISVD 2009), 
International Symposium on Voronoi Diagrams,  we show that Beta divergences is a (representational) Bregman divergence with
<UL>
<LI> Beta=1 -> Kullback-Leibler
<LI> Beta=0 -> Itakura-Saito
<LI> Beta=2 -> squared Euclidean distance 
</UL>
</p>
<p>In the paper, we derive formula for the beta left and right-sided centroids.
The program <a href="http://blog.informationgeometry.org/resources/RepresentationalBetaBregman.java">RepresentationalBetaBregman.java</a> shows this equivalence (up to numerical errors).</p>
<pre>
Shows that beta divergences can be obtained from representational Bregman divergences
beta-div=0.0060607537896566754  equals Bregman rep. div=0.006060753789656717
beta-div=0.03596836005227946    equals Bregman rep. div=0.03596836005227946
beta-div=0.03786639961675385    equals Bregman rep. div=0.03786639961675385
beta-div=0.015356733711556145   equals Bregman rep. div=0.015356733711556173
beta-div=0.16665973512136045    equals Bregman rep. div=0.16665973512136045
beta-div=0.006143185064308276   equals Bregman rep. div=0.006143185064308207
beta-div=0.012777128199946086   equals Bregman rep. div=0.012777128199946072
beta-div=2.42453303134018E-4    equals Bregman rep. div=2.4245330313402494E-4
beta-div=0.07962156613977964    equals Bregman rep. div=0.07962156613977961
beta-div=2.6549301732092453E-4  equals Bregman rep. div=2.6549301732092974E-4
Press any key to continue...
</pre>
<p>Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Population space and Rao's distance</title>
<link>http://blog.informationgeometry.org/article.php?id=81</link>
<pubDate>Sun, 20 Dec 2009 02:00:32 +0900</pubDate>
<description>The seminal paper of Rao written before he joined Cambridge for his PhD is available online at:...</description>
<content:encoded>
<![CDATA[<p>The seminal paper of Rao written before he joined Cambridge for his PhD is available online at:<BR</p>
<p>
<A HREF="http://books.google.fr/books?id=tNKkI3QX-UoC&amp;printsec=frontcover&amp;source=gbs_v2_summary_r&amp;cad=0#v=onepage&amp;q=&amp;f=false">Breakthroughs in Statistics</A>
<img src="http://blog.informationgeometry.org/resources/BreakthroughsStatsVol1.jpg"  />
<BR>
page 235 is a reprint of:<BR>
Rao, Calyampudi (1945). "Information and the accuracy attainable in the estimation of statistical parameters". Bulletin of the Calcutta Mathematical Society 37: 81?89.
<BR>
</p>
<p>There we find three essential results:
<UL>
<LI>Cramer-Rao bound</p>
<p>
<LI>Population space and riemannian geometry using the Fisher information metric as the quadratic differential form.</p>
<p>
<LI>Test of significance (and classification)</p>
<p>
</UL>
</p>
<p>Information geometry has since then spreaded, with the work of Chentsov on alpha-connections and its investigation by Amari. Historically, the space of distributions, was called the "population space".</p>
<p>The Rao distance for 1D normals is also given.<BR>
</p>
<p>Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Bhattacharyya metric for multivariate gaussians</title>
<link>http://blog.informationgeometry.org/article.php?id=80</link>
<pubDate>Sat, 19 Dec 2009 21:20:21 +0900</pubDate>
<description>The Bhattacharyya metric for multivariate gaussians
 is given by


The geometry is spherical on renormalized density fun...</description>
<content:encoded>
<![CDATA[<p>The Bhattacharyya metric for multivariate gaussians
 is given by<BR>
<img src="http://blog.informationgeometry.org/resources/BattGaussian.png"  />
</p>
<p>The geometry is spherical on renormalized density functions.</p>
<p>Bhattacharyya with Mahalanobis distances were precursors of the Fisher-Rao Riemannian distances.<BR>
</p>
<p>Frank.</p>
]]>
</content:encoded>
</item><item>
<title>alpha-means with respect to alpha-divergences</title>
<link>http://blog.informationgeometry.org/article.php?id=79</link>
<pubDate>Thu, 17 Dec 2009 22:39:23 +0900</pubDate>
<description>In this note alphadivergencemeans.pdf, we summarize the following work


Shun-ichi Amari, Integration of Stochastic Mode...</description>
<content:encoded>
<![CDATA[<p>In this note <a href="http://blog.informationgeometry.org/resources/alphadivergencemeans.pdf">alphadivergencemeans.pdf</a>, we summarize the following work</p>
<p>
<UL>
<LI>Shun-ichi Amari, Integration of Stochastic Models by Minimizing \alpha-Divergence, 
Neural Computation (NECO), (19)10:2780-2796, October 2007.</p>
<p>
<LI>F. Nielsen and R. Nock, The dual Voronoi diagrams with respect to representational Bregman divergences, International Symposium on Voronoi Diagrams (ISVD), June 2009.</p>
<p>
<LI>
F. Nielsen and R. Nock, Sided and Symmetrized Bregman Centroids, IEEE transactions on information theory (2009), vol. 55, no. 6, pp. 2882-2904</p>
<p>
</UL>
</p>
]]>
</content:encoded>
</item><item>
<title>jMEF in Matlab: Mixture of Exponential Families</title>
<link>http://blog.informationgeometry.org/article.php?id=78</link>
<pubDate>Sat, 12 Dec 2009 21:49:39 +0900</pubDate>
<description>The jMEF library is easily interfaced with Matlab. You can compute Gaussian Mixture Models (GMMs) and manipulate them ea...</description>
<content:encoded>
<![CDATA[<p>The jMEF library is easily interfaced with Matlab. You can compute Gaussian Mixture Models (GMMs) and manipulate them easily now in Matlab.<BR>
</p>
<p>
<A HREF="http://www.lix.polytechnique.fr/~nielsen/MEF/matlab/jMEF_matlab.html">jMEF in Matlab</A>
<BR>
</p>
<p>
<center>
<img src="http://blog.informationgeometry.org/resources/gmmoriginal.png"  />
<BR>
Source GMM<BR>
<img src="http://blog.informationgeometry.org/resources/samplesfromgmm.png"  />
Sample points from GMM<BR>
<img src="http://blog.informationgeometry.org/resources/gmmfromsamples.png"  />
<BR>
GMM obtained using Bregman soft clustering (expectation maximization)
</center>
</p>
]]>
</content:encoded>
</item><item>
<title>Alpha divergences as representational Bregman divergences</title>
<link>http://blog.informationgeometry.org/article.php?id=77</link>
<pubDate>Fri, 11 Dec 2009 18:09:37 +0900</pubDate>
<description>In my paper at ISVD 2009 on 

The dual Voronoi diagrams with respect to
representational Bregman divergences


slides


...</description>
<content:encoded>
<![CDATA[<p>In my paper at ISVD 2009 on <BR>
</p>
<p>The dual Voronoi diagrams with respect to
representational Bregman divergences<BR>
</p>
<p>
<A HREF="http://www.lix.polytechnique.fr/~nielsen/Slides-ISVD2009.pdf">slides</A>
<BR>
</p>
<p>I show that by using a representational function, we can obtain alpha-divergences as a representational Bregman divergences. Therefore, it is easy to extend algorithms tailored to Bregman divergences to alpha divergences.<BR</p>
<p>Here is a code snippet in Java: <a href="http://blog.informationgeometry.org/resources/RepresentationalAlphaBregman.java">RepresentationalAlphaBregman.java</a>
<BR>
</p>
<p>Running it, you get something like</p>
<pre>
Shows that alpha divergences can be obtained from representational Bregman divergences
alpha-div=0.9008838766115398    equals Bregman rep. div=0.9008838766115399
alpha-div=0.14730849849416264   equals Bregman rep. div=0.14730849849416267
alpha-div=0.05455969651963932   equals Bregman rep. div=0.054559696519639225
alpha-div=1.2439567444853374    equals Bregman rep. div=1.2439567444853372
alpha-div=0.15345391125768915   equals Bregman rep. div=0.15345391125768917
alpha-div=0.12118392973570616   equals Bregman rep. div=0.12118392973570632
alpha-div=1.038494366079179 equals Bregman rep. div=1.0384943660791794
alpha-div=0.08541142546071197   equals Bregman rep. div=0.08541142546071195
alpha-div=0.06842729068201092   equals Bregman rep. div=0.06842729068201084
alpha-div=0.6941500904965174    equals Bregman rep. div=0.6941500904965173
</pre>
<p>In the ISVD 2009 paper, we give closed-form solutions for centroids of representational Bregman divergences, including alpha-means et beta-means.<BR>
</p>
<p>Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Checking the information monotonicity of the Kullback-Leibler divergence</title>
<link>http://blog.informationgeometry.org/article.php?id=76</link>
<pubDate>Wed, 09 Dec 2009 22:03:02 +0900</pubDate>
<description>I wrote a small program to illustrate the information monotonicity property of f-divergences (including Kullback-Leibler...</description>
<content:encoded>
<![CDATA[<p>I wrote a small program to illustrate the information monotonicity property of f-divergences (including Kullback-Leibler).
Here is a numerical example for histograms of 8 bins that we reduce to histograms in 4 and 2 bins. The KL measure is less at coarser resolution than higher resolution.</p>
<pre>
Check the information monotonicity of Kullback-Leibler divergence:
by merging bins into a coarser histogram, the Kullback-Leibler divergence is less than the higher resolution:
0.08522624581487719 0.1320022228947157 0.13591019441965485 0.06674528382980667 0.05029196864402132 0.19946790829780184 0.20516964900877682 0.1251865270903457
0.02648151081895093 0.17500830227728026 0.2779122146863664 0.06203415984687348 0.2535969883830692 0.04424529112439828 0.1320533102395284 0.028668222623533027
                KL(p,q)=0.46400208957858724     KL(q,p)=0.4558758820302893
4 bins  KL(p,q)=0.10555546240269079     KL(q,p)=0.09733318810652078
2 bins  KL(p,q)=0.029647768170771346    KL(q,p)=0.029836929625659273
Information monotonicity holds only for Csiszar's f-divergences
</pre>
]]>
</content:encoded>
</item><item>
<title>jMEF: Exponential families library</title>
<link>http://blog.informationgeometry.org/article.php?id=75</link>
<pubDate>Thu, 26 Nov 2009 15:22:50 +0900</pubDate>
<description>The jMEF library, Exponential families library,  has been updated with new tutorials.
A supplemental report listing form...</description>
<content:encoded>
<![CDATA[<p>The <A HREF="http://www.lix.polytechnique.fr/~nielsen/MEF/">jMEF library</A>, Exponential families library,  has been updated with new tutorials.
A supplemental report listing formulas has been posted on arxiv:
<A HREF="http://arxiv.org/abs/0911.4863">Statistical exponential families: A digest with flash cards</A>
</p>
]]>
</content:encoded>
</item><item>
<title>Several ways to solve for the geodesic equations in Rao's distance computation</title>
<link>http://blog.informationgeometry.org/article.php?id=74</link>
<pubDate>Thu, 26 Nov 2009 15:17:25 +0900</pubDate>
<description>
Rao's Distance Measure, by Colin Atkinson and Ann F. S. Mitchell, 1981 Indian Statistical Institute.


Rao's distance i...</description>
<content:encoded>
<![CDATA[<p>
<A HREF="http://www.jstor.org/pss/25050283">Rao's Distance Measure</A>, by Colin Atkinson and Ann F. S. Mitchell, 1981 Indian Statistical Institute.<BR>
</p>
<p>
Rao's distance is the Riemannian geodesic distance induced by the Fisher information matrix as the tensor.
Computing Rao's distance for given parametric distributions (let d be the number of parameters), thus involved to compute explicitly the geodesic.
The distance is the length of that geodesic, the sum of its infinitesimal elements along the shortest path curve.
It is thus quite complicated to compute in practice the exact Rao's distance as we need to solve the differential equation of the geodesic stated by the Euler-Lagrange equations.
In this paper, three different approaches are proposed;
<ul>
<li>Classic Euler-Lagrange equations (d second order differential equations)
<li>Hamilton's equations  (2d first order differential equations)
<li>Hamilton-Jacobi equations (nonlinear partial differential equation)
</ul>
For uniparameter distribution, the geodesic length (Rao's distance) becomes easy, and the list of such distances are given 
for Poisson, binomial, exponential, chi-squared (gamma).
For multi-parameter distributions, it becomes more difficult. 
The two classical examples are the normal distributions (which Rao geometry is hyperbolic geometry) and the multinomials (which Rao geometry is the spherical geometry).
For same-mean multivariate normals, the Rao distance is given (an unpublished result due to Jensen in 1976).
One problem we typically face with this distance computation is to know whether it admits a closed-form equation or not.

</p>
<p>Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Fisher information of Gamma distributions</title>
<link>http://blog.informationgeometry.org/article.php?id=73</link>
<pubDate>Tue, 24 Nov 2009 03:45:18 +0900</pubDate>
<description>Computing the Rao distance for Gamma distributions
by F. Reverter and J. M. Oller


The Gamma distribution belongs to th...</description>
<content:encoded>
<![CDATA[<p>Computing the Rao distance for Gamma distributions<BR>
by F. Reverter and J. M. Oller<BR>
</p>
<p>
The Gamma distribution belongs to the exponential families.
Therefore, the Fisher information metric is $I(\theta)=\nabla^2 F(\theta)$.
However, integrating the square root of the information matrix is difficult (no closed form solution).
The author proceeds by characterizing the Riemannian geodesic using the differential equation relying on Christoffel symbols.
Geodesics on the Gamma manifold are unique since the manifold is simply connected, complete with all sectional curvatures nonpositive. The authors come up with a Newton-like numerical optimization algorithm that depends on a good initialization. First, they show that the metric is bounded by Poincare metrics for which closed form equations of the geodesics are known. This yields a good starting tangent vector.<BR>
It is quite <b>impressive</b> to look at the formula of the closed-form equation of the Poincare geodesics.
Those formula are surprisingly quite complicated. <BR>
The authors implemented their algorithm in FORTRAN and show that the algorithm always convergence on the domain examples, with high numerical precisions.
</p>
]]>
</content:encoded>
</item><item>
<title>A library for exponential families</title>
<link>http://blog.informationgeometry.org/article.php?id=72</link>
<pubDate>Tue, 03 Nov 2009 21:57:18 +0900</pubDate>
<description>We have developed a library for manipulating exponential families in statistics:


jMEF

We can learn mixture of exponen...</description>
<content:encoded>
<![CDATA[<p>We have developed a library for manipulating exponential families in statistics:<BR>
</p>
<p>
<A HREF="http://www.lix.polytechnique.fr/~nielsen/MEF/">jMEF</A>
</p>
<p>We can learn mixture of exponential families (such as Gaussian mixture models), mixture of Poisson, mixture of Laplacians, etc.</p>
]]>
</content:encoded>
</item><item>
<title>Hyperbolic Voronoi diagrams</title>
<link>http://blog.informationgeometry.org/article.php?id=71</link>
<pubDate>Sat, 24 Oct 2009 01:17:21 +0900</pubDate>
<description>
Geometries are abstract by essence but we visualize them using embedding into the good old Euclidean 2D/3D spaces: our ...</description>
<content:encoded>
<![CDATA[<p>
Geometries are abstract by essence but we visualize them using embedding into the good old Euclidean 2D/3D spaces: our sheet of paper, or 3D browser. So consider hyperbolic geometry, it has many different realizations: conformal or not.
In a conformal representation such as Poincare upper-space or ball, angles are preserved. That is, angles measured in the Euclidean geometry coincide with angles in the hyperbolic geometry. Conformal representations are therefore often wished for mapping because it tends to minimize distortions locally.
</p>
<p>The Voronoi diagram on hyperbolic geometry has thus been studied in conformal Poincare upper-plane representation.
However, geodesics are visualized by arcs of circles and it makes computation more difficult if not tricky. Another problem is that it requires more numerical precision to carry out predicate evaluations.</p>
<p>
Now consider the Klein ball realization. In this non-conformal representation, geodesics are straight line segment (but the mid Euclidean point is not the mid hyperbolic point). Bisectors are also hyperplanes and the diagram is therefore affine, and can be computed from an equivalent power diagram. So hyperbolic Voronoi diagrams are handy and do not require more specific implementation than a weighted Voronoi diagram.</p>
<p>The details are explained in the following report, illustrated with an application on image browsing<BR>
<A HREF="http://arxiv.org/abs/0903.3287">PDF</A>
</p>
<p>
<HR>
<center>
<img src="http://blog.informationgeometry.org/resources/KleinPoincare.png"  alt="KleinPoincare.png" />
</center>
Blue: Affine hyperbolic Voronoi diagram in Klein non-conformal ball model.<BR>
Red: Hyperbolic Voronoi diagram in Poincare conformal ball model.<BR>
</p>
]]>
</content:encoded>
</item><item>
<title>Singular Value Decomposition: Ultimate Matrix Factorization</title>
<link>http://blog.informationgeometry.org/article.php?id=70</link>
<pubDate>Tue, 22 Sep 2009 18:36:13 +0900</pubDate>
<description>I am teaching the fundamentals of 3D at Ecole Polytechnique (INF555). We are currently looking at various matrix decompo...</description>
<content:encoded>
<![CDATA[<p>I am teaching the fundamentals of 3D at Ecole Polytechnique (INF555). We are currently looking at various matrix decompositions and their use in visual computing.</p>
<p>
To compute the PCA of high-dim datasets, we just need to compute the <A HREF="http://en.wikipedia.org/wiki/Singular_value_decomposition">SVD</A> of the covariance matrix of zero-mean normalized data sets. So I looked for a good source of explanations of SVD and I came across the lecture of Gilles Strang:<BR>

<A HREF="http://ocw.mit.edu/OcwWeb/Mathematics/18-06Spring-2005/VideoLectures/detail/lecture29.htm">
SVD lecture
</A>
</p>
<p>Here, the 4 subspaces (image and nullspace) of column/row matrices are reviewed and it is shown how to compute the SVD by simply solving left/right eigenproblems.<BR>
</p>
<p>Definitively worth watching! (you'll see on one example a problem with the sign in a SVD decomposition to solve!!!)</p>
]]>
</content:encoded>
</item><item>
<title>Java var args in action</title>
<link>http://blog.informationgeometry.org/article.php?id=69</link>
<pubDate>Fri, 29 May 2009 00:53:56 +0900</pubDate>
<description>I am writing this library for manipulating exponential families, bregman divergences, and so on.
I am using Java. Today,...</description>
<content:encoded>
<![CDATA[<p>I am writing this library for manipulating exponential families, bregman divergences, and so on.
I am using Java. Today, I discovered that Java can handle arbitrary number of parameters using the three dots ... syntax as follows:</p>
<pre>
class VarargsSum
{
    public static double cumul(double... elements)
    {
    double cumul=0.0d;
    for (double el:elements)
        cumul+=el;
    return cumul;   
    }
    
    public static void write(String... records)
    {
    for (String record: records)
      System.out.println(record);
    }
    public static void main(String [] args)
    {
    Double sum=cumul(5.0,4.0,23.0); 
    write("Computational","Information","Geometry",sum.toString());// explicit cast needed
    }   
}
</pre>
<p>Compiling and running this above code, you will get</p>
<pre>
Computational
Information
Geometry
32.0
</pre>
<p>Let me see how to best use this in the library now...
Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Approximating the smallest enclosing ball of balls</title>
<link>http://blog.informationgeometry.org/article.php?id=68</link>
<pubDate>Fri, 15 May 2009 07:03:35 +0900</pubDate>
<description>Approximating the smallest enclosing ball of balls
Here is an applet to play with:


applet

Frank.
...</description>
<content:encoded>
<![CDATA[<p>Approximating the smallest enclosing ball of balls<BR>
Here is an applet to play with:<BR>
</p>
<p>
<A HREF"http://www.lix.polytechnique.fr/~nielsen/MINIBALLBALL/Smallball.html">applet</A>
<BR>
Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Quaternions and Sir Hamilton's bridge</title>
<link>http://blog.informationgeometry.org/article.php?id=67</link>
<pubDate>Sat, 09 May 2009 23:26:37 +0900</pubDate>
<description>Hamilton's bridge and the birth of quaternions as 4D normed  division algebra.
It cannot exist for 3D vectors... 
See al...</description>
<content:encoded>
<![CDATA[<p>Hamilton's bridge and the birth of quaternions as 4D normed  division algebra.
It cannot exist for 3D vectors... <BR>
See also octonions and 2**n Caley constructions. 
There are also called hypercomplex numbers.<BR>
</p>
<p>For the small story, here is the inscription plate:<BR>
</p>
<p>
<img src="http://blog.informationgeometry.org/resources/plaque4.jpg" >
<BR>
</p>
<p>Here as he walked by 
on the 16th of October 1843
Sir William Rowan Hamilton 
in a flash of genius discovered
the fundamental formula for 
quaternion multiplication 
i2 = j2 = k2 = ijk = -1 
& cut it on a stone of this bridge
</p>
<p>How much progress has been done in a century! (eg., Lie groups and algebras)<BR>
Frank.</p>
]]>
</content:encoded>
</item><item>
<title>Learning a kernel in SVM the conformal way</title>
<link>http://blog.informationgeometry.org/article.php?id=66</link>
<pubDate>Wed, 06 May 2009 20:36:13 +0900</pubDate>
<description>Support vector machines (SVMs) are one of the key tool for classification in machine learning.
Suppose you have two sets...</description>
<content:encoded>
<![CDATA[<p>Support vector machines (SVMs) are one of the key tool for classification in machine learning.
Suppose you have two sets of high-dimensional points (say, +1 class and -1 class, or red and blue points if you prefer) to separate.
The SVM is seeking for the unique hyperplane that separaters the +1-labelled points to the -1-labelled points that maximizes the margin: The distance to the hyperplane.
The points touched by translating the hyperplane on the left and right sides are called support vectors. 
There are in general d+1 such points in dimension d.
In practice,  red/blue points cannot be linearly separated.
The trick is then to use a function f to map these points in a higher dimensional feature space, where they can be linearly separated.
It is always possible to do so. Now, we can manipulate these feature points implicitly with a kernel function  k(x,x')=f(x).f(x'), where '.' denotes the innerproduct.
This is the so-called kernel trick (geometry in a Hilbert space with a Riemannian metric).
Choosing the best kernel is difficult. One way is to learn it by bootstrapping the learning machines as follows:<BR>
First, learn a SVM and detect the support vectors,<BR>
Then adjust the kernel by choosing K(x,x')=D(x)D(x')k(x,x') for a positive function D().<BR>
The idea is to enlarge the spatial resolution around the boundary separating surface.<BR>
Finally, repeat these steps as much as possible, avoiding overfitting.<BR>
</p>
<p>All technical details are described in the paper:<BR>
S.Wu and S. Amari, Conformal Transformation of Kernel Functions: A Data-Dependent
Way to Improve Support Vector Machine Classifiers, Neural Processing Letters, 15, pp.
59-67, 2002.<BR>
</p>
<p>Frank.</p>
]]>
</content:encoded>
</item>
</channel>
</rss>