## Jan 26, 2010

### Mixture of beta distributions

Mixtures of beta distributions are useful in various areas of computational science for modeling underlying distributions of datasets. Although beta distributions are not the most famous mixture models (leaving the place to unthroned Gaussian mixture models or GMMs for short) they are convenient for a number of reasons. The two parameters alpha and beta give flexibility for modeling various shapes, and they are prior conjugate of Bernoulli distributions for Bayesian inference.

In bioinformatics, beta mixtures have been proven useful for analyzing gene expressions. Either the same gene observed under different modalities (say, different maker microarrays or radioactivity labeled DNAc) or for extracting pathways (co-expressed genes). The basic feature is the correlation number of pairwise expressions. Modeling those correlation coefficient distribution allows one to fit beta mixtures with two components: The similar and dissimilar without using a priori threshold (often set arbitrarily to 0.5).
A standard EM algorithm using numerical optimization let us fit beta mixture. What is the best method and best tool for doing that? If there are biologists surfing here around, let me know please.
The reference of the paper is:

Bioinformatics. 2005 May 1;21(9):2118-22. Epub 2005 Feb 15.
Applications of beta-mixture models in bioinformatics.
Ji Y, Wu C, Liu P, Wang J, Coombes KR.