Code covered by the BSD License  

Highlights from
Variational Bayesian Inference for Gaussian Mixture Model

4.4 | 7 ratings Rate this file 109 Downloads (last 30 days) File Size: 19.5 KB File ID: #35362
image thumbnail

Variational Bayesian Inference for Gaussian Mixture Model


Mo Chen (view profile)


Variational Bayes method (mean field) for GMM can auto determine the number of components

| Watch this File

File Information

This is the variational Bayesian procedure (also called mean field) for inference of Gaussian mixture model. This is the Bayesian treatment of Gaussian mixture model.

Unlike the EM algorithm (Maximum likelihood estimation), it can automatically determine the number of the mixture components k.

Example code:
load data;

The data set is of 3 Gaussian. You only need set a number (say 10) which is larger than the intrinsic number of components. The algorithm will automatically find the right k.

Detail description of the algorithm can be found in the reference.

Reference: Pattern Recognition and Machine Learning by Christopher M. Bishop (P.474)


This file inspired Em Algorithm For Gaussian Mixture Model.

MATLAB release MATLAB 7.13 (R2011b)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (12)
19 Aug 2014 Johannes


I solved my problem by removing components that approach zero. Adding this after line 128:

% remove components that are zero
nk = sum(R,1); % 10.51
idx = find(nk<0.01);
% idx = find(nk<4);

if ~isempty(idx)
R(:,idx) = [];
alpha(:,idx) = [];
kappa(:,idx) = [];
m(:,idx) = [];
v(:,idx) = [];
M(:,:,idx) = [];

tmp = sum(R,2);
R = bsxfun(@times,R,1./tmp);
logR = log(R);

model.alpha = alpha;
model.kappa = kappa;
model.m = m;
model.v = v;
model.M = M; % Whishart: M = inv(W)

also added some noise to the matrix in line 176 as suggested below:

V = chol(Xs*Xs'/nk(i) + diag(ones(1,d)*1e-3)); % add some noise to avoid singular matrix

Hope this helps somebody else to. By the way is there a better way to exchange this information? I would have sent you the updated file, but I couldn't find how to do this here.

Comment only
17 Jul 2014 Johannes

This is a great help for me. However I also have the problem with multdimensional data.

Here is how to reproduce the error:
I create some dummy data (10 gaussian clusters in 3D)

numClusters = 10;

allClusters = [];

for ii = 1:numClusters
sigmaX = 20;
sigmaY = 20;
sigmaZ = 80;
sigmas = diag([sigmaX sigmaY sigmaZ]);
imageSize = diag([ 1000 1000 20 ]);

simGauss = sigmas * randn(3,2e2);
mu = imageSize * rand(3,1);
cluster = bsxfun(@plus, simGauss,mu);

allClusters = cat(2, allClusters, cluster);

Error using chol
Matrix must be positive definite.

Sometimes it works, most of the time not however. I trace the problem to entries of nk becoming zero and a subsequent division by zero.
I have tried to replace the zero values with small ones, but that didn't help. Have no solution so far :-/

03 Jun 2014 Bogdan

Bogdan (view profile)

Thanks for this! Is there anyway to use it with multi-dimensional data points? It's giving me an error about the matrix not being positive definite.


31 May 2014 Jose Caballero

Thanks for the code!

Could you explain how to derive the variance of components from the output?

04 Feb 2014 Chi-Fu

Chi-Fu (view profile)

18 Jun 2013 Jia Tsing Ng

In response to Venkat R,

It's usual to have that problem. Easiest 'hack' is to add some noise on the diagonal of your matrix before chol-ing it.

Comment only
15 Apr 2013 Paul

Paul (view profile)

Easy to use and quick for my data (~4000 pts, yielding ~20 clusters).

That having been said, is there any chance of getting more documentation on the outputs?

I need to take the identified clusters and use the model to classify future data. I've read and understood Bishop, but it is still difficult to reverse engineer the code (terse the variable names) to actually use the identified model.

14 Jan 2013 Andrew

Andrew (view profile)

Hi Chandra,
It will likely work if you replace all ~ in the file with an unused word (ex. 'trash'). In newer versions of Matlab, a ~ can be used in place of an output var when none is desired.

Comment only
19 Dec 2012 Chandra Shekhar

Hello Mo Chen.
When i run this code in MATLAB 2009a, i am getting following error.

??? Error: File: vbgm.m Line: 33 Column: 3
Expression or statement is incorrect--possibly unbalanced (, {, or

could you explain me please..?
Can You

Comment only
13 Dec 2012 yu

yu (view profile)

12 Oct 2012 Venkat R

I was able to run the test case successfully. But when I give my data as input, I get the error
Error using ==> chol
Matrix must be positive definite.

Error in ==> vbgm>vbound at 176
V = chol(Xs*Xs'/nk(i));

Error in ==> vbgm at 28
L(t) = vbound(X,model,prior)/n;

My data (286x162) doesn't contain any complete row or column as 0, although it does contain 0 at few discreet places. Does this method have a limitation?


27 Mar 2012 Nicolás de la Maza

Hellor Michael,

could you explain de outputs of this algorithm please? I mean, I want to know where to find the mean, covariance and mixture components pounds vectors in order to compare with Matlabs function

Best regards!

Comment only

Contact us