Why I get two different covariance matrix?
    6 views (last 30 days)
  
       Show older comments
    
Hi all,
Prof. Andrew Ng in his ML class says that we can calculate covariance matrix as: 1/m*X'*X .
Where; 
examples are in rows of X, 
X' is transpose of X,
and, m is number of examples.
For example:
X=randi(12,[6,2]);
cov1=1/size(X,1)*X'*X
And, covariance with cov function is: 
cov2=cov(X)
As you can see, cov1 is different from cov2 !!!
What is the reasan for that? Do you have any idea? 
Thanks
0 Comments
Accepted Answer
  Paul
      
      
 on 14 Jul 2022
        Hi Ali,
Perhaps Prof. Ng has some additional assumptions about the data that aren't included in your question. To compute the covariance we have to subtract off the mean. As for whether or not the outer product should be scaled by 1/m or 1/(m-1) depends on assumptions about the underlying data. IIRC, if we know the data is drawn from a Normal distribution then divide by m (perhaps also for other distributions as well?), but typically we don't assume that and so divide by m-1 for unbiased estimation. As can be seen below, cov subtracts the mean and divides by m-1. 
rng(100);
X=randi(12,[6,2])
cov1=1/(size(X,1)-1)*(X-mean(X))'*(X - mean(X))
cov(X)
As for the second question
sigma=[6 2;2 3]; % cov matrix
[a1,v]   = eig(sigma)
[a2,s,~] = svd(sigma)
we see that eig and svd just have a different order for the results.
3 Comments
  Paul
      
      
 on 14 Jul 2022
				Which one of eig() or svd() to use?  it would never occur to me to use svd() to get the eigenvectors of a symmetric matrix. Don't know if eig() or svd() is better for that special case. Asking that as a new question is more likely to get the attention of knowledgeable people that can answer.
More Answers (1)
  ali yaman
 on 14 Jul 2022
        4 Comments
  John D'Errico
      
      
 on 14 Jul 2022
				
      Edited: John D'Errico
      
      
 on 14 Jul 2022
  
			Since this is probably of some general interest, I'll actually post a question of my own, then answer it myself, discussing the relative issues between eig and svd.
See Also
Categories
				Find more on Linear Algebra in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

