Why do these two codes give different covariance matrices?

1st code:
S =[ 0.6661 - 0.7458i 0.4210 - 0.9071i 1.0000 + 0.0000i;-0.9127 + 0.4086i -0.8429 + 0.5381i 1.0000 + 0.0000i;1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i;-0.9127 - 0.4086i -0.8429 - 0.5381i 1.0000 - 0.0000i;0.6661 + 0.7458i 0.4210 + 0.9071i 1.0000 - 0.0000i];
Rmm=cov(S);
2nd code:
S =[ 0.6661 - 0.7458i 0.4210 - 0.9071i 1.0000 + 0.0000i;-0.9127 + 0.4086i -0.8429 + 0.5381i 1.0000 + 0.0000i;1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i;-0.9127 - 0.4086i -0.8429 - 0.5381i 1.0000 - 0.0000i;0.6661 + 0.7458i 0.4210 + 0.9071i 1.0000 - 0.0000i];
Rmm= S*S'/length(S(1,:));

Answers (1)

Hi Sadiq,
cov subtracts mean from each column of S, and also divides by (size(S,1) -1). Also S and S' have to be multiplied in the correct order.
S =[ 0.6661 - 0.7458i 0.4210 - 0.9071i 1.0000 + 0.0000i;
-0.9127 + 0.4086i -0.8429 + 0.5381i 1.0000 + 0.0000i;
1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i;
-0.9127 - 0.4086i -0.8429 - 0.5381i 1.0000 - 0.0000i;
0.6661 + 0.7458i 0.4210 + 0.9071i 1.0000 - 0.0000i];
Rmm = cov(S)
S1 = S-mean(S);
Rmm1= S1'*S1/(size(S1,1)-1)

3 Comments

The obvious answer is that for matrix multiplication sizes, [3x5] x [5x3] = [3x3], but [5x3] x [3x5] = [5x5]. Since cov is coming up with [3x3], you know what the multiplication order has to be. A better answer is that there are 5 observations and 3 variables. For each pair of variables you are trying to see how correlated the observations are, as determined by the dot product of (observations of the first variable) and (observations of the second variable). That means S'*S is correct.
Well, the second code is actually not correct, and the different size of Rmm compared to cov(S) is one indication.
Actually it's pretty easy to say the second code is wrong, since it gives totally different results than Matlab cov which is correct.
Besides which, youtube videos are hardly infallabe. Anyone can post one, and a few are wrong, so you have to take them with a grain of salt. This video is not incorrect, but I would say it is very misleading.
In the video, his table is in the standard form with samples down the columns, variables across the rows. For each variable, its samples form a column vector. The key is that the covariance between two variables is the inner product, a scalar quantity. For two columns of sample data g and h, the inner product is g' * h [matrix multiplication of row g x column h].
However, for purposes of illustration he enters the data as row vectors g and h and makes no comment about the distinction. Then the inner product is expressed the other way round, g * h' .
Since S has its samples in columns, you can see which multiplication is required

Sign in to comment.

Asked:

on 5 May 2021

Edited:

on 10 May 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!