why covariance matrix Should be positive and symmetric in mahalanobis distance
17 views (last 30 days)
i am trying to use mahalanobis distance in pdist command but some time i got the error that covariance matrix should be symmetric and positive definite , for symmetric we can multiply the matrix by its transpose and positive matrix but if original covariance matrix is semipositive even if multiply it , we got semi definite matrix , my question first why we need to have it as positive definite and as Symmetrix
John D'Errico on 31 Jul 2022
Edited: John D'Errico on 31 Jul 2022
SIGH. Multiplying a covariance matrix by its transpose is NOT what you want to do! If it is already a covariance matrix, that operation will SQUARE the eigenvalues. So that is completely incorrect. You will no longer have the same covariance matrix, or anything reasonably close to what you started with!!!!!!
Perhaps more to the point is, why does the Mahalanobis diatance computation require a POSITIVE DEFINITE AND SYMMETRIC matrix?
The reason is the distance computation will use a Cholesky decomposition. And that will require a symmetric matrix, that must at least be positive semi-definite. But then the distance computation will use the inverse of the Cholesky factor. And that won't exist if your matrix is singular.
I suppose you might ask why it needs to use a matrix factorization at all? That gets into the meaning of Mahalanobis distance, and for this I would probably need to teach an entire class on the subject, and a deep explanation of the linear algebra. But think of Mahalanobis distance as a variable ruler. The ruler varies in length, depending on which direction you point it in. (A strange, anisotropic ruler at that.) And the various directions in turn depend on the eigenvectors of your covariance matrix. If we look in the direction of an eigenvector with a zero eigenvalue, then the ruler is infinitely short. And that means any distance then computed with an infinitely short ruler will appear to be infinitely large as a distance.
Is there a solution? Perhaps. I say that because you can use the tool I posted on the File Exchange, to find the NEAREST positive definite matrix to a given matrix. It will adjust your matrix so that the result is a minimally perturbed matrix, that is now positive definite. However, will that really help you? It may, or it may not.
You can find nearestSPD here for free download:
Still, you need to recognize that a distance is meaningless for a singular covariance matrix, and even for a nearly singular matrix, it is still going to give you meaningless results, where the distance predicted is now essentially infinite. Perhaps an example will be best.
First, some numbers. I'll start with a set that lies entirely in one plane in 3-d.
T = randn(5,9);
xyz = randn(100,5)*T + randn(1,9);
Now, if I compute the covariance matrix of that set, it will be singular.
C = cov(xyz)
format long g
As you should see, there are some negative eigenvalues. They are only negative by a tiny amount, as is common. But chol will fail, as would then using C to compute a mahalanobis distance.
The probem is, you DO NOT want to multiply C by itself. That would be completely inappropriate here. But we can find a matrix close to C that IS both symmetric and positive definite.
Chat = nearestSPD(C);
Is it symmetric?
(Chat - Chat') == 0
Is it positieve definite?
It is still rank 5 only.
And the difference between C and Chat is tiny.
norm(C - Chat)
But now I can use pdist
So all of the interpoint distances between members of XYZ are well posed and finite.
Be careful though, as if we compute the distance between random points, that are NOT in the hyperplane of the data used to generate that covariance matrix?
xyzrand = randn(5,9);
As you can see, all of those distances are now virtually infinite, at least as well as pdist can determine them using a Mahalanobis distance.
But if I had tried to use pdist with the matrix C instead, it would have failed, in either case.