Inconsistency in behavior of SVD

26 views (last 30 days)
This came out of an example that I gave my class on computing SVD of low-rank flags (based on an exercise from Gilbert Strang's book). I had students compute low-rank approximations to different flags and then plot the singular values of the flags on a semilog scale. Some students' plots ended up looking different than others, and I realized it depended on whether they called svd with one or three output arguments. Here is a minimal working example:
F = zeros(1000,1800); F(:,1:600) = 37; F(:,601:1200) = 255; F(:,1201:end) = 102; %A tricolor flag, eg France or Italy
>> [u,s,v] = svd(F); s(100,100) %s is a matrix
ans =
s = svd(F); s(100) %s is a vector
ans =
Now, this flag has rank 1, so obviously the 100th singular value is mathematically zero. But what accounts for the nearly 70 orders of magnitude difference here? (which obviously looks very different on a semilog plot!) My assumption was that the svd calculation should be the same whether one is requesting the entire SVD or just the singular values, but obviously something different is happening behind the scenes...

Accepted Answer

Christine Tobler
Christine Tobler on 23 Dec 2021
Firstly, I agree completely with John's great explanations above: Any singular value below eps*first singular value should be treated as zero.
The behavior here isn't wrong, since in both cases the outputs are accurate up to expected round-off. However, it is clearly puzzling and we would like to make the two calls' results match more closely (so that they would look similar when plotted), even though we don't aim for matching down to the last decimal between different calls like this, preferring to get the best possible performance for each individual call.
Some background: Computing the SVD of a matrix A is done in two stages:
  1. A = X*B*Y', where X and Y are orthogonal matrices and B is a bidiagonal matrix. This incurs a round-off error of eps*norm(A).
  2. B = Q*S*P' where Q and P are orthogonal matrices and S is a diagonal matrix of singular values.
For step 2, there is an algorithm which is specialized for cases where the scaling changes a lot along the diagonals of B, in which case it can compute singular values that differ by much more than a factor eps from largest to smallest. However, this algorithm only computes the singular values and not the singular vectors.
That algorithm has been added into the LAPACK library (a standard linear algebra library which we call for SVD and other computations in MATLAB), to be used when no eigenvectors are requested. This is what we call in MATLAB, and why you are seeing the differences in behavior. We can't practically change this behavior from MATLAB without always computing the singular vectors even when they aren't requested, which would be a huge decrease in performance.
Some more comments:
  • These differences in behavior largely affect matrices with large differences in scale (either the norms of each row or the norms of each column have large differences between them). Other matrices are usually not noticeably affected.
  • While the singular values computed in the one-output case are accurate for the matrix B above, this only means they are accurate for the original matrix A if the round-off in step 1 was much smaller than usual (for example, if A was already bidiagonal to start with). Otherwise, we first introduce round-off in step 1, and then get a very accurate picture of the singular values of the matrix that already doesn't match A to the degree that step 2 is using.
So in short, the reason for the difference is that a more accurate algorithm is used in the one-output case that doesn't exist for the three-output case. However, this algorithm is only more accurate if the input matrix has some very special preconditions, otherwise it just represents the round-off error slightly differently, with no effect on accuracy. Singular values below eps*norm(A) should not be trusted, except when you have some very specific knowledge about your input matrix.
Thomas Humphries
Thomas Humphries on 23 Dec 2021
Thanks for this great answer, Christine!

Sign in to comment.

More Answers (2)

John D'Errico
John D'Errico on 21 Dec 2021
Edited: John D'Errico on 21 Dec 2021
F = zeros(1000,1800); F(:,1:600) = 37; F(:,601:1200) = 255; F(:,1201:end) = 102; %A tricolor flag, eg France or Italy
[U,S,V] = svd(F);
s = svd(F);
ans = 2.1466e+05
ans = 2.9104e-11
ans = 2.1438e-11
ans = 2.1902e-113
So that 100th singular value is roughly no larger than eps*s(1). In fact, even the second singular value is roughly that small.
The first rule of computing, and especialy linear algebra is to NEVER trust anything in the least significant bits, especially when there will be subtly different computations involved on different algorithmic paths.
ans = 1
It looks like they use different code, IF you just want the singular values. And that makes some sense. Since it is all trash at that point, see rule number 1.
John D'Errico
John D'Errico on 22 Dec 2021
M = rand(3);
[U,S1,V] = svd(M,"vector");
format long g
S1 = 3×1
1.42429794861722 0.567016326172269 0.286243430498377
S2 = svd(M)
S2 = 3×1
1.42429794861722 0.567016326172269 0.286243430498377
S1 == S2
ans = 3×1 logical array
0 0 0
norm(S1 - S2)
ans =
So even though the two vectors look the same, the least significant bits vary just slightly. This is expected. It is true for almost any algebraic computation, when two operations are done that while mathematically the same result is expected, you should NEVER trust they will produce the same result down to the least significant bits.
A = rand(10);
B = rand(10);
C = rand(10);
(A + B - C) == (-C + B + A)
ans = 10×10 logical array
1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 1 1 1 0 1 1 1 0 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 1 0 1 1 1 1 0 0 1 1 1 1 0 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 0
You should not be surprised at finding tiny differences there, any more than being surprised at the SVD producing differences on the order of machine epsilon when there is likely a different path chosen in the code. Should the MathWorks document that there are differences in the code between the two paths inside SVD? As I said in my answer, that just begs for more problems explaining why you should not be trusting the LSBs of those numbers. And worse, suppose next year, they change the SVD code in some way, that does use the same path? Now they will need to change the documentation again and re-educate people.
The simple rule is best: NEVER trust the LSBs of your numbers in floating point computations.

Sign in to comment.

Chunru on 21 Dec 2021
F = zeros(1000,1800);
F(:,1:600) = 37; F(:,601:1200) = 255; F(:,1201:end) = 102; %A tricolor flag, eg France or Italy
This is a rank 1 matrix as shown by:
ans = 1
For rank 1 matrix, there is only one non-zero singular value. MATLAB arrange the singular values in descending order so that the 1st value is the desired singular value. All the rest should be ideally zero, but subject to computational accuracy.
So s(100,100) for matrix s or s(100) for vector s(100), they corresponding to the true sigular value of 0.
[u,s,v] = svd(F);
s(100,100) %s is a matrix
ans = 2.1438e-11
s1 = svd(F);
s1(100) %s is a vector
ans = 2.1902e-113
We are not sure the exact internal implementation of svd inside matlab. But the results with different output are more or less accurate up to computing accuracy of double (?).
format longg
[diag(s) s1]
ans = 1000×2
1.0e+00 * 214659.730736811 214659.730736811 2.97801336572204e-10 2.97104202825509e-10 2.1438088706804e-11 1.03894314937022e-12 2.1438088706804e-11 1.06656371608964e-23 2.1438088706804e-11 7.52960822699182e-24 2.1438088706804e-11 7.33435968746888e-24 2.1438088706804e-11 5.73717383511201e-24 2.1438088706804e-11 4.92384115399226e-24 2.1438088706804e-11 4.28392707240299e-24 2.1438088706804e-11 3.40957046848442e-24


Find more on Linear Algebra in Help Center and File Exchange




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!