MATLAB precision (?) error that escalates to infinity

8 views (last 30 days)
Assume some given vectors , , . Then, I apply repetetively (e.g. 2000 times) the following update:
For every update, I tried 2 different ways:
  • I directly compute and assign it to , or
  • For every i=1,...,m I compute:
Theoretically the 2 techniques should be totally equivalent. However, while the updates go on a small error is stacked and grows to infinity (as shown from the figure produced by the attached code). I also attach a .mat file used for initialization of the arrays.
So, which of the 2 techniques is correct to be used with MATLAB and which one is not? Is the reason for the error the floating point precision error of MATLAB, or something else?
Note: I am running R2021a.
%%%%% Initialization
load 'my_mats.mat';
max_iters = 2000;
mu = mu_init;
MU = mu_init;
[mu2, MU2] = deal(mu, MU);
[mus, MUS] = deal(mu, MU);
%%%%% Run the multiplications
iter = 0;
while iter <= max_iters
for idx = 1:length(mu_init)
mu2(idx) = cc(idx) + AA(idx,:)*mu;
end
MU2 = cc+AA*MU;
mus = [mus mu2];
MUS = [MUS MU2];
[mu, MU] = deal(mu2, MU2);
iter = iter + 1;
end
%%%%%%% PROBLEM: The difference is not zero. Why???
diffrnce = mus - MUS;
diffs = squeeze(vecnorm(diffrnce, 1)).';
figure;
semilogy(0:max_iters+1, diffs); grid on;
  3 Comments
Torsten
Torsten on 10 Jun 2022
Edited: Torsten on 10 Jun 2022
I think it's not focussing on the wrong things. I find it natural that for an unexperienced person in numerical maths, it's of course surprising that - starting with exactly the same input data and using two equivalent techniques - results already differ after the first iteration.
By the way: In this special case, it's also surprising for me.
Stephen23
Stephen23 on 10 Jun 2022
Edited: Stephen23 on 10 Jun 2022
"By the way: In this special case, it's also surprising for me."
My guess is that the vectorized code uses some BLAS (or similar) routine, while the loop ... well, something else, perhaps something written by TMW. There are various differences that those routines could have, such as the use/number of extended-precision bits, the algorithm used, the order of operations, etc.
"I find it natural that for an unexperienced person in numerical maths..."
Should be aware of how errors accumulate, even before they touch a computer: accumulated error applies to data measurements and any calculations on it, not just those on computers using floating point numbers.

Sign in to comment.

Accepted Answer

Torsten
Torsten on 10 Jun 2022
Your iteration diverges.
Both mu2 and MU2 contain numbers in the order of 1e165 for iter = max_iters.
So you shouldn't worry about the difference. It's in the order of "rounding errors" for numbers of this size.
  2 Comments
Georgios Apostolakis
Georgios Apostolakis on 10 Jun 2022
Yes, but if you decrease the iterations you will see that the difference is still there. So, the only reason for its existence is the computational error of MATLAB?
Torsten
Torsten on 10 Jun 2022
The reason that the error exists comes from floating point arithmetic.
The reason that the error is amplified that much is that your matrix iteration diverges.

Sign in to comment.

More Answers (1)

Walter Roberson
Walter Roberson on 10 Jun 2022
Matrix multiplication calculates each location as the sum of a number of products of pairs. Sometimes this is expressed using the dot product operator, dot(A(P, :), B(:, Q)) = A(P, 1)*B(1,Q) + A(P, 2)*B(2,Q) and so on.
Now if you are working to indefinite precision, associative and commutative properties would hold, and the order that you did the additions would not matter. When working with indefinite precision, it would be perfectly fine to divide the products up into four groups, use a core to sum within the group (working in parallel), and then at the end total the four partial sums. If you were working with indefinite precision, you could add 2^1024 + 2^-1024 to get something different from 2^1024, and then you could subtract off 2^1024 to get 2^-1024 exactly as the result. Furthermore if you were working with indefinite precision, you could add those 2^1024 millions of time before subtracting out an equal number of them.
It follows that if you use indefinite precision, that your calculation system must be able to use pretty much all of your memory to store a single number, since A+B-A-B+A+B-A-B repeatedly trillions of times with indefinite precision mathematically would have to robustly give an exact zero if you were to rewrite as (A+A+A+... A) + (B+B+B+... B) - (A+A+A+... A) - (B+B+B+... B). Make A large absolute magnitude and B small absolute magnitude and you can obviously drive the number of bits needed to represent the intermediate numbers as large as you want, and every bit would have to be preserved just in case what is being subtracted does not exactly balance until trillions more calculations are done.
And so far we are just dealing with addition, subtraction, and pairwise multiplication of finite precision numbers. Even without having gotten into transcendental calculations such as trig or exponentials we have shown that if you want mathematical properties of association and commutation to hold, then you need to use a calculation system that can use all of your memory to represent a single number. (Well, really you need to be able to use more memory than that!)
The reality, of course, is that hardware representation of numbers has a fixed maximum number of bits for each number. And that because of that, the associative and commutative laws cannot hold. Whenever you fix the maximum number of bits to represent something, then for any given number A, there exists a non-zero number B such that A+B is indistinguishable from A. The same logic holds for decimal representation and holds whether you use 32 bits or 64 bits or 256 bits or 65536 bits per number: if your numeric representation for each number cannot grow to as much memory as you have, then associative and commutative laws must fail. This is not a MATLAB bug: it is Information Theory. Any fixed representation can only represent a finite number of states, but associative and commutative laws require an unbounded number of states.
And so, getting back to the sum of products for the matrix multiplication, it follows that the order of the additions must be important for any representation that has a maximum size of representation of numeric values. That, depending on how you add the terms, you may get a different result. This is not a MATLAB bug, this is a fundamental limitation of representations of limited length.
When you take the two different approaches, you are going through different paths in determining what order to do the additions. The full matrix multiplication is probably being sent to a high performance routine that probably splits the calculation between different cores, creating a partial sum for each core and then adding the partial sums. The other calculation might be considered to be too small to be worth doing multiple cores, as there is communication cost to splitting the load ("A human can outrun a horse over short distances.")
Which one is "right"? Neither of them. Switch to the Symbolic Toolbox if you need the "right" answer.

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!