Can you get matlab to add a really small number to a big number and retain precision

I am trying to calculate small numbers from equations like the one below.
p = -1e17 + sqrt(1e34 + 1e18)
I have tried using 'vpa(-1e17 + sqrt(1e34 + 1e18), 30) but the output always comes out as p = 0.0. But p is not exactly zero and this propagates errors in subsequent calculations that involve p. Is there another way to retain sig figs to avoid rounding errors?

2 Comments

Open the Neural Net Fitting App
  • Before R2026a: MATLAB Toolstrip: On the Apps tab, under Machine Learning and Deep Learning, click the app icon.
  • MATLAB command prompt: Enter nftool.
The implication is that it can only be opened from the command line in R2026a.
This was posted in response to a comment from FUN-AKEDE about being unable to find the Neural Fitting app in the toolstrip. They have removed the comment since then.

Sign in to comment.

 Accepted Answer

p = -1e17 + sqrt(1e34 + 1e18)
p = 0
p is in fact exactly 0 if you do the calculation in double precision.
p == 0 % true
ans = logical
1
Why didn't it work with vpa? The way you called vpa:
p = vpa(-1e17 + sqrt(1e34 + 1e18), 30)
p = 
0.0
you're doing the calculations to generate the input argument with which vpa() will be called in double precision and passing that double precision result into vpa. By the time vpa gets called, p is already 0.
You could convert each of the numbers into sym and then do the calculations. If you do this, MATLAB will do the calculations symbolically; note the radical sign in the display of p.
x17 = sym(1e17);
x34 = sym(1e34);
x18 = sym(1e18);
p = -x17 + sqrt(x34+x18)
p = 
Then you can call vpa on that symbolic answer to evaluate it using variable precision arithmetic.
vpa(p, 30)
ans = 
4.99999999999999987499995984189

8 Comments

There is actually something else going here that is worth discussing. That is, you are relying on the sym engine to guess your intent correctly and get the desired sym numbers. E.g., 1e34 cannot be represented exactly in IEEE double precision. This is what is actually stored in the background:
fprintf('%50f\n',1e34)
9999999999999999455752309870428160.000000
But the sym engine is trying to be nice to you and guesses that you really meant exactly 1e34 instead of what is actually stored in the double precision value, so that is what this gets converted to. Rather than relying on that (which may or may not be correct depending on your application), I always spell it out for the sym engine so there are no misunderstandings that bite me down the road. E.g.,
x34 = sym(10)^34
x34 = 
10000000000000000000000000000000000
This way you are not relying on black box logic inside the sym engine to guess your intent correctly by looking for nearby "nice" numbers close to your actual double precision value. For this case it didn't matter, but for other cases that are slightly more complicated, it can make a difference.
This is documented behavior (with a little handwaving) of the sym function. As stated on the documentation page, the default value for the flag input argument that controls how sym treats the first input is 'r' which does:
"When sym uses the rational mode, it converts floating-point numbers obtained by evaluating expressions of the form p/q, p*pi/q, sqrt(p), 2^q, and 10^q (for modest-sized integers p and q) to the corresponding symbolic form. For example, sym(1/10,"r") returns 1/10. This mode effectively compensates for the round-off error involved in the original evaluation but might not represent the floating-point value precisely. If sym cannot find a simple rational approximation, then it uses the same technique as it would use with the flag "f"."
The handwaving bit is the fact that we don't specifically mention the threshold for what is a "modest-sized integer".
But if you didn't want to depend upon this, you could absolutely use the technique you described. And if you want to do something that is in a form similar to one of those expressions but not exactly one of those forms (3^q for example) or one where the value would overflow double, it would be safer (or in the overflow case necessary) to convert the integer to sym then raise it to the appropriate power.
x1 = sym(3^100)
x1 = 
515377520732011324194596268868618440852459487232
x2 = sym(3)^100
x2 = 
515377520732011331036461129765621272702107522001
x3 = sym(2^10000)
x3 = 
x4 = sym(2)^10000
x4 = 
19950631168807583848837421626835850838234968318861924548520089498529438830221946631919961684036194597899331129423209124271556491349413781117593785932096323957855730046793794526765246551266059895520550086918193311542508608460618104685509074866089624888090489894838009253941633257850621568309473902556912388065225096643874441046759871626985453222868538161694315775629640762836880760732228535091641476183956381458969463899410840960536267821064621427333394036525565649530603142680234969400335934316651459297773279665775606172582031407994198179607378245683762280037302885487251900834464581454650557929601414833921615734588139257095379769119277800826957735674444123062018757836325502728323789270710373802866393031428133241401624195671690574061419654342324638801248856147305207431992259611796250130992860241708340807605932320161268492288496255841312844061536738951487114256315111089745514203313820202931640957596464756010405845841566072044962867016515061920631004186422275908670900574606417856951911456055068251250406007519842261898059237118054444788072906395242548339221982707404473162376760846613033778706039803413197133493654622700563169937455508241780972810983291314403571877524768509857276937926433221599399876886660808368837838027643282775172273657572744784112294389733810861607423253291974813120197604178281965697475898164531258434135959862784130128185406283476649088690521047580882615823961985770122407044330583075869039319604603404973156583208672105913300903752823415539745394397715257455290510212310947321610753474825740775273986348298498340756937955646638621874569499279016572103701364433135817214311791398222983845847334440270964182851005072927748364550578634501100852987812389473928699540834346158807043959118985815145779177143619698728131459483783202081474982171858011389071228250905826817436220577475921417653715687725614904582904992461028630081535583308130101987675856234343538955409175623400844887526162643568648833519463720377293240094456246923254350400678027273837755376406726898636241037491410966718557050759098100246789880178271925953381282421954028302759408448955014676668389697996886241636313376393903373455801407636741877711055384225739499110186468219696581651485130494222369947714763069155468217682876200362777257723781365331611196811280792669481887201298643660768551639860534602297871557517947385246369446923087894265948217008051120322365496288169035739121368338393591756418733850510970271613915439590991598154654417336311656936031122249937969999226781732358023111862644575299135758175008199839236284615249881088960232244362173771618086357015468484058622329792853875623486556440536962622018963571028812361567512543338303270029097668650568557157505516727518899194129711337690149916181315171544007728650573189557450920330185304847113818315407324053319038462084036421763703911550639789000742853672196280903477974533320468368795868580237952218629120080742819551317948157624448298518461509704888027274721574688131594750409732115080498190455803416826949787141316063210686391511681774304792596709376
"But if you didn't want to depend upon this ..."
I appreciate the effort the coders of the sym engine put into these "be nice to the user" features, but frankly I would personally never rely on it in code I wrote. I mean, why bother trying to remember the rules and limits for what the sym engine will recognize and what it won't? I sure don't. So yes, I would always spell it out unambiguously in my code when defining sym constants, etc.
Testing the boundaries:
format long g
V = pi
V =
3.14159265358979
W = sym(pi)
W = 
π
for K = 1 : 17
if sym(round(V,K)) == W
fprintf('pi match at %d digits\n', K)
else
fprintf('no pi match at %d digits\n', K);
end
end
no pi match at 1 digits no pi match at 2 digits no pi match at 3 digits no pi match at 4 digits no pi match at 5 digits no pi match at 6 digits no pi match at 7 digits no pi match at 8 digits no pi match at 9 digits no pi match at 10 digits no pi match at 11 digits no pi match at 12 digits
pi match at 13 digits pi match at 14 digits pi match at 15 digits pi match at 16 digits pi match at 17 digits
fprintf('\n');
V = sqrt(7)
V =
2.64575131106459
W = sqrt(sym(7))
W = 
for K = 1 : 17
if sym(round(V,K)) == W
fprintf('sqrt match at %d digits\n', K)
else
fprintf('no sqrt match at %d digits\n', K);
end
end
no sqrt match at 1 digits no sqrt match at 2 digits no sqrt match at 3 digits no sqrt match at 4 digits no sqrt match at 5 digits no sqrt match at 6 digits no sqrt match at 7 digits no sqrt match at 8 digits no sqrt match at 9 digits no sqrt match at 10 digits no sqrt match at 11 digits no sqrt match at 12 digits
sqrt match at 13 digits sqrt match at 14 digits sqrt match at 15 digits sqrt match at 16 digits sqrt match at 17 digits
V = rand
V =
0.516779586083504
W = sym(V,'f')
W = 
for K = 1 : 17
if sym(round(V,K)) == W
fprintf('rand match at %d digits\n', K)
else
fprintf('no rand match at %d digits\n', K);
end
end
no rand match at 1 digits no rand match at 2 digits no rand match at 3 digits no rand match at 4 digits no rand match at 5 digits no rand match at 6 digits no rand match at 7 digits no rand match at 8 digits no rand match at 9 digits no rand match at 10 digits no rand match at 11 digits no rand match at 12 digits no rand match at 13 digits no rand match at 14 digits no rand match at 15 digits no rand match at 16 digits
rand match at 17 digits
And another subtle difference for large numbers is as follows for trig functions:
sin(1e34)
ans = -0.5082
double(sin(sym(1e34)))
ans = -0.6861
What the heck happened? Well, it turns out that double precision trig functions are very literal with the input arguments. In the above sin(1e34) call, the sin( ) function in the background is effectively doing a high precision mod(input_argument,2π) using the exact double precision value of the input with an extended precision π value. (I don't know the exact algorithm it uses, but it is very good at it.) It doesn't try to guess that you really meant exactly 1e34 in this case. So you get a difference in the result. E.g.,
s34 = sprintf('%50f\n',1e34) % the exact double precision value of 1e34 as a string
s34 =
' 9999999999999999455752309870428160.000000 '
x34 = sym(s34) % the exact sym version of this
x34 = 
9.9999999999999994557523098704282e+33
double(sin(x34)) % sin( ) of this symbolic converted string matches double precision sin( )
ans = -0.5082
So you can see what the double precision sin( ) is effectively doing with the input argument. There is no nudging it to a nearby "nice" value. It uses the exact input argument along with an effective extended precision mod with 2π.
As a side note, if you want to have a mod(___,2π) function that matches what the MATLAB trig functions effectively do in the background, you could do this (albeit this is computation intensive and produces result in range -π to +π):
function y = mod2pi(x)
y = atan2(sin(x),cos(x)); % force MATLAB to do the extended precision mod(__,2pi) with the sin( ) and cos( ) calls
end
E.g.,
x = 1e34
x = 1.0000e+34
y = atan2(sin(x),cos(x)) % the extended precision y = mod(x,2pi) version of x
y = -2.6085
sin(x)
ans = -0.5082
sin(y)
ans = -0.5082
Note that you don't get the same result with a naive double precision mod:
mod(x,2*pi)
ans = 0
sin(mod(x,2*pi))
ans = 0
See this Cleve's Corner article from 2002 for a discussion of how some of the special functions are implemented. It doesn't go into a lot of detail about the specifics of the implementation, but you still might find it interesting.
I took a quick look. It appears the library employed for this uses 396 hex digits of 2/π in order to calculate the extended precision mod values.
Just for fun (yes, I think this type of stuff is actually fun), here is BING:
And GOOGLE:
So it appears they both use the double precision value as input rather than symbolic, and use an extended precision mod in the background for argument reduction. Possibly same library as MATLAB?

Sign in to comment.

More Answers (4)

HI Stephen,
It's certainly doable, but rather than have the software crank away at a precise enough calculation, in this case I think it works better to recognize that a very small parameter is running the show, and sidestep the precision problem all together:
p = -a + sqrt(a^2+b^2) % a = 1e17 b = 1e9 b<<a
= -a + a*sqrt(1 + b^2/a^2)
= -a + a*[1 + (1/2)(b^2/a^2) + ((1/2)(-1/2)/2!)(b^2/a^2)^2 ... ] % series expansion
= -a + a + (1/2)(b^2/a) -(1/8)(b^4/a^3) ...
= (1/2)(1e18/1e17) -(1/8)(1e36/1e51) ...
= 5 - 1.25e-16 ...
which seems accurate enough for most purposes (the next term is +6.25e-33).

1 Comment

A good point, that good numerical analysis can often obviate the need for high precision computations. Well written double precision code will be sufficient for most tasks. I even recall the blog posts from Cleve looking at half precision arithmetic, but there you need to be very careful.

Sign in to comment.

p = sym(-1e17) + sqrt(sym(1e34) + sym(1e18))
p = 
vpa(p, 30)
ans = 
4.99999999999999987499995984189
The problem with doing vpa(-1e17 + sqrt(1e34 + 1e18)) is that when you do that, the expression -1e17 + sqrt(1e34 + 1e18) is calculated in double precision, and the double precision result is then converted to symbolic.
In MATLAB, if you use one of the int*() calls or one of the uint*() calls, and inside the () you specify a constant with no complex component, then those int*() or uint*() calls take effect at parse time.
In all other cases, for all other function calls, the parameters are evaluated with default double precision with the function being applied after the resulting expression.
For example
int64(9223372036854775806)
ans = int64 9223372036854775806
int64(+9223372036854775806)
ans = int64 9223372036854775807
int64(9223372036854775806.)
ans = int64 9223372036854775807
The +9223372036854775806 and the 9223372036854775806. are not considered integer literal expressions, so their content is evaluated in double precision, with int64() then applied to the double precision result.
So it is that vpa(-1e17 + sqrt(1e34 + 1e18)) calculates the expression in double precision and applies vpa() to the double precision result.

1 Comment

You have to be careful about how you create the expressions
sym(9223372036854775806)
ans = 
9223372036854775808
so as described, if you give a numeric argument to sym() then it is evaluated in default double precision before sym() gets control.
If you quote on the other hand,
sym('9223372036854775806')
ans = 
9223372036854775806
but
p1 = sym('-1e17')
p1 = 
p2 = sym('1e34')
p2 = 
1.0e+34
p3 = sym('1e18')
p3 = 
1000000000000000000.0
p = p1 + sqrt(p2 + p3)
p = 
4.9999999999999998750000127814531
this shows that if you use quoting that the result is a symbolic floating point result, not a symbolic rational.
The fact that parameters get evaluated in default double precision shows that it is unreliable to pass larger or higher precision floating point values. The fact that quoted parameters become symbolic floating point numbers means that you often do not want to pass quoted parameters either.
So it is often better to construct the symbolic numbers, such as
p1 = sym(10)^17
p1 = 
100000000000000000
p2 = sym(10)^34
p2 = 
10000000000000000000000000000000000
p3 = sym(10)^18
p3 = 
1000000000000000000
p = p1 + sqrt(p2 + p3)
p = 
q = sym(12345)/sym(10)^4
q = 
feval(symengine, 'coerce', sym('1e34'), 'DOM_RAT')
ans = 
10000000000000000000000000000000000
Using feval() to coerce a number is decidedly obscure !!!

Sign in to comment.

Can you get matlab to add a really small number to a big number and retain precision
You can, but it is better to just reorganize the computation in a way that doesn't require it.
From the quadratic formula, you can see that p is the right hand root of the quadratic equation,
0.5x^2 + 1e17x - 0.5e18 = 0
It looks like this equation came from a quadratic fit to data that is essentially linear, resulting in small numeric junk for the second degree term. If so, you should just repeat the fit using a linear model only. Or, just disregard the 2nd degree term and solve for the root of the linear part,
1e17x - 0.5e18 = 0
which will then give x=5.
using a first order approximation of the power function f(x) = (1+x)^n
this can be approximated by f1(x) ~= (1+n*x) for x << 1
thus p = -1e17 + sqrt(1e34 + 1e18) can be rewritten :
p = -1e17*(1 - sqrt(1 + 1e18/1e34))
p = -1e17*(1 - sqrt(1 + 1e-16))
p = -1e17*(1 - (1 + 0.5*1e-16))
p = 1e17 * 0.5 * 1e-16 = 5

Products

Release

R2025a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!