Significant digits of floating-point numbers
MuPAD® notebooks will be removed in a future release. Use MATLAB® live scripts instead.
MATLAB live scripts support most MuPAD functionality, though there are some differences. For more information, see Convert MuPAD Notebooks to MATLAB Live Scripts.
The environment variable
the number of significant decimal digits in floating-point numbers.
The default value is
DIGITS = 10.
Possible values: a positive integer larger than 1 and smaller than 229 + 1.
Floating point numbers are created by applying the function
float to exact numbers or numerical expressions.
Elementary objects are approximated by the resulting floats with a
relative precision of
10^(-DIGITS), i.e., the first
digits are correct. See Example 1.
In arithmetical operations with floating-point numbers, only
DIGITS decimal digits are taken into
account. The numerical error propagates and may grow in the course
of computations. See Example 2.
If a real floating-point number is entered directly (e.g., by
:= 1.234), a number with at least
decimal digits is created.
If a real float is entered with more than
the internal representation stores the extra digits. However, they
are not taken into account in arithmetical operations, unless
increased accordingly. See Example 3.
In particular, complex floating-point numbers are created by adding the real and imaginary part. This addition truncates extra decimal places in the real and imaginary part.
The value of
DIGITS may be changed at any
time during a computation. If
DIGITS is decreased,
only the leading digits of existing floating numbers are taken into
account in the following arithmetical operations. If
increased, existing floating-point numbers are internally padded with
trailing binary zeroes. See Example 4.
DIGITS, certain functions such
as the trigonometric functions may give wrong results if floats as
arguments are too inaccurate. See Example 5.
DIGITS, only significant digits
of floating-point numbers are displayed on the screen. The preferences
be used to modify the screen output. See Example 4.
At least one digit after the decimal point is displayed; if it is insignificant, it is replaced by zero. See Example 6.
Internally, floating-point numbers are created and stored with some extra “guard digits.” These are also taken into account by the basic arithmetical operations.
For example, for
DIGITS = 10, the function
float converts exact
numbers to floats with some more decimal digits. The number of guard
digits depends on
At least 2 internal guard digits are available for any value
Environment variables such as
global variables. Upon return from a procedure that changes
the new value is valid outside the context of the procedure as well!
save DIGITS to restrict the modified value
DIGITS to the procedure. See Example 8.
The default value of
this value after starting or resetting the system via
reset. Also the command
DIGITS; restores the default value.
See the helppage of
We convert some exact numbers and numerical expressions to floating point approximations:
DIGITS := 10: float(PI), float(1/7), float(sqrt(2) + exp(3)), float(exp(-20))
DIGITS := 20: float(PI), float(1/7), float(sqrt(2) + exp(3)), float(exp(-20))
We illustrate error propagation in numerical computations. The
following rational number approximates
17 decimal digits:
r := 738905609893065023/100000000000000000:
r to floating
point approximations. The approximation errors propagate and are amplified
in the following numerical expression:
DIGITS := 6: float(10^20*(r - exp(2)))
None of the digits in this result is correct. A better result
is obtained by increasing
DIGITS := 20: float(10^20*(r - exp(2)))
delete r, DIGITS:
In the following, only 10 of the entered 30 digits are regarded as significant. The extra digits are stored internally, anyway:
DIGITS := 10: a := 1.23456789666666666666666666666; b := 1.23456789444444444444444444444
DIGITS. Because the internal
correct to 30 decimal place, the difference can be computed correctly
to 20 decimal places:
DIGITS := 30: a - b
delete a, b, DIGITS:
We compute a floating-point number with a precision of 10 digits.
Internally, this number is stored with some guard digits. Increasing
30, the correct guard digits become visible. With the the call
trailing zeroes of the decimal representation become visible:
DIGITS := 10: a := float(1/9)
Pref::trailingZeroes(TRUE): DIGITS := 100: a
Pref::trailingZeroes(FALSE): delete a, DIGITS:
For the float evaluation of the sine function, the argument
is reduced to the standard interval [0, 2 π].
For this reduction, the argument must be known to some digits after
the decimal point. For small
DIGITS, the digits
after the decimal point are pure round-off if the argument is a large
DIGITS := 10: sin(float(2*10^30))
DIGITS to 50, the argument of
the the sine function has about 30 correct digits after the decimal
point. The first 30 digits of the following result are reliable:
DIGITS := 50: sin(float(2*10^30))
At least one digit after the decimal point is always displayed. In the following example, the number 39.9 is displayed as 40.0 because “40.” is not be a valid MuPAD® input:
DIGITS := 2: float(10*PI), 39.9, -30.2
float(10^40*8/9) with various
DIGITS. Rounding takes into account all
guard digits, i.e., the resulting integer makes all guard digits visible:
for DIGITS in [7, 8, 9, 17, 18, 19, 26, 27, 28] do print("DIGITS" = DIGITS, round(float(10^40*8/9))) end_for:
The following procedure allows to compute numerical approximations
with a specified precision without changing
a global variable. Internally,
DIGITS is set to
the desired precision and the float approximation is computed. Because
save DIGITS, the value of
not changed outside the procedure:
myfloat := proc(x, digits) save DIGITS; begin DIGITS := digits: float(x); end_proc:
The float approximation of the following value
from numerical cancellation. The procedure
used to approximate
x with 30 digits. The result
is displayed with only 7 digits because of the value
7 valid outside the procedure. However, all displayed digits are correct:
x := PI^7 - exp(8013109200945801/1000000000000000): DIGITS := 7: float(x), myfloat(x, 30)
delete myfloat, x, DIGITS:
If a floating-point number
x has been created
with high precision, and the computation is to continue at a lower
precision, the easiest method to get rid of memory-consuming insignificant
x := x + 0.0.