Significant digits of floating-point numbers

MuPAD® notebooks will be removed in a future release. Use MATLAB® live scripts instead.

MATLAB live scripts support most MuPAD functionality, though there are some differences. For more information, see Convert MuPAD Notebooks to MATLAB Live Scripts.


The environment variable DIGITS determines the number of significant decimal digits in floating-point numbers. The default value is DIGITS = 10.

Possible values: a positive integer larger than 1 and smaller than 229 + 1.

Floating point numbers are created by applying the function float to exact numbers or numerical expressions. Elementary objects are approximated by the resulting floats with a relative precision of 10^(-DIGITS), i.e., the first DIGITS decimal digits are correct. See Example 1.

In arithmetical operations with floating-point numbers, only the first DIGITS decimal digits are taken into account. The numerical error propagates and may grow in the course of computations. See Example 2.

If a real floating-point number is entered directly (e.g., by x := 1.234), a number with at least DIGITS internal decimal digits is created.

If a real float is entered with more than DIGITS digits, the internal representation stores the extra digits. However, they are not taken into account in arithmetical operations, unless DIGITS is increased accordingly. See Example 3.

In particular, complex floating-point numbers are created by adding the real and imaginary part. This addition truncates extra decimal places in the real and imaginary part.

The value of DIGITS may be changed at any time during a computation. If DIGITS is decreased, only the leading digits of existing floating numbers are taken into account in the following arithmetical operations. If DIGITS is increased, existing floating-point numbers are internally padded with trailing binary zeroes. See Example 4.

Depending on DIGITS, certain functions such as the trigonometric functions may give wrong results if floats as arguments are too inaccurate. See Example 5.

Depending on DIGITS, only significant digits of floating-point numbers are displayed on the screen. The preferences Pref::floatFormat and Pref::trailingZeroes can be used to modify the screen output. See Example 4.

At least one digit after the decimal point is displayed; if it is insignificant, it is replaced by zero. See Example 6.

Internally, floating-point numbers are created and stored with some extra “guard digits.” These are also taken into account by the basic arithmetical operations.

For example, for DIGITS = 10, the function float converts exact numbers to floats with some more decimal digits. The number of guard digits depends on DIGITS.

At least 2 internal guard digits are available for any value of DIGITS.

See Example 4 and Example 7.

Environment variables such as DIGITS are global variables. Upon return from a procedure that changes DIGITS, the new value is valid outside the context of the procedure as well! Use save DIGITS to restrict the modified value of DIGITS to the procedure. See Example 8.

The default value of DIGITS is 10; DIGITS has this value after starting or resetting the system via reset. Also the command delete DIGITS; restores the default value.

See the helppage of float for further information.


Example 1

We convert some exact numbers and numerical expressions to floating point approximations:

DIGITS := 10: 
float(PI), float(1/7), float(sqrt(2) + exp(3)), float(exp(-20))

DIGITS := 20:
float(PI), float(1/7), float(sqrt(2) + exp(3)), float(exp(-20))

delete DIGITS:

Example 2

We illustrate error propagation in numerical computations. The following rational number approximates exp(2) to 17 decimal digits:

r := 738905609893065023/100000000000000000:

The following float call converts exp(2) and r to floating point approximations. The approximation errors propagate and are amplified in the following numerical expression:

DIGITS := 6: float(10^20*(r - exp(2)))

None of the digits in this result is correct. A better result is obtained by increasing DIGITS:

DIGITS := 20: float(10^20*(r - exp(2)))

delete r, DIGITS:

Example 3

In the following, only 10 of the entered 30 digits are regarded as significant. The extra digits are stored internally, anyway:

DIGITS := 10:
a := 1.23456789666666666666666666666;
b := 1.23456789444444444444444444444

We increase DIGITS. Because the internal representation of a and b is correct to 30 decimal place, the difference can be computed correctly to 20 decimal places:

DIGITS := 30: a - b

delete a, b, DIGITS:

Example 4

We compute a floating-point number with a precision of 10 digits. Internally, this number is stored with some guard digits. Increasing DIGITS to 30, the correct guard digits become visible. With the the call Pref::trailingZeroes(TRUE), trailing zeroes of the decimal representation become visible:

DIGITS := 10: a := float(1/9)

Pref::trailingZeroes(TRUE): DIGITS := 100: a
Pref::trailingZeroes(FALSE): delete a, DIGITS:

Example 5

For the float evaluation of the sine function, the argument is reduced to the standard interval [0, 2 π]. For this reduction, the argument must be known to some digits after the decimal point. For small DIGITS, the digits after the decimal point are pure round-off if the argument is a large floating-point number:

DIGITS := 10: sin(float(2*10^30))

Increasing DIGITS to 50, the argument of the the sine function has about 30 correct digits after the decimal point. The first 30 digits of the following result are reliable:

DIGITS := 50: sin(float(2*10^30))

delete DIGITS:

Example 6

At least one digit after the decimal point is always displayed. In the following example, the number 39.9 is displayed as 40.0 because “40.” is not be a valid MuPAD® input:

DIGITS := 2: float(10*PI), 39.9, -30.2

delete DIGITS:

Example 7

We compute float(10^40*8/9) with various values of DIGITS. Rounding takes into account all guard digits, i.e., the resulting integer makes all guard digits visible:

for DIGITS in [7, 8, 9, 17, 18, 19, 26, 27, 28] do
    print("DIGITS" = DIGITS, round(float(10^40*8/9)))

Example 8

The following procedure allows to compute numerical approximations with a specified precision without changing DIGITS as a global variable. Internally, DIGITS is set to the desired precision and the float approximation is computed. Because of save DIGITS, the value of DIGITS is not changed outside the procedure:

myfloat := proc(x, digits) 
           save DIGITS;
             DIGITS := digits:

The float approximation of the following value x suffers from numerical cancellation. The procedure myloat is used to approximate x with 30 digits. The result is displayed with only 7 digits because of the value DIGITS = 7 valid outside the procedure. However, all displayed digits are correct:

x := PI^7 - exp(8013109200945801/1000000000000000):
DIGITS := 7: 
float(x), myfloat(x, 30)

delete myfloat, x, DIGITS:


If a floating-point number x has been created with high precision, and the computation is to continue at a lower precision, the easiest method to get rid of memory-consuming insignificant digits is x := x + 0.0.