# Floating points in JavaScript

## Numeric data types

Programming languages usually have more than one numeric data type. Numbers can be mathematically categorized and number categories may have optimal formats for storing them in a computer memory or optimal formats for dealing with them under the hood. Furthermore, number representations may have mathematical properties that conflict with the abilities of computer memory. For instance, the decimal representation of one third involves an infinite sequence of digits 3 after the decimal point, but a computer does not have the capability to store an infinite amount of digits.

Typical numeric data types in programming languages are for instance: *integers* (non-fractional numbers), *floating points* (decimal numerals),
*fixed-point* types
(for representing monetary values) and *bignum* types.
Programming languages may even have different specific sub-types of integers, floating points, etc.

JavaScript has only two numeric data types: *number* and the more recent addition to the language *BigInt*.
The number data type represents *floating point numbers* or shortly *floating points*
and is the general numeric data type for both integers and fractions.
The BigInt data type is a bignum type, involving arbitrary-precision numbers to represent arbitrarily large integers.

## What are floating points?

Just about all programming languages have floating point data types. Floating-point representation allows computers to store a very wide range of numbers, from very small to very large, in an efficient way. But floating points have some pitfalls you should be aware of. The next statement is a famous example of unexpected results that calculations with floating points may yield.

```
console.log( (0.1 + 0.2) === 0.3 ); // returns: false.
```

To be able to understand the quirks associated with floating points, we need to dive deeper into the details of how they work.

### Decimal numerals

First some general background. A number, i.e. a real number, can be:

- An
*integer*. Integers are whole numbers, without a fractional part, like 1 or −313. - A
*rational number*. A rational number is a ratio of two integers, like 1/3. Every integer is also a rational number (3 = 3/1). - An
*irrational number*. Real numbers that are not rational, like`√2`or`π`, are irrational. Irrational numbers cannot be written as a ratio of two integers.

Generally we use the decimal (base-10) numeral system to represent numbers; so called *decimal numerals* or *decimal numbers*.
All integers can be precisely represented in a positional numeral system like this.
With rational numbers this is not always the case, and with irrational numbers this is never the case.
Only *decimal fractions* (they include all integers) can be precisely written as decimal numerals.
*Decimal fractions* are rational numbers where the denominator is (or can be written as) an integer power of ten:

3 = 3/1 = 30/10

1/2 = 5/10 = 0.5

1/4 = 25/100 = 0.25

1/5 = 2/10 = 0.2

313/100 = 3.13

3/200 = 15/1000 = 0.015

Thus, a **non**-decimal fraction is any fraction, when fully reduced,
having a denominator that has a prime factor that is *not* a prime factor of the numeral system's base.
In the decimal system the base is 10 and its prime factors are 2 and 5 (10 = 2 × 5).
For example, 1/6 = 1/(2 × 3) is a non-decimal fraction, since 3 is not a prime factor of 10.
Non-decimal fractions cannot be precisely represented as decimal numerals.
All non-decimal fractions and all irrational numbers are only approximations when represented as decimal numerals.

1/3 ≈ 0.333

−1/6 ≈ −0.166667

π ≈ 3.1415927

In the examples above two rational and an irrational number are approximated by a decimal fraction.
The *decimal expansion* (the sequence of digits) is
actually infinitely long to the right, but rounded-off to a certain precision,
turning it into a decimal fraction.

These round-offs may cause errors big enough to become a problem in your application. When performing mathematical operations on fractions and irrationals, presented as decimal numerals, you should be aware of these possible errors. An example:

3 × 1/3 = 3/3 = 1

3 × 0.333333 = 0.999999

The first expression above shows a precise calculation. This is a strict arithmetic procedure for this calculation with fractions. The second expression above converted the fraction to a rounded-off decimal numeral representation (0.333333), before multiplying. This rounding introduced an error which yields a false result: 1 ≠ 0.999999. The error is small, and will become smaller when the decimal expansion becomes longer (rounding after more digits), but errors may for example accumulate in iterative calculations or comparison could result in unexpected behavior (like the code example at the beginning of this chapter).

### Scientific notation

Now, floating-point number representation is the digital analogon of *scientific notation*.
Scientific notation is directly derived from the decimal numeral system and a more compact and
especially for large or small numbers, a more accessible way to express numbers.

The general form of a number notated in floating-point representation (or scientific notation) is:

`significand` × `base` ^{exponent}

Some examples (in base-10):

12 345 678.9 = 1.23456789 × 10^{7}.

0.000000123456789 = 1.23456789 × 10^{−7}.

9999 = 9.999 × 10^{3} (= 10^{4} − 1).

In
*normalized* scientific notation
the significand is a number with only one non-zero decimal digit before the decimal point.
The examples above are in normalized scientific notation.

Calculators and computer programs often use *e-notation*
as an alternative format of scientific notation with base 10.

1.23 × 10^{7} = 1.23e7.

Floating-point representation allows computers to store a very wide range of numbers, from very small to very large, with an acceptable fixed relative precision, in a fixed format with a limited number of digits. The computer uses base-2 (binary numeral system) though, because the computer's digital electronic circuits under the hood work with two-state electric signals: 0 and 1.

## IEEE 754

IEEE 754 is a standard for floating-point arithmetic.
JavaScript uses this standard for it's number data type.
More specifically, JavaScript uses 64-bit IEEE 754, known as the double-precision format, as opposed to single precision (32-bit).
In other programming languages, like C, this is known as the `double`

data type.
So, JavaScript uses 64 bits (binary digits) to store a floating point: one bit for the sign (for a positive or negative number),
11 bits for the exponent and 52 bits for the *significand* (aka *mantissa*).
In figure 4 this is visualized for storing the decimal number 0.1.

0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0 |

sign (1) | exponent (11) | significand (52) |
---|

Fig.4 - Binary representation of decimal 0.1 in 64-bit IEEE 754 format.

Used converter: baseconvert.com

The number of bits for the significand (length of the significand) determines the relative precision (relative to the order of magnitude)
in which numbers can be represented.
The length of the exponent determines the possible range of orders of magnitude, from very small to very large numbers.
The exponent can be positive or negative. The 11 bits store the exponent in a range 1 ... 2046, because
0 and 2047 (2^{11}−1) are for special cases (`NaN`

etc.).
A fixed exponent bias of 1023 is subtracted from this range to
get an exponent value in the range −1022 ... +1023.
An exponent of all zeros (0) and a zero significand (all zero bits) represents `-0`

or `+0`

(however: `-0 === +0 && -0 === 0`

).
An exponent of all ones (2047) and a zero significand represents `-Infinity`

or `+Infinity`

.
An exponent of all ones (2047) and a non-zero significand represents `NaN`

.

In IEEE 754 a number is stored in **normalized** scientific form. This means that the
*separator point* (i.e. the *decimal point*, or in binary, the *binary point*) sits just after the leftmost digit of the significand.
This leftmost digit in binary is always 1. Therefore only the bits on the right side of the binary point are stored as the significand.
When encoding the number from memory, the implied first bit 1 and binary point are automatically prepended back to the number's significand.

## Max/min floating points in JavaScript

Given the 52 bits for the significand, the maximum number in JS should be:

1. | 111...111 | × 10^{52} | = | 1111...111 |

implied first 1 | 52 ones | [*] | 53 ones |
---|

A binary number of 53 ones equals: 2^{53} − 1 = 9 007 199 254 740 991 ≈ 9 × 10^{15}.

```
console.log(Number.MAX_SAFE_INTEGER); // logs: 9007199254740991
console.log(Number.MIN_SAFE_INTEGER); // logs: -9007199254740991
```

Two things may arrest your attention:

- These maximum and minimum numbers are integers.
- Given the range of the exponent (−1022 ... +1023), the exponent should allow for much bigger and much smaller integers.

The maximum and minimum numbers are integers and based on the length of the significand.
An exponent of 52 moves the binary point all the way to the end, after the rightmost digit of the significand.
There are no more digits left for a fractional part. Hence, the maximum and minimum numbers are integers.
The bigger the exponent, the bigger the order of magnitude, the bigger the number itself, the less digits available
for a possible fractional part, the less the absolute precision will be. The **relative** precision however, relative to the order of magnitude (determined by the exponent),
is fixed and is determined by the number of bits available for the significand.
Fractional digits (or right most digits in general) are more significant to a small order of magnitude number than to
a large order of magnitude number.

The exponent can be much bigger than 52 and much smaller than −52.
So the maximum and minimum representable numbers must be much bigger and much smaller than the above mentioned
`MIN_SAFE_INTEGER`

and `MAX_SAFE_INTEGER`

.
This is true, but the representation of numbers (only integers) bigger or smaller than
`MIN_SAFE_INTEGER`

and `MAX_SAFE_INTEGER`

is not reliable.
That is why these minimum and maximum numbers have "SAFE" in their name.
Given the range of the exponent
−1022 .. +1023:

Maximum representable number: 2^{1024} − 1 ≈ 1.8 × 10^{308}

Minimum representable number: −(2^{1024} − 1) ≈ −1.8 × 10^{308}

```
console.log(Number.MAX_VALUE); // logs: 1.7976931348623157e+308
```

The property `Number.MIN_VALUE`

does *not* represent the minimum representable number, as you may expect.
It represents the minimum positive number, that is, the closest positive number to zero:

2^{−1022 − 52} = 2^{−1074} ≈ 5 × 10^{−324}

```
console.log(Number.MIN_VALUE); // logs: 5e-324
```

But what happens between `MAX_SAFE_INTEGER`

and `MAX_VALUE`

and between
`MIN_SAFE_INTEGER`

and `-MAX_VALUE`

?
Suppose we add one to `MAX_SAFE_INTEGER`

. This results in a binary number written as a 1 followed by **53** zeros.

(2^{53} − 1) + 1 = 1 × 2^{53}.

**52**zeros and an exponent representing 53. The 53rd zero in the significand is not stored (there are only 52 bits), but it will be "added" after multiplication by 2

^{53}. So, this still works. But when we add 2 to

`MAX_SAFE_INTEGER`

, the 53rd bit of the significant should be 1, which cannot be stored.
Thus, `MAX_SAFE_INTEGER`

+1 "equals" `MAX_SAFE_INTEGER`

+2.
Adding 3 to `MAX_SAFE_INTEGER`

results in the 52nd bit becoming 1, which is stored, and the 53rd bit becoming 0,
which is "added" after multiplication by 2^{53}. So, this number can be correctly stored again.

```
console.log( (Number.MAX_SAFE_INTEGER + 1) === (Number.MAX_SAFE_INTEGER + 2) ); // logs: true
console.log( 9007199254740992 === 9007199254740993 ); // logs: true
console.log( (Number.MAX_SAFE_INTEGER + 3) === 9007199254740994 ); // logs: true
console.log( 9007199254740993 === 9007199254740994 ); // logs: false
```

So, numbers outside the safe range but inside the minimum/maximum range are all processed as finite numbers but a
considerable amount is not stored correctly. Numbers larger than `MAX_VALUE`

are represented as `Infinity`

.
For integers greater than `MAX_SAFE_INTEGER`

or integers less than `MIN_SAFE_INTEGER`

,
data type BigInt is available.

`MAX_SAFE_INTEGER`

has 16 significant decimal digits. The 16th digit however, does not
always guarantee a precise conversion, as shown above.
The number of bits for the significand (length of the significand) determines the relative precision of the floating point format.
The relative precision is 53 bits which is equivalent to 15.95... decimal digits.

2^{53} = 10^{n} ⇔
`n` = 53 log_{10}(2) = 15.95...

Using numbers with more than 15 significant decimal digits in operations may yield improper results. Results of calculations are presented as decimal numbers, at most, rounded to 16 to 17 significant decimal digits. Significant digits do not include leading and trailing zeros (0.0123e4, 1.23, 1.2300e2, 123, 123.0, 123000 all have 3 significant digits).

Conclusion: be careful with operations that involve numbers (positive or negative) with more than 15 significant decimal digits. Also adding or subtracting numbers of very different scales or magnitudes may give unexpected results. In general: When there are too many significant digits, the number is rounded to match up with the limited number of bits in the significand, which possibly introduces rounding errors.

```
console.log( 0.30000000000000004 === 0.30000000000000002 ); // logs: true // more than 15 significant digits
console.log( 100_000_000 + 0.000000002 ); // logs: 100000000 // adding numbers of very different orders of magnitude
console.log( 0.3 + 0.00000000000000002 ); // logs: 0.3 // adding numbers of very different orders of magnitude
// In the next script a very small number (smallNum) is added to a
// very large number (largeNum) a million times.
//
// This has no effect though. Whether you add it once or a million times,
// every single time the result gets rounded back to 1e9.
let largeNum = 1e9, // this is 1000000000
smallNum = 1e-9; // this is 0.0000000001
for (let n = 0; n < 1000000; n++ ) {
largeNum += smallNum;
}
console.log(largeNum); // logs: 1000000000 // = 1e9.
```

## Numbers with infinite many significant digits

As we have seen; floating points having too many significant digits need to be rounded. Consequently, numbers with an infinite sequence of digits are always rounded. As discussed before, all irrational numbers and some rational numbers have infinite sequences of digits after the decimal/binary point. Which rational numbers have an infinite expansion of digits depends on the base of the numeral system they are represented in.

Now, let's go back to the JavaScript presented at the beginning of this article: `(0.1 + 0.2) !== 0.3`

.
All three involved fractions are decimal fractions. When they are represented as decimal numerals they have finite sequences of digits, and therefore
are not approximated by rounding, and therefore they can be precisely added and represented as decimal numerals.
However, numbers are stored as floating points in binary form instead of decimal form, and
in binary form the three involved fractions *are* approximated by rounding!
In binary form only *binary fractions*, that are fractions, when fully reduced, having a denominator being a power of two, can be
precisely represented with a finite number of digits.
All three fractions are *not* binary fractions:
0.1 = 1/10,
0.2 = 2/10 and
0.3 = 3/10.
This means that the stored bits patterns for both `0.1`

and `0.2`

are rounded and
after adding them, the bits pattern for the sum is rounded again. Then this result is compared with the
bits pattern of the other part of the comparison, being `0.3`

, which is rounded only once.
Thus, due to different rounding errors, the sum `0.1 + 0.2`

evaluates as being unequal to the lone `0.3`

.

Note in the next example that, as mentioned before, using numbers with more than 15 significant decimal digits may cause problems, but results may be rounded to more than 15 significant decimal digits.

```
console.log( (0.1 + 0.2) === 0.3 ); // logs: false
```

0.1 + 0.2: 0011111111010011001100110011001100110011001100110011001100110100 0.3: 0011111111010011001100110011001100110011001100110011001100110011

```
console.log(0.1+0.2); // logs 0.30000000000000004
console.log( (0.3) === 0.30000000000000004 ); // logs: false
console.log( 0.30000000000000004 === 0.30000000000000002 ); // logs: true
```

Some more examples:

```
console.log( (0.3 + 0.6) === (0.45 + 0.45) ); // logs: false
console.log( (0.3 + 0.6) >= (1.1 - 0.2) ); // logs: false
console.log(0.3 + 0.6); // logs: 0.8999999999999999
console.log(0.45 + 0.45); // logs: 0.9
console.log(1.1 - 0.2); // logs: 0.9000000000000001
```

Rounding errors in operations involving non-binary fractions not always cause flawed results. Rounding errors may cancel each other out.

```
console.log( (0.5 + 0.1) === 0.6 ); // returns: true.
```

0.5 + 0.1: 0011111111100011001100110011001100110011001100110011001100110011 0.6: 0011111111100011001100110011001100110011001100110011001100110011

In the next example all involved fractions are binary fractions, therefore no rounding occurs at all.

```
console.log( (0.5 + 0.25) === 0.75 ); // logs: true
```

Also irrational numbers have infinite many significant digits/bits after the decimal/binary point.

```
console.log( Math.sin(Math.PI) === 0 ); // logs: false
console.log( Math.sin(Math.PI) ); // logs: 1.2246467991473532e-16 // This is 0.00000000000000012246467991473532
```

Scientific calculators
or computer algebra systems
usually handle irrational numbers or fractions in an exact mathematical way.
Their source code includes an extensive set of algorithms to perform symbolic mathematical operations directly,
instead of using approximate floating point values for each intermediate calculation:
3 × 1/3 = 3/3 = 1 instead of 3 × 0.333333 = 0.999999.
A system like this will return exactly 0 on an input sin(`π`).

## Comparing floating point numbers

From what we have learned so far we can conclude: be careful with directly comparing two floating points, for instance in conditional statements or loops. Before comparing we could round the numbers to some desired number of decimals after the decimal point:

```
let toNdecimals = 6; // What is an appropriate round-off for your application?
console.log( roundNumber(0.1+0.2) === roundNumber(0.3) ); // logs: true.
console.log( roundNumber(10000000000.1+0.2) === roundNumber(10000000000.3) ); // logs: false.
console.log( roundNumber(10000000000.1+0.2) ); // logs: 10000000000.300001.
function roundNumber(num) {
return Math.round(num * Math.pow(10, toNdecimals)) / Math.pow(10, toNdecimals);
}
```

A less ponderous method is to check if the difference of the two numbers is small enough, that is, if the two numbers are "close enough" for your application.

```
let epsilon = 1e-6;
console.log( Math.abs((0.1+0.2) - 0.3) < epsilon ); // logs: true.
console.log( Math.abs((10000000000.1+0.2) - 10000000000.3) < epsilon ); // logs: false.
```

The difficulty is to find an appropriate approximation error,
often called *epsilon*, that the difference is compared to.
An epsilon too small makes the comparison always return `false`

,
an epsilon too big makes your condition test too inaccurate; it returns `true`

if the two numbers are too different.
As mentioned before, the precision of the IEEE 754 floating point system is *relative* due to the fixed number of bits in the significand.
Rounding happens in the last bit. If the number is larger, the last bit represents a larger value
and thus the absolute rounding error becomes larger. However, multiple rounding errors in operations or calculations may still (partly)
cancel each other out. The resulting total error may even be negative, hence the `Math.abs()`

method used in the examples.

```
console.log( Math.abs((0.1+0.2) - 0.3) ); // logs: 5.551115123125783e-17
console.log( Math.abs((1.1+2.2) - 3.3) ); // logs: 4.440892098500626e-16
console.log( Math.abs((11.1+22.2) - 33.3) ); // logs: 0
console.log( Math.abs((111.1+222.2) - 333.3) ); // logs: 5.684341886080802e-14 // the difference is negative
console.log( Math.abs((1111.1+2222.2) - 3333.3) ); // logs: 4.547473508864641e-13 // the difference is negative
console.log( Math.abs((11111.1+22222.2) - 33333.3) ); // logs: 0
console.log( Math.abs((111111.1+222222.2) - 333333.3) ); // logs: 5.820766091346741e-11
```

In the above example we see that, in general, the larger the involved numbers are, the larger the difference is. Choosing a very small
epsilon, say 1e-16, would work for `0.1 + 0.2 - 0.3`

, but not for the rest in the example.

In discussions online it is often proposed to use the `Number.EPSILON`

property as a fixed absolute epsilon.
The `Number.EPSILON`

property represents the *machine epsilon* quantity.

```
console.log(Number.EPSILON); // logs: 2.220446049250313e-16
```

The purpose of this "epsilon" is **not** to use it as a fixed absolute epsilon in comparisons.
For many, many applications this "epsilon" would be (way) too small. Forget about the existence of `Number.EPSILON`

, you
will probably never need to use it.

So, we compare an absolute error to an absolute epsilon, while the occurring error is relative to the order of magnitude of the involved numbers. A solution could be to make the difference or the epsilon also relative. You can find functions online that do just that, but they all have issues, even the most clever ones.

The easiest and most accessible way may be to simply choose a fixed epsilon
that is appropriate for your application. In practice, nine out of ten times acceptable errors in your application
are way larger than the rounding errors that may occur. If orders of magnitude do not differ too much you generally can choose
a value for epsilon where a smaller difference between two numbers has no practical meaning in your application.
And real nerds choose binary fractions for epsilons:
0.5 = 2^{−1}, 0.25 = 2^{−2}, 0.125 = 2^{−3}, 0.0625 = 2^{−4}, 0.0009765625 = 2^{−10}, etc.

## Alternative solutions

Usually the use of floating points does not cause problems, as long as the mentioned precautions are followed.
If your application really needs clean decimal calculations, like calculations with money,
and simply rounding the results to a fixed number of decimal places when displaying them is not a sufficient solution,
you might consider "internally" using integers only, e.g. calculate everything in cents.
This avoids binary fractions with infinite many significant bits.
Drawbacks are that code becomes less comprehensible and that the involved integers more likely become too large.
The latter may be tackled by using the JavaScript BigInt data type, particularly
in applications where values greater than 2^{53} are reasonably expected.

There are some JavaScript libraries available that provide an imitation decimal number type for arbitrary-precision arithmetic on all decimal numbers (integers and fractions). Only use them though, if there are no other reasonable solutions available and if it is really worth the seriously worse run-time performance. Libraries are:

## BigInts in JavaScript

In JavaScript **BigInt** is a primitive integer data type, especially for representing arbitrarily large integers, greater than 2^{53} - 1.

BigInt uses the same literal format as for the number type, except
with a lowercase letter `n`

suffix, with no decimal point (it's an integer) and with no e-notation allowed.
A BigInt literal cannot begin with a zero.

```
console.log(-123n); // logs: -123n
console.log(0n); // logs: 0n
console.log(123_456_789n); // logs: 123456789n
```

Arithmetic operations may be used with BigInt values. The built-in object `Math`

does *not* work with BigInts. Operations on BigInts return BigInts.
Operations with a fractional result will be truncated, i.e. the result will be rounded towards zero.
Floating points (number data type) cannot be mixed with BigInts in arithmetic operations.
BigInts cannot be mixed with booleans in arithmetic operations.

```
console.log(3n - 2n); // logs: 1n
console.log(3n ** 2n); // logs: 9n
console.log(3n / 2n); // logs: 1n // not 1.5n
console.log(3n + 2n); // logs: 5n
console.log(3n + "hello"); // logs: "3hello" // not "3nhello"
```

```
console.log(3n - true); // logs: TypeError
console.log(3n - 2); // logs: TypeError
console.log(Math.pow(3n,2n)); // logs: TypeError
```

The examples above could have worked if JS would coerce BigInts into numbers.
Probably because of safety reasons JS does not do this. BigInts can be arbitrary large and so
automatically converting them to floating points may unintentionally change the value of the number.
BigInts need to be *explicitly* converted first (see later).

Coercion of BigInts into booleans will not lead to a loss of precision. Therefore logical operators and conditional statements work with mixed BigInts and floating points. Also (non-strict) comparison operators take mixed operands.

```
console.log(0n || "hello"); // logs: "hello"
console.log(!3n); // logs: false
console.log(Boolean(!0n)); // logs: true
if (0n) { console.log("This will not be logged.") }
console.log(2n > 1); // logs: true
console.log(2n == 2); // logs: true
console.log(2n === 2); // logs: false // different data types
```

Explicit conversion from BigInt to a floating point and vice versa can be done by using the wrapper object constructor functions
`BigInt(number)`

and `Number(BigInt)`

. Using the unary plus operator to convert a BigInt throws a TypeError exception.

```
console.log( BigInt(Number.MAX_SAFE_INTEGER) + 2n ); // logs: 9007199254740993n
console.log( +2n ); // logs: TypeError
console.log( BigInt(1.5) ); // logs: RangeError
```

However, converting between floating points and BigInt values can lead to loss of precision!
So, only use BigInt values when values greater than 2^{53} are reasonably expected,
and once you use them in your application, stick to them! An alternative may be to pass a **string**
to `BigInt("number")`

instead of the number.

```
console.log( BigInt(123456789123456789) ); // logs: 123456789123456784n // the floating point already lost precision before passing it to BigInt().
console.log( BigInt("123456789123456789") ); // logs: 123456789123456789n
```