Why canβt a computer represent all real numbers exactly?
Because there are uncountably many real numbers but only finitely many bit patterns
What do we want from an encoding of real numbers?
It should be precise (as accurate as possible) and unambiguous (each bit pattern corresponds to exactly one number in the system).
General form of scientific notation in base 10?
Number = π Γ 10 ^ x , where π is the mantissa (just the number) and π₯ is the exponent.
What are the three parts of scientific notation?
Mantissa (significand), base (radix), and exponent.
In decimal scientific notation, what is the base?
The base (radix) is 10
In normalized decimal scientific notation, what range must the mantissa satisfy?
1 β€ π < 10
How is zero represented in normalized scientific notation?
m = 0
x = 0
Convert 16.37 to scientific notation.
16.37 = 1.637 Γ 10^1
List three virtues of scientific notation.
It is simple to generate, understand and manipulate;
It can represent numbers of widely varying magnitudes in a compact representation;
It is accurate to the precision of the representation
What happens when the mantissa has a fixed number of digits?
Numbers with more significant digits than the mantissa can hold must be approximated (rounded).
Give two disadvantages of scientific notation.
It approximates numbers that are mathematically exact (e.g. 1/3 or pi)
Rounding errors can accumulate and grow in complex calculations.
General form of binary floating point representation?
Number = Β± π Γ 2^π₯ , with 1 β€ |m| < 2 for normalized values.
What is special about the mantissa in normalized binary floating point?
It always has a leading 1 to the left of the binary point (except for zero).
Normalize
101.1101 in binary scientific notation.
2
101.1101 (2) = 1.011101 (2) Γ 2^2
What does the sign bit π represent in floating point?
π = 0 means positive
π = 1 means negative
How to convert IEEE 754 Single Precision (32 bits) to decimal
First separate into its 3 parts (sign, exponent, mantissa)
The most left bit represents the sign
Next 8 bits represent the exponent
Next 23 bits represent the mantissa
Convert the exponent from binary to decimal then subtract 127.
For the mantissa do 1 + 2^m + 2^n where m,n represents the positions of all 1’s in the mantissa.
Then multiply the above number by 2^x where x represents exponent - 127
How to convert the integer 13.1875 to single precision floating point number.
The number is positive so sign = 0
Mantissa:
13 in binary = 1101
0.1875 - keep multiplying by 2 until you reach the 0 after the point, or notice repetition
0.1875 x 2 = 0.375 0.375 x 2 = 0.75 0.75 x 2 = 1.5 0.5 x 2 = 1
we ignore the 1 in 1.5, only care about decimal part
LOOK AT FIRST NUMBER EITHER 0/1
Join the 2 numbers together : 1101.0011
Normalise it to 1.1010011 by moving binary point 3 times to the left so it is:
1.1010011 x 2^3 -> First 1 is not stored, so: 1010011
Add the bias so: 3 + 127 = 130 = 10000010
Answer:
01000001010100110000000000000000
How to do floating point addition using the example : (9.6 x 10^2) + (6.6 x 10^1)
1) De-normalise the smaller operand and adjust its exponent to equal that of the other operand: (9.60 x 10^2) + (0.66 x 10^2)
2) Add the mantissae and retain the common exponent: (10.26 x 10^2)
3) Re-normalise the mantissa and adjust the exponent if necessary: (1.03 x 10^3)
How to do floating point multiplication using example: (5.2 x 10^-4) x (2.6 x 10^7)
1) Multiply the mantissae and add the exponents. This will yield: (13.56 x 10^3)
2) Renormalise the mantissa and adjust the exponent if necessary: (1.356 x 10^4)