Practice Exercise 21 Answers

Write the following numbers in scientific notation:

               1                            1.0 x 10⁰
               3064                         3.064 x 10³
               -0.0000016384                -1.6384 x 10^-6
               -268                         -2.68 x 10²
               100,000,500,006,783          1.00000500006783 x 10¹⁴
               12345678.9101112             1.23456789101112 x 10⁷

Write the same numbers above using the decimal floating point system that is presented in Chapter 24. (There are 5 digits for the mantissa, 2 digits for the exponent, which is written in excess-50 notation, and a leading 1 or 0 indicates - or +, respectively.)

                                   excess 50
                sign   mantissa    exponent                   Notes & calculations
               ------+-----------+-----------+----------------------------------------------
                 0   |  10000    |    51     |  .1 x 10¹            50+1 = 51
                 0   |  30640    |    54     |  .3064 x 10⁴         50+4 = 54
                 1   |  16384    |    45     |  -.16384 x 10^-5      50-5 = 45
                 1   |  26800    |    53     |  -.268 x 10³         50+3 = 53
                 0   |  10000    |    65     |  .100000500006783  can't store entire mantissa
                                             |  .10000 x 10¹⁵       50+15 = 65
                 0   |  12345    |    58     |  .12345 x 10⁸        50+8 = 58
               
               (Notice that if the mantissa is longer than 5 digits, some of them will
                get chopped off.)

Which of the following numbers has the smallest value?

               a.) -6.4 x 10^-38
                       this one is close to 0, but it is still larger than (b)
               
               b.) -6.4 x 10³⁸
                       this one is the most negative, hence the smallest
               
               c.) 6.4 x 10^-37
                       close to 0 but still positive
               
               d.) 6.4 x 10³⁷

Which of the above numbers has the smallest magnitude (smallest absolute value)?

               a.) -6.4 x 10^-38

Perform Floating Point addition on the following pairs of values, which are written in scientific notation. Assume that there are only 5 places of accuracy in the mantissa.

               a.)      7.3892 x 10¹⁷
                      + 1.8901 x 10¹⁹
                      ---------------
               
               after doing exponent adjustment...
               
                  0.073892 x 10¹⁹
               +  1.890100 x 10¹⁹
               ----------------------
                  1.9639|92 x 10¹⁹
                         ^
                         round?  if yes, then 1.9640 x 10¹⁹
                                 if no,  then 1.9639 x 10¹⁹
               
               b.)      -7.3892 x 10¹⁷
                      + +1.8901 x 10¹⁹
                      -----------------
               
               after doing exponent adjustment...
               
                 -0.073892 x 10¹⁷
               + +1.890100 x 10¹⁹
               ----------------
                  1.816208 x 10¹⁹
                       round to 1.8162 x 10¹⁹
               
               c.)       7.3892 x 10¹⁴
                       + 1.8901 x 10²⁷
                       ----------------
               
               after doing exponent adjustment
               
                 0.00000000000073892 x 10²⁷
               + 1.8901 x 10²⁷
               ------------------
                 1.8901 x 10²⁷   after truncation down to 5 significant digits

Perform Floating Point multiplication on the following pairs of values:

               a.    7.3892 x 10¹⁴
                   x 1.8901 x 10²⁷
                   ---------------
                    13.966326 x 10¹⁴⁺²⁷ = 13.966326 x 10⁴¹
                                       = 1.3966326 x 10⁴²
               
               
               b.    3.6000 x 10¹⁴
                   x 5.0000 x 10¹⁷
                   ---------------
                     18.0 x 10¹⁴⁺¹⁷ = 18.0 x 10³¹ = 1.80 x 10³²

Now perform some additions and multiplications using the decimal floating point system which we developed in Chapter 24. All results should be normalized and overflow should be indicated. Remember that the exponent is written in excess 50 notation.

               a.   0 83021 53                0 00083 56
                  + 0 93011 56              + 0 93011 56
                  ------------              ------------
                                              0 93094 56  <--final answer
               
               
               b.   0 83021 53                10³ x 10⁶ = 10³⁺⁶ = 10⁹
                  x 0 93011 56
                  ------------
                       .83021 x .93011 = .7721866
               
                                              0 77218 59  <--final answer
               
               
               c.   0 83021 52                0 08302 53
                  + 1 93011 53                1 93011 53         .93011-.08302 = .84709
                  ------------              ------------
                    ^                         1 84709 53  <--final answer
                  negative so subtract!
               
               
               d.   0 83021 52                0 00000 68    <--decimal adjust zeros out #
                  + 0 93011 68                0 93011 68
                  ------------              ------------
                                              0 93011 68  <--final answer
               
               e.   1 83021 23     both negative so result will be positive
                  x 1 93011 71
                  ------------
                    .83021 x .93011 = .7721866
                    10^23-50 = 10^71-50 = 10^-27 x 10²¹ = 10^-6
                                       50 + -6 = 44
               
                                              0 77218 44  <--final answer

Again referring to the decimal floating point system given in Chapter 24,

     a.)  How many representable real numbers are there?  Consider all
     normalized, unnormalized and denormalized numbers.
     
             00000 to 99999      there are 100,000 numbers in this range,
                                 so there are 100,000 different mantissas
     
             00 to 99 is the range of exponents, so there are 100 different
                      exponents
     
             there are 2 signs, positive and negative
     
             100 x 100,000 x 2 = 20 million
     
     b.) Again referring to the decimal floating point system, how many
     normalized numbers are there?
     
            You must exclude those mantissas whose 1st digit is 0
     
            1 0000 ... 1 9999    --> 10,000 mantissas
            2 0000 ... 2 9999    --> 10,000 mantissas
            3 0000 ... 3 9999    --> 10,000 mantissas
                ...
            9 0000 ... 9 9999    --> 10,000 mantissas
                                     ------
                                     90,000 mantissas
     
            90,000 x 100 x 2 = 18 million
     
     c.) How many denormalized numbers are there?
     
         2 signs     mantissas    exponent (all denorms have exp=00)
            +/-        0 XXXX        00
     
             2    x     10,000    x   1   = 20,000
     
         Some systems exclude 0 00000 00 and 1 00000 00 since they are pure 0.
         So in this case, there would be 19,998 denormalized numbers.

Referring to the floating point addition hardware in Section 24.8:

     a.) How many stages are there in the addition hardware of Fig. 7?
     
           3 stages
     
     b.) What would be the expected speedup?
     
           speedup would be somewhat less than 3 since writing into the
           intermediate registers takes a little time
     
     c.) How many stages are in the floating point multiplication hardware of
           Fig. 8?
     
               2

Suppose you are working a 4-bit binary number system, using excess 8 notation. Write down all 16 bit patterns and show what their values would be in unsigned binary, sign-magnitude, 2's complement and excess-8.

               raw bits  unsigned  sign-magnitude   excess-8    2'complement
               -------------------------------------------------------------
                 0000         0         +0              -8            0
                 0001         1         +1              -7           +1
                 0010         2         +2              -6           +2
                 0011         3         +3              -5           +3
                 0100         4         +4              -4           +4
                 0101         5         +5              -3           +5
                 0110         6         +6              -2           +6
                 0111         7         +7              -1           +7
                 1000         8         -0               0           -8
                 1001         9         -1              +1           -7
                 1010        10         -2              +2           -6
                 1011        11         -3              +3           -5
                 1100        12         -4              +4           -4
                 1101        13         -5              +4           -3
                 1110        14         -6              +6           -2
                 1111        15         -7              +7           -1

What is the value 38 in a true binary floating point system, like the one presented in Chapter 24? The first digit is assumed since all values are stored in normalized form.

                    38 = 100110₂    (32 + 4 + 2)
               
                       = .100110₂ x 2⁶
               
                          1     0     0     1       1     0
                         1/2 + 1/4 + 1/8 + 1/16 + 1/32 + 1/64
                         1/2       +       1/16 + 1/32
                         .5        +      .0625 + .03125       = .59375
               
               
                      2⁶ = 64, so .59375 x 64 = 38
               
                 +---+-----------+---------------------------------+
                 | 0 | 10000110  | 00110 ....... (all 0's) ....    |
                 +---+-----------+---------------------------------+
                                  ^
                                  .1 is implied
               
                  exponent is 8 bits, excess 128,    6 + 128 = 134,   134=10000110₂