středa 3. května 2017

NaN and other special values of Intel x87 processors

Compilation of certain numerical libraries requires usually setting up several non-portable subroutines which define machine dependent constants. These are numbers describing the arithmetic environment. E.g., the subroutines R1MACH/D1MACH from SLATEC define 5 basic values:

C   D1MACH( 1) = B**(EMIN-1), the smallest positive magnitude.
C   D1MACH( 2) = B**EMAX*(1 - B**(-T)), the largest magnitude.
C   D1MACH( 3) = B**(-T), the smallest relative spacing.
C   D1MACH( 4) = B**(1-T), the largest relative spacing.
C   D1MACH( 5) = LOG10(B)

The library I have compiled needed several additional values:

C   DMACH(6) = not-a-number.
C   DMACH(7) = positive machine infinity.
C   DMACH(8) = negative machine infinity.

where the not-a-number, according to documentation, should be "quiet NaN". The IEEE 754 standard for binary arithmetic specifies quiet NaN as the result of various invalid or ambiguous operations, such as 0/0.

The famous procedure "dpara.f" from the paranoia package identifies NaN value for the current processor as a result of the 0/0 operation: the result is the value

                            sig exp 11bit   mant 52bit
-524288=4294443008=0xFFF80000=1 11111111111 1000000000...

This value has the sign bit set, which was different from the value suggested in the code. 
The IEEE 754 Converter is also very helpful, though it works for the single precision case only. Interesting source of information is This blog post:
For single precision (s: significand, e: exponent)
e=0x0,  s=0:  => +-0.0
e=0xff, s=0:  => +-Inf
e=0xff, s!=0: => NaN

and for double (s: significand, e: exponent)
e=0x0,   s=0:  => +-0.0
e=0x7ff, s=0:  => +-Inf
e=0x7ff, s!=0: => NaN
7ff0 0000 0000 0000   = Inf
fff0 0000 0000 0000   = -Inf

The ultimate solution I have found in this description:

Both types of NaNs are represented by the largest biased exponent allowed by the format (single- or double-precision) and a mantissa that is non-zero.
  • The bit pattern of the mantissa for a signalling NaN has the most significant digit set to zero and at least one of the remaining digits set to one.
  • The bit pattern of the mantissa for a quiet NaN has the most significant digit set to one
For single-precision values:
  • Positive infinity is represented by the bit pattern 7F800000
  • Negative infinity is represented by the bit pattern FF800000
  • A signalling NaN (NANS) is represented by any bit pattern
    between 7F800001 and 7FBFFFFF or between FF800001 and FFBFFFFF
  • A quiet NaN (NANQ) is represented by any bit pattern
    between 7FC00000 and 7FFFFFFF or between FFC00000 and FFFFFFFF
For double-precision values:
  • Positive infinity is represented by the bit pattern 7FF0000000000000
  • Negative infinity is represented by the bit pattern FFF0000000000000
  • A signalling NaN is represented by any bit pattern
    between 7FF0000000000001 and 7FF7FFFFFFFFFFFF or
    between FFF0000000000001 and FFF7FFFFFFFFFFFF
  • A quiet NaN is represented by any bit pattern
    between 7FF8000000000000 and 7FFFFFFFFFFFFFFF or
    between FFF8000000000000 and FFFFFFFFFFFFFFFF
To conclude: 
The actual value is onto the double precision variable introduced via equivalence of INTEGER/REAL*4 or INTEGER(2)/REAL*8 variables.

For single precision version

     EQUIVALENCE (RMACH, IRMACH)
C NaN                             sig exp      mant
C       /2143289344/ = 0x7FC00000 = 0 11111111 10000000000000000000000
      DATA IRMACH(6)/2143289344/
C +Inf                             sig exp     mant
C orig: /2139095040/ = 0x7F800000 = 0 11111111 00000000000000000000000
      DATA IRMACH(7)/2139095040/
C -Inf                             sig exp     mant
C orig: /4286578688/ = 0xFF800000 = 1 11111111 00000000000000000000000
C      DATA IRMACH(8)/-8388608/   0xff800000 = 4286578688
      DATA IRMACH(8)/-8388608/

and double precision version (intel processors use the little-endian convention):

     EQUIVALENCE (RMACH, IRMACH)
...
C NaN RMACH(6)                    sig exp 11bit   mant 52bit       
C       /2146959360/ = 0x7FF80000   0 11111111111 1000000000000000000        
      DATA IRMACH(11)/0/
      DATA IRMACH(12)/2146959360/
C +Inf RMACH(7)                   sig exp 11bit   mant 52bit
C       /2146435072/ = 0x7FF00000 = 0 11111111111 00000000000000000000
      DATA IRMACH(13)/0/
      DATA IRMACH(14)/2146435072/
C -Inf RMACH(8)                   sig exp 11bit   mant 52bit
C       /  -1048576/ = 0xFFF00000 = 1 11111111111 00000000000000000000
      DATA IRMACH(15)/0/
      DATA IRMACH(16)/-1048576/

Žádné komentáře:

Okomentovat