[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ns] Fix: ns and compiling with -O2 on Linux x86




In ns-2.1b7a the FPU was not being correctly set up for use on Intel x86
based Linux machines. This lead to validation failures, portability
problems and inconsistent floating point calculations when compiler
optimisations were being used.

Looking at the cvs tree there have been attempted fixes for this since cvs
revision 1.16 of tclAppInit.cc . 

My fix for this is to explicitly set up the floating point unit to use
56-bit precision and nearest rounding when ns starts. There was code to
do this in tclAppInit.cc but it appears to be not setting the codeword
correctly. The patch attached simplifies the code and gets this setup
working again using the fpu_control.h glibc header. I've tried it on a
RH6.2 Pentium III and a RH6.1 Celeron both using egcs-2.91.66 with -O2 and
it passes the validation suites. 

This should mean that -O2 optimisations can be used on Linux x86. It
should also fix the problems described in:
http://www.isi.edu/nsnam/archive/ns-users/webarch/1999/msg04185.html
regarding Linux and floating point portability problems. 

Please see the patch to tclAppInit.cc against ns-2.1b7a attached.

Tom

ps looking through the source there are several possible "int main(int
argc, char **argv)" entry points for ns in tclAppInit.cc, tkAppInit.cc
and ns_tclsh.cc. Only tclAppInit.o actually gets linked into the ns 
binary, do ns_tclsh.cc and tkAppInit.cc actually do anything or are they
cruft that should be removed?
--- tclAppInit.cc.orig	Fri Mar  2 11:44:50 2001
+++ tclAppInit.cc	Fri Mar  2 11:46:02 2001
@@ -63,63 +63,34 @@
 
 
 #if defined(linux) && defined(i386)
-#ifndef HAVE_FESETPRECISION
-/*
- * From:
- |  Floating-point environment <fenvwm.h>                                    |
- | Copyright (C) 1996, 1997, 1998, 1999                                      |
- |                     W. Metzenthen, 22 Parker St, Ormond, Vic 3163,        |
- |                     Australia.                                            |
- |                     E-mail   [email protected]                          |
- * used here with permission.
- */
-#define FE_FLTPREC       0x000
-#define FE_INVALIDPREC   0x100
-#define FE_DBLPREC       0x200
-#define FE_LDBLPREC      0x300
-/*
- * From:
- * fenvwm.c
- | Copyright (C) 1999                                                        |
- |                     W. Metzenthen, 22 Parker St, Ormond, Vic 3163,        |
- |                     Australia.  E-mail   [email protected]              |
- * used here with permission.
- */
-/*
-  Set the precision to prec if it is a valid
-  floating point precision macro.
-  Returns 1 if precision set, 0 otherwise.
-  */
-int fesetprecision(int prec)
-{
-  unsigned short cw;
-  asm volatile ("fnstcw %0":"=m" (cw));
-  if ( !(prec & ~FE_LDBLPREC) && (prec != FE_INVALIDPREC) )
-    {
-      cw = (cw & ~FE_LDBLPREC) | (prec & FE_LDBLPREC);
-      asm volatile ("fldcw %0":"=m" (cw));
-      return 1;
-    }
-  else
-    return 0;
-}
-#endif /* !HAVE_FESETPRECISION */
+#include <fpu_control.h>
 
 /*
- * Linux i386 uses 60-bit floats for calculation,
+ * Linux x86 uses 60-bit floats for calculation,
  * not 56-bit floats, giving different results.
  * Fix that.
- *
- * See <http://www.linuxsupportline.com/~billm/faq.html>
+ * 
+ * See <http://www.suburbia.net/~billm/floating-point/index.html>
  * for why we do this fix.
  *
- * This function is derived from wmexcep
+ * The macros and constants set in <fpu_control.h> are used here. 
  *
  */
+
 void
 fix_i386_linux_floats()
 {
-	fesetprecision(FE_DBLPREC);
+  unsigned short mode =0;
+
+  // masks for exceptions  (see fpu_control.h)
+  mode |= _FPU_MASK_IM | _FPU_MASK_DM; // invalid op and denormalized op 
+  mode |= _FPU_MASK_ZM | _FPU_MASK_PM; // zerodiv and precision
+  mode |= _FPU_MASK_UM | _FPU_MASK_OM; // overflow and underflow
+  
+  // set precision to double and rounding to nearest (as IEEE 754)
+  mode |= _FPU_DOUBLE | _FPU_RC_NEAREST;
+
+  _FPU_SETCW(mode);  
 }
 #endif