[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

PATCH for CalendarScheduler; FIXES numerical stability



I provide below my patch for the CalendarScheduler, that fixes the
"backwards in time" problem that I have shown to be caused by numerical
instability (underflow in repeated FP additions) in the existing
implementation.

Compile scheduler.cc with -DCAL_FIX to build with the patch.

With this patch, my 1000-second simulation of 300 mobile nodes, and 900
TCP connections does not cause the CalendarScheduler to give the "backwards
in time error." Without the patch, ns-2.1b6 does give this error.

I've run the full set of validation tests (./validate) of my fixed
CalendarScheduler against results generated by the HeapScheduler. All
tests pass!

My fix is quite simple. Each time the "year" of the queue wraps (from
the last bucket to the first), I use direct FP multiplication to compute
the next value of buckettop_. That is:

	buckettop_ = prevtop_ + nbuck_ * width_;

(where prevtop_ is the buckettop_ value from the immediately previous
year-wrap transition.)

Note that I must take care to maintain prevtop_ correctly across expansions
of the bucket count--I do this.

The result is that *within* a year, fast FP adds are used to compute
buckettop_. These adds accumulate error. But at the end of every year, I
"pull" buckettop_ back to where it should be by doing the above direct
multiplication. Because the FP multiply happens once per year, the effect
on the code's performance should be negligible for large calendars, whose
years have many days (buckets).

Note that with this fix, it *may* be the case that MIN_WIDTH can be made
smaller than 1.0e-6, since the accumulation of roundoff error will be
limited to within one year. Someone may want to pursue this--the result
would be shorter linked lists in buckets, and improved insertion performance.

There is one other thing about the CalendarQueue code that makes me uneasy.
When choosing a new width_, the code enforces that the new width_ cannot be
more than MIN_WIDTH times smaller than clock_+1.0 . This makes no sense in my
mobile simulations, where so many events are queued up using "ns at" when
clock_ is still zero, that the number of buckets increases to 256K before
the simulation even starts (in fact, it doesn't grow again for the life of
the simulation). Once the simulation gets rolling, and clock_ gets as big
as 1000 or greater, the width_ chosen at clock_ == 0 will still be in use,
but it won't avoid underflow for FP adds of clock_ + width_ when clock_ is
so much larger.

It's possible, in fact, that this is the essence of the "backwards in time"
bug--the choice of minimum width_ can be made inappropriately when many
events are inserted before the clock starts running, and no resizing may ever
occur thereafter, leaving too small a width_ for numerical stability. This
leaves the question of *what* value, if not clock_, is the right one to
use when computing a minimum width_. Not even max_ is the right choice--if
the number of buckets never changes again, clock_ and max_ *both* can be
much greater later in the simulation.

And so, my fix may be the best way to deal with this, after all: because it
limits accumulation of FP underflow error to within a year, it becomes
less crucial to guess the correct future width_ at the moment the number of
buckets is doubled.

-Brad, [email protected]

Patch follows:

--- ../ns.pre-CVS/scheduler.cc	Wed Jun  9 14:23:34 1999
+++ ./scheduler.cc	Wed Aug 18 16:24:04 1999
@@ -31,12 +31,12 @@
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
- * @(#) $Header: /usr/src/mash/repository/vint/ns-2/scheduler.cc,v 1.43 1999/06/09 21:23:34 kfall Exp $
+ * @(#) $Header: /home/sphinx/u0/bkarp/cvsroot/ns-2.1b6/scheduler.cc,v 1.6 1999/08/17 21:27:57 bkarp Exp $
  */
 
 #ifndef lint
 static const char rcsid[] =
-    "@(#) $Header: /usr/src/mash/repository/vint/ns-2/scheduler.cc,v 1.43 1999/06/09 21:23:34 kfall Exp $ (LBL)";
+    "@(#) $Header: /home/sphinx/u0/bkarp/cvsroot/ns-2.1b6/scheduler.cc,v 1.6 1999/08/17 21:27:57 bkarp Exp $ (LBL)";
 #endif
 
 #include <stdlib.h>
@@ -366,6 +366,9 @@
 	double width_;
 	double oneonwidth_;
 	double buckettop_;
+#ifdef CAL_FIX
+	double prevtop_;
+#endif
 	double last_clock_;
 	int nbuckets_;
 	int buckbits_;
@@ -451,9 +454,21 @@
 				resize(nbuckets_/2);
 			return e;
 		} else {
-			if (++i == nbuckets_)
+			if (++i == nbuckets_) {
 				i = 0;
-			buckettop_ += width_;
+#ifdef CAL_FIX
+				/* Brad Karp, [email protected]:
+				   at the end of each year, eliminate roundoff
+				   error in buckettop_ */
+				buckettop_ =
+					prevtop_ + nbuckets_ * width_;
+				prevtop_ = buckettop_;
+#endif
+			}
+#ifdef CAL_FIX
+			else
+#endif
+				buckettop_ += width_;
 		}
 	} while (i != lastbucket_);
 
@@ -478,6 +493,9 @@
 	last_clock_ = min->time_;
 	unsigned long n = (unsigned long)(min->time_ * oneonwidth_);
 	buckettop_ = (n + 1) * width_ + 0.5 * width_;
+#ifdef CAL_FIX
+	prevtop_ = buckettop_ - lastbucket_ * width_;
+#endif
 
 	return deque();
 }
@@ -495,6 +513,9 @@
 	long n = (long)(start * oneonwidth_);
 	lastbucket_ = n % nbuck;
 	buckettop_ = (n + 1) * width_ + 0.5 * width_;
+#ifdef CAL_FIX
+	prevtop_ = buckettop_ - lastbucket_ * width_;
+#endif
 	bot_threshold_ = nbuck / 2 - 2;
 	top_threshold_ = 2 * nbuck;
 }
@@ -539,12 +560,18 @@
 	// dequue and requeue sample events to gauge width
 	double olp = clock_;
 	double olt = buckettop_;
+#ifdef CAL_FIX
+	double olbt = prevtop_;
+#endif
 	int olb = lastbucket_;
 	for (int i = 0; i < nsamples; i++) hold[i] = deque();
 	// insert in the inverse order and using insert2 to take care of same-time events.
 	for (int j = nsamples-1; j >= 0; j--) insert2(hold[j]);
 	clock_ = olp;
 	buckettop_ = olt;
+#ifdef CAL_FIX
+	prevtop_ = olbt;
+#endif
 	lastbucket_ = olb;
 
 	// try to work out average cluster separation