The Network Simulator ns-2: Debugging Tips

Memory Leaks

Tcl-level Debugging

Ns supports Don Libs' Tcl debugger (see its Postscript documentation and source code). Install the program or leave the source code in a directory parallel to ns-2 and it will be built. Unlike expect, described in the tcl-debug documentation, we do not support the -D flag. To enter the debugger, add the lines "debug 1" to your script at the appropriate location.

The command $ns gen-map lists all objects in a raw form.

This is useful to correlate the position and function of an object given its name. The name of the object is the OTcl handle, usually of the form ``_o###''. For TclObjects, this is also available in a C++ debugger such as gdb as this->name_.

C-Level Debugging

Any standard debugger should do the trick.

The following macro for gdb makes it easier to see what happens in subroutines that take Tcl arguments (like TcpAgent::command()):

## for dumping Tcl-passed arguments
define pargvc
set $i=0
while $i < argc
  p argv[$i]
  set $i=$i+1
document pargvc
Print out argc argv[i]'s common in Tcl code.
(presumes that argc and argv are defined)

Mixing Tcl and C debugging

(Always a fun concept, right?)
It is a painful reality that when looking at the Tcl code and debugging Tcl level stuff, one wants to get at the C-level classes, and vice versa. This is smallish hint on how one can make that task easier. If you are running ns through gdb, then
  1. The following incantation (shown in bold below) gets you access to the Tcl debugger. Notes on how you can then use this debugger and what you can do with it are documented elsewhere.
    (gdb) run
    Starting program: /nfs/prot/kannan/PhD/simulators/ns/ns-2/ns 
    Breakpoint 1, AddressClassifier::AddressClassifier (this=0x12fbd8)
    (gdb) p this->name_
    $1 = 0x2711e8 "_o73"
    (gdb) call Tcl::instance().eval("debug 1")
    15: lappend auto_path $dbg_library
    dbg15.3> w
    *0: application
     15: lappend auto_path /usr/local/lib/dbg
    dbg15.4> Simulator info instances
    dbg15.5> _o1 now
    dbg15.6> # and other fun stuff
    dbg15.7> _o73 info class
    dbg15.8> _o73 info vars
    slots_ shift_ off_ip_ offset_ off_flags_ mask_ off_cmn_
    dbg15.9> c
    (gdb) w
    Ambiguous command "w": while, whatis, where, watch.
    (gdb) where
    #0  AddressClassifier::AddressClassifier (this=0x12fbd8)
    #1  0x5c68 in AddressClassifierClass::create (this=0x10d6c8, argc=4, 
        argv=0xefffcdc0) at
  2. In a like manner, if you have started ns through gdb, then you can always get gdb's attention by sending an interrupt, usually ^C on berkeloidrones.
However, note that these do tamper with the stack frame, and on occasion, may (sometimes can (and rarely, does)) screw up the stack so that, you may not be in a position to resume execution. To its credit, gdb appears to be smart enough to warn you about such instances when you should tread softly, and carry a big stick.

Memory Debugging

The first thing to do if you run out of memory is to make sure you can use all the memory on your system. Some systems by default limit the memory available for individual programs to something less than all available memory. To relax this, use the limit or ulimit command These are shell functions---see the manual page for your shell for details. Limit is for csh, ulimit is for sh/bash.

Simulations of large networks can consume a lot of memory. Ns-2.0b17 supports Gray Watson's dmalloc library (see its web documentation and source code). To add it, install it on your system or leave its source in a directory parallel to ns-2 and specify --with-dmalloc when configuring ns. Then build all components of ns for which you want memory information with debugging symbols (this should include at least ns-2, possibly tclcl and otcl, maybe also tcl).

To use dmalloc:

  1. define an alias (csh: alias dmalloc 'eval `\dmalloc -C \!*`', bash: function dmalloc { eval `command dmalloc -b $*` })
  2. Turn debugging on by typing dmalloc -l logfile low
  3. Run your program (which was configured and built with dmalloc as described above)
  4. Interpret logfile by running dmalloc_summarize ns <logfile (You need to download dmalloc_summarize separately.)

On some platforms you may need to link things statically to get dmalloc to work. On Solaris this is done with by linking with these options: "-Xlinker -B -Xlinker -static {libraries} -Xlinker -B -Xlinker -dynamic -ldl -lX11 -lXext". (You'll need to change Makefile. Thanks to Haobo Yu and Doug Smith for workign this out.)

We can interpret a sample summary produced from this process on ns-2/tcl/ex/newmcast/cmcast-100.tcl with an exit statement after the 200'th duplex-link-of-interefaces statement:

Dmalloc_summarize must map function names to and from their addresses. It often can't resolve addresses for shared libraries, so if you see lots of memory allocated by things beginning with ``ra='', that's what it is. The best way to avoid this problem is to build ns statically (if not all, then as much as possible).

Dmalloc's memory allocation scheme is somewhat expensive, plus there's bookkeeping costs. Programs linked against dmalloc will consume more memory than against most standard mallocs.

Dmalloc can also diagnose other memory errors (duplicate frees, buffer overruns, etc.). See its documentation for details.

Memory Conservation Tips

Some tips to saving memory (some of these use examples from the cmcast-100.tcl script):
(Also see page on large simulations for more related info.)

If you have many links or nodes:

avoid trace-all
$ns trace-all $f causes trace objects to be pushed on all links. If you only want to trace one link, there's no need for this overhead. Saving is about 14 KB/link.

use arrays for sequences of variables
Each variable, say n$i in set n$i [$ns node], has a certain overhead. If a sequence of nodes are created as an array, i.e. n($i), then only one variable is created, consuming much less memory. Saving is about 40+ Byte/variable.

avoid unnecessary variables
If an object will not be referred to later on, avoid naming the object. E.g. set cmcast(1) [new CtrMcast $ns $n(1) $ctrmcastcomp [list 1 1]] would be better if replaced by new CtrMcast $ns $n(1) $ctrmcastcomp [list 1 1]. Saving is about 80 Byte/variable.

run on top of FreeBSD
malloc() overhead on FreeBSD is less than on some other systems. We will eventually port that allocator to other platofrms.

dynamic binding (NEW)
Using bind() in C++ consumes memory for each object you create. This approach can be very expensive if you create many identical objects. Changing bind()'s to delay_bind() changes this memory requirement to per-class. See ~ns/ for an example of how to do binding, either way.
Some statistics collected by dmalloc (Investigating the bottleneck...)
KBytescmcast-50.tcl(217 Links)cmcast-100.tcl(950 Links)
trace-all 8,084 28,541
turn off trace-all 5,095 15,465
use array 5,091 15,459
remove unnecessay variables 5,087 15,451
on SunOS 5,105 15,484