[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

problem in newtimer() (?)



 
I think I have run across a bug in ns' one-way TCP implementation.
I can only really produce this effect in a big simulation with lots
of stuff going on, so I haven't been able to fully nail down what
should be happening.  But, it goes something like this...

We send a load of packets in slow start and they are nearly all
lost.  Recovery happens naturally for a little while.  Then, things
just end...  t_seqno_ is set to 406.  I get an ACK for 406.  This
ACK doesn't let me transmit anything (using SACK versionof fast
recovery).  However, according to TcpAgent::newtimer() I cancel the
RTO timer, even though I have sent many more than 406 packets.  I
get a duplicate ACK, which also doesn't let me transmit anything.
Now, the transfer is done.  It is hung because there is no rexmt
timer, I think.  So, I made a patch that seems to help the
situation.  Here is my version of the newtimer() function.

void TcpAgent::newtimer(Packet* pkt)
{
	hdr_tcp *tcph = hdr_tcp::access(pkt);
#ifdef MALLMAN
	if ((t_seqno_ > tcph->seqno()) || (tcph->seqno() < maxseq_))
#else
	if (t_seqno_ > tcph->seqno())
#endif MALLMAN
		set_rtx_timer();
	else
		cancel_rtx_timer();
}

In other words, we don't cancel the RTO timer if we have not
received an ACK for the last thing we have sent.  

This seems to have fixed the problem.  However, I am not sure if
that introduces other problems.  I'd appreciate some feedback on
what this could effect.  Also, I can provide the packet traces I
have if someone wants to dig into this a little more.

allman