Detecting Internet Outages with Precise Active Probing (extended)
Lin Quan, John Heidemann, and Yuri PradkinUSC/Information Sciences Institute
Abstract
Parts of the Internet are down every day, from the intentional shutdown of the Egyptian Internet in Jan. 2011 and natural disasters such as the Mar. 2011 Japanese earthquake, to the thousands of small outages caused by localized accidents, and human error, maintenance, or choices. Understanding these events requires efficient and accurate detection methods, motivating our new system to detect network outages by active probing. We show that a \emphsingle computer can track outages across the entire analyzable IPv4 Internet, probing a sample of 20 addresses in all 2.5M responsive /24 address blocks. We show that our approach is \emphsignificantly more accurate than the best current methods, with 31% fewer false conclusions, while providing 14% greater coverage and requiring about the same probing traffic. We develop new algorithms to identify outages and cluster them to events, providing the first visualization of outages. We carefully validate our approach, showing consistent results over two years and from three different sites. Using public BGP archives and news sources we confirm 83% of large events. For a random sample of 50 observed events, we find 38% in partial control-plane information, reaffirming prior work that small outages are often not caused by BGP\@. Through controlled emulation we show that our approach detects 100% of full-block outages that last at least twice our probing interval. Finally, we report on Internet stability as a whole, and the size and duration of typical outages, using core-to-edge observations with much larger coverage than prior mesh-based studies. We find that about 0.3% of the Internet is likely to be unreachable at any time, suggesting the Internet provides only 2.5 ``nines'' of availability.Availability
This paper is available in several formats: abstract web page with pointers and cites, PDF, paper copies can be obtained by mail to the authors. Copyright terms for this paper appear below.
Reference
- Quan12a
- Lin Quan, John Heidemann, and Yuri Pradkin. Detecting Internet Outages with Precise Active Probing (extended). Technical Report ISI-TR-2012-678b, USC/Information Sciences Institute, February, 2012. Updated May 2012; TR-678 superceeds ISI-TR-2011-672. <http://www.isi.edu/~johnh/PAPERS/Quan12a.html>.
@techreport{Quan12a,
author = "Lin Quan and John Heidemann and Yuri Pradkin",
title = "Detecting Internet Outages with Precise
Active Probing (extended)",
institution = "USC/Information Sciences Institute",
year = "2012",
number = "ISI-TR-2012-678b",
month = "February",
note = "Updated May 2012; TR-678 superceeds ISI-TR-2011-672",
keywords = "routing outage detection, active probing,
ntework outages, revision of [Quan11a]",
url = "http://www.isi.edu/~johnh/PAPERS/Quan12a.html",
pdfurl = "http://www.isi.edu/~johnh/PAPERS/Quan12a.pdf",
otherurl = "ftp://ftp.isi.edu/isi-pubs/tr-678.pdf",
myorganization = "USC/Information Sciences Institute",
copyrightholder = "authors",
}
Copyright
This paper is copyright © 2012 by its authors. Permission to make digital or hard copies of part or all of this work for personal use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that new copies bear this notice and the full citation on the first page. Abstracting with credit is permitted.To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission of the authors.