<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>ANT Research News</title>
	<atom:link href="http://www.isi.edu/ant/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.isi.edu/ant/blog</link>
	<description>Updates about research by the ANT group (Analysis of Internet Traffic)</description>
	<lastBuildDate>Tue, 15 Sep 2009 22:26:46 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>new tech report “Parametric Methods for Anomaly Detection in Aggregate Traffic”</title>
		<link>http://www.isi.edu/ant/blog/2009/09/15/new-tech-report-%e2%80%9cparametric-methods-for-anomaly-detection-in-aggregate-traffic%e2%80%9d/</link>
		<comments>http://www.isi.edu/ant/blog/2009/09/15/new-tech-report-%e2%80%9cparametric-methods-for-anomaly-detection-in-aggregate-traffic%e2%80%9d/#comments</comments>
		<pubDate>Tue, 15 Sep 2009 22:26:46 +0000</pubDate>
		<dc:creator>johnh</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.isi.edu/ant/blog/?p=24</guid>
		<description><![CDATA[We just posted a tech report “Parametric Methods for Anomaly Detection in Aggregate Traffic” at &#60;ftp://ftp.isi.edu/isi-pubs/tr-663.pdf&#62;.  This paper represents quite a bit of work looking at how to apply parametric detection as part of the NSF-sponsored MADCAT project.
From the abstract:

This paper develops parametric methods to detect network anomalies using only aggregate traffic statistics in [...]]]></description>
			<content:encoded><![CDATA[<p>We just posted a tech report “Parametric Methods for Anomaly Detection in Aggregate Traffic” at &lt;<a href="ftp://ftp.isi.edu/isi-pubs/tr-663.pdf">ftp://ftp.isi.edu/isi-pubs/tr-663.pdf</a>&gt;.  This paper represents quite a bit of work looking at how to apply parametric detection as part of the NSF-sponsored <a href="http://www.isi.edu/ant/madcat/">MADCAT</a> project.</p>
<p>From the abstract:</p>
<blockquote><p>
This paper develops parametric methods to detect network anomalies using only aggregate traffic statistics in contrast to other works requiring flow separation, even when the anomaly is a small fraction of the total traffic.  By adopting simple statistical models for anomalous and background traffic in the time-domain, one can estimate model parameters in real-time, thus obviating the need for a long training phase or manual parameter tuning.  The detection mechanism uses a sequential probability ratio test, allowing for control over the false positive rate while examining the trade-off between detection time and the strength of an anomaly.  Additionally, it uses both traffic-rate and packet-size statistics, yielding a bivariate model that eliminates most false positives.  The method is analyzed using the bitrate SNR metric, which is shown to be an effective metric for anomaly detection.  The performance of the bPDM is evaluated in three ways:  first, synthetically generated traffic provides for a controlled comparison of detection time as a function of the anomalous level of traffic.  Second, the approach is shown to be able to detect controlled artificial attacks over the USC campus network in varying real traffic mixes.  Third, the proposed algorithm achieves rapid detection of real denial-of-service attacks as determined by the replay of previously captured network traces.  The method developed in this paper is able to detect all attacks in these scenarios in a few seconds or less.
</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.isi.edu/ant/blog/2009/09/15/new-tech-report-%e2%80%9cparametric-methods-for-anomaly-detection-in-aggregate-traffic%e2%80%9d/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ANT extensions for bzip2-splitting to appear in Hadoop</title>
		<link>http://www.isi.edu/ant/blog/2009/09/11/ant-extensions-for-bzip2-splitting-to-appear-in-hadoop/</link>
		<comments>http://www.isi.edu/ant/blog/2009/09/11/ant-extensions-for-bzip2-splitting-to-appear-in-hadoop/#comments</comments>
		<pubDate>Fri, 11 Sep 2009 17:49:53 +0000</pubDate>
		<dc:creator>johnh</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[bzip2]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[map/reduce]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[splitting]]></category>

		<guid isPermaLink="false">http://www.isi.edu/ant/blog/?p=19</guid>
		<description><![CDATA[The ANT project is happy to announce that our extensions to Hadoop to support splitting of bzip2-compressed files have been accepted to appear in the next Hadoop release (will be 0.21.0).
Support for compression is important in map/reduce because it reduces the amount of I/O, and because important input files (for us, our Internet address censuses) [...]]]></description>
			<content:encoded><![CDATA[<p>The ANT project is happy to announce that our extensions to Hadoop to support splitting of bzip2-compressed files have been accepted to appear in the next Hadoop release (will be 0.21.0).</p>
<p>Support for compression is important in map/reduce because it reduces the amount of I/O, and because important input files (for us, our <a href="http://www.isi.edu/ant/address/">Internet address censuses</a>) are provided in compressed format.</p>
<p>Splitting is important in map/reduce, because splitting allows many computers to process <em>parts</em> of a few big files.  Since the whole point of Hadoop and map/reduce is processing <em>big</em> files (for us, 4GB or more) with<em> many</em> computers (for us, dozens to hundreds), splitting is really <em>essential</em>.</p>
<p>Until now, Hadoop did not support splitting of compressed files.  Instead, if input data was compressed, you get at most one computer per file.  Some work-arounds were possible, but basically unpleasant, and often requiring that one rewrite all the input data is some other format.</p>
<p>Our extensions (see <a href="https://issues.apache.org/jira/browse/HADOOP-4012">HADOOP-4012</a> and <a href="https://issues.apache.org/jira/browse/MAPREDUCE-830">MAPREDUCE-830</a>, plus <a href="https://issues.apache.org/jira/browse/HADOOP-3646">HADOOP-3646</a> that went into 0.19.0) support <strong>Hadoop execution over bzip2 files with automatic splitting</strong>.  Getting this done was tricker than one might expect:  Hadoop really wants to decide where to split files, yet bzip2 can only support splits at specific locations that are different, and users don&#8217;t care about either of these but instead only about <em>their </em>record boundries.  Fortunately, we were able to align all of these constraints, and deal with the corner cases that inevitably arise.  (What if the bzip2 marker appears in normal data?  What happens when markers exactly align, or are off-by-one?)</p>
<p>Abdul Qadeer did this work in 2008, working with Yuri Pradkin and me (John Heidemann), and continued to work with the patch through its getting committed.  We especially thank Chris Douglas at Yahoo for shepherding patch through the Hadoop bug tracking system, including helping clean it up and add test cases.  And we thank Doug Cutting for initially <a href="http://www.mail-archive.com/hadoop-user@lucene.apache.org/msg01971.html">suggesting bzip2</a> as a splittable compression scheme.</p>
<p>This work was supported by NSF through the <a href="http://www.isi.edu/ant/mrnet/index.html">MR-Net research project</a> (CNS-0823774).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.isi.edu/ant/blog/2009/09/11/ant-extensions-for-bzip2-splitting-to-appear-in-hadoop/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>new tech report “Understanding Address Usage in the Visible Internet”</title>
		<link>http://www.isi.edu/ant/blog/2009/02/23/new-tech-report-%e2%80%9cunderstanding-address-usage-in-the-visible-internet%e2%80%9d/</link>
		<comments>http://www.isi.edu/ant/blog/2009/02/23/new-tech-report-%e2%80%9cunderstanding-address-usage-in-the-visible-internet%e2%80%9d/#comments</comments>
		<pubDate>Mon, 23 Feb 2009 20:20:41 +0000</pubDate>
		<dc:creator>xuecai</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Internet address space]]></category>
		<category><![CDATA[Internet address usage]]></category>

		<guid isPermaLink="false">http://www.isi.edu/ant/blog/?p=14</guid>
		<description><![CDATA[We just posted a tech report “Understanding Address Usage in the Visible Internet” at &#60;ftp://ftp.isi.edu/isi-pubs/tr-656.pdf&#62;.
The abstract summarizes the tech report:
Although the Internet is widely used today, there are few sound estimates of network demographics. Decentralized network management means questions about Internet use cannot be answered by a central authority, and ﬁrewalls and sensitivity to probing [...]]]></description>
			<content:encoded><![CDATA[<p>We just posted a tech report “Understanding Address Usage in the Visible Internet” at &lt;<a href="ftp://ftp.isi.edu/isi-pubs/tr-656.pdf">ftp://ftp.isi.edu/isi-pubs/tr-656.pdf</a>&gt;.</p>
<p>The abstract summarizes the tech report:</p>
<blockquote><p>Although the Internet is widely used today, there are few sound estimates of network demographics. Decentralized network management means questions about Internet use cannot be answered by a central authority, and ﬁrewalls and sensitivity to probing means that active measurements must be done carefully and validated against known data. Building on frequent ICMP probing of 1% of the Internet address space, we develop a clustering algorithm to estimate how Internet addresses are used. We show that adjacent addresses often have similar characteristics and are used for similar purposes (61% of addresses we probe are consistent blocks of 64 neighbors or more). We then apply this block-level clustering to provide data to explore several open questions in how networks are managed. First, the nearing full allocation of IPv4 addresses makes it increasingly important to estimate the costs of better management of the IPv4 space as a component of an IPv6 transition. We provide about how effectively network addresses blocks appear to be used, ﬁnding that a signiﬁcant number of blocks are only lightly used (about one-ﬁfth of /24 blocks have most addresses in use less than 10% of the time). Second, we provide new measurements about dynamically managed address space, showing nearly 40% of /24 blocks appear to be dynamically allocated, and dynamic addressing is most widely used in countries more recently to the Internet (more than 80% in China, while less then 30% in the U.S.).</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.isi.edu/ant/blog/2009/02/23/new-tech-report-%e2%80%9cunderstanding-address-usage-in-the-visible-internet%e2%80%9d/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>new paper &#8220;Uses and Challenges for Network Datasets&#8221;</title>
		<link>http://www.isi.edu/ant/blog/2009/02/07/new-paper-uses-and-challenges-for-network-datasets/</link>
		<comments>http://www.isi.edu/ant/blog/2009/02/07/new-paper-uses-and-challenges-for-network-datasets/#comments</comments>
		<pubDate>Sat, 07 Feb 2009 18:35:25 +0000</pubDate>
		<dc:creator>johnh</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[lander]]></category>
		<category><![CDATA[madcat]]></category>
		<category><![CDATA[network datasets]]></category>
		<category><![CDATA[papers]]></category>

		<guid isPermaLink="false">http://www.isi.edu/ant/blog/?p=12</guid>
		<description><![CDATA[We just posted a pre-print of the paper &#8220;Uses and Challenges for Network Datasets&#8221;, to appear at IEEE CATCH in March.  The pre-print is at &#60;http://www.isi.edu/~johnh/PAPERS/Heidemann09a.html&#62;.
The abstract summarizes the paper:
Network datasets are necessary for many types of network research.  While there has been significant discussion about specific datasets, there has been less about the overall [...]]]></description>
			<content:encoded><![CDATA[<p>We just posted a pre-print of the paper &#8220;Uses and Challenges for Network Datasets&#8221;, to appear at IEEE CATCH in March.  The pre-print is at <a href="http://www.isi.edu/~johnh/PAPERS/Heidemann09a.html">&lt;http://www.isi.edu/~johnh/PAPERS/Heidemann09a.html&gt;</a>.</p>
<p>The abstract summarizes the paper:</p>
<blockquote><p>Network datasets are necessary for many types of network research.  While there has been significant discussion about specific datasets, there has been less about the overall state of network data collection.  The goal of this paper is to explore the research questions facing the Internet today, the datasets needed to answer those questions, and the challenges to using those datasets.  We suggest several practices that have proven important in use of current data sets, and open challenges to improve use of network data.</p></blockquote>
<p>More specifically, the paper tries to answer the question Jody Westby put to PREDICT PIs, which is &#8220;why take data, what is it good for&#8221;?  While a simple question, it&#8217;s not easy to answer (at least, my attempt to dash of a quick answer in e-mail failed).  The paper is an attempt at a more thoughtful answer.</p>
<p>The paper tries to summarize and point to a lot of ongoing work, but I know that our coverage was insufficient.  We welcome feedback about what we&#8217;re missing.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.isi.edu/ant/blog/2009/02/07/new-paper-uses-and-challenges-for-network-datasets/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>IMC paper on Internet Census described in MIT Tech Review</title>
		<link>http://www.isi.edu/ant/blog/2008/10/15/imc-paper-on-internet-census-described-in-mit-tech-review/</link>
		<comments>http://www.isi.edu/ant/blog/2008/10/15/imc-paper-on-internet-census-described-in-mit-tech-review/#comments</comments>
		<pubDate>Thu, 16 Oct 2008 04:59:03 +0000</pubDate>
		<dc:creator>johnh</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Internet address space]]></category>
		<category><![CDATA[Internet Measurement Conference]]></category>
		<category><![CDATA[papers]]></category>

		<guid isPermaLink="false">http://www.isi.edu/ant/blog/?p=6</guid>
		<description><![CDATA[The IMC paper &#8220;Census and Survey of the Visible Internet&#8221; was described in an article &#8220;Probe Sees Unused Internet&#8221; in the MIT Technology Review by Robert Lemos.
The article provides a nice summary of the issues, but it reaches a conclusion that is stronger supported by the study.  The subhead of the article is &#8220;A survey [...]]]></description>
			<content:encoded><![CDATA[<p>The IMC paper &#8220;Census and Survey of the Visible Internet&#8221; was <a href="http://www.technologyreview.com/web/21528/page1/">described in an article</a> &#8220;Probe Sees Unused Internet&#8221; in the MIT Technology Review by Robert Lemos.</p>
<p>The article provides a nice summary of the issues, but it reaches a conclusion that is stronger supported by the study.  The subhead of the article is &#8220;A survey shows that addresses are not running out as quickly as we&#8217;d thought&#8221;, and the article draws the conclusion: &#8220;the problem [of IPv4 address exhaustion] may not be as bad as many fear.&#8221;</p>
<p>The article&#8217;s conclusion, I think, overly simplifies matters&#8212;it is only true if the &#8220;better things we should be doing in managing the IPv4 address space&#8221; are <em>free. </em>The Internet Census we carried out supports the <em>opportunity</em> for better IPv4 address space management.  But an open question is the <em>cost</em> of such management.  Historically, with plentiful IPv4 addresses, IPv4 management costs have been small, but <em>potential better IPv4 management</em> will likely be much more costly.  This cost of ongoing IPv4 management needs to be weighed against the costs of one-time conversion cost to IPv6 coupled followed lower IPv6 management costs.</p>
<p>To me, one exciting conclusion from the Internet Census we carried out is that we now have data that allows us to start evaluating these trade-offs.  The answer may be more careful IPv4 gets us a few years, <em>or</em> that the cost of more careful IPv4 makes IPv6 an obvious choice.  In either case, resolving this transition is important for the Internet community.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.isi.edu/ant/blog/2008/10/15/imc-paper-on-internet-census-described-in-mit-tech-review/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>new paper about Internet address space census and survey</title>
		<link>http://www.isi.edu/ant/blog/2008/08/26/new-paper-about-internet-address-space-census-and-survey/</link>
		<comments>http://www.isi.edu/ant/blog/2008/08/26/new-paper-about-internet-address-space-census-and-survey/#comments</comments>
		<pubDate>Wed, 27 Aug 2008 05:11:30 +0000</pubDate>
		<dc:creator>johnh</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Internet address space]]></category>
		<category><![CDATA[Internet Measurement Conference]]></category>
		<category><![CDATA[papers]]></category>

		<guid isPermaLink="false">http://www.isi.edu/ant/blog/?p=4</guid>
		<description><![CDATA[We are happy to report that the paper &#8220;Census and Survey of the Visible Internet&#8221; has been accepted to appear at the Internet Measurement Conference in Vouliagmeni, Greece in October 2008.
A preprint is available at http://www.isi.edu/~johnh/PAPERS/Heidemann08c.html, and an extended version is available as an updated technical report at http://www.isi.edu/~johnh/PAPERS/Heidemann08a.html.
]]></description>
			<content:encoded><![CDATA[<p>We are happy to report that the paper &#8220;Census and Survey of the Visible Internet&#8221; has been accepted to appear at the Internet Measurement Conference in Vouliagmeni, Greece in October 2008.</p>
<p>A preprint is available at <a href="http://www.isi.edu/~johnh/PAPERS/Heidemann08c.html">http://www.isi.edu/~johnh/PAPERS/Heidemann08c.html</a>, and an extended version is available as an updated technical report at <a href="http://www.isi.edu/~johnh/PAPERS/Heidemann08a.html">http://www.isi.edu/~johnh/PAPERS/Heidemann08a.html</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.isi.edu/ant/blog/2008/08/26/new-paper-about-internet-address-space-census-and-survey/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Internet address space: new technical paper and browsable map</title>
		<link>http://www.isi.edu/ant/blog/2008/02/06/internet-address-space-new-technical-paper-and-browsable-map/</link>
		<comments>http://www.isi.edu/ant/blog/2008/02/06/internet-address-space-new-technical-paper-and-browsable-map/#comments</comments>
		<pubDate>Wed, 06 Feb 2008 21:00:26 +0000</pubDate>
		<dc:creator>johnh</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Internet address space]]></category>
		<category><![CDATA[papers]]></category>

		<guid isPermaLink="false">http://www.isi.edu/ant/blog/2008/02/06/internet-address-space-new-technical-paper-and-browsable-map/</guid>
		<description><![CDATA[A number of folks expressed interest in our ANT census of the Internet address space at &#60;http://www.isi.edu/ant/address/&#62;.
We have three recent updates, a new TECHNICAL REPORT and a BROWSABLE INTERNET ADDRESS MAP, and a PROJECT BLOG.
We have have released a new TECHNICAL REPORT describing the methodology,  ISI-TR-2008-649 at &#60;http://www.isi.edu/~johnh/PAPERS/Heidemann08a.html&#62;.
This report should completely supersede our previous report [...]]]></description>
			<content:encoded><![CDATA[<p>A number of folks expressed interest in our ANT census of the Internet address space at &lt;<a href="http://www.isi.edu/ant/address/">http://www.isi.edu/ant/address/</a>&gt;.</p>
<p>We have three recent updates, a new TECHNICAL REPORT and a BROWSABLE INTERNET ADDRESS MAP, and a PROJECT BLOG.</p>
<p>We have have released a new TECHNICAL REPORT describing the methodology,  ISI-TR-2008-649 at &lt;<a href="http://www.isi.edu/~johnh/PAPERS/Heidemann08a.html">http://www.isi.edu/~johnh/PAPERS/Heidemann08a.html</a>&gt;.<br />
This report should completely supersede our previous report (#640), adding:</p>
<ul>
<li>evaluation in ping accuracy, both absolutely and relative to TCP probing</li>
<li>estimation of error in our evaluations of hosts and server counts</li>
<li>validation of our approach to firewall detection</li>
<li>significant improvements in organization and presentation</li>
</ul>
<p>We have also put up a BROWSABLE INTERNET ADDRESS MAP at &lt;<a href="http://www.isi.edu/ant/address/browse/">http://www.isi.edu/ant/address/browse/</a>&gt;.</p>
<p>With the Google maps engine, this map lets you zoom from an overview to any part of the address space, including showing individual hosts (permuted for anonymization).</p>
<p>Finally, we now have a PROJECT BLOG to allow folks to track future developments: &lt;<a href="http://www.isi.edu/ant/blog/">http://www.isi.edu/ant/blog/</a>&gt;.  We plan to do all future announcements via the blog rather than with general e-mail messages, so folks can opt-in to what they want to hear.</p>
<p>We welcome any comments about the map or technical report, either to our group mailing list (ant, then at isi.edu), or to individuals.</p>
<p>-ANT folks (John Heidemann, Yuri Pradkin, Ramesh Govindan, Christos Papadopoulos, Genevieve Bartlett, Joseph Bannister)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.isi.edu/ant/blog/2008/02/06/internet-address-space-new-technical-paper-and-browsable-map/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hello world!</title>
		<link>http://www.isi.edu/ant/blog/2008/02/06/hello-world/</link>
		<comments>http://www.isi.edu/ant/blog/2008/02/06/hello-world/#comments</comments>
		<pubDate>Wed, 06 Feb 2008 19:33:01 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[administrative]]></category>

		<guid isPermaLink="false">http://www.isi.edu/ant/blog/?p=1</guid>
		<description><![CDATA[Welcome to the ANT Project Blog.  Folks are welcome to subscribe to the RSS feed for this blog if they wish to track research related to the analysis of Internet traffic in the ANT group at ISI, USC, and CSU.  We expect this blog to be very low traffic (research takes time!).
]]></description>
			<content:encoded><![CDATA[<p>Welcome to the ANT Project Blog.  Folks are welcome to subscribe to the RSS feed for this blog if they wish to track research related to the analysis of Internet traffic in the ANT group at ISI, USC, and CSU.  We expect this blog to be very low traffic (research takes time!).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.isi.edu/ant/blog/2008/02/06/hello-world/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
