<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>ANT Research News &#187; software</title>
	<atom:link href="http://www.isi.edu/ant/blog/tag/software/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.isi.edu/ant/blog</link>
	<description>Updates about research by the ANT group (Analysis of Internet Traffic)</description>
	<lastBuildDate>Tue, 14 Jun 2011 23:39:53 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.2</generator>
		<item>
		<title>new conference paper &#8220;Towards an AS-to-Organization Map&#8221; to appear at IMC</title>
		<link>http://www.isi.edu/ant/blog/2010/09/18/new-paper-towards-an-as-to-organization-map-to-appear-at-imc/</link>
		<comments>http://www.isi.edu/ant/blog/2010/09/18/new-paper-towards-an-as-to-organization-map-to-appear-at-imc/#comments</comments>
		<pubDate>Sat, 18 Sep 2010 12:51:18 +0000</pubDate>
		<dc:creator>xuecai</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[algorithms]]></category>
		<category><![CDATA[amite]]></category>
		<category><![CDATA[AS-to-organization mapping]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[Internet Measurement Conference]]></category>
		<category><![CDATA[Internet topology]]></category>
		<category><![CDATA[lander]]></category>
		<category><![CDATA[papers]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[web tools]]></category>

		<guid isPermaLink="false">http://www.isi.edu/ant/blog/?p=133</guid>
		<description><![CDATA[The paper “Towards an AS-to-Organization Map” was accepted by IMC’10 in Melbourne, Australia (available at http://www.isi.edu/~johnh/PAPERS/Cai10c.html). From the abstract: An understanding of Internet topology is central to answer various questions ranging from network resilience to peer selection or data center &#8230; <a href="http://www.isi.edu/ant/blog/2010/09/18/new-paper-towards-an-as-to-organization-map-to-appear-at-imc/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The paper “Towards an AS-to-Organization Map” was accepted by IMC’10 in Melbourne, Australia (available at <a href="http://www.isi.edu/~johnh/PAPERS/Cai10c.html">http://www.isi.edu/~johnh/PAPERS/Cai10c.html</a>).</p>
<p>From the abstract:</p>
<blockquote><p>An understanding of Internet topology is central to answer various questions ranging from network resilience to peer selection or data center location. While much of prior work has examined AS-level connectivity, meaningful and relevant results from such an abstract view of Internet topology have been limited. For one, semantically, AS relationships capture business relationships and not physical connectivity. Additionally, many organizations often use multiple ASes, either to implement different routing policies, or as legacies from mergers and acquisitions. In this paper, we move beyond the traditional AS graph view of the Internet to define the problem of AS-to-organization mapping. We describe our initial steps at automating the capture of the rich semantics inherent in the AS-level ecosystem where routing and connectivity intersect with organizations. We discuss preliminary methods that identify multi-AS organizations from WHOIS data and illustrate the challenges posed by the quality of the available data and the complexity of real-world organizational relationships.</p></blockquote>
<p>Citation: Xue Cai, John Heidemann, Balachander Krishnamurthy, and Walter Willinger. Towards an AS-to-Organization Map. In Proceedings of the ACM Internet Measurement Conference, p. to appear. Melbourne, Australia, ACM. November, 2010.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.isi.edu/ant/blog/2010/09/18/new-paper-towards-an-as-to-organization-map-to-appear-at-imc/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ANT extensions for bzip2-splitting to appear in Hadoop</title>
		<link>http://www.isi.edu/ant/blog/2009/09/11/ant-extensions-for-bzip2-splitting-to-appear-in-hadoop/</link>
		<comments>http://www.isi.edu/ant/blog/2009/09/11/ant-extensions-for-bzip2-splitting-to-appear-in-hadoop/#comments</comments>
		<pubDate>Fri, 11 Sep 2009 17:49:53 +0000</pubDate>
		<dc:creator>johnh</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[bzip2]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[map/reduce]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[splitting]]></category>

		<guid isPermaLink="false">http://www.isi.edu/ant/blog/?p=19</guid>
		<description><![CDATA[The ANT project is happy to announce that our extensions to Hadoop to support splitting of bzip2-compressed files have been accepted to appear in the next Hadoop release (will be 0.21.0). Support for compression is important in map/reduce because it &#8230; <a href="http://www.isi.edu/ant/blog/2009/09/11/ant-extensions-for-bzip2-splitting-to-appear-in-hadoop/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The ANT project is happy to announce that our extensions to Hadoop to support splitting of bzip2-compressed files have been accepted to appear in the next Hadoop release (will be 0.21.0).</p>
<p>Support for compression is important in map/reduce because it reduces the amount of I/O, and because important input files (for us, our <a href="http://www.isi.edu/ant/address/">Internet address censuses</a>) are provided in compressed format.</p>
<p>Splitting is important in map/reduce, because splitting allows many computers to process <em>parts</em> of a few big files.  Since the whole point of Hadoop and map/reduce is processing <em>big</em> files (for us, 4GB or more) with<em> many</em> computers (for us, dozens to hundreds), splitting is really <em>essential</em>.</p>
<p>Until now, Hadoop did not support splitting of compressed files.  Instead, if input data was compressed, you get at most one computer per file.  Some work-arounds were possible, but basically unpleasant, and often requiring that one rewrite all the input data is some other format.</p>
<p>Our extensions (see <a href="https://issues.apache.org/jira/browse/HADOOP-4012">HADOOP-4012</a> and <a href="https://issues.apache.org/jira/browse/MAPREDUCE-830">MAPREDUCE-830</a>, plus <a href="https://issues.apache.org/jira/browse/HADOOP-3646">HADOOP-3646</a> that went into 0.19.0) support <strong>Hadoop execution over bzip2 files with automatic splitting</strong>.  Getting this done was trickier than one might expect:  Hadoop really wants to decide where to split files, yet bzip2 can only support splits at specific locations that are different, and users don&#8217;t care about either of these but instead only about <em>their </em>record boundaries.  Fortunately, we were able to align all of these constraints, and deal with the corner cases that inevitably arise.  (What if the bzip2 marker appears in normal data?  What happens when markers exactly align, or are off-by-one?)</p>
<p>Abdul Qadeer did this work in 2008, working with Yuri Pradkin and me (John Heidemann), and continued to work with the patch through its getting committed.  We especially thank Chris Douglas at Yahoo for shepherding patch through the Hadoop bug tracking system, including helping clean it up and add test cases.  And we thank Doug Cutting for initially <span class="mh-hyperlinked"><a href='http://www.google.com/recaptcha/mailhide/d?k=01LCXqKhsXyzxP9KXmlwrhaw==&c=' onclick="window.open('http://www.google.com/recaptcha/mailhide/d?k=01LCXqKhsXyzxP9KXmlwrhaw==&amp;c=', '', 'toolbar=0,scrollbars=0,location=0,statusbar=0,menubar=0,resizable=0,width=500,height=300'); return false;">suggesting bzip2</a></span> as a splittable compression scheme.</p>
<p>This work was supported by NSF through the <a href="http://www.isi.edu/ant/mrnet/index.html">MR-Net research project</a> (CNS-0823774).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.isi.edu/ant/blog/2009/09/11/ant-extensions-for-bzip2-splitting-to-appear-in-hadoop/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

