<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" version="2.0">

<channel>
	<title>Mosharaf Chowdhury</title>
	
	<link>http://www.mosharaf.com</link>
	<description>To V or not to V that is the question</description>
	<lastBuildDate>Fri, 11 Jun 2010 22:03:54 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/mosharaf" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="mosharaf" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>MSR Cambridge, here I come!</title>
		<link>http://www.mosharaf.com/blog/2010/05/21/msr-cambridge-here-i-come/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=msr-cambridge-here-i-come</link>
		<comments>http://www.mosharaf.com/blog/2010/05/21/msr-cambridge-here-i-come/#comments</comments>
		<pubDate>Sat, 22 May 2010 06:56:18 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Recent News]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1409</guid>
		<description><![CDATA[I&#8217;m going to spend this Summer in stealth mode at Microsoft Research, Cambridge working with Christos Gkantsidis and Hitesh Ballani on a super-secret project. Hopefully, we&#8217;ll have some cool results on a hot topic. This will be my first time in England/UK as well. Looking forward to the English weather that I&#8217;ve heard so much [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m going to spend this Summer in stealth mode at Microsoft Research, Cambridge working with Christos Gkantsidis and Hitesh Ballani on a super-secret project. Hopefully, we&#8217;ll have some cool results on a hot topic.</p>
<p>This will be my first time in England/UK as well. Looking forward to the English weather that I&#8217;ve heard so much about! A Schengen visa is also accompanying me. So don&#8217;t be surprised if I&#8217;m seen in random European cities.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2010/05/21/msr-cambridge-here-i-come/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>PolyViNE has been accepted at VISA’2010</title>
		<link>http://www.mosharaf.com/blog/2010/05/18/polyvine-has-been-accepted-at-visa2010/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=polyvine-has-been-accepted-at-visa2010</link>
		<comments>http://www.mosharaf.com/blog/2010/05/18/polyvine-has-been-accepted-at-visa2010/#comments</comments>
		<pubDate>Tue, 18 May 2010 07:38:29 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Recent News]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Virtualization]]></category>
		<category><![CDATA[PolyViNE]]></category>
		<category><![CDATA[ViNE-Yard]]></category>
		<category><![CDATA[VISA]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1401</guid>
		<description><![CDATA[Our paper, &#8220;PolyViNE: Policy-based Virtual Network Embedding Across Multiple Domains&#8221; is set to appear in VISA&#8217;2010 workshop (with SIGCOMM&#8217;2010) in New Delhi. I worked on it during my last few months in Waterloo (circa Winter/Spring 2009), and it has been lying around ever since because everyone had been busy. Finally, its going to wake up [...]]]></description>
			<content:encoded><![CDATA[<p>Our paper, &#8220;PolyViNE: Policy-based Virtual Network Embedding Across Multiple Domains&#8221; is set to appear in VISA&#8217;2010 workshop (with SIGCOMM&#8217;2010) in New Delhi. I worked on it during my last few months in Waterloo (circa Winter/Spring 2009), and it has been lying around ever since because everyone had been busy. Finally, its going to wake up and smell a workshop.</p>
<blockquote><p>Intra-domain virtual network embedding (ViNE) is a well studied problem in the network virtualization literature. For most practical purposes, however, virtual networks (VNs) must be provisioned across heterogeneous administrative domains managed by multiple infrastructure providers (InPs).</p>
<p>In this paper we present PolyViNE, a policy-based inter-domain VN embedding framework that embeds end-to-end VNs in a decentralized manner. PolyViNE introduces a distributed protocol that coordinates the VN embedding process across participating InPs and ensures competitive prices for service providers (SPs), i.e., VN owners. We also present a location aware VN request forwarding mechanism &#8212; based on a hierarchical addressing scheme (COST) and a location awareness protocol (LAP) &#8212; to allow faster embedding and outline scalability and performance characteristics of PolyViNE through quantitative and qualitative evaluations.</p></blockquote>
<p>As always, the paper can be found in my <a href="http://www.mosharaf.com/publications/">publications page</a>.<span style="text-decoration: line-through;"> once I upload it (not yet).</span></p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2010/05/18/polyvine-has-been-accepted-at-visa2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Berkeley computer science networking Prelim reading list (Spring’10)</title>
		<link>http://www.mosharaf.com/blog/2010/05/11/berkeley-computer-science-networking-prelim-reading-list-spring10/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=berkeley-computer-science-networking-prelim-reading-list-spring10</link>
		<comments>http://www.mosharaf.com/blog/2010/05/11/berkeley-computer-science-networking-prelim-reading-list-spring10/#comments</comments>
		<pubDate>Tue, 11 May 2010 18:54:52 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Random]]></category>
		<category><![CDATA[Prelim]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1381</guid>
		<description><![CDATA[Somehow I never managed to upload the reading list after I finished my Prelim. Here it is finally, or they are &#8211; too many of them (not my fault). This zipped file contains reading lists for the grad networking/network security courses from different years along with the grandfather of all reading lists for networking prelim [...]]]></description>
			<content:encoded><![CDATA[<p>Somehow I never managed to upload the reading list after I finished my Prelim. Here it is finally, or they are &#8211; too many of them (not my fault).</p>
<p><a href="http://www.mosharaf.com/wp-content/uploads/net-prelim-reading-list-spring-10.zip">This zipped file</a> contains reading lists for the grad networking/network security courses from different years along with the grandfather of all reading lists for networking prelim compiled a decade ago. I didn&#8217;t read all of them, but most of them are good/great papers anyway.</p>
<p>All credits to <a href="http://www.eecs.berkeley.edu/~alspaugh/">Sara</a> and <a href="http://www.eecs.berkeley.edu/~sameerag/">Sameer</a> for prodding me to put the files together. Best of luck to them!</p>
<p class="download"><strong>Download:</strong> Berkeley CS Networking Prelim Reading List (Spring&#8217;10 Edition) [<a href="http://www.mosharaf.com/wp-content/uploads/net-prelim-reading-list-spring-10.zip">ZIP</a>]</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2010/05/11/berkeley-computer-science-networking-prelim-reading-list-spring10/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Spark short paper has been accepted at HotCloud’10</title>
		<link>http://www.mosharaf.com/blog/2010/05/08/spark-short-paper-has-been-accepted-at-hotcloud10/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=spark-short-paper-has-been-accepted-at-hotcloud10</link>
		<comments>http://www.mosharaf.com/blog/2010/05/08/spark-short-paper-has-been-accepted-at-hotcloud10/#comments</comments>
		<pubDate>Sat, 08 May 2010 23:26:35 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Recent News]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[data parallel systems]]></category>
		<category><![CDATA[HotCloud]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[Spark]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1374</guid>
		<description><![CDATA[An initial overview of our ongoing work on Spark, an iterative and interactive framework for cluster computing, has been accepted at HotCloud&#8217;10. I&#8217;ve been joined the project last February, while Matei has been working on it since last Fall. I will have uploaded the paper in the publications page. once we have taken care of [...]]]></description>
			<content:encoded><![CDATA[<p>An initial overview of our ongoing work on Spark, an iterative and interactive framework for cluster computing, has been accepted at HotCloud&#8217;10. I&#8217;ve been joined the project last February, while <a href="http://www.cs.berkeley.edu/~matei/">Matei</a> has been working on it since last Fall. I <span style="text-decoration: line-through;">will</span> have uploaded the paper in the <a href="http://www.mosharaf.com/publications/">publications page</a>. <span style="text-decoration: line-through;">once we have taken care of the reviewer comments/suggestions, meanwhile you can read the <a href="http://www.mosharaf.com/wp-content/uploads/EECS-2010-53.pdf">technical report</a> version.</span></p>
<p>This year HotCloud accepted 18 papers (24% of the submitted papers), and the PC are thinking about extending the workshop to a 2nd day from next year.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2010/05/08/spark-short-paper-has-been-accepted-at-hotcloud10/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Prelim, begone, I will have no more of thee!</title>
		<link>http://www.mosharaf.com/blog/2010/02/07/prelim-begone-i-will-have-no-more-of-thee/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=prelim-begone-i-will-have-no-more-of-thee</link>
		<comments>http://www.mosharaf.com/blog/2010/02/07/prelim-begone-i-will-have-no-more-of-thee/#comments</comments>
		<pubDate>Sun, 07 Feb 2010 21:41:14 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Recent News]]></category>
		<category><![CDATA[Prelim]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1332</guid>
		<description><![CDATA[Update: I&#8217;ve finally uploaded the reading list. Here it is! Took the networking Prelim last Thursday and passed. I was a never fan of oral interrogations; heck, I am not a fan of any interrogation for that matter. And this one was really a close call; I am just happy to reach alive at the [...]]]></description>
			<content:encoded><![CDATA[<p class="alert"><strong>Update:</strong> I&#8217;ve finally uploaded the reading list. <a href="http://www.mosharaf.com/blog/2010/05/11/berkeley-computer-science-networking-prelim-reading-list-spring10/">Here</a> it is!</p>
<p>Took the networking Prelim last Thursday and passed. I was a never fan of oral interrogations; heck, I am not a fan of any interrogation for that matter. And this one was really a close call; I am just happy to reach alive at the other end of the tunnel. What a relief! Thanks to everyone, specially Ganesh and Matei, who helped me prepare with mock Prelims and everything else.</p>
<p>I will have the (unofficial) reading list for this Prelim in a future post.</p>
<p>On another note, do watch <a href="http://www.imdb.com/title/tt0063522/">Rosemary&#8217;s Baby</a> to find out where the title of this post <a href="http://www.imdb.com/title/tt0063522/quotes">originated</a> from. Believe it or not, I had this post in mind since I heard Rosemary saying &#8220;Pain, begone, I will have no more thee!&#8221;It feels great to be able to use it finally!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2010/02/07/prelim-begone-i-will-have-no-more-of-thee/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Skilled in the Art of Being Idle: Reducing Energy Waste in Networked Systems</title>
		<link>http://www.mosharaf.com/blog/2009/11/29/skilled-in-the-art-of-being-idle-reducing-energy-waste-in-networked-systems/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=skilled-in-the-art-of-being-idle-reducing-energy-waste-in-networked-systems</link>
		<comments>http://www.mosharaf.com/blog/2009/11/29/skilled-in-the-art-of-being-idle-reducing-energy-waste-in-networked-systems/#comments</comments>
		<pubDate>Mon, 30 Nov 2009 04:50:30 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Chandrashekar]]></category>
		<category><![CDATA[energy]]></category>
		<category><![CDATA[Liu]]></category>
		<category><![CDATA[Nedevschi]]></category>
		<category><![CDATA[Nordman]]></category>
		<category><![CDATA[Ratnasamy]]></category>
		<category><![CDATA[Taft]]></category>
		<category><![CDATA[UCB CS268 F09]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1292</guid>
		<description><![CDATA[S. Nedevschi, J. Chandrashekar, J. Liu, B. Nordman, S. Ratnasamy, N. Taft, &#8220;Skilled in the Art of Being Idle: Reducing Energy Waste in Networked Systems,&#8221; NSDI&#8217;09, (April 2009). [PDF] Summary This paper argues that putting networked end-systems into low-power sleep modes, instead of keeping them in higher-power-consuming idle states, can result in significant energy savings. [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">S. Nedevschi, J. Chandrashekar, J. Liu, B. Nordman, S. Ratnasamy, N. Taft, &#8220;Skilled in the Art of Being Idle: Reducing Energy Waste in Networked Systems,&#8221; <em>NSDI&#8217;09</em>, (April 2009). [<a href="http://tier.cs.berkeley.edu/docs/nedevschi_nsdi09.pdf">PDF</a>]</p>
<h2>Summary</h2>
<p>This paper argues that putting networked end-systems into low-power sleep modes, instead of keeping them in higher-power-consuming idle states, can result in significant energy savings. However, a sleeping device loses its network presence and can prevent running scheduled tasks. There are two main approaches that address these issues: first, wake-on-lan (WoL) mechanisms wake machines up at the arrival of specific packets; and second, the use of a proxy that handles packets for the sleeping machine unless it is absolutely necessary to wake the machine up. The authors study the pros and cons of the two approaches based on data collected from 250 enterprise machines and present a proxy architecture with a narrow API interface to support different idle states.</p>
<p>From their measurement data, the authors found that the machines under study are active for only 10% of the time and on the average 50% of the time they are idle; only a small fraction are put to sleep at all. This observation suggests that there is enormous opportunity of saving energy by exploiting the sleep states. The authors also notice that there is always a steady flow of packets, which makes WoP (wake-on-packet) an infeasible choice.</p>
<p>In the proxy-based solution, packets destined for a sleeping host are intercepted by its proxy. The proxy decides whether to ignore/drop the packet or to respond somehow or to wake up the machine it is representing. To make judicious decisions, the authors identified different classes of packets (e.g., incoming/outgoing, unicast/broadcast/multicast) and found that both broadcast and multicast are largely responsible for poor sleep (80% more sleep time in home environment and 50% more in office environment). They deconstructed different classes of traffic and ended up with some ground rules about what to do when a proxy sees different types of packets.</p>
<p>Idle-time traffic have been differentiated along two different dimensions. The first classifies traffic based on the need to proxy the traffic (protocol) in question into three categories (don&#8217;t-wake protocols, don&#8217;t ignore protocols, and policy-dependent protocols). The second one identifies the complexity of decision making (ignorable(drop), handled via mechanical response, and require specialized processing). The end result is a firewall like architecture, where the power-proxy table consists of a list of rules with each rule consisting of a &lt;trigger, action, timeout&gt; tuple. For incoming packets, the proxy matches the rules and performs necessary actions. The authors also built a simple Click-based prototype of their proposed solution.</p>
<h2>Comments</h2>
<p>This is yet another very good measurement paper. Specially, the study of various types of traffic in different network environment is an eye-opener.</p>
<p>The solution presented in the paper is very much similar to firewall; so an intelligent guess would be that a lot of firewall literature might come in handy in solving the complexities of matching rules.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2009/11/29/skilled-in-the-art-of-being-idle-reducing-energy-waste-in-networked-systems/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cutting the Electric Bill for Internet-Scale Systems</title>
		<link>http://www.mosharaf.com/blog/2009/11/29/cutting-the-electric-bill-for-internet-scale-systems/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=cutting-the-electric-bill-for-internet-scale-systems</link>
		<comments>http://www.mosharaf.com/blog/2009/11/29/cutting-the-electric-bill-for-internet-scale-systems/#comments</comments>
		<pubDate>Sun, 29 Nov 2009 08:28:37 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Balakrishnan]]></category>
		<category><![CDATA[energy]]></category>
		<category><![CDATA[Guttag]]></category>
		<category><![CDATA[Maggs]]></category>
		<category><![CDATA[Qureshi]]></category>
		<category><![CDATA[UCB CS268 F09]]></category>
		<category><![CDATA[Weber]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1295</guid>
		<description><![CDATA[A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, B. Maggs, &#8220;Cutting the Electric Bill for Internet-Scale Systems,&#8221; ACM SIGCOMM Conference, (August 2009). [PDF] Summary Large organizations like Google, Microsoft, and Yahoo! annually consume tens of millions of dollars worth of electricity. The traditional approach toward reducing energy costs by reducing the amount of energy consumption [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, B. Maggs, &#8220;Cutting the Electric Bill for Internet-Scale Systems,&#8221; <em>ACM SIGCOMM Conference</em>, (August 2009). [<a href="http://nms.lcs.mit.edu/papers/sigcomm372-aqureshi.pdf">PDF</a>]</p>
<h2>Summary</h2>
<p>Large organizations like Google, Microsoft, and Yahoo! annually consume tens of millions of dollars worth of electricity. The traditional approach toward reducing energy costs by reducing the amount of energy consumption has not been that successful. The authors of this paper propose a new method based on two key observations: first, electricity prices exhibit both temporal and geographical variation; and second, large distributed systems incorporate request routing and replication. Based on these observations, they posit that cost-aware routing of computations to low price electricity zones can save a significant amount of energy expenses per annum without increasing other costs (e.g., bandwidth). It should be noted that the paper deals with reducing energy cost and not energy consumption.</p>
<p>Electricity is generally produced somewhere else and then transmitted to consumer locations. The whole process is regulated by Regional Transmission Organizations or RTOs (there are eight reliability regions in the US). RTOs use auctioning mechanisms to match buyers with sellers in multiple parallel wholesale markets such as day-ahead, hour-ahead, and real-time markets. This paper is concerned only with the real-time market and depends on its variation over time and locality. Through empirical study of average prices from January 2006 to April 2009, the authors have found significant uncorrelated variation of prices across time, geographic locations, and different wholesale markets. These variations are due to the source of electricity and time difference in regions (resulting in peak hours in different absolute times).</p>
<p>Routing computations to different, possibly further, geographical places can result in increased latency to client experience and an increase in bandwidth cost. Based on their study on Akamai data, the authors found that for Akamai-like large corporations such problems are manageable, i.e., they do have some impact but the total operating cost still decreases. In the end, they provide a simple strategy to reduce energy costs by moving computations to minimum priced places within a certain radius from the original location.</p>
<p>The results presented in this paper are highly dependent on the energy-proportionality of data centers/clusters, which refers to the fact that energy consumption should be proportional to the load. Unfortunately, that is not the case in today&#8217;s hardware, but the authors hope for something like that in the future. Anyhow, they did find significant temporal and geographic variation in prices to exploit and through trace-based simulation they showed that 40% savings can be achieved in ideal energy-proportional scenario. However, on real hardware total savings is brought down to only ~5-6%.</p>
<h2>Critique</h2>
<p>The authors have presented an excellent overview of the contemporary electricity economy, and they have obviously done a great job in shedding new light on a (sort of) well-established problem.</p>
<p>While the paper is excellent in its observations and measurements, the solution is straightforward with several simplifying assumptions. One can imagine a multi-variable optimization problem here that will consider energy-, bandwidth-, and other costs together with multiple constraints on latencies, bandwidth etc. On the other hand, such optimization problems are really hard to solve in real-time, and it might happen so that people will end up using the straightforward approach in the end.</p>
<p>The choice of 1500km as the radius seemed pretty arbitrary. The authors did try to justify the number by some significant jumps in cost and distance at that value, but I did not find it very convincing.</p>
<p>Also, one might ask if such measures do save some cost, why no one is using it. The possible reason is that the overhead of practically implementing something like this outweighs its savings. But this is only the first step in this direction, there might be ways to find some practical solution in the future.</p>
<p>Also, it seemed to me that the paper does not have much networking or communication content per se. Anyhow, it was an interesting read.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2009/11/29/cutting-the-electric-bill-for-internet-scale-systems/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>BotGraph: Large Scale Spamming Botnet Detection</title>
		<link>http://www.mosharaf.com/blog/2009/11/24/botgraph-large-scale-spamming-botnet-detection/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=botgraph-large-scale-spamming-botnet-detection</link>
		<comments>http://www.mosharaf.com/blog/2009/11/24/botgraph-large-scale-spamming-botnet-detection/#comments</comments>
		<pubDate>Tue, 24 Nov 2009 09:45:48 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[BotGraph]]></category>
		<category><![CDATA[botnet]]></category>
		<category><![CDATA[Chen]]></category>
		<category><![CDATA[Gillum]]></category>
		<category><![CDATA[Ke]]></category>
		<category><![CDATA[UCB CS268 F09]]></category>
		<category><![CDATA[Xie]]></category>
		<category><![CDATA[Yu]]></category>
		<category><![CDATA[Zhao]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1281</guid>
		<description><![CDATA[Y. Zhao, Y. Xie, F. Yu, Q. Ke, Y. Yu, Y. Chen, E. Gillum, &#8220;BotGraph: Large Scale Spamming Botnet Detection,&#8221; NSDI&#8217;09, (April 2009). [PDF] Summary Analyzing large volume of logged data to identify abnormal patterns is one of the biggest and most frequently faced challenges in the network security community. This paper presents BotGraph, a [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">Y. Zhao, Y. Xie, F. Yu, Q. Ke, Y. Yu, Y. Chen, E. Gillum, &#8220;BotGraph: Large Scale Spamming Botnet Detection,&#8221; <em>NSDI&#8217;09</em>, (April 2009). [<a href="http://research.microsoft.com/pubs/79413/botgraph.pdf">PDF</a>]</p>
<h2>Summary</h2>
<p>Analyzing large volume of logged data to identify abnormal patterns is one of the biggest and most frequently faced challenges in the network security community. This paper presents BotGraph, a mechanism based on random graph theory, to detect botnet spamming attacks on major web email providers based on millions of log entries and describes its implementation as a data parallel system. The authors posit that even though individually detecting bot-users (false email accounts/users created by botnets) is difficult, they do have some aggregate behavior (e.g., they share IP addresses when they log in and send emails). BotGraph detects abnormal sharing of IP addresses among bot-users to separate them from human-users.</p>
<p>BotGraph has two major components: <em>aggressive sign-up detection</em> that identifies sudden increase of signup activities from the same IP address and <em>stealthy bot detection</em> that detects sharing of one IP address by many bot-users as well as sharing of many IP addresses by a single bot-user by creating a user-user graph. In this paper, the authors consider multiple IP addresses from the same AS as one shared IP address. It is also assumed that legitimate groups of normal users normally do not use the same set of IP addresses from different ASes.</p>
<p>In order to identify bot-user groups, the authors create a user-user graph where each user is a vertex and each link between two vertices carry some weight based on a similarity metric of the two vertices. The authors&#8217; assumption is that the bot-users will create a <em>giant connected component</em> in the BotGraph (since they will share IP addresses) that will collectively distinguish them from the normal users (who create much smaller connected components). They show that there exists some threshold on edge weights, which, if decreased, will suddenly result in large components. It is proven, using random graph theory, that if there is IP address sharing than the giant component will be seen with a high probability.</p>
<p>The detection algorithm works in an iterated manner starting with a smaller threshold value to create a large component, and it then recursively increases the threshold to extract connected sub-components (until the size of the connected components become smaller than some constant). The resultant output is a hierarchical tree. The algorithm then uses some statistical measurements (using histograms) to separate bot-user group components from their real counterparts (this part of the algorithm is questionable!). After the pruning stage, BotGraph goes through another phase of traversing the hierarchical tree to consolidate bot-user group information.</p>
<p>The problem in applying BotGraph arrives first as the complexity arising from the construction of the graph, which can contain millions of vertices, and then in terms of processing the huge graph to actually run the algorithm for detection. The authors propose two approaches using currently popular large data parallel processing mechanism MapReduce and its extension using selective filtering with two more interfaces than just basic Map and Reduce. The extended version is found to be the more resource efficient, scalable, and faster of the two.</p>
<p>Evaluation using the data parallel systems show that BotGraph could identify a large number of previously unknown bot-users and bot-user accounts with low false positive rates.</p>
<h2>Critique</h2>
<p>The whole concept seems to be standing on the shoulder of the assumption that botnet operators are &#8220;stupid&#8221;. The criteria used in the paper are pretty straightforward, and it shouldn&#8217;t be too hard for the botnet operators to figure them out and work around them.</p>
<p>The MapReduce part in the middle of the paper, which took 4 pages, seems not SO new; they could&#8217;ve just left it out of the paper or at least shorten that part and save everyone some time. After all, MapReduce was created to take care of Tera bytes of data; making it work for 500 GB data should not be that awe-inspiring. To give the authors credit, it should be mentioned that most probably they are the first ones to use it in this context.</p>
<p>One more thing, I wish someone could publish what GMail does to filter spams because I can bet they are pretty darn good at what they are doing and much ahead of other web mail providers.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2009/11/24/botgraph-large-scale-spamming-botnet-detection/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Not-a-Bot: Improving Service Availability in the Face of Botnet Attacks</title>
		<link>http://www.mosharaf.com/blog/2009/11/23/not-a-bot-improving-service-availability-in-the-face-of-botnet-attacks/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=not-a-bot-improving-service-availability-in-the-face-of-botnet-attacks</link>
		<comments>http://www.mosharaf.com/blog/2009/11/23/not-a-bot-improving-service-availability-in-the-face-of-botnet-attacks/#comments</comments>
		<pubDate>Tue, 24 Nov 2009 06:12:43 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Balakrishnan]]></category>
		<category><![CDATA[botnet]]></category>
		<category><![CDATA[Gummadi]]></category>
		<category><![CDATA[Maniatis]]></category>
		<category><![CDATA[Ratnasamy]]></category>
		<category><![CDATA[UCB CS268 F09]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1266</guid>
		<description><![CDATA[R. Gummadi, H. Balakrishnan, P. Maniatis, S. Ratnasamy, &#8220;Not-a-Bot: Improving Service Availability in the Face of Botnet Attacks,&#8221; NSDI&#8217;09, (April 2009). [PDF] Summary In recent years, botnets have become the major originators of email spams, DDoS attacks, and click-frauds on advertisement-based web sites. This paper argues that separating human-generated traffic from botnet-generated activities can improve [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">R. Gummadi, H. Balakrishnan, P. Maniatis, S. Ratnasamy, &#8220;Not-a-Bot: Improving Service Availability in the Face of Botnet Attacks,&#8221; <em>NSDI&#8217;09</em>, (April 2009). [<a href="http://berkeley.intel-research.net/maniatis/publications/2009NSDINAB.pdf">PDF</a>]</p>
<h2>Summary</h2>
<p>In recent years, botnets have become the major originators of email spams, DDoS attacks, and click-frauds on advertisement-based web sites. This paper argues that separating human-generated traffic from botnet-generated activities can improve reliability of various web-based services against botnet attacks. But identifying human-generated traffic in the absence of strong unique identities can be challenging. In this paper, the authors propose NAB (Not-A-Bot), a system to approximately identify and certify human-generated activity in a non-intrusive way.</p>
<h3>NAB Architecture</h3>
<p>NAB consists of an attester, a <em>small</em> trusted software, that runs locally at a host (isolated from the <em>untrusted</em> OS) and generates attestations corresponding to a request from an application, as well as an external verifier that validates these attestations in a distributed site. There are four main requirements that drive the NAB architecture (attester and verifier) design:</p>
<ol>
<li>Attestations must be generated in response to human requests automatically.</li>
<li>Attestations must not be transferable from the client on which they are generated to attest traffic originating from another client.</li>
<li>NAB must benefit users that deploy it without hurting those that do not.</li>
<li>NAB must preserve the existing privacy and anonymity semantics of applications</li>
</ol>
<p>Requirements 1 and 2 are implemented/met in the attester and requirements 3 and 4 are ensured in the verifier.</p>
<h3>Attester</h3>
<p>The attester runs on a trusted computing base (<a href="http://en.wikipedia.org/wiki/Trusted_computing_base">TCB</a>), which is implemented by taking advantage of the Trusted Platform Module (TPM) available is most modern systems. The authors use TPM to create a trusted path between physical input devices and the human activity attester.</p>
<p>The attester&#8217;s sole purpose is to create attestations &#8211; when asked for by an application &#8211; for legitimate human activity. The authors used a simple t-δ attester, where a attestation is created if there is any input activity in last δ time units. Even though there is a possibility of forging/harvesting user activity in this simpler approach, the authors argue that the botnet will be limited by human activity frequency, which will decrease the number of attacks.</p>
<p>NAB generates responder-specific, content-specific, and if appropriate, challenger-specific attestations and employs existing cryptographic methods to secure them. It also ensures that attestations cannot be double-spent and cannot be misused by botnets (for a <em>very limited</em> time window botnets can forge attestations)</p>
<h3>Verifier</h3>
<p>Verifier is co-located with the server processing requests. When invoked, the verifier is passed both the attestation and the request. Based on these information (plus the crypto-thing that are in the paper), the verifier checks the validity of the attestation. The authors also discussed different application-specific spam, DDoS, and click-fraud verification policies.</p>
<h3>Evaluation Results</h3>
<p>Major results from the evaluation are:</p>
<ul>
<li>TCB size can be really small (500 SLOC)</li>
<li>Attester CPU cost is 10<sup>7</sup> instructions/attestations</li>
<li>For simple application changes less than 250 SLOC changes is enough to enable them of NAB</li>
<li>In the worst case, NAB can suppress 92% spam, 89% non-human possibly DDoS activity, and 87% automated clicks without false positives</li>
<li>The verifier can withstand 100,000 bot DDoS and can handle more than 1000 requests/second</li>
</ul>
<h2>Discussion</h2>
<p>The authors have a done good job in discussing several alternatives in multiple cases instead of sticking to their chosen one, which provides insights into botnet behavior and explains why some solutions might not work. There solution still has some loopholes, but possibly will work to some extent. The requirement of changing all the applications could be a turn off. However, the fact that NAB does not discriminate against unattested traffic makes that problem go away and allows incremental deployment.</p>
<p>One problem that I can think of right now regarding this approach is that the applications are not in the TCB and the requests for attestations do not seem to be going through a trusted channel. What are the fallback options if a malicious entity corrupts/removes attestations just after they leave the trusted base of the attester? Also, the verifiers do not seem to be in a TCB. What if it is compromised?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2009/11/23/not-a-bot-improving-service-availability-in-the-face-of-botnet-attacks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scalable Application Layer Multicast</title>
		<link>http://www.mosharaf.com/blog/2009/11/22/scalable-application-layer-multicast/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=scalable-application-layer-multicast</link>
		<comments>http://www.mosharaf.com/blog/2009/11/22/scalable-application-layer-multicast/#comments</comments>
		<pubDate>Mon, 23 Nov 2009 04:41:43 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Banerjee]]></category>
		<category><![CDATA[Bhattacharjee]]></category>
		<category><![CDATA[Kommareddy]]></category>
		<category><![CDATA[multicast]]></category>
		<category><![CDATA[UCB CS268 F09]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1246</guid>
		<description><![CDATA[S. Banerjee, B. Bhattacharjee, C. Kommareddy, &#8220;Scalable Application Layer Multicast,&#8221; ACM SIGCOMM Conference, (August 2002). [PDF] Summary Deployment of network layer multicast requires support from the underlying infrastructure. Like many other proposals/protocols with similar requirements, it faced the same fate: no one deployed it. This paper presents a scalable application layer multicast protocol, NICE, designed [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">S. Banerjee, B. Bhattacharjee, C. Kommareddy, &#8220;Scalable Application Layer Multicast,&#8221; <em>ACM SIGCOMM Conference</em>, (August 2002). [<a href="http://pages.cs.wisc.edu/~suman/pubs/sigcomm02.pdf">PDF</a>]</p>
<h2>Summary</h2>
<p>Deployment of network layer multicast requires support from the underlying infrastructure. Like many other proposals/protocols with similar requirements, it faced the same fate: no one deployed it. This paper presents a scalable application layer multicast protocol, NICE, designed for low-bandwidth data streaming applications with large receiver sets. Data packets in NICE are replicated at end hosts, which makes it technically an overlay, and so it does not need support from the network providers.</p>
<p>The NICE protocol arranges the set of end hosts into a hierarchy and maintains the hierarchy through different events. The members at the very top of the hierarchy maintain (soft) state about O(log N) other members. Logically each member keeps detailed state about other members that are near (calculated based on end-to-end latency) in the hierarchy and only has limited knowledge about other members in the group. Hosts in each layer are partitioned into a set of clusters of sizes between k and 3k-1, where k is a constant. Each cluster has a leader that is the graph-theoretic center of the cluster. The leaders in layer L<sub>i</sub> are the members of the layer L<sub>i+1</sub>, and the same process continues. The following properties hold for the distribution of hosts in different layers:</p>
<ul>
<li>A host belongs to a single cluster at any layer.</li>
<li>If a host is present in layer L<sub>i</sub>, it must be the cluster leader in each of the lower layers (L<sub>i-1</sub>&#8230;L<sub>0</sub>).</li>
<li>If a host is not present in layer L<sub>i</sub>, it cannot be present in any higher layers (L<sub>i+1</sub>&#8230;).</li>
<li>Each cluster is size bounded between k and 3k-1.</li>
<li>There are atmost log<sub>k</sub> N layers, and the highest layer has only a single member.</li>
</ul>
<p>There are separate control and data planes in NICE. The neighbors on the control topology exchange periodic soft state refreshes and form a clique in each layer. The data topology in each layer is a source-specific tree.</p>
<p>NICE assumes a special host, rendezvous point (RP), which is known a-priori to all members. When a new host wants to join a multicast group, it contacts the RP. The RP responds by returning the members in the highest layer. The joining node then contacts the closest member, which refers to the members of its cluster in the lower layer (the closest member is here because it is a leader in the lower layer). This process continues iteratively until the joining node finds its appropriate location in the lowest layer. The message overhead of the joining process is O(k log N) query-response pairs. When a node leaves, it removes itself from all the layers and new leaders are chosen. Periodic heartbeat messages are used to keep the state information fresh and refine (split/merge) the hierarchy if necessary.</p>
<p>The authors compare the performance of NICE against the NARADA protocol (which was possibly the best in that time) along the following dimensions: quality of data path (using stretch and stress measures), recovery from host failure, and control traffic overhead.  NICE is similar to NARADA in terms of the stretch of data paths and failure recovery, but it has lower stress on routers and links. The most important result is that NICE has orders of magnitude lower control overhead for groups of size &gt; 32, which makes it much more scalable. In addition to simulations, the paper also presents some smaller scale wide are experiments which are similar to the simulation results.</p>
<h2>Critique</h2>
<p>The idea presented in this paper is neat and straightforward, but at the same time obvious. By creating hierarchies, a lot of problems can be solved; but such solutions come at the expense fault tolerance. The authors just assume that the RP won&#8217;t fail or even if it fails it will recover quickly, but they avoid the details of what might happen if things go wrong. In short, there is no clear notion of reliability.</p>
<p>NICE is just a glorious overlay. The reasons why application layer multicast should work has also been found to be not-so-correct. Network operators do not like protocols that try to bypass them and mess with their traffic engineering. Eventually, they end up blocking or throttling those protocols. The same might happen to NICE, if anyone was using it. There is no evidence presented in the paper that discusses this aspect of NICE-underlay interaction.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2009/11/22/scalable-application-layer-multicast/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss><!-- Dynamic page generated in 3.345 seconds. --><!-- Cached page generated by WP-Super-Cache on 2010-09-01 17:35:11 -->
