<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" version="2.0">

<channel>
	<title>Mosharaf Chowdhury</title>
	
	<link>http://www.mosharaf.com</link>
	<description>UC Berkeley</description>
	<lastBuildDate>Sat, 19 May 2012 04:02:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/mosharaf" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="mosharaf" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Ahimsa accepted at HotCloud’2012</title>
		<link>http://www.mosharaf.com/blog/2012/05/10/ahimsa-accepted-at-hotcloud2012/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=ahimsa-accepted-at-hotcloud2012</link>
		<comments>http://www.mosharaf.com/blog/2012/05/10/ahimsa-accepted-at-hotcloud2012/#comments</comments>
		<pubDate>Thu, 10 May 2012 21:57:06 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Recent News]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Ahimsa]]></category>
		<category><![CDATA[FairCloud]]></category>
		<category><![CDATA[HotCloud]]></category>
		<category><![CDATA[Orchestra]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=2089</guid>
		<description><![CDATA[Update: Camera-ready is available online! Do let us know what you think in the comments section. Our exploratory paper on the complexity of a transfer, &#8220;Redefining Network Fairness to Support Data Parallelism,&#8221; has been accepted for publication at this year&#8217;s HotCloud workshop! In Orchestra, we defined the notion of transfers in the context of cluster computing, and [...]]]></description>
			<content:encoded><![CDATA[<p class="alert"><strong>Update:</strong> Camera-ready is <a href="http://www.mosharaf.com/wp-content/uploads/ahimsa-hotcloud12.pdf">available</a> online! Do let us know what you think in the comments section.</p>
<p>Our exploratory paper on the complexity of a transfer, &#8220;Redefining Network Fairness to Support Data Parallelism,&#8221; has been accepted for publication at this year&#8217;s HotCloud workshop!</p>
<p>In <a href="http://www.mosharaf.com/blog/2011/04/29/orchestra-has-been-accepted-at-sigcomm2011/">Orchestra</a>, we defined the notion of transfers in the context of cluster computing, and in <a href="http://www.mosharaf.com/blog/2012/05/05/faircloud-has-been-accepted-at-sigcomm2012/">FairCloud</a>, we argued for fairness across multiple transfers. However, we have so far been considering transfers independently of the computations they enable. <a href="http://www.eecs.berkeley.edu/~gautamk/">Gautam</a> observed that not all transfers are created equal: when we scale-up or -down the input to computations, input to transfers do not always scale linearly (e.g., partitioned transfers like shuffles in a MapReduce program scales linearly, whereas broadcast has a super-linear scaling factor). As a result, network fairness, when defined in terms of bandwidth, does not always match the simple goal of data parallelism: &#8220;given <em>n</em> times more resources, a data parallel application can expect to complete <em>n</em> times faster.&#8221; Ahimsa explores the notion of network fairness that can match this goal.</p>
<p>This year, HotCloud accepted 24 out of 75 submissions, six of which have at least one Berkeley author :)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2012/05/10/ahimsa-accepted-at-hotcloud2012/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>“Surviving Failures in Bandwidth-Constrained Datacenters” at SIGCOMM’2012</title>
		<link>http://www.mosharaf.com/blog/2012/05/06/surviving-failures-in-bandwidth-constrained-datacenters-at-sigcomm2012/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=surviving-failures-in-bandwidth-constrained-datacenters-at-sigcomm2012</link>
		<comments>http://www.mosharaf.com/blog/2012/05/06/surviving-failures-in-bandwidth-constrained-datacenters-at-sigcomm2012/#comments</comments>
		<pubDate>Sun, 06 May 2012 20:39:51 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Recent News]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[SIGCOMM]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=2079</guid>
		<description><![CDATA[My internship work from last Summer has been accepted for publication at SIGCOMM&#8217;2012 as well; yay!! In this piece of work, we try to allocate machines for datacenter applications with bandwidth and fault-tolerance constraints, which are at odds—allocation for bandwidth tries to put machines closer, whereas a fault-tolerant allocation spreads machine out across multiple fault [...]]]></description>
			<content:encoded><![CDATA[<p>My internship work from last Summer has been accepted for publication at SIGCOMM&#8217;2012 <a href="http://www.mosharaf.com/blog/2012/05/05/faircloud-has-been-accepted-at-sigcomm2012/">as well</a>; yay!! In this piece of work, we try to allocate machines for datacenter applications with bandwidth and fault-tolerance constraints, which are at odds—allocation for bandwidth tries to put machines closer, whereas a fault-tolerant allocation spreads machine out across multiple fault domains.</p>
<blockquote><p>Datacenter networks have been designed to tolerate failures of network equipment and provide sufﬁcient bandwidth. In practice, however, failures and maintenance of networking and power equipment often make tens to thousands of servers unavailable, and network congestion can increase service latency. Unfortunately, there exists an inherent tradeoff between achieving high fault tolerance and reducing bandwidth usage in network core; spreading servers across fault domains improves fault tolerance, but requires additional bandwidth, while deploying servers together reduces bandwidth usage, but also decreases fault tolerance. We present a detailed analysis of a large-scale Web application and its communication patterns. Based on that, we propose and evaluate a novel optimization framework that achieves both high fault tolerance and signiﬁcantly reduces bandwidth usage in network core by exploiting the skewness in the observed communication patterns.</p></blockquote>
<p>During my Master&#8217;s, I worked on several variations of a similar problem called <em>virtual network embedding</em> in the context of network virtualization (<a href="http://www.mosharaf.com/wp-content/uploads/vineyard-infocom09.pdf">INFOCOM&#8217;2009</a>, <a href="http://www.mosharaf.com/wp-content/uploads/tarmvine-networking10.pdf">Networking&#8217;2010</a>, <a href="http://www.mosharaf.com/wp-content/uploads/polyvine-visa10.pdf">VISA&#8217;2010</a>, <a href="http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5951812">ToN&#8217;2012</a>).</p>
<p>This year 32 out of 235 papers have been accepted at SIGCOMM,  <a href="http://lmbgp.tumblr.com/post/21826884051/berkeley-papers-accepted-to-sigcomm">seven</a> of which have at least one Berkeley author.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2012/05/06/surviving-failures-in-bandwidth-constrained-datacenters-at-sigcomm2012/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>FairCloud has been accepted at SIGCOMM’2012</title>
		<link>http://www.mosharaf.com/blog/2012/05/05/faircloud-has-been-accepted-at-sigcomm2012/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=faircloud-has-been-accepted-at-sigcomm2012</link>
		<comments>http://www.mosharaf.com/blog/2012/05/05/faircloud-has-been-accepted-at-sigcomm2012/#comments</comments>
		<pubDate>Sat, 05 May 2012 20:46:19 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Recent News]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[FairCloud]]></category>
		<category><![CDATA[Orchestra]]></category>
		<category><![CDATA[SIGCOMM]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=2075</guid>
		<description><![CDATA[This is kinda old news now, but still as exciting as it was few days ago. Our paper &#8220;FairCloud: Sharing the Network in Cloud Computing&#8221; has been accepted for publication at this year&#8217;s SIGCOMM. We explore the design space of sharing networks, identify tradeoffs, and place categorize different strategies based on their characteristics. In case [...]]]></description>
			<content:encoded><![CDATA[<p>This is kinda old news now, but still as exciting as it was few days ago. Our paper &#8220;FairCloud: Sharing the Network in Cloud Computing&#8221; has been accepted for publication at this year&#8217;s SIGCOMM. We explore the design space of sharing networks, identify tradeoffs, and place categorize different strategies based on their characteristics. In case you are following our <a href="http://www.mosharaf.com/blog/2011/04/29/orchestra-has-been-accepted-at-sigcomm2011/">Orchestra</a> work, FairCloud sits in the Inter-Transfer Controller (ITC) of the Orchestra hierarchy.</p>
<blockquote><p>The network, similar to CPU and memory, is a critical and shared resource in the cloud. However, unlike other resources, it is neither shared <em>proportionally to payment</em>, nor do cloud providers offer <em>minimum guarantees </em>on network bandwidth. The reason is that networks are more difficult to share, since the network allocation of a VM X depends not only on the VMs running on the same machine with X, but also on the other VMs that X communicates with, as well as on the cross-traffic on each link used by X. In this paper, we start from the above requirements—payment proportionality and minimum guarantees—and show that the network- specific challenges lead to fundamental tradeoffs when sharing datacenter networks. We then propose a set of properties to explicitly express these tradeoffs. Finally, we propose three allocation policies that allow us to navigate the tradeoff space. We evaluate their characteristics through simulation and testbed experiments, showing that they are able to provide minimum guarantees and achieve better proportionality with the per-VM payment than known allocations.</p></blockquote>
<p>This year 32 out of 235 papers have been accepted at SIGCOMM. On other news, Berkeley has <a href="http://lmbgp.tumblr.com/post/21826884051/berkeley-papers-accepted-to-sigcomm">seven papers</a> in this SIGCOMM!!!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2012/05/05/faircloud-has-been-accepted-at-sigcomm2012/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Spark wins the Best Paper Award at NSDI’2012</title>
		<link>http://www.mosharaf.com/blog/2012/04/25/spark-wins-the-best-paper-award-at-nsdi2012/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=spark-wins-the-best-paper-award-at-nsdi2012</link>
		<comments>http://www.mosharaf.com/blog/2012/04/25/spark-wins-the-best-paper-award-at-nsdi2012/#comments</comments>
		<pubDate>Wed, 25 Apr 2012 17:23:14 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Awards]]></category>
		<category><![CDATA[Recent News]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Best Paper]]></category>
		<category><![CDATA[NSDI]]></category>
		<category><![CDATA[Spark]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=2047</guid>
		<description><![CDATA[Spark (Resilient Distributed Datasets/RDDs) has won the Best Paper award at NSDI 2012. Woohoo! We were also nominated for the inaugural Community Award for open-sourcing the project.]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.spark-project.org/">Spark</a> (Resilient Distributed Datasets/RDDs) has won the Best Paper award at NSDI 2012. Woohoo! We were also nominated for the inaugural Community Award for open-sourcing the project.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2012/04/25/spark-wins-the-best-paper-award-at-nsdi2012/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>I’ve won the Facebook Fellowship 2012-2013</title>
		<link>http://www.mosharaf.com/blog/2012/02/17/ive-won-the-facebook-fellowship-2012-2013/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=ive-won-the-facebook-fellowship-2012-2013</link>
		<comments>http://www.mosharaf.com/blog/2012/02/17/ive-won-the-facebook-fellowship-2012-2013/#comments</comments>
		<pubDate>Fri, 17 Feb 2012 18:47:15 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Awards]]></category>
		<category><![CDATA[Recent News]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=2053</guid>
		<description><![CDATA[Great news! I&#8217;ve been selected as one of the 12 recipients of 2012-2013 Facebook Fellowship :) We have two more winners from Berkeley. Go BEARS!!! Official release from Facebook can be found here.]]></description>
			<content:encoded><![CDATA[<p>Great news! I&#8217;ve been selected as one of the 12 recipients of 2012-2013 Facebook Fellowship :) We have two more winners from Berkeley. Go BEARS!!!</p>
<p>Official release from Facebook can be found <a href="https://www.facebook.com/note.php?note_id=311040305614050">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2012/02/17/ive-won-the-facebook-fellowship-2012-2013/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>TAing for CS162: Operating Systems and Systems Programming</title>
		<link>http://www.mosharaf.com/blog/2012/01/29/taing-for-cs162-operating-systems-and-systems-programming/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=taing-for-cs162-operating-systems-and-systems-programming</link>
		<comments>http://www.mosharaf.com/blog/2012/01/29/taing-for-cs162-operating-systems-and-systems-programming/#comments</comments>
		<pubDate>Sun, 29 Jan 2012 23:34:01 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Recent News]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=2029</guid>
		<description><![CDATA[This semester I&#8217;m TAing for CS162: Operating Systems and Systems Programming taught by Anthony Joseph and Ion Stoica. Here in Berkeley, we call a TA a GSI (Graduate Student Instructor). This is my first time GSI appointment in Berkeley, and it comes after more than 2 years of break. The last time I TAed, I was in [...]]]></description>
			<content:encoded><![CDATA[<p>This semester I&#8217;m TAing for <a href="http://inst.eecs.berkeley.edu/~cs162/sp12/">CS162: Operating Systems and Systems Programming</a> taught by Anthony Joseph and Ion Stoica. Here in Berkeley, we call a TA a GSI (Graduate Student Instructor). This is my first time GSI appointment in Berkeley, and it comes after more than 2 years of break. The last time I TAed, I was in UWaterloo and received the best TA award. It&#8217;ll be very hard to match that, but I am hoping not to disappoint my students. We&#8217;ll see how it goes.</p>
<p class="error"><strong>PS:</strong> What is the proper verb form of &#8220;TA&#8221;? TAing, TA-ing, TA&#8217;ing or what? Someone must&#8217;ve had solved this problem already :(</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2012/01/29/taing-for-cs162-operating-systems-and-systems-programming/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Spark has been accepted at NSDI’2012</title>
		<link>http://www.mosharaf.com/blog/2011/12/13/spark-has-been-accepted-at-nsdi2012/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=spark-has-been-accepted-at-nsdi2012</link>
		<comments>http://www.mosharaf.com/blog/2011/12/13/spark-has-been-accepted-at-nsdi2012/#comments</comments>
		<pubDate>Wed, 14 Dec 2011 01:09:14 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Recent News]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[NSDI]]></category>
		<category><![CDATA[Spark]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1991</guid>
		<description><![CDATA[Our paper &#8220;Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing&#8221; has been accepted at NSDI&#8217;2012. This is Matei&#8216;s brainchild and a joint work of a lot of people including, but not limited to, TD, Ankur, Justin, Murphy, and professors Ion Stoica, Scott Shenker, and Michael Franklin. Unlike many other systems papers, Spark is [...]]]></description>
			<content:encoded><![CDATA[<p>Our paper &#8220;Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing&#8221; has been accepted at NSDI&#8217;2012. This is <a href="http://www.cs.berkeley.edu/~matei/">Matei</a>&#8216;s brainchild and a joint work of a lot of people including, but not limited to, <a href="http://www.eecs.berkeley.edu/~tdas/">TD</a>, Ankur, <a href="http://www.cs.berkeley.edu/~jtma/">Justin</a>, Murphy, and professors Ion Stoica, Scott Shenker, and Michael Franklin. Unlike many other systems papers, Spark is actively developed and used by many people. You can also <a href="http://www.spark-project.org/">download</a> and use it in no time to solve all your problems; well, at least the ones that require analyzing big data in little time. We focus on the concept of resilient distributed datasets or RDDs in this paper, and show how we can perform fast, in-memory iterative and interactive jobs with low-overhead fault-tolerance.</p>
<blockquote><p>We present Resilient Distributed Datasets (RDDs), a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault-tolerant manner. RDDs are motivated by two types of applications that current computing frameworks handle inefficiently: iterative algorithms and interactive data mining tools. In both cases, keeping data in memory can improve performance by an order of magnitude. To achieve fault tolerance efficiently, RDDs provide a restricted form of shared memory, based on coarse-grained transformations rather than fine-grained updates to shared state. However, we show that RDDs are expressive enough to capture a wide class of computations, including current specialized programming models for iterative jobs like Pregel. We have implemented RDDs in a system called Spark, which we evaluate through a variety of benchmarks and user applications.</p></blockquote>
<p>The NSDI&#8217;2012 PC accepted 30 out of 169 papers. On other news, this time Berkeley will have a big presence at NSDI with several other papers. Go Bears!!!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/12/13/spark-has-been-accepted-at-nsdi2012/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Memory Management in the Cloud</title>
		<link>http://www.mosharaf.com/blog/2011/12/05/memory-management-in-the-cloud/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=memory-management-in-the-cloud</link>
		<comments>http://www.mosharaf.com/blog/2011/12/05/memory-management-in-the-cloud/#comments</comments>
		<pubDate>Mon, 05 Dec 2011 18:13:50 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[PACMan]]></category>
		<category><![CDATA[RAMCloud]]></category>
		<category><![CDATA[UCB Cloud Computing Course F11]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1985</guid>
		<description><![CDATA[Stanford, &#8220;The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM,&#8221; SIGOPS Operating Systems Review, Vol. 43, No. 4, December 2009, pp. 92-105. [PDF] AMP Lab, &#8220;PACMan: Coordinated Memory Caching for Parallel Jobs,&#8221; Secret Draft. Update: PACMan has been accepted at NSDI&#8217;2012. Secret draft won&#8217;t remain secret anymore :) Summary Cloud applications require storage systems [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">Stanford, &#8220;The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM,&#8221; <em>SIGOPS Operating Systems Review</em>, Vol. 43, No. 4, December 2009, pp. 92-105. [<a href="http://www.stanford.edu/~ouster/cgi-bin/papers/ramcloud.pdf">PDF</a>]</p>
<p class="alert">AMP Lab, &#8220;PACMan: Coordinated Memory Caching for Parallel Jobs,&#8221; <em>Secret Draft</em>.</p>
<p class="alert"><strong>Update:</strong> PACMan has been accepted at NSDI&#8217;2012. Secret draft won&#8217;t remain secret anymore :)</p>
<h2>Summary</h2>
<p>Cloud applications require storage systems that provide low latency and high throughput for large amounts of data.  While traditional disks cannot meet such requirements, given the trend in DRAM price and capacity, it is possible to envision a future where most of the storage needs can be fulfilled by DRAM; RAMCloud is such a system. PACMan, on the other hand, suggests that even today, most of the workloads can be kept into DRAM using better caching mechanisms.</p>
<h3>RAMCloud</h3>
<p>The core idea in RAMCloud is to keep everything in DRAM with disks used only as backups. The biggest challenge is to make sure that the storage system can be recovered quickly upon failure. RAMCloud uses buffered logging. The authors claim that replication is not necessary to achieve high performance, rather replicas are used only for parallel recovery. In steady state, there is a single copy of the data present in DRAM. Recovery is performed using a massively parallel read of data from disks.</p>
<h3>PACMan</h3>
<p>PACMan is a caching mechanism and corresponding system for HDFS and similar distributed file systems. The key idea is that current clusters have a large amount of unused memory that can be used to cache frequently-used data blocks, and traditional caching strategies like LRU or LFU do not work well on cluster jobs. The authors propose the concept of all-or-nothing property, i.e., when caching all data blocks for a given job across the cluster should be cached or nothing at all.</p>
<h2>Comments</h2>
<p>RAMCloud is a more general system than PACMan, but clearly, it is more expensive as well. RAMCloud trades off price for speed, but it is likely to be used in many future systems if prices of DRAM and high-speed network equipments keep going down. PACMan, from the high level, may seem to be a more short-term fix for the existing clusters. However, the insight of all-or-nothing is important and will be useful even in the future. Also, PACMan can have a quicker impact because it does not ask for any investment to reap the possible gains.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/12/05/memory-management-in-the-cloud/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Confidentiality and Security in the Cloud</title>
		<link>http://www.mosharaf.com/blog/2011/11/27/confidentiality-and-security-in-the-cloud/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=confidentiality-and-security-in-the-cloud</link>
		<comments>http://www.mosharaf.com/blog/2011/11/27/confidentiality-and-security-in-the-cloud/#comments</comments>
		<pubDate>Mon, 28 Nov 2011 01:09:33 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Balakrishnan]]></category>
		<category><![CDATA[CryptDB]]></category>
		<category><![CDATA[Popa]]></category>
		<category><![CDATA[Redfield]]></category>
		<category><![CDATA[Ristenpart]]></category>
		<category><![CDATA[Savage]]></category>
		<category><![CDATA[Shacham]]></category>
		<category><![CDATA[Tromer]]></category>
		<category><![CDATA[UCB Cloud Computing Course F11]]></category>
		<category><![CDATA[Zeldovich]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1975</guid>
		<description><![CDATA[Raluca Ada Popa, Catherine M. S. Redﬁeld, Nickolai Zeldovich, Hari Balakrishnan, &#8220;CryptDB: Protecting Conﬁdentiality with Encrypted Query Processing,&#8221; SOSP, 2011. [PDF] Thomas Ristenpart, Eran Tromer, Hovav Shacham, Stefan Savage, &#8220;Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds,&#8221; CCS, 2009. [PDF] Summary With the increase in popularity of cloud computing [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">Raluca Ada Popa, Catherine M. S. Redﬁeld, Nickolai Zeldovich, Hari Balakrishnan, &#8220;CryptDB: Protecting Conﬁdentiality with Encrypted Query Processing,&#8221; <em>SOSP</em>, 2011. [<a href="http://people.csail.mit.edu/nickolai/papers/raluca-cryptdb.pdf">PDF</a>]</p>
<p class="alert">Thomas Ristenpart, Eran Tromer, Hovav Shacham, Stefan Savage, &#8220;Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds,&#8221; <em>CCS</em>, 2009. [<a href="http://cseweb.ucsd.edu/~hovav/dist/cloudsec.pdf">PDF</a>]</p>
<h2>Summary</h2>
<p>With the increase in popularity of cloud computing as a scalable, elastic, and cost-effective infrastructure solution, concerns about the security, privacy, and confidentiality of user data hosted on public clouds are also increasing. Curious administrators might breach trust, malicious entities can try to restrict/deny services, and adversaries might gain access to confidential data.</p>
<h3>CryptDB</h3>
<p>CryptDB stores user data in an SQL-aware encrypted form with multi-layered encryption onions. Each layer provides different levels of security and restricts execution of  SQL queries to limited sets. Depending on user queries, layers are dynamically ripped off one after another. Eventually, the database reaches a steady-state that strikes a balance between confidentiality of data and usability of the database. Encryption keys are chained together with user passwords to survive security breaches of both database and application servers.</p>
<h3>Hey You, Get Off of My Cloud!</h3>
<p>This paper discusses the risks of shared public clouds by demonstrating how an attacker can find the network topology of a cloud provider (e.g., Amazon EC2) to get a VM that co-resides with a victim VM and extract information from the victim. The goal is more to show that these risks existed in 2009 (it is questionable how big of a risk they are, and how hard it is avoid them), than how to address them.</p>
<h2>Comments</h2>
<p>CryptDB is undoubtedly the more practical of the two papers with a usable solution to a real problem. However, it has its weaknesses: CryptDB should require N times more space for N layers of the onion, creation/update of new onions with the change of user passwords and corresponding encryption key chains will be expensive, and for databases with mostly long-running and persistent connections, information of most users will be exposed when database and application servers are compromised.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/11/27/confidentiality-and-security-in-the-cloud/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Graph-parallel frameworks</title>
		<link>http://www.mosharaf.com/blog/2011/11/18/graph-parallel-frameworks/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=graph-parallel-frameworks</link>
		<comments>http://www.mosharaf.com/blog/2011/11/18/graph-parallel-frameworks/#comments</comments>
		<pubDate>Sat, 19 Nov 2011 03:59:29 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[GraphLab]]></category>
		<category><![CDATA[Pregel]]></category>
		<category><![CDATA[UCB Cloud Computing Course F11]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1966</guid>
		<description><![CDATA[Google, &#8220;Pregel: A System for Large-Scale Graph Processing,&#8221; SIGMOD, 2010. [PDF] Carnegie Mellon, &#8220;GraphLab: A New Framework for Parallel Machine Learning,&#8221; arXiv:1006.4990, 2010. [PDF] Summary Data-parallel frameworks such as MapReduce and Dryad are good at performing embarrassingly parallel jobs. These frameworks are not ideal for iterative jobs and for jobs where data-dependencies across stages are [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">Google, &#8220;Pregel: A System for Large-Scale Graph Processing,&#8221; <em>SIGMOD</em>, 2010. [<a href="http://kowshik.github.com/JPregel/pregel_paper.pdf">PDF</a>]</p>
<p class="alert">Carnegie Mellon, &#8220;GraphLab: A New Framework for Parallel Machine Learning,&#8221; <em>arXiv:1006.4990</em>, 2010. [<a href="http://arxiv.org/pdf/1006.4990v1">PDF</a>]</p>
<h2>Summary</h2>
<p>Data-parallel frameworks such as MapReduce and Dryad are good at performing embarrassingly parallel jobs. These frameworks are not ideal for iterative jobs and for jobs where data-dependencies across stages are sparse (e.g., in MapReduce, each reducer is likely to depend on each mapper). However, there are many problems, specially in machine learning, that can be intuitively expressed using graphs with sparse computational dependencies, require multiple iterations to converge, and have variable convergence rate for different parameters. Pregel and GraphLab are two frameworks optimized for this type of graph-based problems.</p>
<p>A typical graph-parallel problem is expressed using graphs with vertices and edges, where each vertex and edge have associated data with them. In every iteration, vertex and edge data are updated and a bunch messages are exchanged between neighboring entities. This update function is typically the same for every vertex, and it is written by the user. There may or may not be a synchronization step at the end of every iteration. In a distributed setting, the graph is cut and divided across multiple nodes and updates from a collection of vertices in one node is communicated to another using message passing.</p>
<h3>Pregel vs GraphLab</h3>
<p>The key difference between Pregel and GraphLab is that Pregel has a barrier at the end of every iteration, whereas GraphLab is completely asynchronous. Asynchrony in GraphLab allows it to prioritize more complex vertices over others, but it also calls for consistency models to maintain sanity of results. GraphLab proposes three consistency models: full, edge, and vertex consistency, to allow different levels of parallelism. Another difference is that Pregel allows dynamic modifications to the graph structure, whereas GraphLab does not.</p>
<h2>Comments</h2>
<p>Pregel and GraphLab sit at two ends of the &#8220;power of framework&#8221; vs &#8220;ease of use&#8221; tradeoff space. Allowing asynchrony makes GraphLab more general and powerful than Pregel, but it is more complex and requires users to understand which consistency model is suitable for them. Pregel is simpler (common for most frameworks in Google&#8217;s arsenal), but still capable of handling a wide variety of problems. Given its origin at Google, open-source clones like Giraph, Pregel&#8217;s model is more likely to succeed in near future.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/11/18/graph-parallel-frameworks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss><!-- Dynamic page generated in 0.698 seconds. --><!-- Cached page generated by WP-Super-Cache on 2012-05-18 21:02:44 --><!-- Compression = gzip -->

