<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Cirrus Minor</title>
	
	<link>http://arnon.me</link>
	<description>Musings of a holistic architect by Arnon Rotem-Gal-Oz</description>
	<lastBuildDate>Thu, 02 May 2013 09:20:10 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/CirrusMinor" /><feedburner:info uri="cirrusminor" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Fallacies of massively distributed computing</title>
		<link>http://feedproxy.google.com/~r/CirrusMinor/~3/7IQzKVik8H4/</link>
		<comments>http://arnon.me/2013/04/fallacies-massively-distributed-computing/#comments</comments>
		<pubDate>Mon, 29 Apr 2013 04:01:39 +0000</pubDate>
		<dc:creator>Arnon Rotem-Gal-Oz</dc:creator>
				<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Featured Posts]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Software Architecture]]></category>
		<category><![CDATA[tradeoffs]]></category>

		<guid isPermaLink="false">http://arnon.me/?p=1178</guid>
		<description><![CDATA[<p>In the last few years, we see the advent of highly distributed systems. Systems that have clusters with lots of servers are no longer the sole realm of the googles&#8217; and facebooks&#8217; of the world and we begin to see multi-node and big data systems in enterprises. e.g. I don&#8217;t think a company such as Nice (the company I work for) would release an hadoop based analytics platform and solutions, something we did just last week, 5-6 years ago.</p>
<p>So now that large(r) clusters are more prevalent, I thought it would be a good time to reflect on the fallacies of distributed computing and how/if they are relevant; should they be changed.
If you don&#8217;t know about the fallacies you can see the list and read the article I wrote about them  at the link mentioned above. In a few words ... <a href="http://arnon.me/2013/04/fallacies-massively-distributed-computing/">Read More &#187;</a>]]></description>
				<content:encoded><![CDATA[<p>In the last few years, we see the advent of highly distributed systems. Systems that have clusters with lots of servers are no longer the sole realm of the googles&#8217; and facebooks&#8217; of the world and we begin to see multi-node and big data systems in enterprises. e.g. I don&#8217;t think a company such as Nice (the company I work for) would release an hadoop based analytics platform and solutions,<a href="http://www.cmswire.com/cms/customer-experience/nice-systems-launches-customer-engagement-analytics-platform-020640.php"> something we did just last week</a>, 5-6 years ago.</p>
<p>So now that large(r) clusters are more prevalent, I thought it would be a good time to reflect on the <a href="https://blogs.oracle.com/jag/resource/Fallacies.html">fallacies of distributed computing</a> and how/if they are relevant; should they be changed.<br />
If you don&#8217;t know about the fallacies you can see the list and read the article I wrote about them <a href="https://blogs.oracle.com/jag/resource/Fallacies.html"> at the link mentioned above</a>. In a few words I&#8217;d just say that these are statement, originally drafted by Peter Deutsch, Tom Lyon and others in in 1991-2, about failed assumptions we are tempted to make when working on distributed systems which turn out as fallacies and cost us dearly.</p>
<p>So the fallacies help keep in mind that distributed systems are different, and they do seem to hold, even after the 20 years that passed. I think, however, that working with larger cluster we should also consider the following 3 as fallacies we&#8217;re likely to assume</p>
<ul>
<li><span style="font-size: 13px; line-height: 19px;">Instances are free</span></li>
<li><span style="font-size: 13px; line-height: 19px;">Instances have identities</span></li>
<li><span style="font-size: 13px; line-height: 19px;">Map/Reduce is a panacea</span></li>
</ul>
<p><strong>Instances are free</strong><br />
A lot of the new technologies of the big-data and noSQL era bring with them the promise of massive scalability. If you see a performance problem, you can just (a famous <a href="http://www.ayeconference.com/lullaby-language/">lullaby word</a>) add another server. In most cases that is even true, you can indeed add more servers and get better performance. What these technologies don&#8217;t tell you is that instances have costs. More instances mean increased TCO starting from management effort monitoring, configuring etc, as well as operations cost either for the hardware; the rented space and electricity in a hosted solution or the usage by hours in a cloud environment. So from the development side of the fence the solution is easy &#8211; add more hardware. In reality sometimes it is better to make the effort and optimize your code/design. Just the other week we had a more than a 10 fold improvement in query performance by removing query parts that were no longer needed after a change in the data flow of the system &#8211; that was way cheaper than adding 2-3 more nodes to achieve the same results.</p>
<p><strong>Instances have identities</strong><br />
I remember, sometime in Jurassic age, when I set up a network for the first time (A Novell Netware 3.11 if you must ask) it had just one server. Naturally that server was treated with a lot of respect. It had a a printer connected, it had a name, nobody could touch it but me. One server to rule all them clients. Moving on I had server farms, so just a list of random names began to be a problem so we started to use themes like gods, single malts (&#8220;can you reboot the Macallan please&#8221;) etc. Anyway, that&#8217;s all nice and dandy and if you are starting small with a (potentially) big data project you might be tempted to do something similar. If you are tempted &#8211; don&#8217;t. When you have tens of servers (and naturally even worst when you have hundreds or thousands) you no longer care about the individual server. You want to look at the world as pools of server types. you have a pool of data nodes in your hadoop cluster, a pool of application servers , a pool of servers running configuration x and another with configuration y. You&#8217;d need tools like <a href="http://abiquo.org/display/ABI24/Abiquo+Concepts">abiquo</a> and/or <a href="http://www.opscode.com/chef/">chef</a> and/or <a href="http://ansible.cc/">ansible</a> or similar products to manage this mess. But again, you won&#8217;t care much about XYZ2011 server and even it runs tomcat today, tomorrow it may make more sense to make it part of the cassandra cluster. What matters are the roles in the pools of resources and that the pool sizes will be enough to handle the capacity needed.</p>
<p><strong>Map/Reduce is a panacea</strong><br />
Hadoop seems to be the VHS of large clusters. It might not be the ultimate solution, but it does seem to be the one that gets the most traction &#8211; a lot of vendors old (like IBM, Microsoft, Oracle etc.) and new (Hortonworks, Cloudera, Pivotal etc.) offer Hadoop distros and many other solutions offer Hadoop adaptors (Mongodb, Casandra, Vertica etc.) and Hadoop, well hadoop is about the distributed file system and, well, map/reduce.<br />
Map/Reduce, which was <a href="http://research.google.com/archive/mapreduce.html">introduced in 2004 by Google</a> is an efficient algorithm for going over a large distributed data set without moving the data (map) and then producing aggregated or merged of results (reduce). Map/Reduce is great and it is a very useful paradigm applicable for a large set of problems.<br />
However it shouldn&#8217;t be the only tool in your tool set as map/reduce is inefficient when there&#8217;s a need to do multiple iterations on the data (e.g. grpah processing) or when you have to do many incremental updates to the data but don&#8217;t need to touch all of it. Also there&#8217;s the matter of ad-hoc reports (which I&#8217;ll probably blog about separately) Google solved these in <a href="http://googleresearch.blogspot.co.il/2009/06/large-scale-graph-computing-at-google.html">pregel</a>, <a href="http://research.google.com/pubs/pub36726.html">percolator</a> and <a href="http://research.google.com/pubs/pub36632.html">dremel</a> in 2009/2010 and now the rest of the world is playing catchup as it did with map/reduce a few year ago &#8211; but even if the solutions are not mature yet, you should keep in mind that they are coming</p>
<p>Instances are free; Instances have identities; and map/reduce is a panacea &#8211; these are my suggested additions to the fallacies of distributed computing when talking about large clusters. I&#8217;d be happy to hear what you think and/or if there are other things to keep in mind that I&#8217;ve missed</p>
<script src="http://feeds.feedburner.com/~s/arnonrgo?i=http://arnon.me/2013/04/fallacies-massively-distributed-computing/" type="text/javascript" charset="utf-8"></script><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=7IQzKVik8H4:WewdKHkIqcE:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=7IQzKVik8H4:WewdKHkIqcE:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=7IQzKVik8H4:WewdKHkIqcE:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=7IQzKVik8H4:WewdKHkIqcE:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=7IQzKVik8H4:WewdKHkIqcE:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=7IQzKVik8H4:WewdKHkIqcE:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=7IQzKVik8H4:WewdKHkIqcE:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=qj6IDK7rITs" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/CirrusMinor/~4/7IQzKVik8H4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://arnon.me/2013/04/fallacies-massively-distributed-computing/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		<feedburner:origLink>http://arnon.me/2013/04/fallacies-massively-distributed-computing/</feedburner:origLink></item>
		<item>
		<title>SOA Patterns is “deal of the day” on Manning’s site (Apr. 14th)</title>
		<link>http://feedproxy.google.com/~r/CirrusMinor/~3/gwEcvh1iJJM/</link>
		<comments>http://arnon.me/2013/04/soa-patterns-deal-day-mannings-site-apr-14th/#comments</comments>
		<pubDate>Sat, 13 Apr 2013 19:37:02 +0000</pubDate>
		<dc:creator>Arnon Rotem-Gal-Oz</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Featured Posts]]></category>
		<category><![CDATA[SOA Patterns]]></category>
		<category><![CDATA[book]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[soa patterns]]></category>

		<guid isPermaLink="false">http://arnon.me/?p=1156</guid>
		<description><![CDATA[
<p>I just got a notice from Manning that my book SOA patterns will be featured as &#8220;deal of the day&#8221; on Apr 14th &#8211; that means that it will be available for 50% off starting Midnight US ET of April 14th (and considering it&#8217;s a world-wide offer it would actually last for more than 24 hours).</p>
<p>To get the 50% discount use code dotd0414au at www.manning.com/rotem</p>
<p>If you&#8217;re not familiar with my book (which I guess is unlikely if you&#8217;re reading my blog, but anyway), you might want to check out the SOA Patterns page on my site, read one or more of the pattern draft or check out the book reviews.</p>
<p>Reviews of SOA patterns
</p>

Cameron McKenzie @ TheServerSide.com
Tad Anderson @ Java Developers Journal
Roberto Casadei @ robertocasadei.it
Colin Jack @ losTechies (half a book review)
Jan Van Ryswyck @ ElegantCode.com (half a book review)
Karsten Strøbæk @ ... <a href="http://arnon.me/2013/04/soa-patterns-deal-day-mannings-site-apr-14th/">Read More &#187;</a>]]></description>
				<content:encoded><![CDATA[<div class="aside"><a href="http://www.manning.com/rotem"><img class=" wp-image-5 alignleft" alt="SOAPatterns" src="http://arnon.me/wp-content/uploads/2010/05/SOAPatterns.png" width="150" height="188" /></a></div>
<p>I just got a notice from Manning that my book <a href="http://www.manning.com/rotem/">SOA patterns</a> will be featured as &#8220;deal of the day&#8221; on Apr 14th &#8211; that means that it will be available for 50% off starting Midnight US ET of April 14th (and considering it&#8217;s a world-wide offer it would actually last for more than 24 hours).</p>
<p>To get the 50% discount use code <strong>dotd0414au</strong> at <a href="http://www.manning.com/rotem">www.manning.com/rotem</a></p>
<p>If you&#8217;re not familiar with my book (which I guess is unlikely if you&#8217;re reading my blog, but anyway), you might want to check out the <a title="SOA Patterns" href="http://arnon.me/soa-patterns/">SOA Patterns page</a> on my site, read one or more of the pattern draft or check out the book reviews.</p>
<p><strong>Reviews of SOA patterns<br />
</strong></p>
<ul>
<li><a href="http://www.theserverside.com/feature/SOA-Patterns-Solve-Recurring-Distributed-Programming-Problems">Cameron McKenzie @ TheServerSide.com</a></li>
<li><a href="http://java.sys-con.com/node/2383897">Tad Anderson @ Java Developers Journal</a></li>
<li><a href="http://blog.robertocasadei.it/2013/04/review-soa-patterns/">Roberto Casadei @ robertocasadei.it</a></li>
<li><a href="http://lostechies.com/colinjack/2009/05/24/book-review-soa-patterns-first-5-chapters/">Colin Jack @ losTechies</a> (half a book review)</li>
<li><a href="http://elegantcode.com/2009/04/18/half-a-book-review-soa-patterns/">Jan Van Ryswyck @ ElegantCode.com</a> (half a book review)</li>
<li><a href="http://blog.strobaek.org/2012/04/23/review-of-books-soa-patterns-by-arnon-rotem-gal-oz/">Karsten Strøbæk @ strobaek.org</a></li>
<li><a href="http://www.amazon.com/SOA-Patterns-Arnon-Rotem-Gal-Oz/product-reviews/1933988266/ref=cm_cr_dp_see_all_btm?ie=UTF8&amp;showViewpoints=1&amp;sortBy=bySubmissionDateDescending">Reviews on Amazon</a></li>
</ul>
<p><strong>Patterns Drafts:<br />
</strong></p>
<ul>
<li><a href="http://rgoarchitects.bit.ly/6wTGjM">Edge Component (pdf) </a></li>
<li><a href="http://rgoarchitects.bit.ly/5LdAcf">Gridable Service (pdf) </a></li>
<li><a href="http://rgoarchitects.bit.ly/4IePlB">Service Firewall (html @ InfoQ) </a></li>
<li><a href="http://rgoarchitects.bit.ly/4xNwpp">Saga (pdf) </a></li>
<li><a href="http://rgoarchitects.bit.ly/5VvPNT">The Knot Antipattern (pdf) </a></li>
<li><a href="http://rgoarchitects.bit.ly/5XguD8">Blogjecting Watchdog (pdf) </a></li>
<li><a href="http://rgoarchitects.bit.ly/5NILfx">Reservation (pdf) </a></li>
<li><a href="http://arnon.me/wp-content/uploads/2010/10/TransactionalIntegration.pdf">Transactional Integration anti-pattern (pdf)</a></li>
<li><a href="http://arnon.me/wp-content/uploads/2010/10/Nanoservices.pdf">Nanoservices anti-pattern (pdf)</a></li>
<li><a href="http://arnon.me/wp-content/uploads/2011/10/Composite-Frontend.pdf">Composite Frontend pattern</a></li>
</ul>
<script src="http://feeds.feedburner.com/~s/arnonrgo?i=http://arnon.me/2013/04/soa-patterns-deal-day-mannings-site-apr-14th/" type="text/javascript" charset="utf-8"></script><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=gwEcvh1iJJM:kbFFvJU6Wpg:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=gwEcvh1iJJM:kbFFvJU6Wpg:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=gwEcvh1iJJM:kbFFvJU6Wpg:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=gwEcvh1iJJM:kbFFvJU6Wpg:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=gwEcvh1iJJM:kbFFvJU6Wpg:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=gwEcvh1iJJM:kbFFvJU6Wpg:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=gwEcvh1iJJM:kbFFvJU6Wpg:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=qj6IDK7rITs" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/CirrusMinor/~4/gwEcvh1iJJM" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://arnon.me/2013/04/soa-patterns-deal-day-mannings-site-apr-14th/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://arnon.me/2013/04/soa-patterns-deal-day-mannings-site-apr-14th/</feedburner:origLink></item>
		<item>
		<title>It’s open source, so the source, you know, is open…</title>
		<link>http://feedproxy.google.com/~r/CirrusMinor/~3/bHxuKY4jMsQ/</link>
		<comments>http://arnon.me/2013/04/open-source-source-open/#comments</comments>
		<pubDate>Mon, 01 Apr 2013 05:53:57 +0000</pubDate>
		<dc:creator>Arnon Rotem-Gal-Oz</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Lesson learned]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[tidbits]]></category>

		<guid isPermaLink="false">http://arnon.me/?p=1124</guid>
		<description><![CDATA[
<p>Even though I mostly sit at work trying to look busy, every so often someone does stumbles into my office with a question or a problem so I&#8217;ve got to do something.</p>
<p>Interestingly enough, a lot of problems can be handled by some pretty basic stuff like like reminding people that a .jar/war file is a zip file and you can take a look inside for what&#8217;s there or what&#8217;s missing; or sending people to read the log files (turns out these buggers actually contain useful information) etc. &#8211; so now for today&#8217;s lesson: &#8220;It&#8217;s open source, so the source, you know, is open&#8230;&#8221;</p>
<p>We use a lot of open source projects at Nice (we&#8217;ve also, slowly, starting to give something back to the community but that&#8217;s another story). One of these is HBase, one of our devs was working on enabling ... <a href="http://arnon.me/2013/04/open-source-source-open/">Read More &#187;</a>]]></description>
				<content:encoded><![CDATA[<div class="aside"><img class="alignleft" alt="" src="http://upload.wikimedia.org/wikipedia/commons/thumb/4/42/Opensource.svg/500px-Opensource.svg.png" width="240" height="216" /></div>
<p>Even though I mostly sit at work <a href="http://www.youtube.com/watch?v=wC8PzhNuh7w">trying to look busy</a>, every so often someone does stumbles into my office with a question or a problem so I&#8217;ve got to do something.</p>
<p>Interestingly enough, a lot of problems can be handled by some pretty basic stuff like like reminding people that a .jar/war file is a zip file and you can take a look inside for what&#8217;s there or what&#8217;s missing; or sending people to read the log files (turns out these buggers actually contain useful information) etc. &#8211; so now for today&#8217;s lesson: &#8220;It&#8217;s open source, so the source, you know, is open&#8230;&#8221;</p>
<p>We use a lot of open source projects at <a href="http://www.nice.com">Nice</a> (we&#8217;ve also, slowly, <a href="https://github.com/nicesystems">starting to give something back to the community</a> but that&#8217;s another story). One of these is HBase, one of our devs was working on enabling and testing <a href="http://hbase.apache.org/book/compression.html">compression on HBase</a>. looking at the HBaseAdmin API (actually <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html#COMPRESSION_COMPACT">the column descriptor</a>) he saw there was the option for setting the compression of a column family and an option for setting compression of compaction. The question he came with was do I know how it behaves when you set one and not the other and how they work together.</p>
<p>Well I know about HBase compression but I didn&#8217;t hear about compaction compression and the documentation on this is, well, lacking. Luckily HBase is an open source project, so I took a peek. I started with <a href="http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.94.0/org/apache/hadoop/hbase/io/hfile/HFile.java/">hfile.java</a> which reads and writes HBase data to hadoop. well, it seems that the writer gets a compression algorithm as a parameter and that the reader gets the compression algorithm from the header. so essentially different hfiles can have different compressions and HBase will not care. We start to see the picture but to be sure we need to see where the compression is set. So we look in the regionserver&#8217;s <a href="http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.94.1/org/apache/hadoop/hbase/regionserver/Store.java?av=f">Store.java</a> file and we see :</p>
<script src="https://gist.github.com/d04a6a6a34d7c6c82e65.js?file=gistfile1.java"></script><noscript><p>View the code on <a href="https://gist.github.com/d04a6a6a34d7c6c82e65">Gist</a>.</p></noscript>
<p>Bottom line reading through HBase code I was able to understand exactly how the feature in question behaves and also get a better understanding of the internal workings of HBase (HFile descibe their own structure so different files can have different attributes like compression etc.)</p>
<p>Another example for how reading code can help is using Yammer&#8217;s monitoring library <a href="http://metrics.codahale.com/">metrics</a>. Building the monitoring solution for our platform we also collect JMX counters (like everybody else I guess :) ).So I stumbled upon metrics and the <a href="http://metrics.codahale.com/manual/core/">manual</a> did a good job of showing the different features and explaining why this is an interesting library. I asked one of our architects to POC it and see if it is a good fit. He tried but it so happens that it is rather hard to understand how to put everything together and actually use it just from the documentation. Luckily metrics code has unit tests (not all of it by the way, which is a shame, but at least enough of it) e.g. the following (taken from <a href="https://github.com/codahale/metrics/tree/master/metrics-jersey/src/test/java/com/yammer/metrics/jersey/tests">here</a>) that shows how to instrument a jersey service:</p>
<script src="https://gist.github.com/5283401.js"></script><noscript><p>View the code on <a href="https://gist.github.com/5283401">Gist</a>.</p></noscript>
<p>Again, we see that having the code available is a great benefit. You don&#8217;t have to rely on documentation being complete (something we all do so well, but those other people writing code don&#8217;t so, you know..) or hoping  for a good samaritan  to help you on stack overflow or some other forum. and that&#8217;s just from reading the code&#8230; imagine what you could do if you could actually offer fixes to problems you encounter but, oh wait, you can&#8230; </p>
<p>Ok, I think I&#8217;ve done enough for today, got to get back to trying to look busy</p>
<hr />
<p>illustration licensed under creative commons attribution 2.5 by <a href="http://opensource.org/">opensource.org</a></p>
<script src="http://feeds.feedburner.com/~s/arnonrgo?i=http://arnon.me/2013/04/open-source-source-open/" type="text/javascript" charset="utf-8"></script><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=bHxuKY4jMsQ:ZBOb4V2Gd-M:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=bHxuKY4jMsQ:ZBOb4V2Gd-M:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=bHxuKY4jMsQ:ZBOb4V2Gd-M:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=bHxuKY4jMsQ:ZBOb4V2Gd-M:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=bHxuKY4jMsQ:ZBOb4V2Gd-M:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=bHxuKY4jMsQ:ZBOb4V2Gd-M:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=bHxuKY4jMsQ:ZBOb4V2Gd-M:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=qj6IDK7rITs" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/CirrusMinor/~4/bHxuKY4jMsQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://arnon.me/2013/04/open-source-source-open/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://arnon.me/2013/04/open-source-source-open/</feedburner:origLink></item>
		<item>
		<title>Herding Apache Pig – using pig with perl and python</title>
		<link>http://feedproxy.google.com/~r/CirrusMinor/~3/yul7HNxhf0I/</link>
		<comments>http://arnon.me/2013/03/herding-apache-pig-pig-perl-python/#comments</comments>
		<pubDate>Mon, 04 Mar 2013 07:50:43 +0000</pubDate>
		<dc:creator>Arnon Rotem-Gal-Oz</dc:creator>
				<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Featured Posts]]></category>
		<category><![CDATA[Apache Pig]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hadoop streaming]]></category>
		<category><![CDATA[Perl]]></category>
		<category><![CDATA[pitfalls]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[UDF]]></category>

		<guid isPermaLink="false">http://arnon.me/?p=1102</guid>
		<description><![CDATA[
<p>the past week or so we got some new data that we had to process quickly . There are quite a few technologies out there to quickly churn map/reduce jobs on Hadoop (Cascading,  Hive,  Crunch, Jaql to name a few of many) , my personal favorite is Apache Pig.  I find that the imperative nature of pig makes it relatively easy to understand what&#8217;s going on and where the data is going and that it produces efficient enough map/reduces. On the down side pig lacks control structures so working with pig also mean you need to extend it with user defined functions (UDFs) or Hadoop streaming. Usually I use Java or Scala for writing UDFs but it is always nice to try something new so we decided to checkout some other technologies &#8211; namely perl and python. This post highlights some of ... <a href="http://arnon.me/2013/03/herding-apache-pig-pig-perl-python/">Read More &#187;</a>]]></description>
				<content:encoded><![CDATA[<div class="aside"><a href="http://arnon.me/wp-content/uploads/2012/04/Spider-Pig-spiderpig-236326_1024_768.jpg"><img class=" wp-image-611 alignleft" alt="duh" src="http://arnon.me/wp-content/uploads/2012/04/Spider-Pig-spiderpig-236326_1024_768-300x225.jpg" width="300" height="225" /></a></div>
<p>the past week or so we got some new data that we had to process quickly . There are quite a few technologies out there to quickly churn map/reduce jobs on Hadoop (<a href="http://www.cascading.org/">Cascading</a>,  <a href="http://hive.apache.org/">Hive</a>,  <a href="https://github.com/cloudera/crunch">Crunch</a>, <a href="https://code.google.com/p/jaql/">Jaql</a> to name a few of many) , my personal favorite is <a href="http://pig.apache.org/">Apache Pig</a>.  I find that the imperative nature of pig makes it relatively easy to understand what&#8217;s going on and where the data is going and that it produces efficient enough map/reduces. On the down side pig lacks control structures so working with pig also mean you need to extend it with user defined functions (UDFs) or<a href="http://wiki.apache.org/hadoop/HadoopStreaming"> Hadoop streaming</a>. Usually I use Java or Scala for writing UDFs but it is always nice to try something new so we decided to checkout some other technologies &#8211; namely perl and python. This post highlights some of the pitfalls we met and how to work around them.</p>
<p>&nbsp;</p>
<p>Yuval, who was working with me on this mini-project likes perl (to each his own, I suppose)  so we started with that. searching for pig and perl examples, we found something like the following</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
</pre></td><td class="code"><pre class="pig" style="font-family:monospace;">A = LOAD 'data';
B = STREAM A THROUGH `stream.pl`;</pre></td></tr></table></div>

<p>The first pitfall here is that the perl script name is surrounded by a backtick (the character on the tilde (~) key) and not a single quote (so in the script above  &#8217;data&#8217; is surrounded by single quotes and `stream.pl` is surrounded by backticks ).</p>
<p>The second pitfall was that the code above works nicely when you use pig in local mode (pig -x local) but it failed when we tried to run it on the cluster. It took some head scratching and some trial and error but eventually Yuval came with the following:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
</pre></td><td class="code"><pre class="pig" style="font-family:monospace;">DEFINE CMD `perl stream.pl` ship ('/PATH/stream.pl');
A = LOAD 'data'
B = STREAM A THROUGH CMD;</pre></td></tr></table></div>

<p>Basically we&#8217;re telling pig to copy the pig script to HDFS so that it would be accessible on all the nodes.</p>
<p>So,  perl worked pretty well, but since we&#8217;re using Hadoop Streaming and get the data via stdin we lose all the context of the data that pig knows. We also need to emulate the textual representations of bags and tuples so the returned data will be available to pig for further work. This is all workable but not fun to work with (in my opinion anyway).</p>
<p>I decided to write pig UDFs in python. python can be used with Apache streaming, like perl above, but it also integrates more tightly with Pig via jython (i.e the python UDF is compiled  into java and ships to the cluster as part of the jar pig generates for the map/reduce anyway).</p>
<p>Pig UDFs are better than streaming as you get Pig&#8217;s schema for the parameters and you can tell Pig the schema you return for your output. UDFs in python are especially nice as the code is almost 100% regular python and Pig does the mapping for you (for instance a bag of tuples in pig is translated to a list of tuples in python etc.). Actually the only difference is that if you want Pig to know about the data types you return from the python code you need to annotate the method with @outputSchema  e.g. a simple UDF that gets the month as an int from a date string in the format YYYY-MM-DD HH:MM:SS</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">@</span>outputSchema<span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;num:int&quot;</span><span style="color: black;">&#41;</span>
<span style="color: #ff7700;font-weight:bold;">def</span> getMonth<span style="color: black;">&#40;</span>strDate<span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">try</span>:
        dt<span style="color: #66cc66;">,</span> _<span style="color: #66cc66;">,</span> _ <span style="color: #66cc66;">=</span> strDate.<span style="color: black;">partition</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;.&quot;</span><span style="color: black;">&#41;</span>
        <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #dc143c;">datetime</span>.<span style="color: black;">strptime</span><span style="color: black;">&#40;</span>dt<span style="color: #66cc66;">,</span> <span style="color: #483d8b;">&quot;%Y-%m-%d %H:%M:%S&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">month</span>
    <span style="color: #ff7700;font-weight:bold;">except</span> <span style="color: #008000;">AttributeError</span>:
        <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #ff4500;">0</span>
    <span style="color: #ff7700;font-weight:bold;">except</span> <span style="color: #008000;">IndexError</span>:
        <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #ff4500;">0</span>
    <span style="color: #ff7700;font-weight:bold;">except</span> <span style="color: #008000;">ValueError</span>:
        <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #ff4500;">0</span></pre></td></tr></table></div>

<p>Using the PDF is as simple as declaring the python file where the UDF is defined. Assuming our UDF is ina a file called utils.py, it would be declared as follows:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
</pre></td><td class="code"><pre class="pig" style="font-family:monospace;">Register utils.py using jython as utils;</pre></td></tr></table></div>

<p>And then using that UDF would go something like:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
</pre></td><td class="code"><pre class="pig" style="font-family:monospace;">A = LOAD 'data' using PigStorage('|') as (dateString:chararray);
B = FOREACH A GENERATE utils.getMonth(dateString) as month;</pre></td></tr></table></div>

<p>Again, like in the perl case there are a few pitfalls here. for one the python script and the pig script need to be in the same directory (relative paths only work in in the local mode). The more annoying pitfall hit me when I wanted to import some python libs (e.g. datetime in the example which is imported using &#8220;from datetime import datetime&#8221;). There was no way I could come up with to make this work. The solution I did come up with eventually was to take a jyhton standalone .jar (a jar with a the common python libraries included) and replace Pig&#8217;s jython Jar (in the pig lib directory) with the stanalone one. There&#8217;s probably a nicer way to do this (and I&#8217;d be happy to hear about it) but this worked for me. It only has to be done on the machine where you run the pig script as the python code gets compiled and shipped to the cluster as part of the jar file Pig generates anyway.</p>
<p>Working with Pig and python has been really nice. I liked writing pig UDFs in python much more than writing them in Java or Scala for that matter. The two main reasons for that is that a lot of java cruft for integrating with pig is just not there so I can focus on just solving the business problem and the other reason is that with both  Pig and Python being &#8220;scripts&#8221; the feedback loop from making a change to seing it work is much shorter. Anyway, Pig also supports Javascript and Ruby UDFs but these would have to wait for next time :)</p>
<script src="http://feeds.feedburner.com/~s/arnonrgo?i=http://arnon.me/2013/03/herding-apache-pig-pig-perl-python/" type="text/javascript" charset="utf-8"></script><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=yul7HNxhf0I:X92lIHA_xBE:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=yul7HNxhf0I:X92lIHA_xBE:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=yul7HNxhf0I:X92lIHA_xBE:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=yul7HNxhf0I:X92lIHA_xBE:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=yul7HNxhf0I:X92lIHA_xBE:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=yul7HNxhf0I:X92lIHA_xBE:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=yul7HNxhf0I:X92lIHA_xBE:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=qj6IDK7rITs" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/CirrusMinor/~4/yul7HNxhf0I" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://arnon.me/2013/03/herding-apache-pig-pig-perl-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://arnon.me/2013/03/herding-apache-pig-pig-perl-python/</feedburner:origLink></item>
		<item>
		<title>The Saga pattern and that architecture vs. design thing</title>
		<link>http://feedproxy.google.com/~r/CirrusMinor/~3/nWoUUgnsZ0Y/</link>
		<comments>http://arnon.me/2013/01/saga-pattern-architecture-design/#comments</comments>
		<pubDate>Wed, 23 Jan 2013 22:43:57 +0000</pubDate>
		<dc:creator>Arnon Rotem-Gal-Oz</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Featured Posts]]></category>
		<category><![CDATA[SOA Patterns]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[book]]></category>
		<category><![CDATA[Design]]></category>
		<category><![CDATA[saga]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[soa patterns]]></category>

		<guid isPermaLink="false">http://arnon.me/?p=1078</guid>
		<description><![CDATA[
<p>It has been few months since SOA Patterns was published and so far the book sold somewhere between 2K-3K copies which I guess is not bad for an unknown author &#8211; so first off, thanks to all of you who bought a copy (by the way, if you found the book useful I&#8217;d be grateful if you could also rate it on Amazon so that others would know about it too)</p>
<p>I know at least a few of you actually read the book as from time to time I get questions about it :). Not all the questions are interesting to &#8220;the general public&#8221; but some are. One interesting question I got is about the so called &#8220;Canonical schema pattern&#8220;. I have a post in the making (for too long now,sorry about that Bill) that explains why I don&#8217;t consider it ... <a href="http://arnon.me/2013/01/saga-pattern-architecture-design/">Read More &#187;</a>]]></description>
				<content:encoded><![CDATA[<div class="aside"><a href="http://arnon.me/2013/01/saga-pattern-architecture-design/oslo_school_of_architecture_and_design/" rel="attachment wp-att-1095"><img class=" wp-image-1095 alignleft" alt="Oslo_school_of_architecture_and_design" src="http://arnon.me/wp-content/uploads/2013/01/Oslo_school_of_architecture_and_design-300x199.jpg" width="300" height="199" /></a></div>
<p>It has been few months since <a href="http://www.manning.com/rotem/">SOA Patterns</a> was published and so far the book sold somewhere between 2K-3K copies which I guess is not bad for an unknown author &#8211; so first off, thanks to all of you who bought a copy (by the way, if you found the book useful I&#8217;d be grateful if you could also <a href="http://www.amazon.com/dp/1933988266">rate it on Amazon</a> so that others would know about it too)</p>
<p>I know at least a few of you actually read the book as from time to time I get questions about it :). Not all the questions are interesting to &#8220;the general public&#8221; but some are. One interesting question I got is about the so called &#8220;<a href="http://en.wikipedia.org/wiki/Canonical_schema_pattern">Canonical schema pattern</a>&#8220;. I have a post in the making (for too long now,sorry about that Bill) that explains why I don&#8217;t consider it a pattern and why I think it verges on being an anti-pattern. Another question I got more recently, which is also the subject of this post, was about the <a href="arnon.me/soa-patterns/saga">Saga pattern</a>.<br />
Here is (most of) the email I got from Ashic :</p>
<blockquote><p>&#8220;Garcia-Molina&#8217;s paper focuses on failure management and compensation so as to prevent partial success. It discusses a variety of approaches &#8211; with an SEC, with application code outside of the database, backward-forward and even forward-only (the latter having no &#8220;compensate&#8221; step per activity, rather a forward flow that takes care of the partial success). Nowadays, I see two viewpoints regarding sagas:</p>
<p>1. People calling process managers sagas, which is obviously incorrect. [e.g. NServiceBus "sagas".]<br />
2. People focusing very strongly on a &#8220;context&#8221; of work, whereby the context gets passed around from activity to activity. For linear up front workflows, routing slips are an easy solution. An example of this can be found at Clemens&#8217;s post here: http://vasters.com/clemensv/2012/09/01/Sagas.aspx . For more complicated workflows, graph-like slips may be used.</p>
<p>After discussing with some enthusiasts, they seem very keen to suggest that the context has to move along. They seem to reject the notion of a saga where a central coordinator controls the process. In other words, even if a process manager takes care of only routing messages, and that routing includes compensations to alleviate partial successes, they are unwilling (sometimes vehemently) to call that a saga. They acknowledge it can be useful, but say that is not a saga. I find this to be confusing. In this case the process manager acts as the SEC would in a Garcia-Molina saga capable database. This approach still allows interleaved transactions (or steps) without a global lock. Why would this not be a saga?</p>
<p>In your book, I did see you mentioned orchestration as a way of implementing sagas. However, when this was brought up, the proponents of point 2 suggest that that is not what you really mean. To me it seems quite clear, and it aligns with Hector&#8217;s paper. I just want to make sure I have this right. I&#8217;d love your thoughts on this.&#8221;</p></blockquote>
<p>Let&#8217;s start with the answer to the question:</p>
<p>When I think about the Saga pattern I see it as the application of the notions in the Garcia-Molina paper (which talked about databases) to SOA. In other words, I see sagas as the notion of getting distributed agreement of a process with reduced guarantees (vs. distributed transactions that propose ACID guarantees across systems). &#8211; So,basically, a Saga is loose transaction-like flow where, in case of failures, involved services perform compensation steps (which may be nothing, a complete undo or something else entirely). The Saga pattern can augment this process with temporary promises (which I call <a href="http://arnon.me/soa-patterns/reservation/">reservations</a>).</p>
<p>Under this definition both centrally managed processes and a &#8220;choreographed&#8221; processes are Sagas &#8211; as long as the semantics and intent mentioned above are kept. The centrally managed orchestration provides visibility of processes, ease of management etc; The cooperative event based, context shared sagas provide flexibility and allow serendipity of new processes; Both have merit and both have a place, at least in my opinion :)</p>
<p>The main reason both of these, very different, approaches are valid designs and implementations for the Saga pattern is that the Saga pattern (like others in the book) is an Architectural pattern and not a Design pattern. Which brings us to the second reason for this post, the difference between &#8220;Architecture&#8221; and &#8220;Design&#8221;. In a nutshell, architecture is a type of design where the focus is quality attributes and wide(er) scope whereas design focuses on functional requirements and more localized concerns.</p>
<p>The Saga pattern is an architectural pattern that focused on the integrity reliability quality attributes and it pertains to the communication patterns between services. When it comes to design the implementation of the pattern. you need to decide how to implement the concerns and roles defined in the pattern -e.g. controlling the flow and the status of the saga. One decision can be to implement it centrally and use orchestration another decision can be to decentralize it and use context&#8230;</p>
<p>Design decision can be very meaningful sometimes it can be hard to find what&#8217;s left of the architecture &#8211; consider for example the whole idea behind blogging and RSS feeds. The architectural notion is a publish/subscribe system where the blog writer publish an &#8220;event&#8221; (a new post) and subscribers get a copy. When it came to design and implementation, considering it was implemented on top of HTTP and REST where there is no publish/subscribe capability it was actually designed as a pull system where the publisher provides a list of recent changes (the feed) and subscribers sample it and check if anything changed since the last time. So architecturally pub/sub, design pull a centralized server that exposes latest changes &#8211; a really big difference</p>
<p>Does it matter at all? I think yes. Architecture lets us think about the system at a higher level of abstraction and thus tackle more complex systems. When we design and focus on more local issues we can tackle the nitty gritty details and make sure things actually work. we need to check the effects of design on architecture and vice versa to make sure the whole thing sticks together and actually does what we want/need.</p>
<p>Note that architecture and design are not the complete story &#8211; another variable is the technology (e.g. HTTP in the example above) which affects the design decision and thus also the architecture (you can read a little more about it in my <a href="arnon.me/saf">posts on SAF</a>)</p>
<hr />
<p>illustration by <a href="http://www.flickr.com/photos/boedker/3872055286/">Mads Boedker</a></p>
<script src="http://feeds.feedburner.com/~s/arnonrgo?i=http://arnon.me/2013/01/saga-pattern-architecture-design/" type="text/javascript" charset="utf-8"></script><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=nWoUUgnsZ0Y:x8aPcSK4ecM:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=nWoUUgnsZ0Y:x8aPcSK4ecM:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=nWoUUgnsZ0Y:x8aPcSK4ecM:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=nWoUUgnsZ0Y:x8aPcSK4ecM:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=nWoUUgnsZ0Y:x8aPcSK4ecM:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=nWoUUgnsZ0Y:x8aPcSK4ecM:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=nWoUUgnsZ0Y:x8aPcSK4ecM:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=qj6IDK7rITs" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/CirrusMinor/~4/nWoUUgnsZ0Y" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://arnon.me/2013/01/saga-pattern-architecture-design/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://arnon.me/2013/01/saga-pattern-architecture-design/</feedburner:origLink></item>
		<item>
		<title>Killing the HBase zombie table</title>
		<link>http://feedproxy.google.com/~r/CirrusMinor/~3/MkH9dvEwQJo/</link>
		<comments>http://arnon.me/2013/01/killing-hbase-zombie-table/#comments</comments>
		<pubDate>Tue, 15 Jan 2013 21:22:24 +0000</pubDate>
		<dc:creator>Arnon Rotem-Gal-Oz</dc:creator>
				<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[hbase]]></category>
		<category><![CDATA[Lesson learned]]></category>
		<category><![CDATA[troublshooting]]></category>
		<category><![CDATA[ZooKeeper]]></category>

		<guid isPermaLink="false">http://arnon.me/?p=1068</guid>
		<description><![CDATA[
<p>One of our team leaders approached me in the hall today and asked if I could land a hand in troubleshooting something. He and our QA lead were configuring one of our test Hadoop clusters after an upgrade and they had a problem with one table they were trying to set up:</p>

When they tried to create the table in HBase shell they got an error that the table exists
When they tried to delete the table they got an error that the table does not exist
HBase ships with a health-check and fix util called hbck (use: hbase hbck to run. see here for details) &#8211; they&#8217;ve run hbase reports everything is fine and dandy

<p>Hmm, The first thing I tied to do is to look at the .META. table. This is where HBase keeps the tables and the regions they use. I ... <a href="http://arnon.me/2013/01/killing-hbase-zombie-table/">Read More &#187;</a>]]></description>
				<content:encoded><![CDATA[<div class="aside"><a href="http://arnon.me/2013/01/killing-hbase-zombie-table/zombie/" rel="attachment wp-att-1069"><img class="size-full wp-image-1069 alignleft" alt="zombie" src="http://arnon.me/wp-content/uploads/2013/01/zombie.jpg" width="240" height="240" /></a></div>
<p>One of our team leaders approached me in the hall today and asked if I could land a hand in troubleshooting something. He and our QA lead were configuring one of our test Hadoop clusters after an upgrade and they had a problem with one table they were trying to set up:</p>
<ul>
<li>When they tried to create the table in HBase shell they got an error that the table exists</li>
<li>When they tried to delete the table they got an error that the table does not exist</li>
<li>HBase ships with a health-check and fix util called hbck (use: <em>hbase hbck</em> to run. see <a style="font-size: 13px; line-height: 19px;" href="http://hbase.apache.org/book/hbck.in.depth.html">here</a> for details) &#8211; they&#8217;ve run hbase reports everything is fine and dandy</li>
</ul>
<p>Hmm, The first thing I tied to do is to look at the .META. table. This is where HBase keeps the tables and the regions they use. I thought maybe there was some just there. but it didn&#8217;t look like that. I tried to do a major compaction for it and that didn&#8217;t help either.</p>
<p>The next thing I tried actually found the problem. I ran the Zookeeper client (I used <em>hbase zkcli </em>but you can also run it via zookeeper scripts) and looked at /hbase/table (<em>ls /hbase table</em>) -the zombie table was listed right there with all the legit tables. HBase stores some data schema and state of each table in zookeeper to be able to coordinate between all the regionservers and it seems that during the upgrade process the system was restarted a few times. One of these restarts coincided with a removal of the table and caught it in the middle.</p>
<p>Ok, so that is the problem &#8211; what&#8217;s the solution? Simple just remove the offending znode from zookeeper (<em>rmr /hbase/table/TABLE_NAME ) </em>and restart the cluster (since the data is cached in the regionservers/hbase master to save trips to zookeeper). Also be careful not to remove any other node or you&#8217;d cause problems to other tables.</p>
<p>The role of ZooKeeper in HBase is not documented very well. The only online <a href="http://wiki.apache.org/hadoop/ZooKeeper/HBaseUseCases">account of ZooKeeper&#8217;s role with HBase</a> I found (save looking at the code itself of course) is really outdated. Hopefully this post will save some head scratching and time for others who find themselves with the same problem.</p>
<p>Anyway, I hope the next post I&#8217;ll do on <a href="http://zookeeper.apache.org/">ZooKeeper</a> will be about something <a href="https://github.com/NiceSystems/zcache">much nicer </a> :)<a style="font-size: 13px; line-height: 19px;" href="https://github.com/NiceSystems/zcache"> </a></p>
<hr />
illustration by <a href="http://www.flickr.com/photos/jamesrdoe/6282892279/">jamesrdoe</a> </p>
<script src="http://feeds.feedburner.com/~s/arnonrgo?i=http://arnon.me/2013/01/killing-hbase-zombie-table/" type="text/javascript" charset="utf-8"></script><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=MkH9dvEwQJo:QSvP-rL13VE:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=MkH9dvEwQJo:QSvP-rL13VE:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=MkH9dvEwQJo:QSvP-rL13VE:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=MkH9dvEwQJo:QSvP-rL13VE:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=MkH9dvEwQJo:QSvP-rL13VE:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=MkH9dvEwQJo:QSvP-rL13VE:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=MkH9dvEwQJo:QSvP-rL13VE:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=qj6IDK7rITs" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/CirrusMinor/~4/MkH9dvEwQJo" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://arnon.me/2013/01/killing-hbase-zombie-table/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://arnon.me/2013/01/killing-hbase-zombie-table/</feedburner:origLink></item>
		<item>
		<title>Introducing H-Rider</title>
		<link>http://feedproxy.google.com/~r/CirrusMinor/~3/5KVG7R-cO0U/</link>
		<comments>http://arnon.me/2012/12/introducing-hrider/#comments</comments>
		<pubDate>Sun, 09 Dec 2012 22:45:08 +0000</pubDate>
		<dc:creator>Arnon Rotem-Gal-Oz</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[hbase]]></category>
		<category><![CDATA[Nice systems]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://arnon.me/?p=1057</guid>
		<description><![CDATA[<p>In the last year and half or so (since I  joined Nice Systems ) we&#8217;ve been hard at work building our big data platform based on a  lot of open source technologies including Hadoop and HBase and quite a few others. Building on open source brings a lot of benefits and helps cut development time by building on the knowledge and effort of other.</p>
<p>I personally think that this has to be  two-way street and as a company benefits for open source it should also give something back. This is why I am very happy to introduce Nice&#8217;s first (hopefully first of many) contribution back to the open source community. A UI dev tool for working with HBase called h-rider. H-rider offers a convenient user interface to poke around data stored in HBase which our developers find very useful both for development and debugging  </p>
<p>h-rider ... <a href="http://arnon.me/2012/12/introducing-hrider/">Read More &#187;</a>]]></description>
				<content:encoded><![CDATA[<p>In the last year and half or so (since I  joined <a href="http://www.nice.com/">Nice Systems</a> ) we&#8217;ve been hard at work building our big data platform based on a  lot of open source technologies including Hadoop and HBase and quite a few others. Building on open source brings a lot of benefits and helps cut development time by building on the knowledge and effort of other.</p>
<p>I personally think that this has to be  two-way street and as a company benefits for open source it should also give something back. <span style="font-size: 13px; line-height: 19px;">This is why I am very happy to introduce Nice&#8217;s first (hopefully first of many) contribution back to the open source community. </span><span style="font-size: 13px; line-height: 19px;">A UI dev tool for working with HBase called h-rider. H-rider offers a convenient user interface to poke around data stored in HBase which our developers find very useful both for development and debugging  </span></p>
<p>h-rider is developer and maintained by <strong>Igor Cher</strong>, one of the best developers</p>
<p>Here is some more blurb from the <a href="https://github.com/NiceSystems/hrider/wiki">hrider git wiki</a>:</p>
<h3>What is h-rider?</h3>
<p>The h-rider is a UI application created to provide an easier way to view or manipulate the data saved in the distributed database - <a href="http://hbase.apache.org/">HBase™</a> - that supports structured data storage for large tables.</p>
<h3>Getting started</h3>
<p>To get started, begin here:</p>
<ol>
<li><a href="https://github.com/NiceSystems/hrider/wiki/The-hrider-manual">Learn about</a> h-rider by reading the manual.</li>
<li><a href="https://github.com/NiceSystems/hrider/wiki/The-hrider-releases">Download</a> h-rider from the release page (or git clone from the <a href="https://github.com/NiceSystems/hrider">repo</a>)
</li>
</ol>
<p><a href="http://arnon.me/wp-content/uploads/2012/12/h-rider.png"><img src="http://arnon.me/wp-content/uploads/2012/12/h-rider-300x145.png" alt="" title="h-rider" width="300" height="145" class="aligncenter size-medium wp-image-1058" /></a></p>
<script src="http://feeds.feedburner.com/~s/arnonrgo?i=http://arnon.me/2012/12/introducing-hrider/" type="text/javascript" charset="utf-8"></script><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=5KVG7R-cO0U:AeZMNYTLjtA:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=5KVG7R-cO0U:AeZMNYTLjtA:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=5KVG7R-cO0U:AeZMNYTLjtA:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=5KVG7R-cO0U:AeZMNYTLjtA:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=5KVG7R-cO0U:AeZMNYTLjtA:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=5KVG7R-cO0U:AeZMNYTLjtA:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=5KVG7R-cO0U:AeZMNYTLjtA:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=qj6IDK7rITs" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/CirrusMinor/~4/5KVG7R-cO0U" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://arnon.me/2012/12/introducing-hrider/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://arnon.me/2012/12/introducing-hrider/</feedburner:origLink></item>
		<item>
		<title>Ignorance I tell you, it is all ignorance.</title>
		<link>http://feedproxy.google.com/~r/CirrusMinor/~3/smLaC2l4ZSs/</link>
		<comments>http://arnon.me/2012/11/ignorance/#comments</comments>
		<pubDate>Mon, 19 Nov 2012 07:00:19 +0000</pubDate>
		<dc:creator>Arnon Rotem-Gal-Oz</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Design]]></category>
		<category><![CDATA[lessons learned]]></category>

		<guid isPermaLink="false">http://arnon.me/?p=1044</guid>
		<description><![CDATA[
<p>I was poking around my old blog (rgoarchitects.com) and I found this post from 2007 which I think is worth re-iterating:</p>
<p>In a post called &#8220;Ignorance vs. Negligence&#8220;, Ayende blows some steam off on some of the so called &#8220;professionals&#8221; that he met along the way. You know &#8230;those with a fancy title that don&#8217;t know jack and design some of the nightmares we see from time to time. I&#8217;ve seen this phenomena in a lot of projects I consulted/reviewed:</p>

The senior security expert who recommended something which isn&#8217;t supported by the platform
The senior architect who throw the system down to hell by basing all the system on a clunky asynchronous solutions that should only be used by a tiny portion of the application.
The geniuses that built this wonderful code generator that generated code with so many dependencies and singletons that made ... <a href="http://arnon.me/2012/11/ignorance/">Read More &#187;</a>]]></description>
				<content:encoded><![CDATA[<div class="aside"><img class="alignleft" title="bliss" src="http://farm1.staticflickr.com/25/35198403_2c4f9921bc.jpg" alt="" width="180" height="135" /></div>
<p>I was poking around my old blog (rgoarchitects.com) and I found this post from 2007 which I think is worth re-iterating:</p>
<p>In a post called &#8220;<a href="http://ayende.com/blog/2955/ignorance-vs-negligence">Ignorance vs. Negligence</a>&#8220;, Ayende blows some steam off on some of the so called &#8220;professionals&#8221; that he met along the way. You know &#8230;those with a fancy title that don&#8217;t know jack and design some of the nightmares we see from time to time. I&#8217;ve seen this phenomena in a lot of projects I consulted/reviewed:</p>
<ul>
<li><span style="font-size: 13px; line-height: 19px;">The senior security expert who recommended something which isn&#8217;t supported by the platform</span></li>
<li><span style="font-size: 13px; line-height: 19px;">The senior architect who throw the system down to hell by basing all the system on a clunky asynchronous solutions that should only be used by a tiny portion of the application.</span></li>
<li><span style="font-size: 13px; line-height: 19px;">The geniuses that built this wonderful code generator that generated code with so many dependencies and singletons that made the solution unusable</span></li>
<li><span style="font-size: 13px; line-height: 19px;">The chief architect that created this wonderful performance hog, and then kept poking around to make sure we don&#8217;t fix it too much.</span></li>
<li><span style="font-size: 13px; line-height: 19px;">The architect that partitioned a distributed solution based on functions &#8211; so that each and every business process has visit go through all the tiers and components. The solution made the everything more complicated by few orders of magnitude (scale, synchronization, availability, performance what not)</span></li>
<li><span style="font-size: 13px; line-height: 19px;">The architect that designed his own distributed transaction mechanism (basically duplicating COM+) &#8211; naturally with less than satisfactory results&#8230;</span></li>
</ul>
<p>Etc.<br />
Ayende says<br />
&#8220;They all have a few things in common, they represent themselves as experts, senior, knowledgeable people. In all those cases, they have actively acted to harm the business they were working for, by action, inaction or misaction</p>
<p>I have no issue with people not knowing any better, but I do expect people that ought to know better to&#8230; actually do know better.&#8221;</p>
<p>I don&#8217;t think that this is negligence involved here- I think all of these people <strong>wanted</strong> to do the right thing, they probably <strong>believed</strong> they were right. They were probably also pretty good at their past jobs that led them to the current position.</p>
<p>What they didn&#8217;t learn is to <strong>&#8220;know that they don&#8217;t know&#8221;</strong>. This is a hard lesson to learn. I <del>think</del> hope I learned my lesson after the first time I tried to distribute a (naive) solution I was so proud of. Well, at least, I stayed around for enough time to both see the results and learn how to fix the problem.<br />
I think not staying around for enough time is one of reasons for this ignorance &#8211; since on the onset things usually look good enough and seem to work. If by the time the problem rears its ugly head you&#8217;ve already moved on to a new shiny job, you&#8217;d missed the opportunity to learn form your mistakes.</p>
<p>Another cause for ignorance is not looking around and learning only from your experience. For instance, I am now interviewing a lot of people, and when I ask a question like &#8220;tell me about something interesting you recently read- a book, an article, a blog anything&#8221; &#8211; I usually get blank stares. Few people tell me about an article they read that is related to a problem they had, and fewer still tell me about something without a direct relation to their work. If you don&#8217;t look beyond the keyboard you will never know better. learning only from your mistakes can be problematic &#8211; especially if we also consider the previous point (people don&#8217;t stay around)</p>
<p>Ignorance is bliss they say, maybe so &#8211; but I think ignorance has a lot to do with the crappy systems we see all around us and its one of the reasons writing software stays more of an art than a science or craft.</p>
<hr />
<p>illustration by <a href="http://www.flickr.com/photos/parl/35198403/">parl</a></p>
<script src="http://feeds.feedburner.com/~s/arnonrgo?i=http://arnon.me/2012/11/ignorance/" type="text/javascript" charset="utf-8"></script><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=smLaC2l4ZSs:a_gwF4ApVkY:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=smLaC2l4ZSs:a_gwF4ApVkY:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=smLaC2l4ZSs:a_gwF4ApVkY:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=smLaC2l4ZSs:a_gwF4ApVkY:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=smLaC2l4ZSs:a_gwF4ApVkY:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=smLaC2l4ZSs:a_gwF4ApVkY:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=smLaC2l4ZSs:a_gwF4ApVkY:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=qj6IDK7rITs" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/CirrusMinor/~4/smLaC2l4ZSs" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://arnon.me/2012/11/ignorance/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		<feedburner:origLink>http://arnon.me/2012/11/ignorance/</feedburner:origLink></item>
		<item>
		<title>The NoSQL landscape in diagrams</title>
		<link>http://feedproxy.google.com/~r/CirrusMinor/~3/HKzIkWHKVJs/</link>
		<comments>http://arnon.me/2012/11/nosql-landscape-diagrams/#comments</comments>
		<pubDate>Sat, 03 Nov 2012 14:51:33 +0000</pubDate>
		<dc:creator>Arnon Rotem-Gal-Oz</dc:creator>
				<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[NoSql]]></category>

		<guid isPermaLink="false">http://arnon.me/?p=1039</guid>
		<description><![CDATA[<p>Here&#8217;s the NoSQL landscape in 3 slides (and hey, at least mine looks different :) )</p>
<p>451 research published their view of the NoSql/NewSql world in a unified diagram.</p>
<p></p>
<p>Infochimps published a similar diagram</p>
<p></p>
<p>And here&#8217;s mine from SOA Patterns chapter 10 (discussing &#8220;SOA &#38; big data&#8221;)</p>
<p></p>
<p>&#160;</p>
<p>&#160;</p>
]]></description>
				<content:encoded><![CDATA[<p>Here&#8217;s the NoSQL landscape in 3 slides (and hey, at least mine looks different :) )</p>
<p><a href="http://blogs.the451group.com/information_management/2012/11/02/updated-database-landscape-graphic">451 research published their view of the NoSql/NewSql</a> world in a unified diagram.</p>
<p><img class="alignnone" title="451 Research NoSQL" src="http://blogs.the451group.com/information_management/files/2012/11/DB-landscape.jpg" alt="" width="720" height="540" /></p>
<p><a href="http://techcrunch.com/2012/10/27/big-data-right-now-five-trendy-open-source-technologies/">Infochimps published a similar diagram</a></p>
<p><img class="alignnone" title="Infochimps NoSQL landscape" src="https://lh4.googleusercontent.com/tw1CEKK-Fp73oFe2Lw41_cS3jUI4vq0-Mw6AfVjpWhfsZeoiRjBUcUbc697P_qhdEsFnHTR5iRiLCO2uKnuBDJBXuv9XcsQb9Lfktk5zWBvdNgQRNWI" alt="" width="720" height="540" /></p>
<p>And here&#8217;s mine from <a title="SOA Patterns" href="http://arnon.me/soa-patterns/">SOA Patterns</a> chapter 10 (discussing &#8220;SOA &amp; big data&#8221;)</p>
<p><a href="http://arnon.me/wp-content/uploads/2012/11/noSQL.png"><img class="alignleft  wp-image-1040" title="noSQL" src="http://arnon.me/wp-content/uploads/2012/11/noSQL.png" alt="" width="720" height="540" /></a></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<script src="http://feeds.feedburner.com/~s/arnonrgo?i=http://arnon.me/2012/11/nosql-landscape-diagrams/" type="text/javascript" charset="utf-8"></script><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=HKzIkWHKVJs:v_wC7TEw8aU:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=HKzIkWHKVJs:v_wC7TEw8aU:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=HKzIkWHKVJs:v_wC7TEw8aU:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=HKzIkWHKVJs:v_wC7TEw8aU:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=HKzIkWHKVJs:v_wC7TEw8aU:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=HKzIkWHKVJs:v_wC7TEw8aU:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=HKzIkWHKVJs:v_wC7TEw8aU:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=qj6IDK7rITs" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/CirrusMinor/~4/HKzIkWHKVJs" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://arnon.me/2012/11/nosql-landscape-diagrams/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://arnon.me/2012/11/nosql-landscape-diagrams/</feedburner:origLink></item>
		<item>
		<title>SOA &amp; Big Data</title>
		<link>http://feedproxy.google.com/~r/CirrusMinor/~3/xGxLrQxI5b4/</link>
		<comments>http://arnon.me/2012/10/soa-big-data/#comments</comments>
		<pubDate>Thu, 11 Oct 2012 19:27:50 +0000</pubDate>
		<dc:creator>Arnon Rotem-Gal-Oz</dc:creator>
				<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[SOA Patterns]]></category>
		<category><![CDATA[presentations]]></category>
		<category><![CDATA[Service Oriented Architecture]]></category>
		<category><![CDATA[SOA]]></category>

		<guid isPermaLink="false">http://arnon.me/?p=1031</guid>
		<description><![CDATA[<p>I gave a presentation of SOA and big data in IGTCloud forum</p>

]]></description>
				<content:encoded><![CDATA[<p>I gave a presentation of SOA and big data in <a href="http://www.meetup.com/IGTCloud/">IGTCloud forum</a></p>
<iframe src="http://www.slideshare.net/slideshow/embed_code/14689528" width="400" height="337" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"></iframe><br/><br/>
<script src="http://feeds.feedburner.com/~s/arnonrgo?i=http://arnon.me/2012/10/soa-big-data/" type="text/javascript" charset="utf-8"></script><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=xGxLrQxI5b4:sJDa1UWQmuY:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=xGxLrQxI5b4:sJDa1UWQmuY:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=xGxLrQxI5b4:sJDa1UWQmuY:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=xGxLrQxI5b4:sJDa1UWQmuY:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?i=xGxLrQxI5b4:sJDa1UWQmuY:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=xGxLrQxI5b4:sJDa1UWQmuY:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/CirrusMinor?a=xGxLrQxI5b4:sJDa1UWQmuY:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/CirrusMinor?d=qj6IDK7rITs" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/CirrusMinor/~4/xGxLrQxI5b4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://arnon.me/2012/10/soa-big-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://arnon.me/2012/10/soa-big-data/</feedburner:origLink></item>
	</channel>
</rss>
