<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom">
 
 <title>Marginally Interesting by Mikio L. Braun</title>
 
 <link href="http://blog.mikiobraun.de/" />
 <updated>2013-05-21T15:51:35+02:00</updated>
 <id>http://blog.mikiobraun.de/</id>
 <author>
   <name>Mikio L. Braun</name>
   <uri>http://mikiobraun.de/</uri>
   <email>mikiobraun@gmail.com</email>
 </author>
 
 
   <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/MarginallyInteresting" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="marginallyinteresting" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
   <title type="html">A Trip to Silicon Valley: First impressions</title>
   <link href="http://blog.mikiobraun.de/2013/05/silicon-valley-trip.html" />
   <updated>2013-05-21T15:41:00+02:00</updated>
   <published>2013-05-21T15:41:00+02:00</published>
   <author>
     <name>Mikio L. Braun</name>
     <uri>http://mikiobraun.de</uri>
     <email>mikiobraun@gmail.com</email>
   </author>
   <id>http://blog.mikiobraun.de/2013/05/silicon-valley-trip</id>
   <content type="html">&lt;p&gt;&lt;a title='Exit from the 101 to University Avenue in Palo Alto.' href='/images/bay-area/univ-ave.jpg' rel='group' class='fancybox teaser-pic'&gt;&lt;img src='/images/bay-area/t-univ-ave.jpg' /&gt;&lt;/a&gt;
&lt;p&gt;At the end of April, &lt;a href='http://twitter.com/thinkberg'&gt;Leo&lt;/a&gt; and I went on a one week trip to the Valley. Over the years, we had built up a number of connections in the Valley and we thought that now was the time to go over there and meet face to face. We end up having 20 meetings over 6 days, which made for quite a schedule.&lt;/p&gt;

&lt;p&gt;It&amp;#8217;s one thing to know abstractly that the Silicon Valley is home to most computer related companies, and to drive down Highway 101 and see another well known company every 30 seconds or so. &amp;#8220;Oh look, there&amp;#8217;s Evernote&amp;#8221;&amp;#8212;&amp;#8220;There&amp;#8217;s Intel&amp;#8221;&amp;#8212;&amp;#8220;I think I just saw Salesforce&amp;#8221;, and so on.&lt;/p&gt;

&lt;p&gt;The situation gets even more intense once you get to San Francisco. Particularly SoMa, an area probably 2 times 4 kilometers wide, seems to host every Internet startup you have ever heard of, and some bigger ones, too. &lt;a href='http://twitter.com/'&gt;Twitter&lt;/a&gt;, &lt;a href='http://trulia.com/'&gt;Trulia&lt;/a&gt;, &lt;a href='http://flurry.com/'&gt;Flurry&lt;/a&gt;, &lt;a href='http://dropbox.com/'&gt;Dropbox&lt;/a&gt;, &lt;a href='http://zendesk.com/'&gt;Zendesk&lt;/a&gt;, etc. are all in that area. It&amp;#8217;s as if the whole Internet industry has their offices in Berlin Mitte in the area between Unter den Linden and Leipziger Straße.&lt;/p&gt;
&lt;div class='centered'&gt;
&lt;a title='Paris Baguette in Palo Alto' href='/images/bay-area/paris-baguette.jpg' rel='gallery' class='fancybox'&gt;
&lt;img src='/images/bay-area/thumb-paris-baguette.jpg' /&gt;&lt;/a&gt;
&lt;a title='In the time you&amp;apos;ve read this text, you&amp;apos;ve already hit the guy in front of you' href='/images/bay-area/rear-view-mirror.jpg' rel='gallery' class='fancybox'&gt;
&lt;img src='/images/bay-area/thumb-rear-view-mirror.jpg' /&gt;&lt;/a&gt;
&lt;a title='And always the highway.' href='/images/bay-area/highway.jpg' rel='gallery' class='fancybox'&gt;
&lt;img src='/images/bay-area/thumb-highway.jpg' /&gt;&lt;/a&gt;
&lt;a title='SF at night' href='/images/bay-area/san-francisco-night.jpg' rel='gallery' class='fancybox'&gt;
&lt;img src='/images/bay-area/thumb-san-francisco-night.jpg' /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;p&gt;We spent a lot of time in coffee shops for free Wifi, especially in the &lt;a href='https://www.facebook.com/PBPaloAlto'&gt;Paris Baguette&lt;/a&gt; on University Avenue in Palo Alto, a Korean coffee shop which reminded Leo of his time in Seoul. To get Wifi access, you had to check into the Paris Baguette on Facebook, something I found pretty neat. As it turned out, this wasn&amp;#8217;t a new general feature of Facebook, but something being test-driven in a few coffee shops in the Valley first. We found a few more such examples, like being able to pay with bitcoins in the &lt;a href='http://www.coupacafe.com/'&gt;Coupa Cafe&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;It sometimes felt as if the whole Valley was turned into one big sandbox to try out new business ideas and new pieces of technology. Already on our first day, when we sleepily sat in the sun trying to shake off our jetlag, we noticed that everyone seemed to be an entrepreneur. People were discussing business models, hacking on their websites, pitching, wherever we went. Later, people would complain that it&amp;#8217;s so hard to hire anyone because everyone wants to be a founder.&lt;/p&gt;

&lt;p&gt;As we started to talk to people, we also noticed people being quite open and supportive. It&amp;#8217;s probably our German bias, but when you talk to people in Germany about your business, they quickly get defensive and start to question the merit of your whole approach. &amp;#8220;Hasn&amp;#8217;t that been done before&amp;#8221; or &amp;#8220;I think I still don&amp;#8217;t understand what&amp;#8217;s so great about that&amp;#8221; are the kinder things friends of you would say. In contrast, people in the Valley seemed much more open as if there&amp;#8217;s a general understanding that it doesn&amp;#8217;t hurt to try. Even if people weren&amp;#8217;t impressed by your approach they&amp;#8217;d offer some piece of advice. It was also very common that people offered to connect you to other people which might be interested.&lt;/p&gt;

&lt;p&gt;Originally, we had no meetings scheduled for Friday, the day when we were flying back to Germany in the evening, but in the end we had three meetings more or less back to back just because of these introduction. It felt as if we could have stayed for another week without getting bored. People later told us that they know of people who came over for three months and still could have gone on.&lt;/p&gt;

&lt;p&gt;As someone said: The funny thing about the Valley is that although it&amp;#8217;s all about the Internet and being connected online, actually meeting face to face counts so much.&lt;/p&gt;&lt;/p&gt;
   &lt;p&gt;&lt;a href="http://blog.mikiobraun.de/2013/05/silicon-valley-trip.html"&gt;Click here for the full article&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/MarginallyInteresting/~4/7IyBbHcUWUk" height="1" width="1"/&gt;</content>
 </entry>
 

 
   <entry>
   <title type="html">Reclaim your data, own a piece of the cloud!</title>
   <link href="http://blog.mikiobraun.de/2013/04/reclaim-your-data.html" />
   <updated>2013-04-05T16:45:00+02:00</updated>
   <published>2013-04-05T16:45:00+02:00</published>
   <author>
     <name>Mikio L. Braun</name>
     <uri>http://mikiobraun.de</uri>
     <email>mikiobraun@gmail.com</email>
   </author>
   <id>http://blog.mikiobraun.de/2013/04/reclaim-your-data</id>
   <content type="html">&lt;p&gt;&lt;p&gt;Lately I&amp;#8217;ve been discussing quite a bit with &lt;a href='http://twitter.com/thinkberg'&gt;Leo&lt;/a&gt; about the current state of the &amp;#8216;Net. Sure it&amp;#8217;s nice to get all those services in the cloud for free, but in the end, you either have to worry about what exactly happens with your data, or what you can do against companies shutting down cloud based services like the Google Reader, leaving you with a pile of useless XML files, a bit like letting you take home the remnants of your car after compactification.&lt;/p&gt;

&lt;p&gt;I used to say that the main problem is that the user is the product, not the customer, but &lt;a href='https://medium.com/i-m-h-o/eb22fa75e39b'&gt;this post by Derek Powazek&lt;/a&gt; convinced me that even paying for the service won&amp;#8217;t ensure that you get decent support and control over your data.&lt;/p&gt;

&lt;p&gt;In the end, the only thing that helps is to reclaim your data and the service itself. Just like your wordpress powered blog on your own root server will stay around as long as you pay the bills, both data and the software to make it come alive should run on a computer you control.&lt;/p&gt;

&lt;p&gt;But how have ended up in a situation like this anyway? Here is my little history of networked computing.&lt;/p&gt;
&lt;div class='figure'&gt;
  &lt;img src='/images/cloud-mainframe.jpg' /&gt;
&lt;/div&gt;
&lt;p&gt;In the mainframe era, computers where huge bulky machines, and only large institutions could afford to have one. People invented time-sharing operating systems to make those computers usable to many people concurrently, which were usually connected through &lt;a href='http://en.wikipedia.org/wiki/IBM_3270'&gt;dumb text-terminals&lt;/a&gt;. Those terminals mostly worked in a block-oriented manner, meaning that they presented you with a form which you would submit to the server to get the results, a bit like a form on a web page.&lt;/p&gt;

&lt;p&gt;Obviously, services where hosted centrally, and you very much depended on the mainframe for storage and providing the service.&lt;/p&gt;
&lt;div class='figure'&gt;
  &lt;img src='/images/cloud-internet.jpg' /&gt;
&lt;/div&gt;
&lt;p&gt;All this changed with the advent of the Internet and the home computer. Instead of a relatively small number of large mainframe computers you got a large network of small machines. Services like mail, ftp, and even http were designed in a way that they could run in a decentralized manner. In principle, anyone could hook up a computer to the network and run the services he was interested on his server.&lt;/p&gt;

&lt;p&gt;Of course, you had to solve a number of technical problems, getting good bandwidth to your home was a problem, you had to use a dynamic DNS service to map changing dial-up IP addresses to a DNS entry, you had to know Linux or some other variant of UNIX, but it was possible.&lt;/p&gt;
&lt;div class='figure'&gt;
  &lt;img src='/images/cloud-vm.jpg' /&gt;
&lt;/div&gt;
&lt;p&gt;Server virtualization made things a lot easier. People realized that most of the times, computers were sitting idle anyway, so why not combine them virtually in a server. That also made it possible to host a large number of servers in data centers, where they also had constant internet access, for relatively small amounts of money. (BTW, virtualization already existed in the mainframe era.)&lt;/p&gt;
&lt;div class='figure'&gt;
  &lt;img src='/images/cloud-cloud.jpg' /&gt;
&lt;/div&gt;
&lt;p&gt;Server virtualization and the resulting technology of putting lots of PC-type servers into racks (which look a lot like the mainframes of old from the outside) allowed companies to create massive server farms for their data intensive services.&lt;/p&gt;

&lt;p&gt;It probably all started with Google search and Amazon. Google, because they needed to store an index of the whole web somewhere, Amazon, because millions of people wanted to use the website each day.&lt;/p&gt;

&lt;p&gt;Lead by this example, other companies followed, and nowadays it&amp;#8217;s entirely normal to rent out thousands of servers in the cloud (virtual or otherwise) and build services on that private armada of computers.&lt;/p&gt;

&lt;p&gt;I&amp;#8217;m not the first to point out that this is really just the same setup like the mainframe era, only with different technological means. While your computer is in principle able to store enormous amounts of data, and it can provide the same services as the machines in the cloud, it&amp;#8217;s reduced to a screen to run some web browser.&lt;/p&gt;

&lt;p&gt;So we went full circle from centralization to decentralization and back, gaining and losing control over our data and the services we need.&lt;/p&gt;
&lt;div class='figure'&gt;
  &lt;img src='/images/cloud-reclaim.png' /&gt;
&lt;/div&gt;
&lt;p&gt;But there is a way out. Now it&amp;#8217;s easier than ever to rent a piece of the cloud. We already spend enough dollars per month on our smartphones, and probably also for some cloud based services like cloud storage, why not spend a bit more money and also own a small machine somewhere in the cloud?&lt;/p&gt;

&lt;p&gt;If that seems odd to you, have you ever noticed that a smartphone is already a bit like a small server in the cloud? It runs Linux (well at least some do), is always connected, comes typically with a few GB of local storage. In principle, you could install some dynamic DNS program on it to become a full Internet server.&lt;/p&gt;

&lt;p&gt;Of course, managing virtual machines is still much too technical for the ordinary person. We would also need a new type of cloud based service which would keep only data which needs to be global in the company&amp;#8217;s server farm while offloading user specific data to the user&amp;#8217;s servers.&lt;/p&gt;

&lt;p&gt;But technically, it&amp;#8217;s all possible. And wouldn&amp;#8217;t it be cool? &lt;code&gt;;)&lt;/code&gt;&lt;/p&gt;&lt;/p&gt;
   &lt;p&gt;&lt;a href="http://blog.mikiobraun.de/2013/04/reclaim-your-data.html"&gt;Click here for the full article&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/MarginallyInteresting/~4/NDuU9Sxk4tU" height="1" width="1"/&gt;</content>
 </entry>
 

 
   <entry>
   <title type="html">Misconceptions about the CAP Theorem</title>
   <link href="http://blog.mikiobraun.de/2013/03/misconceptions-about-cap-theorem.html" />
   <updated>2013-03-20T21:35:00+01:00</updated>
   <published>2013-03-20T21:35:00+01:00</published>
   <author>
     <name>Mikio L. Braun</name>
     <uri>http://mikiobraun.de</uri>
     <email>mikiobraun@gmail.com</email>
   </author>
   <id>http://blog.mikiobraun.de/2013/03/misconceptions-about-cap-theorem</id>
   <content type="html">&lt;p&gt;&lt;img class='teaser-pic' src='/images/cap-teaser.png' /&gt;
&lt;p&gt;If you&amp;#8217;ve ever listened to a NoSQL talk, you&amp;#8217;ve probably come across the CAP theorem. The argument usually goes like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Traditional databases guarantee consistency.&lt;/li&gt;

&lt;li&gt;The CAP theorem tells you that you cannot have consistency, availability, and fault-tolerance at the same time.&lt;/li&gt;

&lt;li&gt;But we want to build scalable databases, so we forget about consistency.&lt;/li&gt;

&lt;li&gt;Oh and by the way, who needs consistency anyway?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To be honest, to me this always looked like some poor excuse to not really discuss the design decisions of some NoSQL database. It&amp;#8217;s probably just me, but I much prefer at least an attempt at an unbiased analysis of the pros and cons so that I can make an informed decision whether it fits my needs or not. But pulling this theorem out of the hat is like saying &amp;#8220;we don&amp;#8217;t even need to discuss this, because this theorem says impossible, ok!&amp;#8221;&lt;/p&gt;

&lt;p&gt;While searching for discussions of the CAP theorem, I found this excellent (but lengthy) article by Eric Brewer, one of the original authors of the CAP theorem: &lt;em&gt;&lt;a href='http://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed'&gt;CAP Twelve Years Later: How the &amp;#8220;Rules&amp;#8221; Have Changed&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Here is my summary:&lt;/p&gt;

&lt;p&gt;First of all, the interpretation that the CAP theorem says &amp;#8220;you can only have 2 out of 3&amp;#8221; is misleading. It&amp;#8217;s not like the original proof discussed all possible choices and showed that you can have only 2 out of 3.&lt;/p&gt;

&lt;p&gt;Instead, the original proof discusses the following situation: Say you have a distributed system which is in a consistent state (whatever that means), and now there is a &lt;strong&gt;P&lt;/strong&gt;artition of the system, either an actual network failure, or some other way in which machines cannot talk to one another anymore.&lt;/p&gt;

&lt;p&gt;Now consider what options you have when there is write request. You could wait for the partition to end in order to make sure that your system stays in a consistent state (thereby sacrificing &lt;strong&gt;A&lt;/strong&gt;vailability), or you could do the update partially (thereby sacrificing &lt;strong&gt;C&lt;/strong&gt;onsistency). So you can have only C or A in case of P but not both.&lt;/p&gt;

&lt;p&gt;Note that there is really no way in which you could &amp;#8220;choose P&amp;#8221;, it was always about how to handle partitions (which are often not really partitions, but timeouts), and that includes how to detect partitions, how to behave when you are in a &amp;#8220;partition state&amp;#8221;, and how to bring the system back to a consistent state after a partition.&lt;/p&gt;

&lt;p&gt;The article stresses that these are no binary decisions, but that there is rather a whole spectrum of possibly actions and strategies to choose from. It&amp;#8217;s not about saying &amp;#8220;I can&amp;#8217;t have consistency and availability, so I&amp;#8217;ll just forget about consistency&amp;#8221;, it&amp;#8217;s about saying &amp;#8220;in case of a failure, availability is more important to me, therefore I will accept temporary inconsistencies, and implement strategies to clean up afterwards&amp;#8221;.&lt;/p&gt;

&lt;p&gt;When you look at it that way, you get a much clearer picture of how a database like Cassandra fits into this, and how their &lt;a href='http://www.datastax.com/docs/1.1/dml/data_consistency#about-cassandra-s-built-in-consistency-repair-features'&gt;read repair&lt;/a&gt;, &lt;a href='http://www.datastax.com/dev/blog/modern-hinted-handoff'&gt;hinted handoff&lt;/a&gt; features work to regain consistency, although in a very lax (and &lt;em&gt;eventual&lt;/em&gt;) way.&lt;/p&gt;

&lt;p&gt;But it also becomes clear that it&amp;#8217;s just not true that you cannot have distributed databases which are highly available and come with consistency guarantees at all. The article goes on to discuss recent research results which try to achieve exactly that, strategies to minimize the impact of a partition on availability and consistency, how to re-establish consistency after a partition (also in the broader database sense of having consistent cross-references between tables and satisfying other invariants)&lt;/p&gt;

&lt;p&gt;So the next time someone tells you he doesn&amp;#8217;t care about consistency because of the CAP theorem, ask him how he chooses P, and how he deals with the detection, handling, and cleanup of partitions.&lt;/p&gt;&lt;/p&gt;
   &lt;p&gt;&lt;a href="http://blog.mikiobraun.de/2013/03/misconceptions-about-cap-theorem.html"&gt;Click here for the full article&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/MarginallyInteresting/~4/lrLfVbq31Ug" height="1" width="1"/&gt;</content>
 </entry>
 

 
   <entry>
   <title type="html">Lunch talk: Google Reader</title>
   <link href="http://blog.mikiobraun.de/2013/03/lunch-talk-google-reader.html" />
   <updated>2013-03-14T15:35:00+01:00</updated>
   <published>2013-03-14T15:35:00+01:00</published>
   <author>
     <name>Mikio L. Braun</name>
     <uri>http://mikiobraun.de</uri>
     <email>mikiobraun@gmail.com</email>
   </author>
   <id>http://blog.mikiobraun.de/2013/03/lunch-talk-google-reader</id>
   <content type="html">&lt;p&gt;&lt;p&gt;As you&amp;#8217;ve probably heard by now, Google is shutting down Google Reader on July 1, 2013. Reactions are mixed, ranging from people who say that RSS is dead anyway, or that they get all their info from other services like &lt;a href='http://getprismatic.com/'&gt;prismatic&lt;/a&gt; anyway.&lt;/p&gt;
&lt;blockquote class='twitter-tweet tw-align-center'&gt;
	&lt;p&gt;RSS was dead years ago &lt;a href='https://twitter.com/search/%23getoverit'&gt;#getoverit&lt;/a&gt;&lt;/p&gt;&amp;mdash; Nick Halstead (@nik) &lt;a href='https://twitter.com/nik/status/312127843238821888'&gt;March 14, 2013&lt;/a&gt;
&lt;/blockquote&gt;&lt;script charset='utf-8' async='*' src='//platform.twitter.com/widgets.js'&gt; &lt;/script&gt;&lt;blockquote class='twitter-tweet tw-align-center'&gt;
	&lt;p&gt;Google Reader retiring is of no impact 2 me. I moved to @&lt;a href='https://twitter.com/prismatic'&gt;prismatic&lt;/a&gt; quite some time back. It's time for @&lt;a href='https://twitter.com/prismatic'&gt;prismatic&lt;/a&gt; 2 pull up 2 the next level&lt;/p&gt;&amp;mdash; Debasish Ghosh (@debasishg) &lt;a href='https://twitter.com/debasishg/status/312123506844381184'&gt;March 14, 2013&lt;/a&gt;
&lt;/blockquote&gt;
&lt;p&gt;Meanwhile web sites are posting &lt;a href='http://mashable.com/2013/03/14/google-reader-alternatives/'&gt;lists&lt;/a&gt; &lt;a href='http://www.ghacks.net/2013/03/14/the-ultimate-google-reader-alternatives-list/'&gt;of&lt;/a&gt; &lt;a href='http://lifehacker.com/5990456/google-reader-is-getting-shut-down-here-are-the-best-alternatives'&gt;Google&lt;/a&gt; &lt;a href='http://crave.cnet.co.uk/software/google-reader-alternatives-including-flipboard-feedly-and-more-50010640/'&gt;Reader&lt;/a&gt; &lt;a href='http://marketingland.com/12-google-reader-alternatives-36158'&gt;alternatives&lt;/a&gt; like &lt;a href='http://feedly.com/'&gt;feedly&lt;/a&gt;, &lt;a href='http://flipboard.com/'&gt;flipboard&lt;/a&gt;, or &lt;a href='http://www.newsblur.com/'&gt;newsblur&lt;/a&gt; are compiled.&lt;/p&gt;

&lt;p&gt;So obviously, &lt;a href='http://thinkberg.com/'&gt;thinkberg&lt;/a&gt;, who is a big fan of Reader, and I had a lot to discuss over lunch. So what is our take on this?&lt;/p&gt;

&lt;h2 id='rss_is_dead'&gt;RSS is dead?&lt;/h2&gt;

&lt;p&gt;I agree that RSS is probably much too technical for the average user, but what people seem to forget about RSS is that it&amp;#8217;s half of the open decentralized social network framework guys like &lt;a href='http://join.app.net/'&gt;App.net&lt;/a&gt; or &lt;a href='http://github.com/diaspora/diaspora'&gt;diaspora&lt;/a&gt; are trying to build. You can host your own content, you can subscribe to content no matter where it&amp;#8217;s hosted and assemble your own timeline. Also, RSS&amp;#8217;s use is much more widespread than you&amp;#8217;d probably think. All news sites, and blogs support RSS feeds. The obvious exception are the big social network sites like Google+, and Twitter (although you can export real-time searches as RSS feeds, which is pretty nice), because they want you to spend time on their site while they show you some ads.&lt;/p&gt;

&lt;p&gt;It&amp;#8217;s not perfect, of course, there are no comments or discussions (but tumblr is also much less interactive than facebook, for example), it&amp;#8217;s pull, not push, but the openness and extensibility (in the sense that you could hook up your coffee-machine if you want to) is undisputable.&lt;/p&gt;

&lt;h2 id='google_doesnt_get_social__or_enduser_products'&gt;Google doesn&amp;#8217;t &amp;#8220;get&amp;#8221; social - or end-user products&lt;/h2&gt;

&lt;p&gt;I think Google shutting down the Reader is mostly another example how Google is relentlessly redesigning their services to meet some grand plan without caring much about whether the user will be happy with the end result or not. It&amp;#8217;s almost as if they are taking an engieering approach and have a hard time taking into account anything which is not a technical service requirement.&lt;/p&gt;

&lt;p&gt;I guess it&amp;#8217;s one thing to optimize Google&amp;#8217;s infrastructure internally, where your dependencies are relatively few and clearly defined and you have other Google people at the other end of the line who know that these changes are necessary. Contrast this with a service used by millions of people, most of whom are don&amp;#8217;t really want anything to change.&lt;/p&gt;

&lt;p&gt;For example, I think people didn&amp;#8217;t really care whether Picasa got integrated with Google+, or whether Google Docs became part of Google Drive. So it might have made sense for Google&amp;#8217;s long-term goal to have a single social platform which incorporates all these services somehow, but most people are lost somewhere along the line.&lt;/p&gt;

&lt;p&gt;It&amp;#8217;s not as if Google isn&amp;#8217;t trying to get social right, however. &lt;a href='https://www.quora.com/Google-Reader-Shut-Down-March-2013/Why-is-Google-killing-Google-Reader'&gt;Brian Shih&lt;/a&gt;, who once was product manager for Google Reader, says on Quora:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Ironically, I think the reason Google always wanted to pull the Reader team off to build these other social products was that the Reader team actually understood social (and tried a lot of experiments over the years that informed the larger social features at the company). Reader&amp;#8217;s social features also evolved very organically in response to users, instead of being designed top-down like some of Google&amp;#8217;s other efforts.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id='google_to_the_rescue'&gt;Google+ to the rescue?&lt;/h2&gt;

&lt;p&gt;So what about the alternatives? What about Google+?&lt;/p&gt;

&lt;p&gt;As &lt;a href='http://thinkberg.com/'&gt;thinkberg&lt;/a&gt; nicely pointed out, the difference is that Google Reader is (was) a productivity tool, while Google+ is a social network site. Google Reader is optimized to help you sift through many news items quickly, while Google+ wants you to linger (possible to show you as many ads as possible). In the compressed layout, Reader uses exactly one line per item, showing you dozens on a single screen. In constrast, in Google+, you seldom get more than three items per screen.&lt;/p&gt;

&lt;p&gt;Of course, you can share items on Google+, but the old Reader was excellent in that it presented shared items also in the same compact view and by users, shared items are simply lost in the your timeline stream.&lt;/p&gt;

&lt;p&gt;Finally, and I&amp;#8217;m still amazed that this hasn&amp;#8217;t changed ever since Google+ was launched, there is no easy bookmark solution in Google+. In Reader, you could star an item to bookmark it for later reading, but there is no way to see all the items you +1&amp;#8217;d on Google+. The closes thing I&amp;#8217;ve come across is to have a circle called &amp;#8220;Bookmarked&amp;#8221; and reshare to that circle to bookmark posts.&lt;/p&gt;

&lt;p&gt;&lt;a href='http://thinkberg.com/'&gt;thinkberg&lt;/a&gt; said that this makes perfect sense for Google because they&amp;#8217;re only interested in the +1s to improve their search results, not to provide value to the users. Of course, he was just joking, but there is some truth to this. After all, Google+ is a free service and to make sense to Google they need to leverage the information collected there in some other ways, for example in personalized search.&lt;/p&gt;

&lt;p&gt;As &lt;a href='http://thinkberg.com/'&gt;thinkberg&lt;/a&gt; tweeted earlier&lt;/p&gt;
&lt;blockquote class='twitter-tweet tw-align-center'&gt;&lt;p&gt;We will have to rethink the culture of free products (ad supported incl.). It’s most always a loss for the customer in the long run.&lt;/p&gt;&amp;mdash; Matthias L. Jugel (@thinkberg) &lt;a href='https://twitter.com/thinkberg/status/312106126600724480'&gt;March 14, 2013&lt;/a&gt;&lt;/blockquote&gt;
&lt;p&gt;the actual problem might also be that these are all free services. Such services are always hard because companies must find ways to exploit the data in some way to still make money, and customers or users have a hard time complaining if the service is changed in fundamental ways.&lt;/p&gt;

&lt;p&gt;Or put differently, if I&amp;#8217;d paid for Google Reader, I&amp;#8217;d be pretty pissed now.&lt;/p&gt;

&lt;p&gt;Having paying customers on the other hand makes everything so much easier. The company knows why it&amp;#8217;s having a product, and it can also support small user bases. The problem is, especially with new technology, to get people to realize that a service is worth real money, but that is another topic.&lt;/p&gt;

&lt;h2 id='alternative_rss_feed_readers_there_are_none'&gt;Alternative RSS feed readers: there are none&lt;/h2&gt;

&lt;p&gt;So what about the other alternatives? Well, the problem is that Google Reader pretty much killed the market for classical RSS feed readers like bloglines. Most of the other companies mentioned in the various lists have taken a different approach, trying to create some nice looking newspaper-like experience from your feeds. Unfortunately, as I said above, the strength of Google Reader was to be able to quickly sift through huge amounts of information, and that&amp;#8217;s something these newspaper-like apps cannot accomplish.&lt;/p&gt;

&lt;p&gt;I personally like &lt;a href='http://getprismatic.com/'&gt;prismatic&lt;/a&gt; a lot, but that&amp;#8217;s again something different. It&amp;#8217;s pretty good at finding stuff which is interesting, but you cannot track hand selected news sources that way.&lt;/p&gt;

&lt;h2 id='what_id_like_to_have'&gt;What I&amp;#8217;d like to have&lt;/h2&gt;

&lt;p&gt;So just to compile this for the posterity, what are the features I&amp;#8217;d like to see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Track a number of RSS feeds (probably also other sources).&lt;/li&gt;

&lt;li&gt;Choice between compact view (one line per item is enough) and expanded view.&lt;/li&gt;

&lt;li&gt;Web view and mobile app, synchronized.&lt;/li&gt;

&lt;li&gt;Bookmarking capabilities.&lt;/li&gt;

&lt;li&gt;Sharing feature, but don&amp;#8217;t just put everything in my timeline, let me see only shared items or even by specific user.&lt;/li&gt;

&lt;li&gt;Hide stuff I&amp;#8217;ve already seen.&lt;/li&gt;

&lt;li&gt;I&amp;#8217;d be ready to pay around 5$ per month for this.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Basically, someone give me the old Reader, and you&amp;#8217;ll have me as a customer instantly.&lt;/p&gt;

&lt;p&gt;Oh and in case you haven&amp;#8217;t seen this yet:&lt;/p&gt;
&lt;div class='centered'&gt;
&lt;iframe frameborder='0' height='315' width='560' src='http://www.youtube.com/embed/eKTntSh4uSQ' allowfullscreen='*'&gt; 
&lt;/iframe&gt;
&lt;/div&gt;&lt;/p&gt;
   &lt;p&gt;&lt;a href="http://blog.mikiobraun.de/2013/03/lunch-talk-google-reader.html"&gt;Click here for the full article&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/MarginallyInteresting/~4/-_klDZ2QHw0" height="1" width="1"/&gt;</content>
 </entry>
 

 
   <entry>
   <title type="html">More Google Big Data papers: Megastore and Spanner</title>
   <link href="http://blog.mikiobraun.de/2013/03/more-google-papers-megastore-spanner-voted-commits.html" />
   <updated>2013-03-11T15:35:00+01:00</updated>
   <published>2013-03-11T15:35:00+01:00</published>
   <author>
     <name>Mikio L. Braun</name>
     <uri>http://mikiobraun.de</uri>
     <email>mikiobraun@gmail.com</email>
   </author>
   <id>http://blog.mikiobraun.de/2013/03/more-google-papers-megastore-spanner-voted-commits</id>
   <content type="html">&lt;p&gt;&lt;img class='teaser-pic' src='/images/megastore-spanner.png' /&gt;
&lt;p&gt;I &lt;a href='/2013/02/big-data-beyond-map-reduce-googles-papers.html'&gt;recently&lt;/a&gt; reviewed a number of Google Big Data papers. Today I&amp;#8217;ll talk a bit about &lt;a href='http://research.google.com/pubs/pub36971.html'&gt;Megastore&lt;/a&gt; and &lt;a href='http://research.google.com/archive/spanner.html'&gt;Spanner&lt;/a&gt;, two distributed databases I wasn&amp;#8217;t aware of before. Both are distributed databases with interesting features like full consistency and transactions, something most NoSQL proponents will tell you is impossible to scale.&lt;/p&gt;

&lt;p&gt;Now I should mention that I&amp;#8217;m not a distributed systems guy by training, and the papers are sufficiently vague in the exact details of how they work (although they otherwise do an excellent job of describing all the essential components of such a complex system), but anyway, here is what I understood ;)&lt;/p&gt;

&lt;p&gt;While these papers are probably more Big Data Base than Big Data Science, I still think these make an interesting read because they tell you a lot about what it takes to build a scalable and robust system.&lt;/p&gt;

&lt;h2 id='megastore_bigtable__transactions__schema'&gt;Megastore: BigTable + transactions + schema&lt;/h2&gt;

&lt;p&gt;You might not have heard of Megastore, but if you&amp;#8217;ve used any of Google&amp;#8217;s products beside search, you will have interacted with it as things like GMail, Calendar, Picasa (now morphed into Google+), Android Market (now &amp;#8220;Google Play&amp;#8221;), or AppEngine run on Megastore.&lt;/p&gt;

&lt;p&gt;Megastore is build upon BigTable, Google&amp;#8217;s key-value store which inspired countless open-source NoSQL databases like Apache Cassandra. However, Megastore isn&amp;#8217;t schema-free but lets you define tables like in a standard SQL database. The mapping to BigTable columns is straightforward and you can also specify to collocate dependent tables in the same BigTable for better performance.&lt;/p&gt;

&lt;p&gt;Probably the most interesting feature is that Megastore is globally consistent (or &lt;a href='http://en.wikipedia.org/wiki/ACID'&gt;ACID compliant&lt;/a&gt;, to be more exact). The authors argue that looser consistency is nice for scaling, but for the application developer, consistency is so much easier to work with. And actually, I think they&amp;#8217;re right. We&amp;#8217;ve all heard the &amp;#8220;eventually consistency is enough&amp;#8221; talk a few times and have come to believe that often, it doesn&amp;#8217;t really matter. But the truth is, to know that the value you wrote is actually there, or to know that a group of related updates is rolled back if your program crashes, is very valuable.&lt;/p&gt;

&lt;h2 id='consistency_through_distributed_commit_log'&gt;Consistency through distributed commit log&lt;/h2&gt;

&lt;p&gt;So how does Megastore achieve distributed consistency (and by the way I&amp;#8217;m using the term &lt;em&gt;consistency&lt;/em&gt; you already see I&amp;#8217;m not a distributed systems guy ;))?&lt;/p&gt;

&lt;p&gt;The main idea seems to be that Megastore manages a distributed commit log. So-called write ahead logs are pretty common in databases to guard against failures. Every write action is first recorded in a log so that you can pick up after a crash and reapply the write operations in the log. In addition, the log also gives you a time ordering for the transactions.&lt;/p&gt;

&lt;p&gt;Contrast this with the write actions in, for example, Cassandra: &lt;a href='http://www.datastax.com/docs/1.1/dml/data_consistency'&gt;There&lt;/a&gt;, writes can be initiated by any node and make sure a certain number of replicates have acknowledged the write. The end effect is that different nodes might see different orderings for the updates, or some not at all. Cassandra has other mechanisms like &lt;a href='http://www.datastax.com/docs/1.1/dml/data_consistency#builtin-consistency'&gt;read-repair&lt;/a&gt; to make sure all nodes eventually have the same data, but there is no guarantee.&lt;/p&gt;

&lt;p&gt;So how does Megastore achieve a distributed commit log? The standard way for distributed commits seems to be &lt;a href='http://en.wikipedia.org/wiki/Two-phase_commit_protocol'&gt;two-phase commit&lt;/a&gt;, which however requires a master node. Instead, Google used the Paxos protocol, which is a basically a voting scheme to ensure that a majority of agents agree on a proposed value. Just to clarify, it is not about voting between a number of alternatives, it&amp;#8217;s really about agreeing on a given value in a robust manner to ensure that at least half of the agents have noticed and agreed to the number.&lt;/p&gt;

&lt;h2 id='the_paxos_algorithm'&gt;The Paxos algorithm&lt;/h2&gt;

&lt;p&gt;This algorithm was originally published by Leslie Lamport (yes, the LaTeX Leslie Lamport) in a &lt;a href='http://research.microsoft.com/en-us/um/people/lamport/pubs/pubs.html#lamport-paxos'&gt;paper&lt;/a&gt; which was written as if it reported on some historical Greek community (also includes a bunch of Greek symbols), but if you&amp;#8217;re interested, I recommend the &amp;#8221;&lt;a href='http://research.microsoft.com/en-us/um/people/lamport/pubs/pubs.html#paxos-simple'&gt;Paxos made simple&lt;/a&gt;&amp;#8221; paper which explains it in plain English.&lt;/p&gt;

&lt;p&gt;So just to give you an idea how this algorithm works, the algorithm goes through numbered rounds. In each round, the number of the round is first announced to all nodes, and the nodes return a promise to forget about all previous rounds. If a promise was received from the majority of nodes, another attempt is made to get a majority to accept the value. If that works, the value is broadcast to all who are interested. It seems there are ways to make this all robust, including the election of a leader and the identification of the round number such that they are always larger than all previous rounds, just check the original paper for the details ;).&lt;/p&gt;

&lt;p&gt;Now voting on a value sounds pretty abstract, but the trick used in Megastore is to use Paxos for reaching agreement on what will be appended to the commit log. Basically, the initiating node says &amp;#8220;OK, I&amp;#8217;d like to work to commit on this transaction here next&amp;#8221;, and if all agree, it is ensured that the majority of nodes has synchronized commit logs.&lt;/p&gt;

&lt;h2 id='wrapup_read_and_write_in_megastore'&gt;Wrap-up: Read and write in Megastore&lt;/h2&gt;

&lt;p&gt;So on each transaction start, Megastore first identifies the last transaction committed and identifies a node which is up-to-date, or brings a node up-to- date. Then, Megastore reads the values of the transaction&amp;#8217;s timestamp (BigTable stores multiple versions with timestamp of each value). To commit the transaction, Paxos is used to get agreement on appending the transaction to the commit log.&lt;/p&gt;

&lt;p&gt;These are the main ideas, there is a host of other features, like &amp;#8220;dirty reads&amp;#8221;, &amp;#8220;entity groups&amp;#8221; which partition the data (and also restrict the consistency to within entity groups), a messaging system, two-phase commit for transactions across entity groups, optimizations over plain Paxos to get fewer rounds of communications, management servers to track which nodes are up-to-date, and so on.&lt;/p&gt;

&lt;h2 id='spanner_one_atomic_clock_per_data_center'&gt;Spanner: one atomic clock per data center&lt;/h2&gt;

&lt;p&gt;I&amp;#8217;ll just briefly mention &lt;a href='http://research.google.com/archive/spanner.html'&gt;Spanner&lt;/a&gt;, yet another distributed database. The main improvements over Megastore are a SQL-like query language, better write performance (Megastore writes seem to usually take a few hundred milliseconds), and global transactions and consistency (not just within entitiy groups as in Megastore). So basically, Spanner is a distributed database which doesn&amp;#8217;t feel different from a &amp;#8220;normal&amp;#8221; SQL database. Spanner has been developed to be the backend for &lt;a href='http://research.google.com/pubs/pub38125.html'&gt;F1&lt;/a&gt;, yet another RDBMS behind Google&amp;#8217;s online ad business, and it has replaced a farm of sharded MySQL servers.&lt;/p&gt;

&lt;p&gt;Spanner again heavily uses Paxos for various forms of coordination, but also classical two-phase commits. The main feature seems to be that Spanner is able to assign globally meaningful timestamps to commits through the use of GPS and atomic clocks. Yes, I&amp;#8217;m not kidding, apparently, you can have &lt;a href='https://www.google.com/search?q=atomic+clock+rack+mount'&gt;rack mount atomic clocks&lt;/a&gt; for a few thousand dollars.&lt;/p&gt;

&lt;p&gt;Using such clocks, Spanner globally agrees on a commit timestamp using Paxos, giving you global consistency: On a read, you get the timestamp of the last commit and retrieve the date at that timestamp. That way, Spanner also realises lock-free read-only transactions.&lt;/p&gt;

&lt;h2 id='distributed_consistency_is_hard_but_not_impossible'&gt;Distributed Consistency is hard, but not impossible&lt;/h2&gt;

&lt;p&gt;So what can we learn from these examples? What I sort of admire is Google&amp;#8217;s courage to take on such complex problems. You often hear that global consistency is impossible to scale or that distributed transactions are a nightmare. While this is certainly massively complex technology, it is possible nevertheless. You might want to invest in some atomic clocks, though ;)&lt;/p&gt;

&lt;p&gt;I&amp;#8217;ve been bashing Cassandra a bit, but one also has to keep in mind that global consistency comes at a cost. Megastore writes are slow. It&amp;#8217;s ok if you&amp;#8217;re in an interactive application and only need to perform one transaction to complete an action, but it&amp;#8217;s probably the wrong backend for data crunching. Likewise, Spanner&amp;#8217;s latency is reported to be 8ms for reads on average, and 72-100ms for commits on average, which is incredibly fast given what they accomplish, but still slower than the performance you get out of Cassandra.&lt;/p&gt;

&lt;p&gt;So it might be hard, but it&amp;#8217;s possible. All too often, people tell us it&amp;#8217;s impossible but sometimes they&amp;#8217;re just defending their favorite tools feature set.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Related posts:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href='/2013/02/big-data-beyond-map-reduce-googles-papers.html'&gt;Big Data beyond MapReduce: Google&amp;#8217;s Big Data papers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/p&gt;
   &lt;p&gt;&lt;a href="http://blog.mikiobraun.de/2013/03/more-google-papers-megastore-spanner-voted-commits.html"&gt;Click here for the full article&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/MarginallyInteresting/~4/UrNjAAuRyIQ" height="1" width="1"/&gt;</content>
 </entry>
 

 
   <entry>
   <title type="html">Stream Processing has no Query Layer</title>
   <link href="http://blog.mikiobraun.de/2013/03/stream-processing-has-no-query-layer.html" />
   <updated>2013-03-01T16:47:00+01:00</updated>
   <published>2013-03-01T16:47:00+01:00</published>
   <author>
     <name>Mikio L. Braun</name>
     <uri>http://mikiobraun.de</uri>
     <email>mikiobraun@gmail.com</email>
   </author>
   <id>http://blog.mikiobraun.de/2013/03/stream-processing-has-no-query-layer</id>
   <content type="html">&lt;p&gt;&lt;p&gt;When it comes to real-time big data, stream processing frameworks are an interesting alternative to MapReduce. Instead of storing and crunching data in batches, they process the data as it comes along, which immediately makes much more sense if you&amp;#8217;re dealing with event streams. Frameworks like Twitter&amp;#8217;s &lt;a href='http://storm-project.net/'&gt;Storm&lt;/a&gt; and Yahoo&amp;#8217;s &lt;a href='http://incubator.apache.org/s4/'&gt;S4&lt;/a&gt; allow you to scale such computations. Similar to MapReduce jobs, you specify small worker threads which are then deployed and monitored automatically to provide robust scalability.&lt;/p&gt;

&lt;p&gt;So at first you may think &amp;#8220;stream processing is basically MapReduce for events&amp;#8221;, but there is an important and significant difference: There is no query layer in stream processing (well at least, there isn&amp;#8217;t in Storm and S4).&lt;/p&gt;

&lt;p&gt;With &lt;em&gt;query layer&lt;/em&gt;, I mean the capability to query the results of your computations. For stream processing, in particular, this means while the computations are still running, because you are typically consuming a never-ending stream of new events.&lt;/p&gt;

&lt;p&gt;For example, if we consider the ubiquituous &lt;a href='http://engineering.twitter.com/2011/08/storm-is-coming-more-details-and-plans.html'&gt;word count example&lt;/a&gt;, where you pipe in some constant stream of sentences (let&amp;#8217;s say, tweets), how can you query the counts for a given word at a given time?&lt;/p&gt;

&lt;p&gt;The answer is a bit surprising to most people I&amp;#8217;ve talked to: There is no way you can query the result (at least from within the stream processing framework). The information is there, distributed over numerous worker thread who all see and process a part of the stream, but there is no way to access that information. Instead, results have to be periodically output, either to screen or to some persistent storage.&lt;/p&gt;
&lt;div class='figure'&gt;
	&lt;img src='/images/spnq-stream.png' /&gt;
&lt;/div&gt;
&lt;p&gt;Now these are only toy examples, of course, but it also means that for real-world applications, you need some database backend to store your results. Depending on the amount of data you process and the level of aggregation you do (or don&amp;#8217;t do), this also means that your stock MySQL won&amp;#8217;t suffice to keep up with your stream processing cluster.&lt;/p&gt;

&lt;p&gt;The same can be said of MapReduce jobs which run periodically to update some statistics, but the difference is that MapReduce doubles as the storage solution while you need an additional backend for stream processing.&lt;/p&gt;
&lt;div class='figure'&gt;
	&lt;img src='/images/spnq-mapreduce.png' /&gt;
&lt;/div&gt;
&lt;p&gt;So I think stream processing is good when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you have a high frequency event stream,&lt;/li&gt;

&lt;li&gt;have to do quite complex analyses on each event independently,&lt;/li&gt;

&lt;li&gt;do a lot of aggregation so that there is a huge reduction in data volume.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But it&amp;#8217;s not generally applicable when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you need to do a lot of persistent updates which each event,&lt;/li&gt;

&lt;li&gt;need to query results while the analysis is still ongoing.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let me know if I&amp;#8217;m wrong. I&amp;#8217;d be interested in learning about some real-world experiences with scaling stream processing!&lt;/p&gt;&lt;/p&gt;
   &lt;p&gt;&lt;a href="http://blog.mikiobraun.de/2013/03/stream-processing-has-no-query-layer.html"&gt;Click here for the full article&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/MarginallyInteresting/~4/_FRZux3sGwY" height="1" width="1"/&gt;</content>
 </entry>
 

 
   <entry>
   <title type="html">Big Data beyond MapReduce: Google's Big Data papers</title>
   <link href="http://blog.mikiobraun.de/2013/02/big-data-beyond-map-reduce-googles-papers.html" />
   <updated>2013-02-22T16:45:00+01:00</updated>
   <published>2013-02-22T16:45:00+01:00</published>
   <author>
     <name>Mikio L. Braun</name>
     <uri>http://mikiobraun.de</uri>
     <email>mikiobraun@gmail.com</email>
   </author>
   <id>http://blog.mikiobraun.de/2013/02/big-data-beyond-map-reduce-googles-papers</id>
   <content type="html">&lt;p&gt;&lt;img class='teaser-pic' src='/images/google-big-data-teaser.png' /&gt;
&lt;p&gt;Mainstream Big Data is all about MapReduce, but when looking at real-time data, limitations of that approach are starting to show. In this post, I&amp;#8217;ll review Google&amp;#8217;s most important Big Data publications and discuss where they are (as far as they&amp;#8217;ve disclosed).&lt;/p&gt;

&lt;h2 id='mapreduce_google_file_system_and_bigtable_the_mother_of_all_big_data_algorithms'&gt;MapReduce, Google File System and Bigtable: the mother of all big data algorithms&lt;/h2&gt;

&lt;p&gt;Chronologically the first paper is on the &lt;a href='http://research.google.com/archive/gfs.html'&gt;Google File System&lt;/a&gt; from 2003, which is a distributed file system. Basically, files are split into chunks which are stored in a redundant fashion on a cluster of commodity machines (Every article about Google has to include the term &amp;#8220;commodity machines&amp;#8221;!)&lt;/p&gt;

&lt;p&gt;Next up is the &lt;a href='http://research.google.com/archive/mapreduce.html'&gt;MapReduce&lt;/a&gt; paper from 2004. MapReduce has become synonymous with Big Data. Legend has it that Google used it to compute their search indices. I imagine it worked like this: They have all the crawled web pages sitting on their cluster and every day or so they ran MapReduce to recompute everything.&lt;/p&gt;

&lt;p&gt;Next up is the &lt;a href='http://research.google.com/archive/bigtable.html'&gt;Bigtable&lt;/a&gt; paper from 2006 which has become the inspiration for countless NoSQL databases like Cassandra, HBase, and others. About half of the architecture of Cassandra is modeled after BigTable, including the data model, SSTables, and write-ahead-logs (the other half being Amazon&amp;#8217;s Dynamo database for the peer-to-peer clustering model).&lt;/p&gt;

&lt;h2 id='percolator_handling_individual_updates'&gt;Percolator: Handling individual updates&lt;/h2&gt;

&lt;p&gt;Google didn&amp;#8217;t stop with MapReduce. In fact, with the exponential growth of the Internet, it became impractical to recompute the whole search index from scratch. Instead, Google developed a more incremental system, which still allowed for distributed computing.&lt;/p&gt;

&lt;p&gt;Now here is where it&amp;#8217;s getting interesting, in particular compared to what common messages from mainstream Big Data are. For example, they have reintroduced transactions, something NoSQL still tells you that you don&amp;#8217;t need or cannot have if you want to have scalability.&lt;/p&gt;

&lt;p&gt;In the &lt;a href='http://research.google.com/pubs/pub36726.html'&gt;Percolator&lt;/a&gt; paper from 2010, they describe how the Google is keeping its web search index up to date. Percolator is built on existing technologies like Bigtable, but adds transactions and locks on rows and tables, as well as notifications for change in the tables. These notifications are then used to trigger the different stages in a computation. This way, the individual updates can &amp;#8220;percolate&amp;#8221; through the database.&lt;/p&gt;

&lt;p&gt;This approach is reminiscent of stream processing frameworks (SPFs) like Twitter&amp;#8217;s &lt;a href='http://storm-project.net/'&gt;Storm&lt;/a&gt;, or Yahoo&amp;#8217;s &lt;a href='http://incubator.apache.org/s4/'&gt;S4&lt;/a&gt;, but with an underlying data base. SPFs usually use message passing and no shared data. This makes it easier to reason about what is happening, but also has the problem that there is no way to access the result of the computation unless you manually store it somewhere in the end.&lt;/p&gt;

&lt;h2 id='pregel_scalable_graph_computing'&gt;Pregel: Scalable graph computing&lt;/h2&gt;

&lt;p&gt;Eventually, Google also had to start mining graph data like the social graph in an online social network, so they developed &lt;a href='http://kowshik.github.com/JPregel/pregel_paper.pdf'&gt;Pregel&lt;/a&gt;, published in 2010.&lt;/p&gt;

&lt;p&gt;The underlying computational model is much more complex than in MapReduce: Basically, you have worker threads for each node which are run in parallel iteratively. In each so-called superstep, the worker threads can read messages in the node&amp;#8217;s inbox, send messages to other nodes, set and read values associated with nodes or edges, or vote to halt. Computations are run till all nodes have voted to halt. In addition, there are also Aggregators and Combiners which compute global statistics.&lt;/p&gt;

&lt;p&gt;The paper shows how to implement a number of algorithms like Google&amp;#8217;s PageRank, shortest path, or bipartite matching. My personal feeling is that Pregel requires even more rethinking on the side of the implementor than MapReduce or SPFs.&lt;/p&gt;

&lt;h2 id='dremel_online_visualizations'&gt;Dremel: Online visualizations&lt;/h2&gt;

&lt;p&gt;Finally, in another paper from 2010, Google describes &lt;a href='http://research.google.com/pubs/pub36632.html'&gt;Dremel&lt;/a&gt;, which is an interactive database with an SQL-like language for structured data. So instead of tables with fixed fields like in an SQL database, each row is something like a JSON object (to be more exact, the data is modeled by Google&amp;#8217;s &lt;a href='http://code.google.com/p/protobuf/'&gt;protocol buffer&lt;/a&gt; format which imposes restrictions on what fields are allowed). Internally, data is stored in a special format which makes sweeps through the data very efficient. Queries are pushed down to servers and then aggregated on their way back up and use some clever data format for maximum performance.&lt;/p&gt;

&lt;h2 id='big_data_beyond_mapreduce'&gt;Big Data beyond MapReduce&lt;/h2&gt;

&lt;p&gt;Google didn&amp;#8217;t stop with MapReduce, but they developed other approaches for applications where MapReduce wasn&amp;#8217;t a good fit, and I think this is an important message for the whole Big Data landscape. You cannot solve everything with MapReduce. You can make it faster by getting rid of the disks and moving all the data to in-memory, but there are tasks whose inherent structure makes it hard for MapReduce to scale.&lt;/p&gt;

&lt;p&gt;Open source projects have picked up on the more recent ideas and papers by Google. For example, Apache &lt;a href='http://incubator.apache.org/drill/'&gt;Drill&lt;/a&gt; is reimplementing the Dremel framework, while projects like Apache &lt;a href='http://incubator.apache.org/giraph/'&gt;Giraph&lt;/a&gt; and Stanford&amp;#8217;s &lt;a href='http://infolab.stanford.edu/gps/'&gt;GPS&lt;/a&gt; are inspired by Pregel.&lt;/p&gt;

&lt;p&gt;There are still other approaches as well. I&amp;#8217;m personally a big fan of stream mining (not to be confused with stream processing) which aims to process event streams with bounded computational resources by resorting to approximation algorithms. Noel Welsh has some &lt;a href='http://noelwelsh.com
/streaming-algorithms/2012/11/22/streaming-algorithms-scala-exchange-edition/'&gt;interesting slide&amp;#8217;s&lt;/a&gt; on the topic.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Related posts:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href='/2011/10/one-does-not-simply-scale-into-realtime-processing.html'&gt;One does not simply scale into real-time&lt;/a&gt;&lt;/li&gt;

&lt;li&gt;&lt;a href='/2012/12/announcing-streamdrill-beta.html'&gt;Announcing streamdrill&lt;/a&gt;&lt;/li&gt;

&lt;li&gt;&lt;a href='/2013/03/more-google-papers-megastore-spanner-voted-commits.html'&gt;More Google Big Data papers: Megastore and Spanner&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Edits: Feb 25, 2013 Clarified the Text on Dremel a bit based on &lt;a href='https://plus.google.com/117122816629542437147/posts/PV3a6W7FjWW'&gt;Dmitriy Belenko&amp;#8217;s comments&lt;/a&gt;, corrected post time.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Mar 12, 2013 Added link to follow-up post.&lt;/em&gt;&lt;/p&gt;&lt;/p&gt;
   &lt;p&gt;&lt;a href="http://blog.mikiobraun.de/2013/02/big-data-beyond-map-reduce-googles-papers.html"&gt;Click here for the full article&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/MarginallyInteresting/~4/_VZHgJnu9mw" height="1" width="1"/&gt;</content>
 </entry>
 

 
   <entry>
   <title type="html">Yet another Big Data whitepaper</title>
   <link href="http://blog.mikiobraun.de/2013/01/yet-another-big-data-whitepaper.html" />
   <updated>2013-01-17T15:45:00+01:00</updated>
   <published>2013-01-17T15:45:00+01:00</published>
   <author>
     <name>Mikio L. Braun</name>
     <uri>http://mikiobraun.de</uri>
     <email>mikiobraun@gmail.com</email>
   </author>
   <id>http://blog.mikiobraun.de/2013/01/yet-another-big-data-whitepaper</id>
   <content type="html">&lt;p&gt;&lt;p&gt;I recently read the white paper &lt;a href='http://cra.org/ccc/docs/init/bigdatawhitepaper.pdf'&gt;&amp;#8220;Challenges and Opportunities with Big Data&amp;#8221;&lt;/a&gt; published by the Computing Community Consortium of the CRA. It was an interesting read, but I found it a bit too database centric, which is probably no suprise given its set of authors.&lt;/p&gt;

&lt;p&gt;White papers are an interesting class of documents. They&amp;#8217;re usually written from a high-level point of view and are often used to represent opinions. I often wonder what their purpose is. At least one is to form a citable piece of authority which others can then use to support whatever agenda they&amp;#8217;re up to. For example, McKinsey&amp;#8217;s &lt;a href='http://www.mckinsey.com/insights/mgi/research/technology_and_innovation/big_data_the_next_frontier_for_innovation'&gt;Big Data study&lt;/a&gt; is continually cited to say that Big Data will have a huge economic impact in the next years.&lt;/p&gt;

&lt;p&gt;The CCC report of course also believes that Big Data is The Next Big Thing, and discusses what it believes are the upcoming challenges. It first defines a Big Data Analysis Pipeline consisting of the steps&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Acquisition/Recording,&lt;/li&gt;

&lt;li&gt;Extraction/Cleaning/Annotation,&lt;/li&gt;

&lt;li&gt;Integration/Aggregation/Representation,&lt;/li&gt;

&lt;li&gt;Analysis/Modeling, and&lt;/li&gt;

&lt;li&gt;Interpretation,&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;as well as overall properties like Heterogeneity, Scale, Timeliness, Privacy, and Human Collaboration (i.e. cloud sourcing). The report then discusses each of these aspects by giving a brief overview of the state of the art and then discussing future and current challenges.&lt;/p&gt;

&lt;p&gt;It becomes pretty clear quickly that most of the authors come from a data base background, because the report outlines an approach similar to what has worked so well in the data base world. Basically, Big Data will be solved by the following two steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Transform raw data into a format which is more suitable for automated analysis, also dealing with noisy or incomplete data.&lt;/li&gt;

&lt;li&gt;Separate the &amp;#8220;what&amp;#8221; from the &amp;#8220;how&amp;#8221;, just as a SQL query declaratively says what to compute, leaving it up to the database to break this down into an optimal set of primitive search operations, which we would then know how to scale.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I have my doubts concerning both steps. Don&amp;#8217;t get me wrong, having a declarative query language which automatically scales to petabytes of data would be really cool, as well as automated ways to represent data effectively.&lt;/p&gt;

&lt;p&gt;But from an ML point of view, data is always noisy and also highly unstructured, in particular in a way we don&amp;#8217;t yet know how to interpret automatically. For example, if you take text, it&amp;#8217;s pretty clear for us as humans that there is certain information encoded in the text. We have taught computers to understand some of that information using representations like n-grams, some robust form of parsing, and so on, but it is also pretty clear that there is a lot of information we just don&amp;#8217;t know how to preserve. In other word, we just don&amp;#8217;t know how to preserve the information contained in the data for many media types, and existing techniques will likely destroy a substantial part of that information.&lt;/p&gt;

&lt;p&gt;Concerning a declarative query language, I think data base people probably don&amp;#8217;t have a good idea of just how divers Machine Learning methods can be. They are inspired by biology, physics, statistics, usually use vectorial representations and linear algebra for computations, or run complex Monte Carlo simulation, all of which cannot be naturally expressed using JOINs and similar things in a relational data base system. So when the report complains that&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Today&amp;#8217;s analysts are impeded by a tedious process of exporting data from the database, performing a non-SQL process and bringing the data back.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;the main reason is that the non-SQL process is so different from a relational world view that this is actually the most effective way to do the computation.&lt;/p&gt;

&lt;p&gt;Moreover, ML algorithms are often iterative and have a lot of state, meaning that you don&amp;#8217;t just run a query of sorts, but rather complex computations which update a lot of state again and again until your results have converged.&lt;/p&gt;

&lt;p&gt;It&amp;#8217;s probably possible to build such DSLs for subgroups of algorithms (for example, Bayesian algorithms, distributed linear algebra libraries, etc.), I&amp;#8217;m unsure whether there is a general query language which is not just a generic cluster job system. Existing approaches like map reduce or stream processing come with very little assumptions on your data and mostly manage just the parallelization aspect of your computations.&lt;/p&gt;

&lt;p&gt;Still, an interesting paper from an infrastructure point of view, but I feel like they&amp;#8217;re missing the point when it comes to the data analysis part.&lt;/p&gt;&lt;/p&gt;
   &lt;p&gt;&lt;a href="http://blog.mikiobraun.de/2013/01/yet-another-big-data-whitepaper.html"&gt;Click here for the full article&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/MarginallyInteresting/~4/x_ZCM9cHWjs" height="1" width="1"/&gt;</content>
 </entry>
 

 
   <entry>
   <title type="html">What is streamdrill's trick?</title>
   <link href="http://blog.mikiobraun.de/2013/01/what-is-streamdrills-trick.html" />
   <updated>2013-01-15T16:46:00+01:00</updated>
   <published>2013-01-15T16:46:00+01:00</published>
   <author>
     <name>Mikio L. Braun</name>
     <uri>http://mikiobraun.de</uri>
     <email>mikiobraun@gmail.com</email>
   </author>
   <id>http://blog.mikiobraun.de/2013/01/what-is-streamdrills-trick</id>
   <content type="html">&lt;p&gt;&lt;p&gt;In the previous posts I talked about what &lt;a href='http://streamdrill.com'&gt;streamdrill&lt;/a&gt; is &lt;a href='/2013/01/what-is-streamdrill-good-for.html'&gt;good for&lt;/a&gt; and &lt;a href='/2013/01/streamdrill-compared-top-k-problem.html'&gt;how it compares to other Big Data approaches&lt;/a&gt; to real-time event processing. Streamdrill solves the &lt;em&gt;top-k problem&lt;/em&gt; which consists in aggregating activities in an event stream over different timescales and identifying the most active types of events in real-time.&lt;/p&gt;

&lt;p&gt;So, how does streamdrill manage to deal with such large data volumes on a single node?&lt;/p&gt;

&lt;p&gt;Streamdrills efficiency is based on two algorithmic choices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It uses exponential decay for aggregation.&lt;/li&gt;

&lt;li&gt;It bounds its resource usage by selectively discarding inactive entries.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let&amp;#8217;s take these two things one at a time.&lt;/p&gt;

&lt;h2 id='exponential_decay_vs_exact_time_window'&gt;Exponential Decay vs. Exact Time Window&lt;/h2&gt;

&lt;p&gt;Usually, when you do trending, you aggregate over some fixed time window. The easiest way to do this is to keep counting and reset the counters at a fixed point in time. For example, you could compile daily stats by taking the values at midnight and resetting the counters.&lt;/p&gt;

&lt;p&gt;The problem with that approach is that you have to wait for the current interval to end before you have some numbers, which is hardly real-time. You could take the current value of the counter nevertheless and then try to extrapolate in some way, which is also hard if the rate at which events come in varies a lot.&lt;/p&gt;

&lt;p&gt;Another approach is to do a &amp;#8220;rolling window&amp;#8221;, for example, having the counts between now and the the same time yesterday. The problem with that approach is that you need to keep all the data somewhere so that you can subtract an event when it falls out of the window. If you want to aggregate over longer time intervals, say a month, this gets hard, as you have to potentially store millions of events somewhere.&lt;/p&gt;
&lt;div class='figure'&gt;
     &lt;img src='/images/intro-timescales.png' /&gt;
&lt;/div&gt;
&lt;p&gt;Another approach is to use some form of decay. The idea here is that the &amp;#8220;value&amp;#8221; of an event decays continually over time until it is eventually zero. That way, you again get an aggregate over some amount of time, although it&amp;#8217;s not exactly the same thing as a fixed time window. In the end, it doesn&amp;#8217;t really matter if you understand what it is exactly you&amp;#8217;re measuring.&lt;/p&gt;

&lt;p&gt;The good thing is that decaying counters can often be implemented in a way which does not require us to keep all of the information, but only a few numbers per entry.&lt;/p&gt;

&lt;p&gt;Streamdrill does exactly this, using exponential decay. This means that after a specified amount of time, the value of an event has continually reduced to one half, and then again to a quarter, and so on. We chose exponential decay because it has the nice property that a counter always decays to half its original value after a given amount of time, irrespective of the initial value of the counter. That is not the case, for example, for linear decay, where a counter starting at 1000 would take twice as long as one starting at 500.&lt;/p&gt;

&lt;h2 id='keeping_resource_usage_bounded'&gt;Keeping resource usage bounded&lt;/h2&gt;

&lt;p&gt;As I said above, streamdrill throws data away. More specifically, it removes the most inactive entries to make room for new ones. The main purpose of this technique is to ensure that the resource usages of streamdrill are bounded which is important to keep streamdrill&amp;#8217;s performance constant.&lt;/p&gt;

&lt;p&gt;This is also something we&amp;#8217;ve found people having a hard time to digest. After all, isn&amp;#8217;t the whole purpose of Big Data to never throw away data?&lt;/p&gt;

&lt;p&gt;First of all, note that there really is no way around throwing data away if you want to have a system which runs in a stable manner. Otherwise, data will just accumulate and your system is bound to become slower, unless you are able to grow it. But that costs money and adds a layer of complexity to your system you should not underestimate.&lt;/p&gt;

&lt;p&gt;Throwing data for inactive entries away also has the nice effect that you focus on the data which is much more likely to make the top-k entries.&lt;/p&gt;

&lt;p&gt;Second of all, as I&amp;#8217;ve also discussed in &lt;a href='/2012/08/why-you-dont-want-real-time-analytics-to-be-exact.html'&gt;another post&lt;/a&gt;, for many applications, in particular if you&amp;#8217;re really interested in finding the most active elements, it&amp;#8217;s completely ok if you get only approximate results. The counters you get will be approximate, but the overall ranking of the elements will be correct with high probability.&lt;/p&gt;

&lt;p&gt;In fact, approximative algorithms are nothing to be afraid of. &lt;a href='http://en.wikipedia.org/wiki/Approximation_algorithm'&gt;Approximation algorithms&lt;/a&gt; have been around for a long time. For many &lt;a href='http://sigact.org/Prizes/Godel/2010.html'&gt;hard optimization problems&lt;/a&gt;, approximative algorithms are the only way we know how to tackle those problems. Such algorithms trade exactness for resource usage, but come with performance guarantees, meaning that if you can afford the computation time or memory usage, you can get the approximation error as small as you wish. The same is true for streamdrill. If you have enough memory to keep all the events, you will get the correct answers.&lt;/p&gt;

&lt;p&gt;Streamdrill is even attractive in use cases where you need exact results (for example, in billing), because you can combine it with an (supposedly already existing) batch system, to get real-time analytics without having to invest heavily to scale your batch system to real-time.&lt;/p&gt;

&lt;p&gt;So this concludes the mini-series for streamdrill. If you&amp;#8217;re interested, head over to &lt;a href='http://streamdrill.com'&gt;streamdrill&lt;/a&gt; and download the demo, or contact &lt;a href='http://twitter.com/thinkberg'&gt;Leo&lt;/a&gt; or &lt;a href='http://twitter.com/mikiobraun'&gt;me&lt;/a&gt; on Twitter, or post your comments and questions below.&lt;/p&gt;

&lt;p&gt;We&amp;#8217;re currently planning what features to add for the 1.0 version and thinking the details of the licensing model. Currently we&amp;#8217;re thinking about both standalone licenses, as well as SaaS-type offerings.&lt;/p&gt;&lt;/p&gt;
   &lt;p&gt;&lt;a href="http://blog.mikiobraun.de/2013/01/what-is-streamdrills-trick.html"&gt;Click here for the full article&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/MarginallyInteresting/~4/1ltKylF0avg" height="1" width="1"/&gt;</content>
 </entry>
 

 
   <entry>
   <title type="html">Streamdrill compared to other approaches for the Top-K-Problem</title>
   <link href="http://blog.mikiobraun.de/2013/01/streamdrill-compared-top-k-problem.html" />
   <updated>2013-01-10T23:11:00+01:00</updated>
   <published>2013-01-10T23:11:00+01:00</published>
   <author>
     <name>Mikio L. Braun</name>
     <uri>http://mikiobraun.de</uri>
     <email>mikiobraun@gmail.com</email>
   </author>
   <id>http://blog.mikiobraun.de/2013/01/streamdrill-compared-top-k-problem</id>
   <content type="html">&lt;p&gt;&lt;p&gt;In my &lt;a href='/2013/01/what-is-streamdrill-good-for.html'&gt;last post&lt;/a&gt; I&amp;#8217;ve discussed what streamdrill does: It solves the top-k problem in real-time which consists in counting activities of different event times over a certain time interval.&lt;/p&gt;

&lt;p&gt;So far, so good, but you might wonder why you couldn&amp;#8217;t just do this by yourself. After all, it&amp;#8217;s just &lt;em&gt;counting&lt;/em&gt;, right? Actually, the problem that streamdrill solves is simple, as long as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the data volume is small&lt;/li&gt;

&lt;li&gt;the time windows are small compared to the data volume&lt;/li&gt;

&lt;li&gt;the number different kinds of events is small and bounded&lt;/li&gt;

&lt;li&gt;you don&amp;#8217;t need the results in real-time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But as soon as you have millions of events per day, wish to aggregate over days, weeks, or even months, and potentially have several million of different types of events, the problem gets quite complicated.&lt;/p&gt;

&lt;p&gt;You may end up in such a situation faster than you think:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Event rates may rise up beyond what you originally envisioned (in particular if you&amp;#8217;re successful. &lt;tt&gt;;-)&lt;/tt&gt;)&lt;/p&gt;
&lt;/li&gt;

&lt;li&gt;
&lt;p&gt;The number of different event types may explode. This might either be because the underlying sets are already large (say, IP addresses, or users in a social network), or because you are tracking combinations such that the sizes multiply.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let&amp;#8217;s dig deeper into this problem to better understand why this problem quickly gets hard.&lt;/p&gt;

&lt;p&gt;One thing you need to keep in mind is that the other solutions we will discuss still involve some amount of coding. So if you compare streamdrill against &lt;a href='http://hadoop.apache.org/'&gt;Hadoop&lt;/a&gt;, you would need to do a non-trivial amount of coding for Hadoop, because Hadoop is a general purpose framework taking care of the scaling, but doesn&amp;#8217;t solve the top-k problem out of the box.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
	&lt;tr&gt;
	&lt;td /&gt;
	&lt;th&gt;streamdrill&lt;/th&gt;
	&lt;th&gt;Standard SQL&lt;/th&gt;
	&lt;th&gt;Counters&lt;/th&gt;
	&lt;th&gt;Stream Processing&lt;/th&gt;
	&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;solves top-k out of the box&lt;/td&gt;
	&lt;td&gt;&amp;#x2713;&lt;/td&gt;
	&lt;td&gt;&amp;#x2715;&lt;/td&gt;		
	&lt;td&gt;&amp;#x2715;&lt;/td&gt;		
	&lt;td&gt;&amp;#x2715;&lt;/td&gt;
&lt;/tr&gt;	
&lt;tr&gt;&lt;td&gt;real-time&lt;/td&gt;
	&lt;td&gt;&amp;#x2713;&lt;/td&gt;
	&lt;td&gt;&amp;#x2715;&lt;/td&gt;
	&lt;td&gt;(&amp;#x2713; \$\$\$)&lt;/td&gt;
	&lt;td&gt;&amp;#x2713;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;focusses computation on "hot set"&lt;/td&gt;
	&lt;td&gt;&amp;#x2713;&lt;/td&gt;
	&lt;td&gt;&amp;#x2715;&lt;/td&gt;		
	&lt;td&gt;&amp;#x2715;&lt;/td&gt;		
	&lt;td&gt;&amp;#x2715;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;memory based for high throughput&lt;/td&gt;
	&lt;td&gt;&amp;#x2713;&lt;/td&gt;
	&lt;td&gt;&amp;#x2715;&lt;/td&gt;		
	&lt;td&gt;&amp;#x2715;&lt;/td&gt;		
	&lt;td&gt;&amp;#x2713;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;persistent&lt;/td&gt;
	&lt;td&gt;(&amp;#x2713; &amp;#x231A;)&lt;/td&gt;		
	&lt;td&gt;&amp;#x2713;&lt;/td&gt;
	&lt;td&gt;&amp;#x2713;&lt;/td&gt;
	&lt;td&gt;(&amp;#x2713;)&lt;/td&gt;		
&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;scales to cluster&lt;/td&gt;
	&lt;td&gt;(&amp;#x2713; &amp;#x231A;)&lt;/td&gt;		
	&lt;td&gt;&amp;#x2715;&lt;/td&gt;		
	&lt;td&gt;&amp;#x2713;&lt;/td&gt;		
	&lt;td&gt;&amp;#x2713;&lt;/td&gt;		
&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;exact results&lt;/td&gt;
	&lt;td&gt;(&amp;#x2713; &amp;#x231A;)&lt;/td&gt;		
	&lt;td&gt;&amp;#x2713;&lt;/td&gt;		
	&lt;td&gt;&amp;#x2713;&lt;/td&gt;		
	&lt;td&gt;&amp;#x2713;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&amp;#10003; = yes, &amp;#10005; = no, (&amp;#10003;) = possible, (&amp;#10003; \$\$\$) = possible, but expensive, (&amp;#10003; &amp;#8986;) = possible, but not yet.&lt;/p&gt;

&lt;h2 id='approach_1_store_and_crunch_later'&gt;Approach 1: Store and crunch later&lt;/h2&gt;
&lt;div class='figure'&gt;
&lt;img src='/images/streamdrill-store-and-crunch.png' /&gt;
&lt;/div&gt;
&lt;p&gt;Let&amp;#8217;s start by using a traditional approach based on some SQL database. To do that, you would create a table with a timestamp column and columns to reflect the fields in your data. Then you would just pipe in each event into the database. To get the counts you would run something like &lt;code&gt;SELECT count(*) FROM events WHERE timestamp
&amp;gt; '2012-12-01 00:00' AND timestamp &amp;lt; '2012-12-31 23:59'&lt;/code&gt; potentially also grouping to focus on certain types of events, and adding an &lt;code&gt;ORDER BY count(*)&lt;/code&gt; clause to get the most active elements.&lt;/p&gt;

&lt;p&gt;There a number of problems with this approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If you use a normal disk based database, you will be able to add only a few hundred, at best thousands, events per second,&lt;/li&gt;

&lt;li&gt;As you add more data, your database will become slower over time (and you will be adding a lot of data!). &lt;a href='http://twitter.com/thinkberg'&gt;Leo&lt;/a&gt; has a number of nice benchmarks on his blog for &lt;a href='http://thinkberg.com/space/start/2009-09-25/1#NoSQL_:_MongoDB_performance_testing_%28part_1:_insert%29...'&gt;MongoDB&lt;/a&gt; and &lt;a href='http://thinkberg.com/space/start/2010-05-12/1#NoSQL_revisited:_Cassandra_/_Scala'&gt;Cassandra&lt;/a&gt; insertion performance.&lt;/li&gt;

&lt;li&gt;Just adding the data is not enough, you also need to crunch the whole data to compute the activities. But the longer the time window, the longer will the query take to run.&lt;/li&gt;

&lt;li&gt;While the query runs, there will be considerable load on your server, making the addition of events even slower.&lt;/li&gt;

&lt;li&gt;Eventually, you will get so slow that your results will already be a few minutes or even hours old once you get them. Hardly real-time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What&amp;#8217;s more, you&amp;#8217;re probably only interested in the top 100 active elements, so most of the computation is spent on data you&amp;#8217;re not interested in.&lt;/p&gt;

&lt;p&gt;In other words, putting the data into some big database and crunching it later won&amp;#8217;t really scale. If you&amp;#8217;ve got a lot of money on the side, you can employ some form of clustering using map reduce or a similar approach to bring down the times, but the bottom line is the same: You crunch a lot of data which you don&amp;#8217;t really need, and the problem will only become harder if you get more and more data. And &amp;#8220;harder&amp;#8221; also means a lot of coding and operations work (which you don&amp;#8217;t have if you just use streamdrill &lt;tt&gt;;)&lt;/tt&gt;).&lt;/p&gt;

&lt;h2 id='approach_2_just_keeping_the_counters'&gt;Approach 2: Just keeping the counters&lt;/h2&gt;
&lt;div class='figure'&gt;
&lt;img src='/images/streamdrill-counters.png' /&gt;
&lt;/div&gt;
&lt;p&gt;So just storing the data away and crunching it later won&amp;#8217;t work. So how about doing the counting on the spot? That way, you wouldn&amp;#8217;t have to store all those duplicate events. Note that just keeping the counters isn&amp;#8217;t sufficient, you also need to maintain the global index such that you can quickly identify the top-k entries.&lt;/p&gt;

&lt;p&gt;Using a complex event processing framework like &lt;a href='http://esper.codehaus.org/'&gt;Esper&lt;/a&gt; would also be a way to reduce the coding load, as Esper comes with a nice query language which let&amp;#8217;s you formulate averages over time windows in compact way.&lt;/p&gt;

&lt;p&gt;Let&amp;#8217;s assume your data doesn&amp;#8217;t fit into memory anymore (otherwise it won&amp;#8217;t be Big Data, right?). One option is to again store the counters in a database. However, just as in the previous example this restricts the number of updates you can handle. Also, you will generate a lot of changes on the database and not all databases handle that amount of write throughput gracefully. For example, &lt;a href='http://cassandra.apache.org/'&gt;Cassandra&lt;/a&gt; only marks old entries for deletion and cleans up during the compaction phases. &lt;a href='http://de.slideshare.net/mikiobraun/cassandra-an-introduction'&gt;In our experience&lt;/a&gt;, such compactions will eventually take hours and put significant load on your system, cutting the throughput in half.&lt;/p&gt;

&lt;p&gt;And again, most of the time is spent on elements which will never make the top-k entries.&lt;/p&gt;

&lt;h2 id='approach_3_stream_processing_frameworks'&gt;Approach 3: Stream processing frameworks&lt;/h2&gt;
&lt;div class='figure'&gt;
&lt;img src='/images/streamdrill-stream-processing.png' /&gt;
&lt;/div&gt;
&lt;p&gt;Instead of keeping counters in a database, you could also try and scale out using a stream processing framework like Twitter&amp;#8217;s &lt;a href='http://storm-project.net/'&gt;Storm&lt;/a&gt;, or Yahoo&amp;#8217;s &lt;a href='http://incubator.apache.org/s4/'&gt;S4&lt;/a&gt;. Such systems let you define the computation tasks in the form of small worker threads which are then distributed over a cluster automatically by the framework, also keeping the counters in memory.&lt;/p&gt;

&lt;p&gt;While this looks appealing (and in fact, allows you to scale to several hundred thousand events per second), note that this only solves the counting part, but not the global index of all activities. Computing that in a way which scales is non-trivial. You can collect the counter updates at a worker thread which then maintains the index, but what if it doesn&amp;#8217;t fit into memory? You could partition the index, but then you&amp;#8217;d have to aggregate the data to compute queries, and you&amp;#8217;d have to do this yourself, so again, a lot of complexity. The above stream processing frameworks also don&amp;#8217;t come with easy support for query, so you&amp;#8217;d need to build some infrastructure to collect the results yourself.&lt;/p&gt;

&lt;p&gt;And again, you also have a lot of computation for elements which will never show up in your top 100.&lt;/p&gt;

&lt;h2 id='in_summary'&gt;In summary&lt;/h2&gt;

&lt;p&gt;While conceptually simple (in the end, you just count, right), the top-k problem which streamdrill addresses becomes hard if there are more things to count than fit into memory, and the event rate is higher than what you can write to disk.&lt;/p&gt;

&lt;p&gt;Finally, let&amp;#8217;s discuss some of the differences:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;As streamdrill is memory based, all the data is lost when streamdrill crashes or is restarted. However, we already have functionality in the engine to write snapshots of all the data, but those aren&amp;#8217;t available yet via the API streamdrill.&lt;/p&gt;
&lt;/li&gt;

&lt;li&gt;
&lt;p&gt;Right now, streamdrill does not support clustering. We just haven&amp;#8217;t found it necessary so far, but it&amp;#8217;s something that is possible and will be included soon.&lt;/p&gt;
&lt;/li&gt;

&lt;li&gt;
&lt;p&gt;Finally, as I&amp;#8217;m going to explain in more depth in the next post, streamdrill is based on approximate algorithms which trade exactness versus performance. Again, if exactness is really an issue, you can get it by combining with one of the other technologies. This is possible, but not our top priority for now.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the next post, I&amp;#8217;ll explain how streamdrill approaches the problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Related posts:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href='/2013/01/what-is-streamdrill-good-for.html'&gt;What is streamdrill good for?&lt;/a&gt;&lt;/li&gt;

&lt;li&gt;&lt;a href='/2011/10/one-does-not-simply-scale-into-realtime-processing.html'&gt;One does not simply scale into real-time&lt;/a&gt;&lt;/li&gt;

&lt;li&gt;&lt;a href='/2012/12/download-streamdrill-demo.html'&gt;Download the streamdrill demo&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/p&gt;
   &lt;p&gt;&lt;a href="http://blog.mikiobraun.de/2013/01/streamdrill-compared-top-k-problem.html"&gt;Click here for the full article&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/MarginallyInteresting/~4/VdDzwo3uZ_8" height="1" width="1"/&gt;</content>
 </entry>
 

 
 
</feed>
