<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" version="2.0">

<channel>
	<title>DataStax » Blog Post – Corporate</title>
	
	<link>http://www.datastax.com</link>
	<description>DataStax - Software, support, and training for Apache Cassandra</description>
	<lastBuildDate>Wed, 22 Feb 2012 20:22:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/datastax" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="datastax" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>DataStax Enterprise 1.0.2 Service Pack Now Available</title>
		<link>http://www.datastax.com/2012/02/datastax-enterprise-1-0-2-service-pack-now-available</link>
		<comments>http://www.datastax.com/2012/02/datastax-enterprise-1-0-2-service-pack-now-available#comments</comments>
		<pubDate>Fri, 17 Feb 2012 13:38:33 +0000</pubDate>
		<dc:creator>Robin Schumacher</dc:creator>
				<category><![CDATA[Blog Post]]></category>
		<category><![CDATA[Blog Post - Corporate]]></category>

		<guid isPermaLink="false">http://www.datastax.com/?p=9361</guid>
		<description><![CDATA[
We&#8217;re please to let you know that DataStax Enterprise service pack 1.0.2 is now available for <a href="http://www.datastax.com/download/enterprise/versions">download</a>. Please see the <a href="http://www.datastax.com/docs/1.0/datastax_enterprise/dse_release_notes">release notes</a> for the changes included in the service pack and the <a href="http://www.datastax.com/docs">online documentation</a> for upgrade instructions. ]]></description>
			<content:encoded><![CDATA[<p>
We&#8217;re please to let you know that DataStax Enterprise service pack 1.0.2 is now available for <a href="http://www.datastax.com/download/enterprise/versions">download</a>. Please see the <a href="http://www.datastax.com/docs/1.0/datastax_enterprise/dse_release_notes">release notes</a> for the changes included in the service pack and the <a href="http://www.datastax.com/docs">online documentation</a> for upgrade instructions. ]]></content:encoded>
			<wfw:commentRss>http://www.datastax.com/2012/02/datastax-enterprise-1-0-2-service-pack-now-available/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why people care about big data and multi-datacenter replication</title>
		<link>http://www.datastax.com/2012/02/why-people-care-about-big-data-and-multi-datacenter-replication</link>
		<comments>http://www.datastax.com/2012/02/why-people-care-about-big-data-and-multi-datacenter-replication#comments</comments>
		<pubDate>Sun, 05 Feb 2012 03:00:17 +0000</pubDate>
		<dc:creator>Billy Bosworth</dc:creator>
				<category><![CDATA[Blog Post]]></category>
		<category><![CDATA[Blog Post - Corporate]]></category>

		<guid isPermaLink="false">http://www.datastax.com/?p=9230</guid>
		<description><![CDATA[This is my final post in a series (<a href="http://www.datastax.com/2012/01/why-should-i-use-cassandra" target="_blank">here</a>, <a href="http://www.datastax.com/2012/01/choosing-the-right-architecture-for-big-data-scale" target="_blank">here</a>, and <a href="http://www.datastax.com/2012/01/when-failure-is-not-an-option-for-your-big-data-system" target="_blank">here</a>) breaking down the following paragraph from our recent <a href="http://www.datastax.com/2012/01/datastax-take-apache-cassandra-mainstream-in-2011-poised-for-growth-and-innovation-in-2012" target="_blank">press release</a> that touches on some of the key reasons why people choose Cassandra.  It reads:
<span style="color: #808080;">Customers this year chose</span>&#8230;]]></description>
			<content:encoded><![CDATA[This is my final post in a series (<a href="http://www.datastax.com/2012/01/why-should-i-use-cassandra" target="_blank">here</a>, <a href="http://www.datastax.com/2012/01/choosing-the-right-architecture-for-big-data-scale" target="_blank">here</a>, and <a href="http://www.datastax.com/2012/01/when-failure-is-not-an-option-for-your-big-data-system" target="_blank">here</a>) breaking down the following paragraph from our recent <a href="http://www.datastax.com/2012/01/datastax-take-apache-cassandra-mainstream-in-2011-poised-for-growth-and-innovation-in-2012" target="_blank">press release</a> that touches on some of the key reasons why people choose Cassandra.  It reads:
<blockquote><span style="color: #808080;">Customers this year chose Cassandra time and time again over competing solutions. The peer-to-peer design allows for high performance with linear scalability and no single points of failure, even <span style="color: #000080;"><strong>across multiple data centers</strong></span>.  Combine this with native <span style="color: #000080;"><strong>optimization for the cloud</strong> </span>and an extremely robust data model and Cassandra clearly stands apart from the competition for enterprise, mission-critical systems. </span>[emphasis added]</blockquote>
For many, the idea of spanning multiple datacenters with a single database conjures images of late nights, amazing complexity, and delicate &#8220;bubble gum and shoestring&#8221; solutions that once created, you would never even think about touching for fear of watching it all crumble.  The architectures were so challenging that the cost and complexity simply outweighed the benefits.  But in today&#8217;s world, it&#8217;s becoming more than a benefit &#8212; it&#8217;s now a requirement.  Let me share a few examples:
<ul>
	<li>Disaster avoidance.  Some companies need to plan for a worst-case scenario where they lose contact with an entire datacenter (or &#8220;region&#8221; in Amazon&#8217;s cloud).  During the outage, their application running on the database must not fail.</li>
	<li>Performance.  Taking the processing to the locality where the application is interacting with its users.</li>
	<li>Scale.  One customer of ours is running an application simultaneously collecting massive amounts of device data from its infrastructure in more than ten (10) datacenters across the globe.</li>
	<li>Security.  Certain industries have regulations that require local copies of the data, but nobody wants to wait on things like log shipping and batch loads.</li>
</ul>
The world has shrunk for businesses and IT is often in a race to catch up.  The fact is that now big data comes from a wide variety of geographic locations.  It only makes sense that developers and operations teams are going to require that their underlying database keep pace.  But because of all the &#8220;scar tissue&#8221; formed from many years of trying to handle even keeping just two (2) datacenters in sync (let alone multiple) people have a healthy skepticism as to whether or not it can really be done.

Not only &#8220;can&#8221; it be done, it &#8220;must&#8221; be done as that is going to be a common requirement in this big data world.

I think about it a little like wireless.  When wireless networking became a reality, many were ultra skeptical.  Remember how hard everyone fought hard to get their computers and phones in their homes and offices in just the right places with just the right wiring?  Wireless, at first, seemed far too complicated and a &#8220;pie in the sky&#8221; vision.  But once we started using it, there was no going back.  The benefits were too great, and we all started demanding it.

When I talk to customers whose businesses now live and love multi-datacenter replication, I can&#8217;t help but think that in a few years, we&#8217;ll all be looking back and asking: &#8220;How did I ever build applications on just one, or even two, datacenters?

&nbsp;]]></content:encoded>
			<wfw:commentRss>http://www.datastax.com/2012/02/why-people-care-about-big-data-and-multi-datacenter-replication/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Big Statements about Big Data</title>
		<link>http://www.datastax.com/2012/02/big-statements-about-big-data</link>
		<comments>http://www.datastax.com/2012/02/big-statements-about-big-data#comments</comments>
		<pubDate>Wed, 01 Feb 2012 15:13:38 +0000</pubDate>
		<dc:creator>Robin Schumacher</dc:creator>
				<category><![CDATA[Blog Post]]></category>
		<category><![CDATA[Blog Post - Corporate]]></category>

		<guid isPermaLink="false">http://www.datastax.com/?p=9175</guid>
		<description><![CDATA[I was surprised at how many <a href="http://www.datastax.com/2011/11/the-priority-of-big-data-in-2012">news stories</a> at the end of 2011 contained predictions that big data would be ‘big’ in 2012, and thought that the chorus of voices talking about big data would lessen a bit in the early part of this year.

How wrong I&#8230;]]></description>
			<content:encoded><![CDATA[<p>
I was surprised at how many <a href="http://www.datastax.com/2011/11/the-priority-of-big-data-in-2012">news stories</a> at the end of 2011 contained predictions that big data would be ‘big’ in 2012, and thought that the chorus of voices talking about big data would lessen a bit in the early part of this year.

How wrong I was.

Instead, the momentum and grandeur of the statements being made has only gotten stronger. Witness <a href="http://online.wsj.com/article/SB10001424052970203471004577140413041646048.html?mod=wsj_share_in_bot">this story in the Wall Street Journal</a> that calls out big data as being one of the three technologies that will help pull the present economy out of recession just as electrification, telephony, and the automobile did for the recession of 1912.

That’s bold.

ZDNet columnist Dion Hinchcliffe was ahead of the WSJ and put big data as the foundation of what he believes are <a href="http://www.zdnet.com/blog/hinchcliffe/the-big-five-it-trends-of-the-next-half-decade-mobile-social-cloud-consumerization-and-big-data/1811?tag=mantle_skin;content">the five major shifts in 21<sup>st</sup> century information technology</a>.

<a rel="attachment wp-att-9176" href="http://www.datastax.com/wp-content/uploads/2012/02/5-shifts-in-21st-century-technology.jpg" rel="facebox"><img class="aligncenter size-medium wp-image-9176" title="5 shifts in 21st century technology" src="http://www.datastax.com/wp-content/uploads/2012/02/5-shifts-in-21st-century-technology-232x300.jpg" alt="" width="232" height="300" /></a>

As a reminder, big data doesn’t necessarily only mean dealing with petabytes of data, although that can certainly be part of the equation. Big data, as defined by Gartner Group, encompasses:
<ul>
	<li>Velocity – the speed at which data is coming in. That data may only sum up to be GB’s and not TB’s/PB’s in the end, but the pace at which the data enters the database (be it in burst format or continuous) is extremely rapid in nature</li>
	<li>Variety – this includes structured, semi-structured, and unstructured data, perhaps all in the same database</li>
	<li>Volume – indeed big data can mean “big” data.  The desire of many companies to have large volumes of corporate data at the ready for either real-time or batch analysis has never been higher</li>
	<li>Complexity – big data can bring a lot of complication in the forms of heavy data distribution, multi-geography and data center topographies, separation of workloads (e.g. real-time, batch), ETL processes that keep data on the move, and more</li>
</ul>
These four dimensions comprise very real challenges and very real opportunities for those who can effectively tame big data and make it work for them. A good example of the types of benefits that flow from big data is seen in <a href="http://money.cnn.com/news/newsfeeds/gigaom/articles/cleantech_10_ways_big_data_is_remaking_energy.html?iid=SF_T_LN">this recent article from CNN Money</a> that highlights the use cases of big data in the energy marketplace.

We’re excited at DataStax about seeing our customers succeed in their big data projects through a blend of open source technology and commercial software. Our <a href="http://www.datastax.com/products/enterprise">DataStax Enterprise big data platform</a> provides the best of both worlds and supplies proven technology (Cassandra and Hadoop) that future-proof’s any application, and cost savings that range from 80-90% less than the RDBMS vendors charge.

For more information, download our <a href="http://www.datastax.com/wp-content/uploads/2011/10/WP-DataStax-BigData.pdf">big data white paper</a> along with a <a href="http://www.datastax.com/download/enterprise">copy of DataStax Enterprise</a> (free for development use – no limits or strings attached).

&nbsp;

&nbsp;]]></content:encoded>
			<wfw:commentRss>http://www.datastax.com/2012/02/big-statements-about-big-data/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Apache Cassandra on Windows? Absolutely!</title>
		<link>http://www.datastax.com/2012/01/apache-cassandra-on-windows-absolutely</link>
		<comments>http://www.datastax.com/2012/01/apache-cassandra-on-windows-absolutely#comments</comments>
		<pubDate>Tue, 31 Jan 2012 13:14:56 +0000</pubDate>
		<dc:creator>Robin Schumacher</dc:creator>
				<category><![CDATA[Blog Post]]></category>
		<category><![CDATA[Blog Post - Corporate]]></category>

		<guid isPermaLink="false">http://www.datastax.com/?p=9165</guid>
		<description><![CDATA[Outside of politics and religion, nothing can get the blood rolling in tech folk’s bodies more than a healthy debate on the merits of operating systems, such as Linux vs. Windows vs. Mac. Personally, I’ve never used Mac in a production environment, but I have plenty of experience with&#8230;]]></description>
			<content:encoded><![CDATA[<p>
Outside of politics and religion, nothing can get the blood rolling in tech folk’s bodies more than a healthy debate on the merits of operating systems, such as Linux vs. Windows vs. Mac. Personally, I’ve never used Mac in a production environment, but I have plenty of experience with databases on Linux and Windows. And I have a confession to make: I’ve always found Windows to be a pleasant experience.

I’ve worked with Oracle, SQL Server, MySQL, and PostgreSQL on Windows and in general was happy with the results. I remember a case at a large company I worked where we benchmarked Oracle for a major trading system on a large HPUX system against a Windows config that cost 75% less and saw the Windows box win in 90% of the test cases.

In my experience, I also never saw all the supposed outages, performance issues, etc., that some report on Windows. That doesn’t mean my experience of managing databases on Windows was perfect, but overall, Windows did a very nice job for the systems I was assigned to care for.

A few months back we ran a poll asking what development platforms you used, and you said you used Microsoft Windows a lot (it was second on the list behind Mac). So we’ve listened to you and now made available a bundled installer for Windows that includes the latest version of Apache Cassandra, all utilities including the CQL interface, and our DataStax OpsCenter community edition. The MSI installer does everything for you, including creating all the Windows services you need, and takes about one minute to complete.

For a run-through of what the new Windows package looks like along with some pointers on getting started on Windows, see a <a href="http://www.datastax.com/resources/articles/getting-started-with-cassandra-on-windows">new article</a> I’ve posted here on using our new installer on a single Windows box. The installer is primarily designed for single Windows installs on workstations and laptops, but you can create multi-node setups with a little manual tweaking – something I demonstrate in <a href="http://www.datastax.com/resources/articles/setup-and-monitor-a-multi-node-cassandra-cluster-on-windows">this article</a> that shows how to create and monitor a new Cassandra cluster on Windows.

Today, we support Windows 7 and Windows 2008 Server, both 32 and 64-bit, for development work only. Production support will be coming soon.

<a href="http://www.datastax.com/download/community">Download our new Windows package</a> and let us know what you think. And thanks for continuing to support Apache Cassandra and DataStax!

&nbsp;]]></content:encoded>
			<wfw:commentRss>http://www.datastax.com/2012/01/apache-cassandra-on-windows-absolutely/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>DataStax OpsCenter Now Available on Mac</title>
		<link>http://www.datastax.com/2012/01/datastax-opscenter-now-available-on-mac</link>
		<comments>http://www.datastax.com/2012/01/datastax-opscenter-now-available-on-mac#comments</comments>
		<pubDate>Tue, 31 Jan 2012 13:02:42 +0000</pubDate>
		<dc:creator>Robin Schumacher</dc:creator>
				<category><![CDATA[Blog Post]]></category>
		<category><![CDATA[Blog Post - Corporate]]></category>

		<guid isPermaLink="false">http://www.datastax.com/?p=9158</guid>
		<description><![CDATA[Well, you asked and we’ve responded. You said you wanted OpsCenter on Mac on one of the polls we recently took, and now you’ve got your wish.  You can now download and run DataStax OpsCenter on your Mac.

If you want a detailed run-through of how easy it is&#8230;]]></description>
			<content:encoded><![CDATA[<p>
Well, you asked and we’ve responded. You said you wanted OpsCenter on Mac on one of the polls we recently took, and now you’ve got your wish.  You can now download and run DataStax OpsCenter on your Mac.

If you want a detailed run-through of how easy it is to install both Cassandra and OpsCenter on Mac, see a new article I’ve posted <a href="http://www.datastax.com/resources/articles/working-with-apache-cassandra-on-mac-os-x">here</a>.

Keep in mind that, currently, we only support OpsCenter on Mac (both Community and Enterprise versions) for development purposes right now.

<a href="http://www.datastax.com/download/community">Download DataStax OpsCenter for Mac</a> and let us know what you think. And thanks, as always, for your support of DataStax and Apache Cassandra.

&nbsp;]]></content:encoded>
			<wfw:commentRss>http://www.datastax.com/2012/01/datastax-opscenter-now-available-on-mac/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>DataStax Community Server 1.0.7 Now Available</title>
		<link>http://www.datastax.com/2012/01/datastax-community-server-1-0-7-now-available</link>
		<comments>http://www.datastax.com/2012/01/datastax-community-server-1-0-7-now-available#comments</comments>
		<pubDate>Mon, 23 Jan 2012 13:21:14 +0000</pubDate>
		<dc:creator>Robin Schumacher</dc:creator>
				<category><![CDATA[Blog Post]]></category>
		<category><![CDATA[Blog Post - Corporate]]></category>

		<guid isPermaLink="false">http://www.datastax.com/?p=8937</guid>
		<description><![CDATA[
We&#8217;ve updated the DataStax Community Server to Apache Cassandra 1.0.7 and it&#8217;s now available for <a href="http://www.datastax.com/download/community">download on our community downloads page</a>. All changes can be found in the CHANGES.txt file in the main installation directory. Enjoy! ]]></description>
			<content:encoded><![CDATA[<p>
We&#8217;ve updated the DataStax Community Server to Apache Cassandra 1.0.7 and it&#8217;s now available for <a href="http://www.datastax.com/download/community">download on our community downloads page</a>. All changes can be found in the CHANGES.txt file in the main installation directory. Enjoy! ]]></content:encoded>
			<wfw:commentRss>http://www.datastax.com/2012/01/datastax-community-server-1-0-7-now-available/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>My thoughts on Amazon’s DynamoDB</title>
		<link>http://www.datastax.com/2012/01/my-thoughts-on-amazons-dynamodb</link>
		<comments>http://www.datastax.com/2012/01/my-thoughts-on-amazons-dynamodb#comments</comments>
		<pubDate>Thu, 19 Jan 2012 01:41:39 +0000</pubDate>
		<dc:creator>Billy Bosworth</dc:creator>
				<category><![CDATA[Blog Post]]></category>
		<category><![CDATA[Blog Post - Corporate]]></category>

		<guid isPermaLink="false">http://www.datastax.com/?p=8818</guid>
		<description><![CDATA[Earlier today Amazon announced <a href="http://aws.amazon.com/DynamoDB">DynamoDB</a>, a hosted database for Amazon Web Services, which prompted several friends and colleagues of mine to ping me for my thoughts.  The reason they asked is because Cassandra was initially created as the combination of Amazon Dynamo&#8217;s fully distributed architecture and Google Big Table&#8217;s&#8230;]]></description>
			<content:encoded><![CDATA[<p>
Earlier today Amazon announced <a href="http://aws.amazon.com/DynamoDB">DynamoDB</a>, a hosted database for Amazon Web Services, which prompted several friends and colleagues of mine to ping me for my thoughts.  The reason they asked is because Cassandra was initially created as the combination of Amazon Dynamo&#8217;s fully distributed architecture and Google Big Table&#8217;s rich data model.  So, naturally, they wondered if I saw this as a competitor to Cassandra.  I don&#8217;t typically get into the &#8220;feeds and speeds&#8221; differences in offerings, but if you are interested in that, I suggest you take a look at <a href="http://www.datastax.com/dev/blog/8761" target="_blank">this post</a> from Jonathan Ellis, the Apache Cassandra Chairman.  For me, I&#8217;ll share a few of my thoughts at the business level.</p>

<p>Personally, I have never believed that other post-relational (aka NoSQL/Hadoop) database companies were our primary competition.  The brute fact of the matter is that if you put us all together, we are still not statistically relevant compared to the overall DBMS market.  In order to change <em>that</em>, we need to change the ecosystem itself.  Many decision makers who are excited about big data technologies are also frustrated at the lack of human resources who really know how to leverage them.  That isn&#8217;t just about ease of use either.  It&#8217;s about a fundamental <a href="http://www.datastax.com/2011/10/nosql-and-the-power-of-good-choices" target="_blank">shift in thinking</a> to solve problems in new and interesting ways.</p>

<p>When it comes to that kind of ecosystem evolution, we need more momentum than any of us new players can generate on our own &#8212; or even collectively.  That&#8217;s why I <a href="http://www.datastax.com/2011/10/nothing-like-having-oracle-validate-your-mission" target="_blank">got excited last year when Oracle</a> entered the market with a real-time, NoSQL solution of their own.  Today, I feel the same sense of excitement about Amazon&#8217;s announcement.  Their release is another huge validation stamp on what now looks like an irreversible path toward the need for real-time, big data databases.  It is a big step on the journey toward building a massive ecosystem of people who understand how to harness the power of these new technologies.</p>

<p>At this point you may say: &#8220;All well and good, but he&#8217;s dodging the question&#8230; are they competitive?&#8221;</p>

<p>Sure they are.  So is every other database technology out there today, in some form or fashion.  And you know who wins when we are all competing?  Customers!  It forces us all to create better and better solutions, like what we&#8217;re doing here at DataStax to leverage Cassandra in <a href="http://www.datastax.com/products/enterprise" target="_blank">really exciting ways</a>, not to mention the massive amount of time and <a href="http://www.datastax.com/wp-content/uploads/2011/09/WP-DataStax-Cassandra.pdf" target="_blank">improvements</a> that have been put into Cassandra itself.</p>

<p>Competition is always exciting.  So much so, that if no real competition exists, we humans will go to great lengths to manufacture it through games and contests!  (Just put two kids in a room for a while and watch what happens.)  In the big data world, it&#8217;s more than exciting &#8230; it&#8217;s necessary.</p>

<p>I had only one real personal fear coming into this market: That I would sink a big portion of my life into something that would never take hold in the mainstream.  I suspect that would be a truly awful ending for all of us in this space. But thanks to companies like Amazon and Oracle, that feels highly unlikely now, and that is a great thing.</p>]]></content:encoded>
			<wfw:commentRss>http://www.datastax.com/2012/01/my-thoughts-on-amazons-dynamodb/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>When failure is not an option for your big data system</title>
		<link>http://www.datastax.com/2012/01/when-failure-is-not-an-option-for-your-big-data-system</link>
		<comments>http://www.datastax.com/2012/01/when-failure-is-not-an-option-for-your-big-data-system#comments</comments>
		<pubDate>Mon, 16 Jan 2012 18:09:52 +0000</pubDate>
		<dc:creator>Billy Bosworth</dc:creator>
				<category><![CDATA[Blog Post]]></category>
		<category><![CDATA[Blog Post - Corporate]]></category>

		<guid isPermaLink="false">http://www.datastax.com/?p=8450</guid>
		<description><![CDATA[Having tackled the fundamentals of the peer-to-peer architecture in my <a href="http://www.datastax.com/2012/01/choosing-the-right-architecture-for-big-data-scale" target="_blank">last post</a>, I now want to take a look at the section of this paragraph from our recent <a href="http://www.datastax.com/2012/01/datastax-take-apache-cassandra-mainstream-in-2011-poised-for-growth-and-innovation-in-2012" target="_blank">press release</a> that touches on one of the most important aspects of running a mission-critical, big data system.&#8230;]]></description>
			<content:encoded><![CDATA[Having tackled the fundamentals of the peer-to-peer architecture in my <a href="http://www.datastax.com/2012/01/choosing-the-right-architecture-for-big-data-scale" target="_blank">last post</a>, I now want to take a look at the section of this paragraph from our recent <a href="http://www.datastax.com/2012/01/datastax-take-apache-cassandra-mainstream-in-2011-poised-for-growth-and-innovation-in-2012" target="_blank">press release</a> that touches on one of the most important aspects of running a mission-critical, big data system.  The paragraph reads:
<blockquote><span style="color: #888888">Customers this year chose Cassandra time and time again over competing solutions. The peer-to-peer design allows for high performance with linear scalability </span>and <span style="color: #000080"><strong>no single points of failure</strong><span style="color: #808080">, even across multiple data centers</span></span>.  <span style="color: #808080">Combine this with native optimization for the cloud and an extremely robust data model and Cassandra clearly stands apart from the competition for enterprise, mission-critical systems. </span>[emphasis added]</blockquote>
While there are definitely some difficult concepts in the world of big data, &#8220;no single point of failure&#8221; isn&#8217;t one of them.  Someone a little more business oriented may well prefer the phrase &#8220;continuous availability&#8221;, which is the <em>result</em> of having a system with no single points of failure.  In either case, it basically means your system will remain available even under extreme circumstances because it is designed for failure.

You may have read that last sentence and said, &#8220;Typo!  You surely didn&#8217;t mean your system is designed for failure!&#8221;

But that&#8217;s precisely what I mean.  Notice that I didn&#8217;t say &#8220;your system is designed to fail.&#8221;  I said your &#8220;system is designed for failure,&#8221; meaning, that architecturally it is built in such a way that assumes the components that make up that system will individually fail (maybe even very frequently) but that the larger system as a whole will remain available.  As Google said many years ago in <a href="http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/gfs-sosp2003.pdf" target="_blank">this paper</a> on its distributed file system: &#8220;First, component failures are the norm rather than the exception&#8221;.

This idea of accounting for component failures is often equated with the idea of &#8220;scaling out&#8221; or &#8220;scaling horizontally&#8221;, which is the opposite of &#8220;scaling up&#8221; or &#8220;scaling vertically&#8221;.  When you scale horizontally, you add more machines to your system to increase capacity.  When you scale vertically, you add more capacity to your single machine.  For years, relational databases have handled increased capacity by scaling vertically, and aside from other challenges that causes, it introduces a single point of failure that jeopardizes continuous availability.

But here&#8217;s the thing that many people don&#8217;t realize: scaling horizontally does not eliminate the challenge of a single point of failure.  To truly achieve continuous availability, you have to understand the system architecture, which goes back to the discussion in my <a href="http://www.datastax.com/2012/01/choosing-the-right-architecture-for-big-data-scale" target="_blank">last post</a> around the differences between &#8220;master/slave&#8221; and &#8220;distributed peer-to-peer.&#8221;  Read slaves in the master/slave architecture introduce <a href="http://adam.heroku.com/past/2009/7/6/sql_databases_dont_scale/" target="_blank">certain limitations</a>, and there&#8217;s no easy way of getting around that.

Conversely, Cassandra&#8217;s fully distributed architecture means every node is the same.  Every node is a master, and every node is a slave.  You never have to worry about how and when to add nodes of a certain type to increase capacity.  And that also means you never have to worry about losing a node in your system.  Failover testing isn&#8217;t required in Cassandra because Cassandra is constantly failing over from the moment you start your first cluster.

Or said more succinctly, Cassandra is truly built for continuous availability when failure is not an option.]]></content:encoded>
			<wfw:commentRss>http://www.datastax.com/2012/01/when-failure-is-not-an-option-for-your-big-data-system/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Does anyone see the irony in Oracle’s Big Data Appliance (asked a friend of mine…)</title>
		<link>http://www.datastax.com/2012/01/does-anyone-see-the-irony-in-oracles-big-data-appliance-asked-a-friend-of-mine</link>
		<comments>http://www.datastax.com/2012/01/does-anyone-see-the-irony-in-oracles-big-data-appliance-asked-a-friend-of-mine#comments</comments>
		<pubDate>Wed, 11 Jan 2012 06:26:56 +0000</pubDate>
		<dc:creator>Billy Bosworth</dc:creator>
				<category><![CDATA[Blog Post]]></category>
		<category><![CDATA[Blog Post - Corporate]]></category>

		<guid isPermaLink="false">http://www.datastax.com/?p=8619</guid>
		<description><![CDATA[Indulge me in a little fun on this one.

Did you hear the one about the world largest database company <a href="http://www.computerworld.com/s/article/9223302/Oracle_Cloudera_unveil_Hadoop_appliance" target="_blank">partnering</a> with&#8230; umm&#8230;. an open source&#8230; uhh&#8230; database company&#8230; to serve as a software layer on their&#8230;. hardware?  Oh, never mind &#8212; who would believe a tale like&#8230;]]></description>
			<content:encoded><![CDATA[<p>Indulge me in a little fun on this one.</p>

<p>Did you hear the one about the world largest database company <a href="http://www.computerworld.com/s/article/9223302/Oracle_Cloudera_unveil_Hadoop_appliance" target="_blank">partnering</a> with&#8230; umm&#8230;. an open source&#8230; uhh&#8230; database company&#8230; to serve as a software layer on their&#8230;. hardware?  Oh, never mind &#8212; who would believe a tale like that?  But in the world of big data, anything is possible.  Last year we saw Oracle do a sudden <a href="http://www.datastax.com/2011/10/nothing-like-having-oracle-validate-your-mission" target="_blank">about face</a> on NoSQL in general, and now they are partnering to try and quickly fill the gaps in their offering.  I think there are two broad takeaways here:</p>

<ol>
<li>Big data solutions are going mainstream fast.</li>
<li>These ain&#8217;t your father&#8217;s data problems.</li>
</ol>

<p>2012 is going to be such an exciting year in the big data space.  In fact, I think I just figured out what happened to the Mayan calendar&#8230; they hit a big data problem and their calendar app crashed!  We&#8217;ll rewrite it on Cassandra. <img src='http://www.datastax.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>]]></content:encoded>
			<wfw:commentRss>http://www.datastax.com/2012/01/does-anyone-see-the-irony-in-oracles-big-data-appliance-asked-a-friend-of-mine/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Choosing the right architecture for big data scale</title>
		<link>http://www.datastax.com/2012/01/choosing-the-right-architecture-for-big-data-scale</link>
		<comments>http://www.datastax.com/2012/01/choosing-the-right-architecture-for-big-data-scale#comments</comments>
		<pubDate>Wed, 11 Jan 2012 00:06:00 +0000</pubDate>
		<dc:creator>Billy Bosworth</dc:creator>
				<category><![CDATA[Blog Post]]></category>
		<category><![CDATA[Blog Post - Corporate]]></category>

		<guid isPermaLink="false">http://www.datastax.com/?p=8444</guid>
		<description><![CDATA[Following up from my <a href="http://www.datastax.com/2012/01/why-should-i-use-cassandra" target="_blank">last post</a>, let&#8217;s now take a look at the section of this paragraph from our recent <a href="http://www.datastax.com/2012/01/datastax-take-apache-cassandra-mainstream-in-2011-poised-for-growth-and-innovation-in-2012" target="_blank">press release</a> that deals with the foundation of Cassandra: it&#8217;s architecture.  The paragraph reads:
<span style="color: #808080">Customers this year chose Cassandra time and time again over</span>&#8230;]]></description>
			<content:encoded><![CDATA[<p>Following up from my <a href="http://www.datastax.com/2012/01/why-should-i-use-cassandra" target="_blank">last post</a>, let&#8217;s now take a look at the section of this paragraph from our recent <a href="http://www.datastax.com/2012/01/datastax-take-apache-cassandra-mainstream-in-2011-poised-for-growth-and-innovation-in-2012" target="_blank">press release</a> that deals with the foundation of Cassandra: it&#8217;s architecture.  The paragraph reads:</p>
<blockquote><span style="color: #808080">Customers this year chose Cassandra time and time again over competing solutions.</span> <span style="color: #333399"><strong>The peer-to-peer design allows for high performance with linear scalability </strong></span><span style="color: #808080">and no single points of failure, even across multiple data centers.  Combine this with native optimization for the cloud and an extremely robust data model and Cassandra clearly stands apart from the competition for enterprise, mission-critical systems.</span> [emphasis added]</blockquote>

<p>When dealing with new technologies, one of the easiest things to overlook is the architecture.  Talking about architecture is not sexy or glamorous, but it is the absolute foundation of everything you will do for years to come.  Make a mistake up front with most systems, and unwinding later it can be difficult.  Make a mistake with your big data architecture, and unwinding it later can be downright ugly if not impossible.</p>

<p>Today&#8217;s big data architectures come in primarily two flavors:  one where a single machine coordinates all activities for other machines in the cluster (aka Master/Slave); and one where all machines in the cluster are equal in type and function (aka peer-to-peer, or others may call it &#8220;fully distributed&#8221;).  Cassandra is built on the latter&#8211;a fully distributed peer-to-peer architecture based on something called Amazon Dynamo.  (For those who want to geek out on the details of Amazon Dynamo, you can read this <a href="http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf" target="_blank">paper</a>.)</p>

<p>The decision between the two is vitally important.  In master/slave architectures you have, by definition, a single point of failure in your master coordination node and you have introduced some complexity into scaling.  There are techniques and tricks to try and mitigate this issue, but at the end of the day there is simply no free lunch and dealing with it at some level is inescapable. We will touch more on this in my next post.</p>

<p>Another benefit of the architecture comes in terms performance. The Cassandra developers are absolutely fanatical about performance and it <a href="http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-performance" target="_blank">shows</a>.  But what really still stuns me about Cassandra is not just its amazing performance, but that the performance scales <em>linearly</em>.  Think of the advantages this provides to the operations and capacity planning teams.  You don&#8217;t have to worry about what node types to add at what point along the way, you just keep adding nodes to the cluster and scale keeps going up exactly, mathematically, how you would expect it to.</p>

<p>An even more incredible aspect to this linear scale is that it is not limited to on-premise solutions.  This graph shows how one of our customers achieved perfect linearly scalability that takes place entirely in the cloud.</p>

<a rel="attachment wp-att-8586" href="http://www.datastax.com/wp-content/uploads/2012/01/scale1.png" rel="facebox"><img class="aligncenter size-large wp-image-8586" src="http://www.datastax.com/wp-content/uploads/2012/01/scale1-1024x768.png" alt="" width="550" height="412" /></a>

<p>When it comes to big data, choosing the right backend system is absolutely critical to the long-term success of your application and it&#8217;s worth a little up-front time investigating it.</p>]]></content:encoded>
			<wfw:commentRss>http://www.datastax.com/2012/01/choosing-the-right-architecture-for-big-data-scale/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

