<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Scalable web architectures</title>
	
	<link>http://www.royans.net/arch</link>
	<description>Building reliable, high performance, highly available clusters</description>
	<lastBuildDate>Sat, 13 Mar 2010 16:26:13 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
	
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/arch" /><feedburner:info uri="arch" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:emailServiceId>arch</feedburner:emailServiceId><feedburner:feedburnerHostname>http://feedburner.google.com</feedburner:feedburnerHostname><item>
		<title>Scalability links for March 13th 2010</title>
		<link>http://feedproxy.google.com/~r/arch/~3/yGnUm7wug2c/</link>
		<comments>http://www.royans.net/arch/scalability-links-for-march-13th-2010/#comments</comments>
		<pubDate>Sat, 13 Mar 2010 15:39:04 +0000</pubDate>
		<dc:creator>Royans</dc:creator>
				<category><![CDATA[cassandra]]></category>
		<category><![CDATA[updates]]></category>

		<guid isPermaLink="false">http://www.royans.net/arch/scalability-links-for-march-13th-2010/</guid>
		<description>For some reason there has been a disproportionately high number of news items on Cassandra lately. Some of those are included below, but also included are some other interesting updates which you might have missed.


Rackspace and Drizzle: Its time to rethink everything
Haproxy 1.4 – Now supports mysql health checks – This is a big deal [...]


Related posts:&lt;ol&gt;&lt;li&gt;&lt;a href='http://www.royans.net/arch/scalability-links-for-feb-28th-2010/' rel='bookmark' title='Permanent Link: Scalability links for Feb 28th 2010'&gt;Scalability links for Feb 28th 2010&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/scaling-updates-for-feb-18-2010/' rel='bookmark' title='Permanent Link: Scalability updates for Feb 18, 2010'&gt;Scalability updates for Feb 18, 2010&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/scalability-updates-for-jan-26th-2010/' rel='bookmark' title='Permanent Link: Scalability Updates for Jan 26th 2010'&gt;Scalability Updates for Jan 26th 2010&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/links-on-scalability-performance-and-problems/' rel='bookmark' title='Permanent Link: Links on scalability, performance and problems'&gt;Links on scalability, performance and problems&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/scaling-updates/' rel='bookmark' title='Permanent Link: Scaling updates for Feb 10, 2010'&gt;Scaling updates for Feb 10, 2010&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fscalability-links-for-march-13th-2010%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fscalability-links-for-march-13th-2010%2F&amp;source=royans&amp;style=normal&amp;service=bit.ly&amp;hashtags=cassandra,updates" height="61" width="50" /><br />
			</a>
		</div>
<p>For some reason there has been a disproportionately high number of news items on Cassandra lately. Some of those are included below, but also included are some other interesting updates which you might have missed.</p>
<p><img style="float: right" alt="Cassandra" src="http://creatr.cc/creatr/logo/Cassandra.png?1268494660" width="158" height="77" /></p>
<ul>
<li><a href="http://www.rackspacecloud.com/blog/2010/03/13/rackspace-and-drizzle-its-time-to-rethink-everything/">Rackspace and Drizzle: Its time to rethink everything</a></li>
<li><a href="http://agiletesting.blogspot.com/2010/02/use-haproxy-14-if-you-need-mysql-health.html">Haproxy 1.4 – Now supports mysql health checks</a> – This is a big deal for users using haproxy to loadbalance Mysql servers. </li>
<li><a href="http://www.techcrunchit.com/2010/03/12/video-developer-doug-cutting-talks-about-the-founding-of-hadoop/">Video: Doug cutting : Founding of Hadoop</a> </li>
<li><a href="http://xaasblog.wordpress.com/2010/03/11/whiteboard-wednesday-horizontal-scaling-aka-load-balancing/">Horizontal scaling:&#160; aka loadbalancers</a> – This is a video. As they A picture is worth a thousand words. </li>
<li><a href="http://highscalability.com/blog/2010/3/10/how-farmville-scales-the-follow-up.html">How Farmville Scales – The Follow up</a> </li>
<li><a href="http://googleappengine.blogspot.com/2010/03/app-engine-joins-google-over-ipv6.html">Appengine joins IPv6</a> – This is the odd one out. I don’t think anyone other than google is worried about IPv6 At this point. IPv4 is really running out, and IPv6 address space is easy to get. If you have customers on IPv6, its very helpful if the server is also on IPv6. But the infrastructure may not be there yet… </li>
<li><a href="http://www.royans.net/arch/automated-faster-repeatable-scalable-deployments/">Automated, faster, repeatable, scalable deployments</a>&#160; </li>
<li>Datastores </li>
<ul>
<li><a href="http://pl.atyp.us/wordpress/?p=2733">Concerns: NoSQL and Data Models</a> – Jeff had an issue with one of my previous posts. His concerns are interesting to read about. </li>
<li>HBase
<ul>
<li><a href="http://www.slideshare.net/ghelmling/hbase-at-meetup">HBase @ Meetup</a> </li>
</ul>
</li>
<li>Cassandra
<ul>
<li><a href="http://about.digg.com/blog/saying-yes-nosql-going-steady-cassandra">Saying yes to NoSQL, Going steady with Cassandra</a> – This is no surprise. Its a huge endorsement for Cassandra. </li>
<li><a href="http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/">Cassandra: RandomPartitioner vs OrderPreservingPartitioner</a> </li>
<li>As I <a href="http://www.royans.net/arch/reddit-learning-from-mistakes/">suspected</a>, <a href="http://blog.reddit.com/2010/03/she-who-entangles-men.html">Reddit is now moving to Cassandra</a> – Another endorsement </li>
<li><a href="http://wiki.apache.org/cassandra/ArchitectureInternals">Architecture Internals</a> </li>
</ul>
</li>
<li>MongoDB
<ul>
<li><a href="http://www.paperplanes.de/2010/2/25/notes_on_mongodb.html">Notes on MongoDB</a> </li>
<li><a href="http://ivoras.sharanet.org/blog/tree/2010-02-20.mongodb-and-durability.html">MongoDB and durability</a> </li>
</ul>
</li>
</ul>
</ul>


<p>Related posts:<ol><li><a href='http://www.royans.net/arch/scalability-links-for-feb-28th-2010/' rel='bookmark' title='Permanent Link: Scalability links for Feb 28th 2010'>Scalability links for Feb 28th 2010</a></li>
<li><a href='http://www.royans.net/arch/scaling-updates-for-feb-18-2010/' rel='bookmark' title='Permanent Link: Scalability updates for Feb 18, 2010'>Scalability updates for Feb 18, 2010</a></li>
<li><a href='http://www.royans.net/arch/scalability-updates-for-jan-26th-2010/' rel='bookmark' title='Permanent Link: Scalability Updates for Jan 26th 2010'>Scalability Updates for Jan 26th 2010</a></li>
<li><a href='http://www.royans.net/arch/links-on-scalability-performance-and-problems/' rel='bookmark' title='Permanent Link: Links on scalability, performance and problems'>Links on scalability, performance and problems</a></li>
<li><a href='http://www.royans.net/arch/scaling-updates/' rel='bookmark' title='Permanent Link: Scaling updates for Feb 10, 2010'>Scaling updates for Feb 10, 2010</a></li>
</ol></p>
<p><a href="http://feedads.g.doubleclick.net/~a/9SJF9MY7cmH5pZWLm6Mxi94aR-s/0/da"><img src="http://feedads.g.doubleclick.net/~a/9SJF9MY7cmH5pZWLm6Mxi94aR-s/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/9SJF9MY7cmH5pZWLm6Mxi94aR-s/1/da"><img src="http://feedads.g.doubleclick.net/~a/9SJF9MY7cmH5pZWLm6Mxi94aR-s/1/di" border="0" ismap="true"></img></a></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/arch?a=yGnUm7wug2c:3FXw36PewJ4:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/arch?i=yGnUm7wug2c:3FXw36PewJ4:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=yGnUm7wug2c:3FXw36PewJ4:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/arch?i=yGnUm7wug2c:3FXw36PewJ4:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=yGnUm7wug2c:3FXw36PewJ4:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/arch?i=yGnUm7wug2c:3FXw36PewJ4:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=yGnUm7wug2c:3FXw36PewJ4:l6gmwiTKsz0"><img src="http://feeds.feedburner.com/~ff/arch?d=l6gmwiTKsz0" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/arch/~4/yGnUm7wug2c" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.royans.net/arch/scalability-links-for-march-13th-2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.royans.net/arch/scalability-links-for-march-13th-2010/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=scalability-links-for-march-13th-2010</feedburner:origLink></item>
		<item>
		<title>Automated, faster, repeatable, scalable deployments</title>
		<link>http://feedproxy.google.com/~r/arch/~3/oYQg6Y8aLm8/</link>
		<comments>http://www.royans.net/arch/automated-faster-repeatable-scalable-deployments/#comments</comments>
		<pubDate>Tue, 09 Mar 2010 08:58:07 +0000</pubDate>
		<dc:creator>Royans</dc:creator>
				<category><![CDATA[automation]]></category>
		<category><![CDATA[deployment]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://www.royans.net/arch/automated-faster-repeatable-scalable-deployments/</guid>
		<description>While efficient automated deployment tools like Puppet and Capistrano are a big step in the right direction, its not the complete solution for an automated deployment process. This post will explore some of the less discussed issues which are as important for automated, fast, repeatable scalable deployments.&amp;#160; 
Rapid Build and Integration with tests

Use Source control [...]


Related posts:&lt;ol&gt;&lt;li&gt;&lt;a href='http://www.royans.net/arch/scaling-deployments/' rel='bookmark' title='Permanent Link: Scaling deployments'&gt;Scaling deployments&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/scalable-logging-using-syslog/' rel='bookmark' title='Permanent Link: Scalable logging using Syslog'&gt;Scalable logging using Syslog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/service-registry-esb-for-scalable-web-applications/' rel='bookmark' title='Permanent Link: Service registry (ESB) for scalable web applications.'&gt;Service registry (ESB) for scalable web applications.&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/heroku-platform-for-scalable-applications/' rel='bookmark' title='Permanent Link: Heroku platform for scalable web applications'&gt;Heroku platform for scalable web applications&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/book-building-scalable-web-sites/' rel='bookmark' title='Permanent Link: Book: Building Scalable Web Sites'&gt;Book: Building Scalable Web Sites&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fautomated-faster-repeatable-scalable-deployments%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fautomated-faster-repeatable-scalable-deployments%2F&amp;source=royans&amp;style=normal&amp;service=bit.ly&amp;hashtags=automation,deployment,tools" height="61" width="50" /><br />
			</a>
		</div>
<p>While efficient automated deployment tools like Puppet and Capistrano are a big step in the right direction, its not the complete solution for an automated deployment process. This post will explore some of the less discussed issues which are as important for automated, fast, repeatable scalable deployments.&#160; </p>
<h4 align="left">Rapid Build and Integration with tests</h4>
<ul>
<li>Use Source control to build an audit trail: Put everything possible in it, including configurations and deployment scripts.</li>
<li>Continuous Builds triggered by code check-ins can detect and report problems early <a href="http://www.royans.net/arch/wp-content/uploads/2010/03/image1.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; float: right; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://www.royans.net/arch/wp-content/uploads/2010/03/image_thumb1.png" width="216" height="425" /></a>
<ul>
<li>Use tools which provide targeted feedback about build failures. It reduces noise and improves over all quality faster </li>
<li>Faster the build happens after a check-in, better are the chances for bugs to get fixed quickly. Delays can be costly since broken builds could impact other developers as well </li>
<li>Build smaller components (fail fast) </li>
</ul>
</li>
<li>Continuous integration tests of all components can detect errors which may not be caught a build time. </li>
</ul>
<h4>Automated database changes</h4>
<p>Can database changes be automated ? This is probably one of the most interesting challenges for automation, especially if the app requires data migrations which can’t be rolled back. While it would be nice to have only incremental changes introduced into each deployment (which are guaranteed to be forward and backward compatible), there might be some need for non-trivial changes once in a while. As long as there is a process to separate the trivial from non-trivial changes, it might be possible to automate most of the database changes through an automation process.</p>
<p>Tracking which migrations have been applied and which are pending is a very application specific problem for which there are no silver bullets.</p>
<h4 align="left">&#160;</h4>
<h4 align="left">Configuration management </h4>
<h5>&#160;</h5>
<h5 align="left">Environment-specific properties</h5>
<p>Its not abnormal to have different sets of configuration for dev and production. But creating different build packages for different target environments is not the right solution. If you need to change properties between environments pick a better way to do it.</p>
<ul>
<li>Either externalize the configuration properties to a file/directory location outside your app folder, such that repeated deployments don’t overwrite properties.</li>
<li>Or, update the right properties automatically during deployment using a deployment framework which is capable of that. </li>
</ul>
<h5 align="left">Pushing at deployment time or pulling at run time</h5>
<p>In some cases pulling new configuration files dynamically after application startup might make more sense. This is especially true for applications on an infrastructure like AWS/EC2. If applications were already deployed on the base OS image, then it will come up automatically when the system boots up. Some folks keep only minimal information in the base OS image, and use a datastore like S3 to download the latest configuration from. In a private network where using S3 is not possible, you could replace it with some kind of shared store like SVN/NFS/FTP/SCP/HTTPetc.</p>
<h4 align="left">Deployment frameworks</h4>
<h5>&#160;</h5>
<h5 align="left">3rd Party frameworks</h5>
<ul>
<li><a href="http://docs.fabfile.org/1.0a/">Fabric</a> &#8211; Fabric is a Python library and command-line tool for streamlining the use of SSH for application deployment or systems administration tasks. </li>
<li><a href="http://reductivelabs.com/trac/puppet/wiki">Puppet</a> -&#160; Put simply, Puppet is a system for automating system administration tasks. </li>
<li><a href="http://www.capify.org/index.php/Capistrano">Capistrano</a> &#8211; It is designed with repeatability in mind, letting you easily and reliably automate tasks that used to require login after login and a small army of custom shell scripts.&#160; ( also check out <a href="http://labs.peritor.com/webistrano">webistrano</a> ) </li>
<li><a href="http://trac.mcs.anl.gov/projects/bcfg2">Bcfg2</a> &#8211; Bcfg2 helps system administrators produce a consistent, reproducible, and verifiable description of their environment, and offers visualization and reporting tools to aid in day-to-day administrative tasks. </li>
<li><a href="http://wiki.opscode.com/display/chef/Home">Chef</a> &#8211; Chef is a systems integration framework, built to bring the benefits of configuration management to your entire infrastructure. </li>
<li><a href="http://code.google.com/p/slack/">Slack</a> &#8211; slack is an evolution from the usual &quot;put files in some central directory&quot; that is fairly common practice. </li>
<li><a href="http://github.com/samuel/kokki">Kokki</a> &#8211; System configuration management framework influenced by Chef </li>
</ul>
<h5>Custom or Mixed frameworks</h5>
<p>The tools listed above are not the only set of tools available. Simple bash/sh scripts, <a href="http://ant.apache.org/">ant</a> scripts, even tools like <a href="http://cruisecontrol.sourceforge.net/">cruisecontrol</a> and <a href="https://hudson.dev.java.net/">hudson</a> can be used for automated deployments. Here are some other interesting observations&#160; </p>
<ul>
<li>Building huge monolithically applications are thing of the past. Understanding how to break them up into self-contained, less inter-dependent components is the challenge. </li>
<li>If all of your servers get the same exact copy of application and configuration, then you don’t need to worry about configuration management. Just find a tool which deploys files fast. </li>
<li>If your deployments have a lot of inter-dependencies between components then choose a tool which gives you a visual interface of the deployment process if required. </li>
<li>Don’t be shy to write wrapper scripts to automate more tasks. </li>
</ul>
<h5>Push/Pull/P2P Frameworks</h5>
<p><a href="http://agiletesting.blogspot.com/2010/03/automated-deployment-systems-push-vs.html">Grig</a> has an interesting post about <a href="http://agiletesting.blogspot.com/2010/03/automated-deployment-systems-push-vs.html">Push vs Pull</a> where he lists the pros/cons of both the systems. What he forgot to mention is P2P which is the way <a href="http://torrentfreak.com/twitter-uses-bittorrent-for-server-deployment-100210/">twitter is going</a> for its deployment. P2P has advantages from both Push and Pull architecture but comes with its own set of challenges. I haven’t seen an opensource tool using P2P yet, but I’m sure its not too far out.</p>
<h4 align="left">Outage windows</h4>
<p align="left">Though deployments are easier with long outage windows, thats something hard to come by. In an ideal world one would have a parallel set of servers which one could cut over to with a flip of a switch. Unfortunately if user data is involved this is almost impossible to do. The next best alternative is to do “rolling updates” in small batches of servers. The reason this could be challenging is because the deployment tool needs to make sure the app really has completed initialization before it moves on to the next set of servers. </p>
<p>This can be further complicated by the fact that at times there are version dependencies between different applications. In such cases there needs to be a robust infrastructure to facilitate discovery of the right version of applications.</p>
<h4>Conclusion</h4>
<p>Deployment automation, in my personal opinion, is about the process, not the tool. If you have any interesting observations, ideas or comments, please feel free to write to me or leave a comment on this blog.</p>


<p>Related posts:<ol><li><a href='http://www.royans.net/arch/scaling-deployments/' rel='bookmark' title='Permanent Link: Scaling deployments'>Scaling deployments</a></li>
<li><a href='http://www.royans.net/arch/scalable-logging-using-syslog/' rel='bookmark' title='Permanent Link: Scalable logging using Syslog'>Scalable logging using Syslog</a></li>
<li><a href='http://www.royans.net/arch/service-registry-esb-for-scalable-web-applications/' rel='bookmark' title='Permanent Link: Service registry (ESB) for scalable web applications.'>Service registry (ESB) for scalable web applications.</a></li>
<li><a href='http://www.royans.net/arch/heroku-platform-for-scalable-applications/' rel='bookmark' title='Permanent Link: Heroku platform for scalable web applications'>Heroku platform for scalable web applications</a></li>
<li><a href='http://www.royans.net/arch/book-building-scalable-web-sites/' rel='bookmark' title='Permanent Link: Book: Building Scalable Web Sites'>Book: Building Scalable Web Sites</a></li>
</ol></p>
<p><a href="http://feedads.g.doubleclick.net/~a/gVvRw5Nukc3FoletGkfuCauVtYM/0/da"><img src="http://feedads.g.doubleclick.net/~a/gVvRw5Nukc3FoletGkfuCauVtYM/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/gVvRw5Nukc3FoletGkfuCauVtYM/1/da"><img src="http://feedads.g.doubleclick.net/~a/gVvRw5Nukc3FoletGkfuCauVtYM/1/di" border="0" ismap="true"></img></a></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/arch?a=oYQg6Y8aLm8:dKJDX8TLFU4:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/arch?i=oYQg6Y8aLm8:dKJDX8TLFU4:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=oYQg6Y8aLm8:dKJDX8TLFU4:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/arch?i=oYQg6Y8aLm8:dKJDX8TLFU4:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=oYQg6Y8aLm8:dKJDX8TLFU4:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/arch?i=oYQg6Y8aLm8:dKJDX8TLFU4:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=oYQg6Y8aLm8:dKJDX8TLFU4:l6gmwiTKsz0"><img src="http://feeds.feedburner.com/~ff/arch?d=l6gmwiTKsz0" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/arch/~4/oYQg6Y8aLm8" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.royans.net/arch/automated-faster-repeatable-scalable-deployments/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.royans.net/arch/automated-faster-repeatable-scalable-deployments/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=automated-faster-repeatable-scalable-deployments</feedburner:origLink></item>
		<item>
		<title>Disaster Recovery: Impressive RPO and RTO objectives set by Google Apps Operations</title>
		<link>http://feedproxy.google.com/~r/arch/~3/RRUbKVL3q40/</link>
		<comments>http://www.royans.net/arch/disaster-recovery-impressive-rpo-and-rto-objectives-set-by-google-apps-operations/#comments</comments>
		<pubDate>Fri, 05 Mar 2010 15:44:58 +0000</pubDate>
		<dc:creator>Royans</dc:creator>
				<category><![CDATA[architecture]]></category>
		<category><![CDATA[disaster recovery]]></category>
		<category><![CDATA[replication]]></category>

		<guid isPermaLink="false">http://www.royans.net/arch/disaster-recovery-impressive-rpo-and-rto-objectives-set-by-google-apps-operations/</guid>
		<description>Unless you are running a fly by night shop, DR (Disaster recovery) should be one of the top issues for your operations team. In a “Scalable architecture” world, the complexity of DR can become a disaster in itself.  
Yesterday Google Announced that it now finally has DR plan for Google Apps. While this is nice, [...]


Related posts:&lt;ol&gt;&lt;li&gt;&lt;a href='http://www.royans.net/arch/google-app-engine-and-social-apps/' rel='bookmark' title='Permanent Link: Google App Engine and Social Apps'&gt;Google App Engine and Social Apps&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/scalability-stories-15th-sept-mysql-proxy-cluster-fire-system-facebook-apps-and-twitter/' rel='bookmark' title='Permanent Link: Scalability Stories (15th Sept) Mysql Proxy, Cluster Fire System, Facebook apps and Twitter'&gt;Scalability Stories (15th Sept) Mysql Proxy, Cluster Fire System, Facebook apps and Twitter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/google-app-engine-java-edition/' rel='bookmark' title='Permanent Link: Google app engine review (Java edition)'&gt;Google app engine review (Java edition)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/app-engine-datastore/' rel='bookmark' title='Permanent Link: Working with Google App engine&amp;rsquo;s datastore'&gt;Working with Google App engine&amp;rsquo;s datastore&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/google-patents-map-reduce-system-and-method-for-efficient-large-scale-data-processing/' rel='bookmark' title='Permanent Link: Google patents Map reduce &amp;ldquo;System and method for efficient large-scale data processing&amp;rdquo;'&gt;Google patents Map reduce &amp;ldquo;System and method for efficient large-scale data processing&amp;rdquo;&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fdisaster-recovery-impressive-rpo-and-rto-objectives-set-by-google-apps-operations%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fdisaster-recovery-impressive-rpo-and-rto-objectives-set-by-google-apps-operations%2F&amp;source=royans&amp;style=normal&amp;service=bit.ly&amp;hashtags=architecture,disaster+recovery,replication" height="61" width="50" /><br />
			</a>
		</div>
<p>Unless you are running a fly by night shop, DR (Disaster recovery) should be one of the top issues for your operations team. In a “Scalable architecture” world, the complexity of DR can become a disaster in itself.  <img style="display: inline; float: right; margin-left: 0px; margin-right: 0px;" src="http://jedwinmedia.ca/home/images/stories/googleapps.jpg" alt="" width="220" height="164" align="right" /></p>
<p>Yesterday Google Announced that it now finally has <a href="http://googleenterprise.blogspot.com/2010/03/disaster-recovery-by-google.html">DR plan for Google Apps</a>. While this is nice, one should always take such messages with a pinch of salt, until they prove it that they can do it. Look at the <a href="https://groups.google.com/group/google-appengine/browse_thread/thread/a7640a2743922dcf">DR plan for Google App engine</a> which was also there, but still suffered more than 2 hour outage because of incomplete documentation, insufficient training and probably lack of someone to make a quick decisive decision at the time of failure.</p>
<p>But back to Google Apps for now. These guys are planning for an <a href="http://en.wikipedia.org/wiki/Recovery_point_objective">RPO of 0 seconds</a>, which means multiple datacenters will always be in consistent state all the time.  And they want a <a href="http://en.wikipedia.org/wiki/Recovery_time_objective">RTO to be instant failover</a> as well ! This is an incredible DR plan, and requires technical expertise in all 7 layers of <a href="http://en.wikipedia.org/wiki/OSI_model">OSI Model</a> to achieve it.</p>
<blockquote><p>In larger businesses, companies will add a storage area network (SAN), which is a consolidated place for all storage. SANs are expensive, and even then, you&#8217;re out of luck if your data center goes down. So the largest enterprises will build an entirely new data center somewhere else, with another set of identical mail servers, another SAN and more people to staff them.</p>
<p>But if, heaven forbid, disaster strikes both your data centers, you&#8217;re toast (<a href="http://www.youtube.com/watch?v=R7nMGARCCwU">check out this customer&#8217;s experience with a fire</a>). So big companies will often build the second data center far away, in a different &#8216;threat zone&#8217;, which creates even more management headaches. Next they need to ensure the primary SAN talks to the backup SAN, so they have to implement robust bandwidth to handle terabytes of data flying back and forth without crippling their network. There are other backup options as well, but the story&#8217;s the same: as redundancy increases, cost and complexity multiplies.</p>
<p>How do you know if your disaster recovery solution is as strong as you need it to be? It&#8217;s usually measured in two ways: RPO (<a href="http://en.wikipedia.org/wiki/Recovery_point_objective">Recovery Point Objective</a>) and RTO (<a href="http://en.wikipedia.org/wiki/Recovery_time_objective">Recovery Time Objective</a>). RPO is how much data you&#8217;re willing to lose when things go wrong, and RTO is how long you&#8217;re willing to go without service after a disaster.</p>
<p>For a large enterprise running SANs, the RTO and RPO targets are an hour or less: the more you pay, the lower the numbers. That can mean a large company spending the big bucks is willing to lose all the email sent to them for up to an hour after the system goes down, and go without access to email for an hour as well. Enterprises without SANs may be literally trucking tapes back and forth between data centers, so as you can imagine their RPOs and RTOs can stretch into days. As for small businesses, often they just have to start over.</p>
<p>For Google Apps customers, our RPO design target is zero, and our RTO design target is instant failover. We do this through live or synchronous replication: every action you take in Gmail is simultaneously replicated in two data centers at once, so that if one data center fails, we nearly instantly transfer your data over to the other one that&#8217;s also been reflecting your actions.</p></blockquote>
<p>This is one of the most ambitious DR plan I’ve ever read off which involves such a huge customer base.They not only have to replicate all the user data into multiple data centers, they have to do it synchronously (or almost synchronously),  across a huge distance (latency can slow down synchronous operations) without impacting users. And to top it all, they have to do a complete site failover if the primary datacenter goes down.</p>
<p>I am impressed, but don’t mind learning more on how they do it.</p>


<p>Related posts:<ol><li><a href='http://www.royans.net/arch/google-app-engine-and-social-apps/' rel='bookmark' title='Permanent Link: Google App Engine and Social Apps'>Google App Engine and Social Apps</a></li>
<li><a href='http://www.royans.net/arch/scalability-stories-15th-sept-mysql-proxy-cluster-fire-system-facebook-apps-and-twitter/' rel='bookmark' title='Permanent Link: Scalability Stories (15th Sept) Mysql Proxy, Cluster Fire System, Facebook apps and Twitter'>Scalability Stories (15th Sept) Mysql Proxy, Cluster Fire System, Facebook apps and Twitter</a></li>
<li><a href='http://www.royans.net/arch/google-app-engine-java-edition/' rel='bookmark' title='Permanent Link: Google app engine review (Java edition)'>Google app engine review (Java edition)</a></li>
<li><a href='http://www.royans.net/arch/app-engine-datastore/' rel='bookmark' title='Permanent Link: Working with Google App engine&rsquo;s datastore'>Working with Google App engine&rsquo;s datastore</a></li>
<li><a href='http://www.royans.net/arch/google-patents-map-reduce-system-and-method-for-efficient-large-scale-data-processing/' rel='bookmark' title='Permanent Link: Google patents Map reduce &ldquo;System and method for efficient large-scale data processing&rdquo;'>Google patents Map reduce &ldquo;System and method for efficient large-scale data processing&rdquo;</a></li>
</ol></p>
<p><a href="http://feedads.g.doubleclick.net/~a/ssKFM21LN364eWCo5JNZpt5eDZU/0/da"><img src="http://feedads.g.doubleclick.net/~a/ssKFM21LN364eWCo5JNZpt5eDZU/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/ssKFM21LN364eWCo5JNZpt5eDZU/1/da"><img src="http://feedads.g.doubleclick.net/~a/ssKFM21LN364eWCo5JNZpt5eDZU/1/di" border="0" ismap="true"></img></a></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/arch?a=RRUbKVL3q40:e2D8NZV2OqE:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/arch?i=RRUbKVL3q40:e2D8NZV2OqE:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=RRUbKVL3q40:e2D8NZV2OqE:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/arch?i=RRUbKVL3q40:e2D8NZV2OqE:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=RRUbKVL3q40:e2D8NZV2OqE:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/arch?i=RRUbKVL3q40:e2D8NZV2OqE:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=RRUbKVL3q40:e2D8NZV2OqE:l6gmwiTKsz0"><img src="http://feeds.feedburner.com/~ff/arch?d=l6gmwiTKsz0" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/arch/~4/RRUbKVL3q40" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.royans.net/arch/disaster-recovery-impressive-rpo-and-rto-objectives-set-by-google-apps-operations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.royans.net/arch/disaster-recovery-impressive-rpo-and-rto-objectives-set-by-google-apps-operations/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=disaster-recovery-impressive-rpo-and-rto-objectives-set-by-google-apps-operations</feedburner:origLink></item>
		<item>
		<title>The Reddit problem: Learning from mistakes</title>
		<link>http://feedproxy.google.com/~r/arch/~3/a81x9N56lLE/</link>
		<comments>http://www.royans.net/arch/reddit-learning-from-mistakes/#comments</comments>
		<pubDate>Tue, 02 Mar 2010 05:30:29 +0000</pubDate>
		<dc:creator>Royans</dc:creator>
				<category><![CDATA[architecture]]></category>
		<category><![CDATA[consistent hashing]]></category>
		<category><![CDATA[failure]]></category>
		<category><![CDATA[memcached]]></category>
		<category><![CDATA[memcachedb]]></category>

		<guid isPermaLink="false">http://www.royans.net/arch/reddit-learning-from-mistakes/</guid>
		<description>Reddit has a very interesting post about what not to do when trying to build a scalable system. While the error is tragic, I think its an excellent design mistakes to learn from.
Though the post lacked detailed technical report, we might be able to recreate what happened. They mentioned they are using MemcacheDB datastore, with [...]


Related posts:&lt;ol&gt;&lt;li&gt;&lt;a href='http://www.royans.net/arch/scaling-updates/' rel='bookmark' title='Permanent Link: Scaling updates for Feb 10, 2010'&gt;Scaling updates for Feb 10, 2010&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/brewers-cap-theorem-on-distributed-systems/' rel='bookmark' title='Permanent Link: Brewers CAP Theorem on distributed systems'&gt;Brewers CAP Theorem on distributed systems&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/weekend-reading-material/' rel='bookmark' title='Permanent Link: Weekend reading material'&gt;Weekend reading material&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.royans.net%2Farch%2Freddit-learning-from-mistakes%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.royans.net%2Farch%2Freddit-learning-from-mistakes%2F&amp;source=royans&amp;style=normal&amp;service=bit.ly&amp;hashtags=architecture,consistent+hashing,failure,memcached,memcachedb" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://blog.reddit.com/2010/03/and-fun-weekend-was-had-by-all.html">Reddit</a> has a very interesting post about what not to do when trying to build a scalable system. While the error is tragic, I think its an excellent design mistakes to learn from.</p>
<p>Though the post lacked detailed technical report, we might be able to recreate what happened. They mentioned they are using <a href="http://memcachedb.org/">MemcacheDB</a> datastore, with 2GB RAM per node, to keep some data which they need very often.</p>
<p>Dont be confused between <a href="http://memcachedb.org/">MemcacheDB</a> and <a href="http://memcached.org/">memcached</a>. While memcached is a distributed cache engine, MemcacheDB is actually a persistent datastore. And because they both use the same protocol, applications often use memcached libraries to connect to MemcacheDB datastore. <a href="http://www.royans.net/arch/wp-content/uploads/2010/03/redditheader.png"><img style="display: inline; float: right; margin-left: 0px; margin-right: 0px; border: 0px;" title="redditheader" src="http://www.royans.net/arch/wp-content/uploads/2010/03/redditheader_thumb.png" border="0" alt="redditheader" width="141" height="63" align="right" /></a></p>
<p>Both memcached and MemcacheDB rely on the clients to figure out how the keys are distributed across multiple nodes. Reddit chose <a href="http://en.wikipedia.org/wiki/MD5">MD5</a> to generate the keys for their key/value pairs. The algorithm Reddit used to identify which node in the cluster a key should go to could have been dependent on the number of nodes in the system. For example one popular way to identify which node a key should be on would be to use the “<a href="http://en.wikipedia.org/wiki/Modular_arithmetic">modulo</a>” function. For example key “k” could be stored on the node “n” where “n=k modulo 3”. [ If k=101, then n=2 ]</p>
<p><a href="http://www.royans.net/arch/wp-content/uploads/2010/03/image.png"><img style="display: block; float: none; margin-left: auto; margin-right: auto; border-width: 0px;" title="image" src="http://www.royans.net/arch/wp-content/uploads/2010/03/image_thumb.png" border="0" alt="image" width="389" height="101" /></a></p>
<p>Though MemcacheDB uses <a href="http://en.wikipedia.org/wiki/Berkeley_DB">BDB</a> to persist data, it seems like they heavily relied on keeping all the data in RAM. And at some point they might have hit the upper limit on what could be cached in RAM which caused disk i/o which resulted in slower response times. In a scalable architecture one should have been able to add new nodes and the system should have been able to scale.</p>
<p>Unfortunately though this algorithm works beautifully during regular operation, it fails as soon as you add or remove a node (when you change n). At that point you can’t guarantee that all the data you previously stored on node n would still be on the same node.</p>
<p>And while this algorithm may still be ok for “memcached” cache clusters, its really bad for MemcacheDB which requires “<a href="http://weblogs.java.net/blog/2007/11/27/consistent-hashing">consistent hashing</a>”.</p>
<p><img src="http://2.bp.blogspot.com/_swahP4sgx0k/S4v_wE8cr5I/AAAAAAAAACk/CmV9zw4O8O4/s1600/RedditArchDiagramWhiteBG.png" alt="[RedditArchDiagramWhiteBG.png]" /></p>
<p>Reddit today announced that they have increased RAM on these MemcacheDB servers from 2GB to 6GB, which allows 94% of their DB to be kept in memory. But they have realized their mistake (they probably figured this out long time back) and are thinking about how to fix it. The simplest solution of adding a few nodes requires re-hashing their keys which would take days according to their estimate. And of course just adding nodes without using some kind of “consistent hashing” is still not a scalable solution.</p>
<p>I personally learnt two things</p>
<ul>
<li>Dont mix MemcacheDB and memcached. They are not designed to solve the same problem.</li>
<li>Don’t just simply replace memcached with MemcacheDB without thinking twice</li>
</ul>
<p>There are many different products out there today which do a better job at scaling, so I won’t be surprised if they abandon MemcacheDB completely as well.</p>


<p>Related posts:<ol><li><a href='http://www.royans.net/arch/scaling-updates/' rel='bookmark' title='Permanent Link: Scaling updates for Feb 10, 2010'>Scaling updates for Feb 10, 2010</a></li>
<li><a href='http://www.royans.net/arch/brewers-cap-theorem-on-distributed-systems/' rel='bookmark' title='Permanent Link: Brewers CAP Theorem on distributed systems'>Brewers CAP Theorem on distributed systems</a></li>
<li><a href='http://www.royans.net/arch/weekend-reading-material/' rel='bookmark' title='Permanent Link: Weekend reading material'>Weekend reading material</a></li>
</ol></p>
<p><a href="http://feedads.g.doubleclick.net/~a/B9kSpScpg-u_dTBgU9NwuXF_hbg/0/da"><img src="http://feedads.g.doubleclick.net/~a/B9kSpScpg-u_dTBgU9NwuXF_hbg/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/B9kSpScpg-u_dTBgU9NwuXF_hbg/1/da"><img src="http://feedads.g.doubleclick.net/~a/B9kSpScpg-u_dTBgU9NwuXF_hbg/1/di" border="0" ismap="true"></img></a></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/arch?a=a81x9N56lLE:keHnZOP7W_w:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/arch?i=a81x9N56lLE:keHnZOP7W_w:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=a81x9N56lLE:keHnZOP7W_w:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/arch?i=a81x9N56lLE:keHnZOP7W_w:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=a81x9N56lLE:keHnZOP7W_w:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/arch?i=a81x9N56lLE:keHnZOP7W_w:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=a81x9N56lLE:keHnZOP7W_w:l6gmwiTKsz0"><img src="http://feeds.feedburner.com/~ff/arch?d=l6gmwiTKsz0" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/arch/~4/a81x9N56lLE" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.royans.net/arch/reddit-learning-from-mistakes/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		<feedburner:origLink>http://www.royans.net/arch/reddit-learning-from-mistakes/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=reddit-learning-from-mistakes</feedburner:origLink></item>
		<item>
		<title>Scalability links for Feb 28th 2010</title>
		<link>http://feedproxy.google.com/~r/arch/~3/mcjZIJ5Uu_M/</link>
		<comments>http://www.royans.net/arch/scalability-links-for-feb-28th-2010/#comments</comments>
		<pubDate>Sun, 28 Feb 2010 08:31:04 +0000</pubDate>
		<dc:creator>Royans</dc:creator>
				<category><![CDATA[mongodb]]></category>
		<category><![CDATA[redis]]></category>
		<category><![CDATA[updates]]></category>

		<guid isPermaLink="false">http://www.royans.net/arch/scalability-links-for-feb-28th-2010/</guid>
		<description>State of current NoSQL databases : A very detailed post about many NoSQL solutions. A lot of work went into this one.
Truth about joins: Google app engine datastore’s limitation of not allowing joins might soon be a thing of the past. Simple joins may now be possible on GAE if you are using Java. Its [...]


Related posts:&lt;ol&gt;&lt;li&gt;&lt;a href='http://www.royans.net/arch/scalability-links-for-march-13th-2010/' rel='bookmark' title='Permanent Link: Scalability links for March 13th 2010'&gt;Scalability links for March 13th 2010&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/scaling-updates-for-feb-18-2010/' rel='bookmark' title='Permanent Link: Scalability updates for Feb 18, 2010'&gt;Scalability updates for Feb 18, 2010&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/scalability-updates-for-jan-26th-2010/' rel='bookmark' title='Permanent Link: Scalability Updates for Jan 26th 2010'&gt;Scalability Updates for Jan 26th 2010&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/links-on-scalability-performance-and-problems/' rel='bookmark' title='Permanent Link: Links on scalability, performance and problems'&gt;Links on scalability, performance and problems&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/scaling-updates/' rel='bookmark' title='Permanent Link: Scaling updates for Feb 10, 2010'&gt;Scaling updates for Feb 10, 2010&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fscalability-links-for-feb-28th-2010%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fscalability-links-for-feb-28th-2010%2F&amp;source=royans&amp;style=normal&amp;service=bit.ly&amp;hashtags=mongodb,redis,updates" height="61" width="50" /><br />
			</a>
		</div>
<ul>
<li><a href="http://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape.html">State of current NoSQL databases</a> : A very detailed post about many NoSQL solutions. A lot of work went into this one.</li>
<li><a href="http://gae-java-persistence.blogspot.com/2010/02/truth-about-joins.html">Truth about joins</a>: Google app engine datastore’s limitation of not allowing joins might soon be a thing of the past. Simple joins may now be possible on GAE if you are using Java. Its still beta, but the fact that this is being tested is very encouraging. </li>
<li><a href="http://www.rackspacecloud.com/blog/2010/02/25/should-you-switch-to-nosql-too/">Should you switch to NoSQL too</a> ? </li>
<li><a href="http://www.paperplanes.de/2010/2/25/notes_on_mongodb.html">Notes on MongoDB</a>: A very nice summary of MongoDB. </li>
<li><a href="http://nosql.mypopescu.com/post/414014237/redis-usecase-find-out-who-is-online">Redis real-life examples</a> [<a href="http://www.paperplanes.de/2010/2/16/a_collection_of_redis_use_cases.html">More here</a>] : I’ve been seeing a lot of discussions around Redis lately. Here are some use cases I’ve gathered from a couple of posts. Haven’t yet seen it being used by a large organization.
<ul>
<li><a href="http://www.lukemelia.com/blog/archives/2010/01/17/redis-in-practice-whos-online/">Who’s Online</a> </li>
<li><a href="http://techno-weenie.net/2010/2/3/where-s-waldo-track-user-locations-with-node-js-and-redis">Track user locations with Node.js and Redis</a> </li>
<li><a href="http://nosql.mypopescu.com/post/408913109/presentation-redis-remote-dictionary-server-by-ezra">Remote Dictionary server</a> </li>
<li><a href="http://code.google.com/p/redis/wiki/TwitterAlikeExample">Twitter clone using Redis</a> </li>
<li><a href="http://www.dorkalev.com/2010/02/sikwamic-simple-key-value-with-comet.html">Sikwamic: Simple Key-value with comet</a> </li>
</ul>
</li>
<li><a href="http://blog.griddynamics.com/2010/02/4-levels-of-replication-technologies.html">Replication technologies</a> </li>
<li><a href="http://cd34.com/blog/webserver/using-varnish-to-assist-with-ab-testing/">Using Varnish to assist with AB testing</a> – Testing new uncooked features by external customers get more difficult as products become more mature and stable. Tools like “varnish” could be used to test different pages/features. </li>
<li><a href="http://www.google.com/reader/view/#stream/user%2F09020763963835953964%2Fstate%2Fcom.google%2Fstarred">Redundant Array of Independent Datacenters</a> </li>
<li><a href="http://gigaom.com/2010/02/22/twitter-reports-it-has-grown-to-50m-daily-tweets/">Twitter at 50M Daily Tweets</a>&#160; </li>
<li><a href="http://www.readwriteweb.com/cloud/2010/02/hitler-rants-about-cloud-secur.php">Funny: Hitler rants about cloud security and updates his Facebook page</a> </li>
</ul>


<p>Related posts:<ol><li><a href='http://www.royans.net/arch/scalability-links-for-march-13th-2010/' rel='bookmark' title='Permanent Link: Scalability links for March 13th 2010'>Scalability links for March 13th 2010</a></li>
<li><a href='http://www.royans.net/arch/scaling-updates-for-feb-18-2010/' rel='bookmark' title='Permanent Link: Scalability updates for Feb 18, 2010'>Scalability updates for Feb 18, 2010</a></li>
<li><a href='http://www.royans.net/arch/scalability-updates-for-jan-26th-2010/' rel='bookmark' title='Permanent Link: Scalability Updates for Jan 26th 2010'>Scalability Updates for Jan 26th 2010</a></li>
<li><a href='http://www.royans.net/arch/links-on-scalability-performance-and-problems/' rel='bookmark' title='Permanent Link: Links on scalability, performance and problems'>Links on scalability, performance and problems</a></li>
<li><a href='http://www.royans.net/arch/scaling-updates/' rel='bookmark' title='Permanent Link: Scaling updates for Feb 10, 2010'>Scaling updates for Feb 10, 2010</a></li>
</ol></p>
<p><a href="http://feedads.g.doubleclick.net/~a/IZ65udGEu6w9XMpFAufv2jyhhyM/0/da"><img src="http://feedads.g.doubleclick.net/~a/IZ65udGEu6w9XMpFAufv2jyhhyM/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/IZ65udGEu6w9XMpFAufv2jyhhyM/1/da"><img src="http://feedads.g.doubleclick.net/~a/IZ65udGEu6w9XMpFAufv2jyhhyM/1/di" border="0" ismap="true"></img></a></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/arch?a=mcjZIJ5Uu_M:R6kL8WElK3Q:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/arch?i=mcjZIJ5Uu_M:R6kL8WElK3Q:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=mcjZIJ5Uu_M:R6kL8WElK3Q:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/arch?i=mcjZIJ5Uu_M:R6kL8WElK3Q:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=mcjZIJ5Uu_M:R6kL8WElK3Q:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/arch?i=mcjZIJ5Uu_M:R6kL8WElK3Q:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=mcjZIJ5Uu_M:R6kL8WElK3Q:l6gmwiTKsz0"><img src="http://feeds.feedburner.com/~ff/arch?d=l6gmwiTKsz0" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/arch/~4/mcjZIJ5Uu_M" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.royans.net/arch/scalability-links-for-feb-28th-2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.royans.net/arch/scalability-links-for-feb-28th-2010/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=scalability-links-for-feb-28th-2010</feedburner:origLink></item>
		<item>
		<title>Cassandra as a communication medium – A service Registry and Discovery tool</title>
		<link>http://feedproxy.google.com/~r/arch/~3/xK6PgZjYVQY/</link>
		<comments>http://www.royans.net/arch/cassandra-as-a-communication-medium-service-registrydiscovery/#comments</comments>
		<pubDate>Sat, 27 Feb 2010 08:51:25 +0000</pubDate>
		<dc:creator>Royans</dc:creator>
				<category><![CDATA[CAP]]></category>
		<category><![CDATA[cassandra]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[eventually consistent]]></category>
		<category><![CDATA[scalable]]></category>
		<category><![CDATA[discovery]]></category>
		<category><![CDATA[registry]]></category>

		<guid isPermaLink="false">http://www.royans.net/arch/cassandra-as-a-communication-medium-service-registrydiscovery/</guid>
		<description>Few weeks ago while I was mulling over what kind of service registry/discovery system to use for a scalable application deployment platform, I realized that for mid-size organizations with complex set of services, building one from scratch may be the only option.
I also found out that many AWS/EC2 customers have already been using S3 and [...]


Related posts:&lt;ol&gt;&lt;li&gt;&lt;a href='http://www.royans.net/arch/investigating-cassandra-to-build-service-registrydiscovery-service/' rel='bookmark' title='Permanent Link: Cassandra for service registry/discovery service'&gt;Cassandra for service registry/discovery service&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/service-registry-esb-for-scalable-web-applications/' rel='bookmark' title='Permanent Link: Service registry (ESB) for scalable web applications.'&gt;Service registry (ESB) for scalable web applications.&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/cassandra-inverted-index/' rel='bookmark' title='Permanent Link: Cassandra : inverted index'&gt;Cassandra : inverted index&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/amazon-launches-relational-database-as-a-service/' rel='bookmark' title='Permanent Link: Amazon launches Relational Database as a service'&gt;Amazon launches Relational Database as a service&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/brewers-cap-theorem-on-distributed-systems/' rel='bookmark' title='Permanent Link: Brewers CAP Theorem on distributed systems'&gt;Brewers CAP Theorem on distributed systems&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fcassandra-as-a-communication-medium-service-registrydiscovery%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fcassandra-as-a-communication-medium-service-registrydiscovery%2F&amp;source=royans&amp;style=normal&amp;service=bit.ly&amp;hashtags=CAP,cassandra,discovery,registry,scalable" height="61" width="50" /><br />
			</a>
		</div>
<p>Few weeks ago while I was mulling over what kind of service <a href="http://www.royans.net/arch/service-registry-esb-for-scalable-web-applications/"><em>registry/discovery system</em></a> to use for a scalable application deployment platform, I realized that for mid-size organizations with complex set of services, building one from scratch may be the only option.</p>
<p>I also found out that many AWS/EC2 customers have already been using S3 and SimpleDB to&#160; publish/discover services. That discussion eventually led me to investigate <a href="http://www.royans.net/arch/investigating-cassandra-to-build-service-registrydiscovery-service/">Cassandra as the service registry datastore</a> in an enterprise network. </p>
<p>Here are some of the observations I made as I played with <a href="http://www.royans.net/arch/wp-content/uploads/2010/02/column_oriented.jpg"><img style="border-bottom: 0px; border-left: 0px; display: inline; float: right; border-top: 0px; border-right: 0px" title="column_oriented" border="0" alt="column_oriented" src="http://www.royans.net/arch/wp-content/uploads/2010/02/column_oriented_thumb.jpg" width="260" height="150" /></a>Cassandra for this purpose.&#160; I welcome feedback from readers if you think I’m doing something wrong or if you think I can improve the design further.</p>
<ul>
<li>The biggest issue I noticed with Cassandra was the <a href="http://www.royans.net/arch/cassandra-inverted-index/"><em>absence of inverted index</em></a> which could be worked around as I have <a href="http://www.royans.net/arch/cassandra-inverted-index/"><em>blogged here</em></a>. I later realized there is something called <a href="http://github.com/tjake/Lucandra">Lucandra</a>&#160; as well which I need to look at, at some point. </li>
<li>The keyspace structure I used was very simple… ( I skipped some configuration lines to keep it simple ) </li>
</ul>
<blockquote><p>&lt;Keyspace Name=&quot;devkeyspace&quot;&gt;      <br />&#160;&#160;&#160; &lt;ColumnFamily&#160; CompareWith=&quot;UTF8Type&quot; Name=&quot;forward&quot;&#160; /&gt;       <br />&#160;&#160;&#160; &lt;ColumnFamily&#160; CompareWith=&quot;UTF8Type&quot; Name=&quot;reverse&quot;&#160; /&gt;       <br />&lt;/Keyspace&gt; <a href="http://www.royans.net/arch/cassandra-inverted-index/"><img src="http://www.royans.net/arch/wp-content/uploads/2010/02/image.png" width="447" height="94" /></a></p>
</blockquote>
<ul>
<li>Using an “OrderPreservingPartitioner” seemed important to do “range scans”.&#160; Order Preserving partitioner keeps objects with similar looking keys together to allow bulk reads and writes. By default Cassandra randomly distributes the objects across the cluster which works well if you only have a few nodes. </li>
<li>I eventually plan to use this application across two datacenters. The best way to mirror data across datacenters in Cassandra is by using “RackAwareStrategy”. If you select this option, it tells Cassandra to try to pick replicas of each token from different datacenters/racks. The default algorithm uses IP addresses to determine if two nodes are part of the same rack/datacenter, but there are other interesting ways to do it as well. </li>
<li>Some of the APIs changed significantly between the versions I was playing with. Cassandra developers will remind you that this is expected in a product which is still at 0.5 version. What amazes me, however, is the fact that <a href="http://www.facebook.com/note.php?note_id=24413138919">Facebook</a>, <a href="http://about.digg.com/blog/looking-future-cassandra">Digg</a> and now <a href="http://nosql.mypopescu.com/post/407159447/cassandra-twitter-an-interview-with-ryan-king">Twitter</a> have been using this product in production without bringing down everything. </li>
<li>I was eventually able to build a thin java webapp to front-end Cassandra, which provided the REST/json interface for registry/discovery service. This is also the app which managed the inverted indexes.
<ul>
<li>Direct Cassandra access from remote services was disabled for security/stability reasons. </li>
<li>The app used DNS to loadbalance queries across multiple servers. </li>
</ul>
</li>
<li>My initial performance tests on this cluster performed miserably because I forgot that all of my requests were hitting the same node. The right way to tests Cassandra’s capacity is by loadbalancing requests across all Cassandra nodes.
<ul>
<li>Also realized, that by default, the logging mode was set to “DEBUG” which is very verbose. Shutting that down seemed to speed up response times as well. </li>
</ul>
</li>
<li>Playing with different consistency levels for reading and writing was also an interesting experience, especially when I started killing nodes just to see the app break. This is what <a href="http://www.royans.net/arch/tag/cap/">tweeking CAP</a> is all about. </li>
<li>Due to an interesting problem related to “eventual consistency”, <a href="http://spyced.blogspot.com/2010/02/distributed-deletes-in-cassandra.html">Cassandra doesn’t completely delete data which was marked deletion or was intentionally changed</a>. In the default configuration that data is kept around for 10 days before its completely removed from the system. </li>
<li>Some documentation on the core <a href="http://wiki.apache.org/cassandra/Operations">operational aspects of Cassandra</a> exist, but it would be nice if there were more. </li>
</ul>
<p>Cassandra was designed as a scalable,highly available datastore. But because of its interesting self-healing and “RackAware” features, it can become an interesting communication medium as well.</p>


<p>Related posts:<ol><li><a href='http://www.royans.net/arch/investigating-cassandra-to-build-service-registrydiscovery-service/' rel='bookmark' title='Permanent Link: Cassandra for service registry/discovery service'>Cassandra for service registry/discovery service</a></li>
<li><a href='http://www.royans.net/arch/service-registry-esb-for-scalable-web-applications/' rel='bookmark' title='Permanent Link: Service registry (ESB) for scalable web applications.'>Service registry (ESB) for scalable web applications.</a></li>
<li><a href='http://www.royans.net/arch/cassandra-inverted-index/' rel='bookmark' title='Permanent Link: Cassandra : inverted index'>Cassandra : inverted index</a></li>
<li><a href='http://www.royans.net/arch/amazon-launches-relational-database-as-a-service/' rel='bookmark' title='Permanent Link: Amazon launches Relational Database as a service'>Amazon launches Relational Database as a service</a></li>
<li><a href='http://www.royans.net/arch/brewers-cap-theorem-on-distributed-systems/' rel='bookmark' title='Permanent Link: Brewers CAP Theorem on distributed systems'>Brewers CAP Theorem on distributed systems</a></li>
</ol></p>
<p><a href="http://feedads.g.doubleclick.net/~a/y_M6GrGyCLXgLaYDvNIrFjGAjLg/0/da"><img src="http://feedads.g.doubleclick.net/~a/y_M6GrGyCLXgLaYDvNIrFjGAjLg/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/y_M6GrGyCLXgLaYDvNIrFjGAjLg/1/da"><img src="http://feedads.g.doubleclick.net/~a/y_M6GrGyCLXgLaYDvNIrFjGAjLg/1/di" border="0" ismap="true"></img></a></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/arch?a=xK6PgZjYVQY:IwpQhJA2InE:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/arch?i=xK6PgZjYVQY:IwpQhJA2InE:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=xK6PgZjYVQY:IwpQhJA2InE:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/arch?i=xK6PgZjYVQY:IwpQhJA2InE:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=xK6PgZjYVQY:IwpQhJA2InE:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/arch?i=xK6PgZjYVQY:IwpQhJA2InE:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=xK6PgZjYVQY:IwpQhJA2InE:l6gmwiTKsz0"><img src="http://feeds.feedburner.com/~ff/arch?d=l6gmwiTKsz0" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/arch/~4/xK6PgZjYVQY" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.royans.net/arch/cassandra-as-a-communication-medium-service-registrydiscovery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.royans.net/arch/cassandra-as-a-communication-medium-service-registrydiscovery/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=cassandra-as-a-communication-medium-service-registrydiscovery</feedburner:origLink></item>
		<item>
		<title>Talk on “database scalability”</title>
		<link>http://feedproxy.google.com/~r/arch/~3/KMdNDXqhDkE/</link>
		<comments>http://www.royans.net/arch/talk-on-database-scalability/#comments</comments>
		<pubDate>Fri, 26 Feb 2010 16:12:45 +0000</pubDate>
		<dc:creator>Royans</dc:creator>
				<category><![CDATA[cassandra]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[scalability]]></category>
		<category><![CDATA[talk]]></category>

		<guid isPermaLink="false">http://www.royans.net/arch/talk-on-database-scalability/</guid>
		<description>This is a very interesting talk by Jonathan Ellis on database scalability. He designed and implemented multi-petabyte storage for Mozy and is currently the project chair for Apache Cassandra.

What every developer should know about database scalability, PyCon 2010
View more presentations from jbellis.


Scalability is not improving latency, but increasing throughput 
But overall performance shouldn’t degrade 
Throw [...]


Related posts:&lt;ol&gt;&lt;li&gt;&lt;a href='http://www.royans.net/arch/cloud-architecture-notes-from-an-amazon-talk/' rel='bookmark' title='Permanent Link: Cloud architecture: Notes from an Amazon talk'&gt;Cloud architecture: Notes from an Amazon talk&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/amazon-launches-relational-database-as-a-service/' rel='bookmark' title='Permanent Link: Amazon launches Relational Database as a service'&gt;Amazon launches Relational Database as a service&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/what-is-scalability/' rel='bookmark' title='Permanent Link: What is scalability ?'&gt;What is scalability ?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/scalability-killers-the-art-of-scalability/' rel='bookmark' title='Permanent Link: Scalability Killers (The art of scalability)'&gt;Scalability Killers (The art of scalability)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/scalability-links-for-march-13th-2010/' rel='bookmark' title='Permanent Link: Scalability links for March 13th 2010'&gt;Scalability links for March 13th 2010&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.royans.net%2Farch%2Ftalk-on-database-scalability%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.royans.net%2Farch%2Ftalk-on-database-scalability%2F&amp;source=royans&amp;style=normal&amp;service=bit.ly&amp;hashtags=cassandra,database,scalability,talk" height="61" width="50" /><br />
			</a>
		</div>
<p>This is a very interesting talk by <a href="http://spyced.blogspot.com/">Jonathan Ellis</a> on database scalability. He designed and implemented multi-petabyte storage for <a href="http://mozy.com/">Mozy</a> and is currently the project chair for <a href="http://en.wikipedia.org/wiki/Cassandra">Apache Cassandra</a>.</p>
<p><embed src="http://blip.tv/play/g4VigciebAI%2Em4v" type="application/x-shockwave-flash" width="480" height="350" allowscriptaccess="always" allowfullscreen="true"></embed></p>
<div style="width: 425px" id="__ss_3228070"><strong style="margin: 12px 0px 4px; display: block"><a title="What every developer should know about database scalability, PyCon 2010" href="http://www.slideshare.net/jbellis/what-every-developer-should-know-about-database-scalability-pycon-2010">What every developer should know about database scalability, PyCon 2010</a></strong><object width="425" height="355"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=databasescalability2010-100219145240-phpapp01&amp;stripped_title=what-every-developer-should-know-about-database-scalability-pycon-2010" /><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><embed src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=databasescalability2010-100219145240-phpapp01&amp;stripped_title=what-every-developer-should-know-about-database-scalability-pycon-2010" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed></object>
<div style="padding-bottom: 12px; padding-left: 0px; padding-right: 0px; padding-top: 5px">View more <a href="http://www.slideshare.net/">presentations</a> from <a href="http://www.slideshare.net/jbellis">jbellis</a>.</div>
</p></div>
<ul>
<li>Scalability is not improving latency, but increasing throughput </li>
<li>But overall performance shouldn’t degrade </li>
<li>Throw hardware, not people at the problem </li>
<li>Traditional databases use b-tree indexes. But requires the entire index to be in-memory at the same place. </li>
<li>Easy bandaid #1– SSD storage is better for b-tree indexes which need to hit disk </li>
<li>Easy bandaid #2 – Buy faster server every 2 years. As long as your userbase doesn’t grow faster that Moore’s law </li>
<li>Easy bandaid #3 – Use caching to handle hotspots (Distributed) </li>
<li>Memcache server failures can change where hashing keys are kept </li>
<li>Consistent hashing solves the problem by mapping keys to tokens. The tokens can move around to more or less server. Apps would be able to figure out which keys are where. </li>
</ul>


<p>Related posts:<ol><li><a href='http://www.royans.net/arch/cloud-architecture-notes-from-an-amazon-talk/' rel='bookmark' title='Permanent Link: Cloud architecture: Notes from an Amazon talk'>Cloud architecture: Notes from an Amazon talk</a></li>
<li><a href='http://www.royans.net/arch/amazon-launches-relational-database-as-a-service/' rel='bookmark' title='Permanent Link: Amazon launches Relational Database as a service'>Amazon launches Relational Database as a service</a></li>
<li><a href='http://www.royans.net/arch/what-is-scalability/' rel='bookmark' title='Permanent Link: What is scalability ?'>What is scalability ?</a></li>
<li><a href='http://www.royans.net/arch/scalability-killers-the-art-of-scalability/' rel='bookmark' title='Permanent Link: Scalability Killers (The art of scalability)'>Scalability Killers (The art of scalability)</a></li>
<li><a href='http://www.royans.net/arch/scalability-links-for-march-13th-2010/' rel='bookmark' title='Permanent Link: Scalability links for March 13th 2010'>Scalability links for March 13th 2010</a></li>
</ol></p>
<p><a href="http://feedads.g.doubleclick.net/~a/_It7tGn2xhm0-E7Ngw_e66-SqEQ/0/da"><img src="http://feedads.g.doubleclick.net/~a/_It7tGn2xhm0-E7Ngw_e66-SqEQ/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/_It7tGn2xhm0-E7Ngw_e66-SqEQ/1/da"><img src="http://feedads.g.doubleclick.net/~a/_It7tGn2xhm0-E7Ngw_e66-SqEQ/1/di" border="0" ismap="true"></img></a></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/arch?a=KMdNDXqhDkE:-veULaRheog:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/arch?i=KMdNDXqhDkE:-veULaRheog:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=KMdNDXqhDkE:-veULaRheog:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/arch?i=KMdNDXqhDkE:-veULaRheog:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=KMdNDXqhDkE:-veULaRheog:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/arch?i=KMdNDXqhDkE:-veULaRheog:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=KMdNDXqhDkE:-veULaRheog:l6gmwiTKsz0"><img src="http://feeds.feedburner.com/~ff/arch?d=l6gmwiTKsz0" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/arch/~4/KMdNDXqhDkE" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.royans.net/arch/talk-on-database-scalability/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.royans.net/arch/talk-on-database-scalability/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=talk-on-database-scalability</feedburner:origLink></item>
		<item>
		<title>Scalable logging using Syslog</title>
		<link>http://feedproxy.google.com/~r/arch/~3/qjoSZS4HYUk/</link>
		<comments>http://www.royans.net/arch/scalable-logging-using-syslog/#comments</comments>
		<pubDate>Fri, 26 Feb 2010 05:18:57 +0000</pubDate>
		<dc:creator>Royans</dc:creator>
				<category><![CDATA[logging]]></category>
		<category><![CDATA[scalable]]></category>
		<category><![CDATA[syslog]]></category>

		<guid isPermaLink="false">http://www.royans.net/arch/scalable-logging-using-syslog/</guid>
		<description>Syslog is a commonly used transport mechanism for system logs. But people sometimes forget it could be used for a lot of other purposes as well. 
Take, for example, the interesting challenge of aggregating web server logs from 100 different servers into one server and then figuring out how to merge them. If you have [...]


Related posts:&lt;ol&gt;&lt;li&gt;&lt;a href='http://www.royans.net/arch/scalable-products-kfs-released/' rel='bookmark' title='Permanent Link: Scalable products: KFS released'&gt;Scalable products: KFS released&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/automated-faster-repeatable-scalable-deployments/' rel='bookmark' title='Permanent Link: Automated, faster, repeatable, scalable deployments'&gt;Automated, faster, repeatable, scalable deployments&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/service-registry-esb-for-scalable-web-applications/' rel='bookmark' title='Permanent Link: Service registry (ESB) for scalable web applications.'&gt;Service registry (ESB) for scalable web applications.&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/heroku-platform-for-scalable-applications/' rel='bookmark' title='Permanent Link: Heroku platform for scalable web applications'&gt;Heroku platform for scalable web applications&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/book-building-scalable-web-sites/' rel='bookmark' title='Permanent Link: Book: Building Scalable Web Sites'&gt;Book: Building Scalable Web Sites&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fscalable-logging-using-syslog%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fscalable-logging-using-syslog%2F&amp;source=royans&amp;style=normal&amp;service=bit.ly&amp;hashtags=logging,scalable,syslog" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://en.wikipedia.org/wiki/Syslog">Syslog</a> is a commonly used transport mechanism for system logs. But people sometimes forget it could be used for a lot of other purposes as well. </p>
<p>Take, for example, the interesting challenge of aggregating web server logs from 100 different servers into one server and <a href="http://www.royans.net/arch/wp-content/uploads/2010/02/syslog.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; float: right; border-top: 0px; border-right: 0px" title="syslog" border="0" alt="syslog" src="http://www.royans.net/arch/wp-content/uploads/2010/02/syslog_thumb.png" width="240" height="88" /></a>then figuring out how to merge them. If you have built your own tool to do this, you would have figured out&#160; by now how expensive it is to poll all the servers and how out-of-date these logs could get by the time you process it. If you are not inserting them into some kind of datastore which sorts the rows by timestamp, you now also have to take up the challenge of building merge-sort script.</p>
<p>There is nothing which stops applications from using syslog as well. If your apps are in Java, you should try out Syslog appender for log4j [<a href="http://threebit.net/mail-archive/tomcat-users/msg00219.html">Ref 1</a>] [<a href="http://www.5341.com/msg/229868.html">Ref 2</a>]. Not only do you get central logging, you also get get to see real-time “tail -f” of events as they happen in a merged file. If there are issues anywhere in your network, you have just one place to look at. If your logging volume is high, you would have to use other tools (or build your own) to do log analysis.</p>
<p>Here are some things you might have to think about if you plan to use syslog for your environment.</p>
<ol>
<li>Setup different syslog servers for each of your datacenters using split DNS or by use different hostnames.</li>
<li>Try not to send logs across WAN links</li>
<li>Rotate logs on a nightly basis, or depending on the log volume</li>
<li>Reduce amount of logging (don’t do “debug” in production for example) </li>
<li>Write tools to detect change in logging volume in dev/qa environment. If you follow good logging practice, you should be able to identify components which are responsible for the increase very quickly.</li>
<li>Identify log patterns which could be causes of concerns and setup some kind of alerting using your regular monitoring service (nagios for example). Don’t be afraid to use 3rd party tools which do this very well.</li>
<li>Syslog over UDP is non-blocking, but the syslog server can overloaded if logging volume is not controlled. The most expensive part of logging is disk i/o. If you notice high i/o</li>
<li>UDP doesn’t guarantee that every log event will make it to the syslog server. Find out if that level of uncertainty in logging is ok for your environment. </li>
</ol>
<p>Other interesting observations</p>
<ol>
<li>The amount of changes required in a java app which is already using log4j to log to a syslog server is trivial</li>
<li>Logging to local files can be disabled, which means you don’t have to worry about disk storage on each server..</li>
<li>If you are using or want to use tools like <a href="http://www.splunk.com/">splunk</a> or hadoop/hbase for log analysis, syslog is probably the easiest way to get there. </li>
<li>You can always loadbalance syslog servers by using DNS loadbalancing.</li>
<li>Apache webservers can’t do syslog out of the box, but you can still <a href="http://www.oreillynet.com/pub/a/sysadmin/2006/10/12/httpd-syslog.html">make it happen</a> </li>
<li>I personally like <a href="http://haproxy.1wt.eu/">haproxy</a> more and it does do syslog out of the box.</li>
<li>If you want to log events from startup/shutdown scripts, you can use the “logger” *nix command to send events to the syslog server. </li>
</ol>
<p>How is log aggregated in your environment ?</p>
</p>
<p>References</p>
<ul>
<li><a href="http://www.johnandcailin.com/blog/john/setting-syslog-distributed-application-logging">Setting syslog distributed application logging</a></li>
<li><a href="http://www.cyberciti.biz/tips/howto-linux-unix-write-to-syslog.html">Write message to syslog</a></li>
<li><a href="http://www.javaworld.com/javaworld/jw-04-2001/jw-0406-syslog.html">Robust event logging with syslog</a></li>
<li><a href="http://php.net/manual/en/function.syslog.php">PHP syslog function</a></li>
<li><a href="http://www.oreillynet.com/pub/a/sysadmin/2006/10/12/httpd-syslog.html">Sending apache logs to syslog</a></li>
</ul>


<p>Related posts:<ol><li><a href='http://www.royans.net/arch/scalable-products-kfs-released/' rel='bookmark' title='Permanent Link: Scalable products: KFS released'>Scalable products: KFS released</a></li>
<li><a href='http://www.royans.net/arch/automated-faster-repeatable-scalable-deployments/' rel='bookmark' title='Permanent Link: Automated, faster, repeatable, scalable deployments'>Automated, faster, repeatable, scalable deployments</a></li>
<li><a href='http://www.royans.net/arch/service-registry-esb-for-scalable-web-applications/' rel='bookmark' title='Permanent Link: Service registry (ESB) for scalable web applications.'>Service registry (ESB) for scalable web applications.</a></li>
<li><a href='http://www.royans.net/arch/heroku-platform-for-scalable-applications/' rel='bookmark' title='Permanent Link: Heroku platform for scalable web applications'>Heroku platform for scalable web applications</a></li>
<li><a href='http://www.royans.net/arch/book-building-scalable-web-sites/' rel='bookmark' title='Permanent Link: Book: Building Scalable Web Sites'>Book: Building Scalable Web Sites</a></li>
</ol></p>
<p><a href="http://feedads.g.doubleclick.net/~a/Axz8XBdU3JrsHphA7se_JUWtOos/0/da"><img src="http://feedads.g.doubleclick.net/~a/Axz8XBdU3JrsHphA7se_JUWtOos/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/Axz8XBdU3JrsHphA7se_JUWtOos/1/da"><img src="http://feedads.g.doubleclick.net/~a/Axz8XBdU3JrsHphA7se_JUWtOos/1/di" border="0" ismap="true"></img></a></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/arch?a=qjoSZS4HYUk:DxjJOSmup10:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/arch?i=qjoSZS4HYUk:DxjJOSmup10:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=qjoSZS4HYUk:DxjJOSmup10:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/arch?i=qjoSZS4HYUk:DxjJOSmup10:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=qjoSZS4HYUk:DxjJOSmup10:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/arch?i=qjoSZS4HYUk:DxjJOSmup10:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=qjoSZS4HYUk:DxjJOSmup10:l6gmwiTKsz0"><img src="http://feeds.feedburner.com/~ff/arch?d=l6gmwiTKsz0" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/arch/~4/qjoSZS4HYUk" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.royans.net/arch/scalable-logging-using-syslog/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.royans.net/arch/scalable-logging-using-syslog/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=scalable-logging-using-syslog</feedburner:origLink></item>
		<item>
		<title>SimpleDB now allows you to tweak consistency levels</title>
		<link>http://feedproxy.google.com/~r/arch/~3/zbfe-lUcpDM/</link>
		<comments>http://www.royans.net/arch/simpledb-now-allows-you-to-tweak-consistency-levels/#comments</comments>
		<pubDate>Thu, 25 Feb 2010 14:59:03 +0000</pubDate>
		<dc:creator>Royans</dc:creator>
				<category><![CDATA[CAP]]></category>
		<category><![CDATA[simpledb]]></category>
		<category><![CDATA[updates]]></category>

		<guid isPermaLink="false">http://www.royans.net/arch/simpledb-now-allows-you-to-tweak-consistency-levels/</guid>
		<description>We discussed Brewer’s Theorm a few days ago and how its challenging to obtain Consistency, Availability and Partition tolerance in any distributed system. We also discussed that many of the distributed datastores allow CAP to be tweaked to attain certain operational goals. 
Amazon SimpleDB, which was released as an “Eventually Consistent” datastore,&amp;#160; today launched a [...]


Related posts:&lt;ol&gt;&lt;li&gt;&lt;a href='http://www.royans.net/arch/eventual-consistency-is-just-caching/' rel='bookmark' title='Permanent Link: Eventual consistency is just caching ?'&gt;Eventual consistency is just caching ?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/experimenting-with-simpledb-flagthiscom/' rel='bookmark' title='Permanent Link: Experimenting with SimpleDB (Flagthis.com)'&gt;Experimenting with SimpleDB (Flagthis.com)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/brewers-cap-theorem-on-distributed-systems/' rel='bookmark' title='Permanent Link: Brewers CAP Theorem on distributed systems'&gt;Brewers CAP Theorem on distributed systems&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/weekend-reading-material/' rel='bookmark' title='Permanent Link: Weekend reading material'&gt;Weekend reading material&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/scalability-stories-for-oct-22-2007/' rel='bookmark' title='Permanent Link: Scalability stories for Oct 22, 2007'&gt;Scalability stories for Oct 22, 2007&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fsimpledb-now-allows-you-to-tweak-consistency-levels%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fsimpledb-now-allows-you-to-tweak-consistency-levels%2F&amp;source=royans&amp;style=normal&amp;service=bit.ly&amp;hashtags=CAP,simpledb,updates" height="61" width="50" /><br />
			</a>
		</div>
<p>We discussed <a href="http://www.royans.net/arch/brewers-cap-theorem-on-distributed-systems/">Brewer’s Theorm</a> a few days ago and how its challenging to obtain Consistency, Availability and Partition tolerance in any distributed system. We also discussed that many of the <img style="float: right" alt="Amazon Web Services" src="http://awsmedia.s3.amazonaws.com/logo_aws.gif" width="164" height="60" />distributed datastores allow <strong>CAP</strong> to be tweaked to attain certain operational goals. </p>
<p>Amazon SimpleDB, which was released as an “Eventually Consistent” datastore,&#160; today launched a few features to do just that. </p>
<ul>
<li><strong>Consistent reads</strong>: Select and GetAttributes request now include an optional Boolean flag “ConsistentRead” which requests datastore to return consistent results only. If you have noticed scenarios where read right after a write returned an old value, it shouldn’t happen anymore. </li>
<li><strong>Conditional put/puts, delete/deletes </strong>: By providing “conditions” in the form of a key/value pair SimpleDB can now conditionally execute/discard an operation. This might look like a minor feature, but can go a long way in providing reliable datastore operations. </li>
</ul>
<blockquote><p><em>Even though SimpleDB now enables operations that support a stronger consistency model, under the covers SimpleDB remains the same highly-scalable, highly-available, and highly durable structured data store. Even under extreme failure scenarios, such as complete datacenter failures, SimpleDB is architected to continue to operate reliably. However when one of these extreme failure conditions occurs it may be that the stronger consistency options are briefly not available while the software reorganizes itself to ensure that it can provide strong consistency. Under those conditions the default, eventually consistent read will remain available to use.</em></p>
</blockquote>
<p><strong>References</strong></p>
<ul>
<li><a href="http://www.allthingsdistributed.com/2010/02/strong_consistency_simpledb.html">Strong consistency in SimpleDB</a> </li>
<li><a href="http://aws.typepad.com/aws/2010/02/amazon-simpledb-consistency-enhancements.html">Amazon SimpleDB Consistency Enhancements</a> </li>
<li><a href="http://perspectives.mvdirona.com/2010/02/24/ILoveEventualConsistencyBut.aspx">I love eventual consistency but..</a> </li>
</ul>


<p>Related posts:<ol><li><a href='http://www.royans.net/arch/eventual-consistency-is-just-caching/' rel='bookmark' title='Permanent Link: Eventual consistency is just caching ?'>Eventual consistency is just caching ?</a></li>
<li><a href='http://www.royans.net/arch/experimenting-with-simpledb-flagthiscom/' rel='bookmark' title='Permanent Link: Experimenting with SimpleDB (Flagthis.com)'>Experimenting with SimpleDB (Flagthis.com)</a></li>
<li><a href='http://www.royans.net/arch/brewers-cap-theorem-on-distributed-systems/' rel='bookmark' title='Permanent Link: Brewers CAP Theorem on distributed systems'>Brewers CAP Theorem on distributed systems</a></li>
<li><a href='http://www.royans.net/arch/weekend-reading-material/' rel='bookmark' title='Permanent Link: Weekend reading material'>Weekend reading material</a></li>
<li><a href='http://www.royans.net/arch/scalability-stories-for-oct-22-2007/' rel='bookmark' title='Permanent Link: Scalability stories for Oct 22, 2007'>Scalability stories for Oct 22, 2007</a></li>
</ol></p>
<p><a href="http://feedads.g.doubleclick.net/~a/DK2KfBKHQXSW_JoCMGOaGTrYC1c/0/da"><img src="http://feedads.g.doubleclick.net/~a/DK2KfBKHQXSW_JoCMGOaGTrYC1c/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/DK2KfBKHQXSW_JoCMGOaGTrYC1c/1/da"><img src="http://feedads.g.doubleclick.net/~a/DK2KfBKHQXSW_JoCMGOaGTrYC1c/1/di" border="0" ismap="true"></img></a></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/arch?a=zbfe-lUcpDM:9EGj_darLYw:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/arch?i=zbfe-lUcpDM:9EGj_darLYw:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=zbfe-lUcpDM:9EGj_darLYw:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/arch?i=zbfe-lUcpDM:9EGj_darLYw:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=zbfe-lUcpDM:9EGj_darLYw:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/arch?i=zbfe-lUcpDM:9EGj_darLYw:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=zbfe-lUcpDM:9EGj_darLYw:l6gmwiTKsz0"><img src="http://feeds.feedburner.com/~ff/arch?d=l6gmwiTKsz0" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/arch/~4/zbfe-lUcpDM" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.royans.net/arch/simpledb-now-allows-you-to-tweak-consistency-levels/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.royans.net/arch/simpledb-now-allows-you-to-tweak-consistency-levels/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=simpledb-now-allows-you-to-tweak-consistency-levels</feedburner:origLink></item>
		<item>
		<title>NoSQL in the Twitter world</title>
		<link>http://feedproxy.google.com/~r/arch/~3/a-dYYziAFe8/</link>
		<comments>http://www.royans.net/arch/nosql-in-the-twitter-world/#comments</comments>
		<pubDate>Tue, 23 Feb 2010 15:42:58 +0000</pubDate>
		<dc:creator>Royans</dc:creator>
				<category><![CDATA[NOSQL]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://www.royans.net/arch/nosql-in-the-twitter-world/</guid>
		<description>NoSQL solutions have one thing in common. They are generally designed for horizontal scalability. So its no wonder that lot of applications in the “twitter” world have picked NoSQL based datastores for their persistence layer. Here is a collection of these apps from MyNoSQL blog. 

Twitter uses Cassandra
MusicTweets used Redis [ Ref ] – The [...]


Related posts:&lt;ol&gt;&lt;li&gt;&lt;a href='http://www.royans.net/arch/scalability-stories-15th-sept-mysql-proxy-cluster-fire-system-facebook-apps-and-twitter/' rel='bookmark' title='Permanent Link: Scalability Stories (15th Sept) Mysql Proxy, Cluster Fire System, Facebook apps and Twitter'&gt;Scalability Stories (15th Sept) Mysql Proxy, Cluster Fire System, Facebook apps and Twitter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href='http://www.royans.net/arch/scaling-updates-for-feb-18-2010/' rel='bookmark' title='Permanent Link: Scalability updates for Feb 18, 2010'&gt;Scalability updates for Feb 18, 2010&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fnosql-in-the-twitter-world%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.royans.net%2Farch%2Fnosql-in-the-twitter-world%2F&amp;source=royans&amp;style=normal&amp;service=bit.ly&amp;hashtags=NOSQL,twitter" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://en.wikipedia.org/wiki/NoSQL">NoSQL</a> solutions have one thing in common. They are generally designed for horizontal scalability. So its no wonder that lot of applications in the “twitter” world have picked NoSQL based datastores for their <img style="float: right" alt="Twitter.com" src="http://a0.twimg.com/a/1266879478/images/twitter_logo_header.png" width="155" height="36" />persistence layer. Here is a collection of these apps from <a href="http://nosql.mypopescu.com/post/406672401/more-nosql-based-twitter-apps">MyNoSQL</a> blog. </p>
<ol>
<li><a href="http://nosql.mypopescu.com/post/407159447/cassandra-twitter-an-interview-with-ryan-king">Twitter uses Cassandra</a></li>
<li>MusicTweets used <a href="http://code.google.com/p/redis/">Redis</a> [ <a href="http://rfw.posterous.com/musictweets-a-rediswebsocket-powered-experime">Ref</a> ] – The site is dead, but you can still read about it </li>
<li><a href="http://github.com/yssk22/tstore">Tstore</a> uses <a href="http://couchdb.apache.org/">CouchDB</a> </li>
<li><a href="http://retwis.antirez.com/">Retwis</a> uses <a href="http://couchdb.apache.org/">CouchDB</a> </li>
<li><a href="http://retwisrb.danlucraft.com/login">Retwis-RB</a> uses <a href="http://code.google.com/p/redis/">Redis</a> and <a href="http://www.sinatrarb.com/">Sinatra ??</a>&#160; &#8211; No idea what sinatra is. Will have to look into it. <em>[ Update: Sinatra is not a DB store ]</em> </li>
<li><a href="http://locomotivation.squeejee.com/post/148492725/twitter-apps-sing-on-mongodb">Floxee</a> uses <a href="http://www.mongodb.org/display/DOCS/Home">MongoDB</a> </li>
<li><a href="http://github.com/ieure/Twidoop">Twidoop</a> uses <a href="http://hadoop.apache.org/">Hadoop</a> </li>
<li><a href="http://git.chris-lamb.co.uk/?p=swordfish.git;a=tree">Swordfish</a> built on top of <a href="http://1978th.net/tokyocabinet/">Tokyo Cabinet</a> comes with a twitter clone app with it. </li>
<li><a href="http://tweetarium.com/">Tweetarium</a> uses <a href="http://1978th.net/tokyocabinet/">Tokyo Cabinet</a> </li>
</ol>
<p><strong>References</strong></p>
<ul>
<li><a href="http://nosql.mypopescu.com/post/319859407/nosql-twitter-applications">NoSQL Twitter Applications</a> </li>
<li><a href="http://nosql.mypopescu.com/post/406672401/more-nosql-based-twitter-apps">More NoSQL Twitter apps</a> </li>
</ul>
<p>Do you know of any more ?</p>


<p>Related posts:<ol><li><a href='http://www.royans.net/arch/scalability-stories-15th-sept-mysql-proxy-cluster-fire-system-facebook-apps-and-twitter/' rel='bookmark' title='Permanent Link: Scalability Stories (15th Sept) Mysql Proxy, Cluster Fire System, Facebook apps and Twitter'>Scalability Stories (15th Sept) Mysql Proxy, Cluster Fire System, Facebook apps and Twitter</a></li>
<li><a href='http://www.royans.net/arch/scaling-updates-for-feb-18-2010/' rel='bookmark' title='Permanent Link: Scalability updates for Feb 18, 2010'>Scalability updates for Feb 18, 2010</a></li>
</ol></p>
<p><a href="http://feedads.g.doubleclick.net/~a/msoNyeXpRy6VVyaVkN7DY75F5JM/0/da"><img src="http://feedads.g.doubleclick.net/~a/msoNyeXpRy6VVyaVkN7DY75F5JM/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/msoNyeXpRy6VVyaVkN7DY75F5JM/1/da"><img src="http://feedads.g.doubleclick.net/~a/msoNyeXpRy6VVyaVkN7DY75F5JM/1/di" border="0" ismap="true"></img></a></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/arch?a=a-dYYziAFe8:9KPvdspQ4c8:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/arch?i=a-dYYziAFe8:9KPvdspQ4c8:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=a-dYYziAFe8:9KPvdspQ4c8:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/arch?i=a-dYYziAFe8:9KPvdspQ4c8:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=a-dYYziAFe8:9KPvdspQ4c8:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/arch?i=a-dYYziAFe8:9KPvdspQ4c8:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/arch?a=a-dYYziAFe8:9KPvdspQ4c8:l6gmwiTKsz0"><img src="http://feeds.feedburner.com/~ff/arch?d=l6gmwiTKsz0" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/arch/~4/a-dYYziAFe8" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.royans.net/arch/nosql-in-the-twitter-world/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://www.royans.net/arch/nosql-in-the-twitter-world/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=nosql-in-the-twitter-world</feedburner:origLink></item>
	</channel>
</rss><!-- Dynamic page generated in 1.943 seconds. --><!-- Cached page generated by WP-Super-Cache on 2010-03-13 08:26:19 -->
