<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Server Density Blog</title>
	
	<link>http://blog.serverdensity.com</link>
	<description>Interesting devops tech stuff</description>
	<lastBuildDate>Sun, 16 Jun 2013 14:13:54 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/serverdensity" /><feedburner:info uri="serverdensity" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Sysadmin Sunday 131</title>
		<link>http://feedproxy.google.com/~r/serverdensity/~3/LkM3UVRvP24/</link>
		<comments>http://blog.serverdensity.com/sysadmin-sunday-131/#comments</comments>
		<pubDate>Sun, 16 Jun 2013 14:13:54 +0000</pubDate>
		<dc:creator>Rufus</dc:creator>
				<category><![CDATA[Sysadmin Sunday]]></category>

		<guid isPermaLink="false">http://blog.serverdensity.com/?p=3592</guid>
		<description><![CDATA[<p>The Meteoric Rise of DigitalOcean Yes, you really can make complex webapps responsive Beyond the default Rails environments Effectively managing memory at Gmail scale The Death of Network Engineers; Long [...]</p><p>The post <a href="http://blog.serverdensity.com/sysadmin-sunday-131/">Sysadmin Sunday 131</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p>]]></description>
				<content:encoded><![CDATA[<ul>
<li><a href="http://buff.ly/1bzGnlb"><span></span>The Meteoric Rise of DigitalOcean</a></li>
<li><a href="http://buff.ly/19uH5no"><span></span>Yes, you really can make complex webapps responsive</a></li>
<li><a href="http://buff.ly/16iq1Ns"><span></span>Beyond the default Rails environments</a></li>
<li><a href="http://buff.ly/11HQJ0Z"><span></span>Effectively managing memory at Gmail scale</a></li>
<li><a href="http://buff.ly/11CzSg6"><span></span>The Death of Network Engineers; Long Live Network Engineers</a></li>
<li><a href="http://buff.ly/1364wg7"><span></span>The network is reliable</a></li>
<li><a href="http://buff.ly/1bnOgdy"><span></span>DevOps, the Title Match</a></li>
<li><a href="http://buff.ly/111uSlw"><span></span>ffind: a sane replacement for command line file search</a></li>
<li><a href="http://buff.ly/16SewQu"><span></span>Instagram: Making the Switch to Cassandra from Redis, a 75% &#8216;Insta&#8217; Savings</a></li>
<li><a href="http://buff.ly/11orTTC"><span></span>SELinux&#8217;s toxic mistake</a></li>
<li><a href="http://buff.ly/15E5x1f"><span></span>Google workshop: Web Front-End Latency</a></li>
</ul>
<p>The post <a href="http://blog.serverdensity.com/sysadmin-sunday-131/">Sysadmin Sunday 131</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p><img src="http://feeds.feedburner.com/~r/serverdensity/~4/LkM3UVRvP24" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.serverdensity.com/sysadmin-sunday-131/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.serverdensity.com/sysadmin-sunday-131/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=sysadmin-sunday-131</feedburner:origLink></item>
		<item>
		<title>Designing and printing (dot) notebooks</title>
		<link>http://feedproxy.google.com/~r/serverdensity/~3/-59cekfcq7c/</link>
		<comments>http://blog.serverdensity.com/designing-and-printing-dot-notebooks/#comments</comments>
		<pubDate>Thu, 13 Jun 2013 13:29:20 +0000</pubDate>
		<dc:creator>Rufus</dc:creator>
				<category><![CDATA[Devops]]></category>
		<category><![CDATA[sysadmin]]></category>

		<guid isPermaLink="false">http://blog.serverdensity.com/?p=3561</guid>
		<description><![CDATA[<p>Since joining Server Density in October last year, I&#8217;ve been responsible for running the marketing efforts. A key belief of our CEO, David Mytton and I is that conferences are [...]</p><p>The post <a href="http://blog.serverdensity.com/designing-and-printing-dot-notebooks/">Designing and printing (dot) notebooks</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p>]]></description>
				<content:encoded><![CDATA[<p>Since joining <a href="https://www.serverdensity.com">Server Density</a> in October last year, I&#8217;ve been responsible for running the marketing efforts. A key belief of our CEO, <a href="https://www.serverdensity.com/people/">David Mytton</a> and I is that conferences are a great source of exposure for Server Density in 2 key ways. 1) They show a desire to give back to wonderful communities that exist. 2) They provide us with a platform to showcase what we&#8217;ve been working on in an engaging and technical context.</p>
<h2>Server Density at conferences</h2>
<p>Our key conference belief is <strong>not to sell anything</strong> &#8211; just nurture an interest. So we never pay for speaking slots and with the limited financial resources available to us, we opt to sponsor more with our time than with money. For that reason we keep an eye on <a href="http://lanyrd.com/server-density" title="Server Density on Lanyrd">lanyrd</a>, organise talk proposals and send our engineers around the world. </p>
<p>Following this theme, we try to offer <em>custom and useful</em> marketing swag out to attendees upon registration; not lazy corporate marketing gumf. Instead of resorting to paying thousands of dollars to plaster our logo everywhere and gain exposure through forced tweets, we try and provide attendees with real value &#8211; whilst keeping the costs as low as possible.</p>
<h3>Providing valuable swag</h3>
<p>Bearing in mind the overwhelming marketing presence at conferences, the aim for us is to offer something that attendees actually like, take away and use. In the past that has been, tape measures, <a href="http://blog.serverdensity.com/the-logistics-of-shipping-promo-lego-usb-drives-to-the-usa/">USB lego men</a> and more recently our <em>custom (dot) books</em>.</p>
<p><img src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/06/otherswag.jpg" alt="Server Density swag" width="100%" height="auto" class="alignnone size-full wp-image-3584" /></p>
<h2>The (dot) book</h2>
<p>The dot book is a 4 colour dot grid notebook. We&#8217;ve designed and printed 2000 so far and offered them out at various conferences over recent months, customised to the theme of the conference.</p>
<p><img src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/06/dotbooks2.jpg" alt="dotbooks" width="100%" height="auto" class="alignnone size-full wp-image-3582" /></p>
<ol>
<li>.bson notebook</li>
<li>.devops notebook</li>
<li>.pcap notebook</li>
<li>.nib notebook</li>
</ol>
<h3>The name</h3>
<p>The dot name reinforces the point that we&#8217;ve designed and printed the books specifically for people with a technical knowledge &#8211; these names represent the filetype, language or framework that the books relate to:</p>
<p><strong>.bson</strong> is an interchange format used mainly as a data storage and network transfer format in MongoDB.</p>
<p><strong>.devops</strong> was not based on a file extension but just took the name of the idea behind combing operations and development.</p>
<p><strong>.pcap</strong> (packet capture) is based on the results of a packet capture session from something like Wireshark.</p>
<p><strong>.nib</strong> is the file extension for NeXT interface builder which is primarily used for iOS development.</p>
<h3>The paper</h3>
<p>Striking the balance between thick enough to convey a message of quality, yet thin enough to close the book was difficult. After many iterations we found that balance with the following paper spec:</p>
<ul>
<li>300gsm uncoated 100% recycled paper cover</li>
<li>90gsm uncoated 100% recycled</li>
</ul>
<p>It was important for us to use <strong>100% fully recycled paper</strong>, although we had to be careful that with uncoated finishing, the paper was able to hold the colour that was being applied, in this instance 4 colour digital.</p>
<h3>The pages</h3>
<p>We&#8217;ve been able to customise the inner notebook pages as much as we are the outer spreads. Following the (dot) file extension theme, the writing paper is formatted in a dot grid. It has more value than simply reinforcing the name &#8211; dot grid paper has all the benefits of squared / lined so you can write straight and do technical drawings, but the visual guides are more subtle &#8211; you should definitely try it if you haven&#8217;t!</p>
<p><img src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/06/inside-dot.jpg" alt="inside-dot" width="100%" height="auto" class="alignnone size-full wp-image-3583" /></p>
<p>In the example of our .nib books made for the <a href="http://altwwdc.com/">ALTWWDC conference</a>, we designed iOS wireframes for each right spread &#8211; as well as a dot grid of course!</p>
<h3>The cover pages</h3>
<p>We&#8217;ve made the covers minimal, but with colours that subtly hint at the context:</p>
<ul>
<li>The .bson is green because of MongoDB.</li>
<li>The .nib books are the bluey grey used by Apple on its developer pages.</li>
<li>The .pcap books are red because they were initially made for one of our engineers talks at <a href="https://atmosphere-conference.com/">Atmosphere conference</a>.</li>
<li>The .devops books are blue to follow the <a href="http://devopsdays.org">devopsdays.org</a> colours.</li>
</ul>
<p>The back of the notebooks showcases our only real attempt at &#8216;official / traditional marketing&#8217; with the company logo taking pride of place.</p>
<p>The inside covers is where the books get interesting and provide some real value, they are the pages we customise heavily for each book. They provide quick quips, tips and hints that people using the language <strong>will find useful</strong>.</p>
<h2>Inside the .bson</h2>
<p><img src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/06/bson.jpg" alt="bson" width="100%" height="auto" class="alignnone size-full wp-image-3586" /></p>
<h2>Inside the .devops</h2>
<p><img src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/06/devops.jpg" alt="devops" width="100%" height="auto" class="alignnone size-full wp-image-3587" /></p>
<h2>Inside the .pcap</h2>
<p><img src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/06/pcap.jpg" alt="pcap" width="800" height="631" class="alignnone size-full wp-image-3588" /></p>
<h2>Inside the .nib</h2>
<p><img src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/06/nib.jpg" alt="nib" width="100%" height="auto" class="alignnone size-full wp-image-3585" /></p>
<h2>The problems with production</h2>
<p>We&#8217;re a lucky lot working with the internet &#8211; we can build and test things instantaneously, deploy things with one click and change / fix bugs post production. Unfortunately when sending an email with packaged artwork to a printers, you better be 100% confident that you haven&#8217;t made a <span style="text-decoration: line-through;">tpyo</span> typo.</p>
<p>This was a major concern, not because we&#8217;ve had to send 1000&#8242;s back, but because we needed to get things right the first time. I would encourage any one undergoing a journey like this to leave your self time for at least 2 iterations. Luckily we had enough time (and the printers patience) to go through 3 iteration stages. Here&#8217;s why:</p>
<ol>
<li>The cover pages were too thin and they lacked a quality feel.</li>
<li>The cover pages were too thick so the books wouldn&#8217;t close properly. In addition, we chose an 80gsm inner page stock which was too thin and didn&#8217;t withstand the fountain pen &#8220;seep test&#8221;.</li>
<li>Because the cover page had to wrap all of the inner pages, the design had to increase by 6mm and the inner pages reduce by 6mm towards the middle so all pages sit flush together when the book is closed. This also caused problems with content being off centre, so alignment was altered on a few occasions.</li>
</ol>
<h2>The future of the (dot) book</h2>
<p>So far we&#8217;ve received some great feedback about the (dot) books that we&#8217;ve had produced. We want to keep them coming and make them a Server Density staple. To close this off, is there a single notebook in particular that you might like to see / find useful in the near future?</p>
<p>The post <a href="http://blog.serverdensity.com/designing-and-printing-dot-notebooks/">Designing and printing (dot) notebooks</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p><img src="http://feeds.feedburner.com/~r/serverdensity/~4/-59cekfcq7c" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.serverdensity.com/designing-and-printing-dot-notebooks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.serverdensity.com/designing-and-printing-dot-notebooks/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=designing-and-printing-dot-notebooks</feedburner:origLink></item>
		<item>
		<title>Sysadmin Sunday 130</title>
		<link>http://feedproxy.google.com/~r/serverdensity/~3/J41JdO5BhRk/</link>
		<comments>http://blog.serverdensity.com/sysadmin-sunday-130/#comments</comments>
		<pubDate>Sun, 09 Jun 2013 14:00:41 +0000</pubDate>
		<dc:creator>Rufus</dc:creator>
				<category><![CDATA[Server Density]]></category>

		<guid isPermaLink="false">http://blog.serverdensity.com/?p=3573</guid>
		<description><![CDATA[<p>Security is not the most important thing to most people Learning from other disciplines Where is all the nodejs malware? DevTools at Etsy What Is the Risk That Amazon Will [...]</p><p>The post <a href="http://blog.serverdensity.com/sysadmin-sunday-130/">Sysadmin Sunday 130</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p>]]></description>
				<content:encoded><![CDATA[<ul>
<li><a href="http://buff.ly/18LJ3z6"><span></span>Security is not the most important thing to most people</a></li>
<li><a href="http://buff.ly/14aJ4cl"><span></span>Learning from other disciplines</a></li>
<li><a href="http://buff.ly/11cRAm2"><span></span>Where is all the nodejs malware?</a></li>
<li><a href="http://buff.ly/10J1XSW"><span></span>DevTools at Etsy</a></li>
<li><a href="http://buff.ly/17fPU3w"><span></span>What Is the Risk That Amazon Will Go Down (Again)?</a></li>
<li><a href="http://buff.ly/17fPQAQ"><span></span>Data Driven Security &#8211; Managing Risk at Etsy</a></li>
<li><a href="http://buff.ly/17fPLgw"><span></span>Some tips on getting started with Vagrant and Chef</a></li>
<li><a href="http://buff.ly/17fG2GZ"><span></span>I&#8217;m not a DevOps&#8230;Are you an Agile?</a></li>
<li><a href="http://buff.ly/1aQM4uX"><span></span>The Canonical List of Hypervisors That Suck</a></li>
<li><a href="http://buff.ly/18Gpxl9"><span></span>Culture Hacking With A Staff Database</a></li>
<li><a href="http://buff.ly/146fbKq"><span></span>6 Quick Tips from Google for International Websites</a></li>
<li><a href="http://buff.ly/1aNndrM"><span></span>MPI Latency on Google Compute Engine</a></li>
</ul>
<p>The post <a href="http://blog.serverdensity.com/sysadmin-sunday-130/">Sysadmin Sunday 130</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p><img src="http://feeds.feedburner.com/~r/serverdensity/~4/J41JdO5BhRk" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.serverdensity.com/sysadmin-sunday-130/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.serverdensity.com/sysadmin-sunday-130/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=sysadmin-sunday-130</feedburner:origLink></item>
		<item>
		<title>Multi data center redundancy – sysadmin considerations</title>
		<link>http://feedproxy.google.com/~r/serverdensity/~3/-tiIiFkk2hM/</link>
		<comments>http://blog.serverdensity.com/multi-data-center-redundancy-sysadmin-considerations/#comments</comments>
		<pubDate>Thu, 06 Jun 2013 12:00:43 +0000</pubDate>
		<dc:creator>David Mytton</dc:creator>
				<category><![CDATA[Server Monitoring]]></category>
		<category><![CDATA[Servers]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://blog.serverdensity.com/?p=3566</guid>
		<description><![CDATA[<p>Last week, I considered the implications for multi data center redundancy on your applications. This post will look at considerations for the sysadmin &#8211; network and server level failover, plus [...]</p><p>The post <a href="http://blog.serverdensity.com/multi-data-center-redundancy-sysadmin-considerations/">Multi data center redundancy – sysadmin considerations</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p>]]></description>
				<content:encoded><![CDATA[<p>Last week, I considered the <a href="http://blog.serverdensity.com/multi-data-center-redundancy-application-considerations/">implications for multi data center redundancy on your applications</a>. This post will look at considerations for the sysadmin &#8211; network and server level failover, plus other aspects of infrastructure which need to be considered.</p>
<h2>Network considerations for multi data center redundancy</h2>
<p>The network is the connecting layer between your customers and you, and all your internal components across data centers. This means there are a few areas to consider how failover works:</p>
<h3>Connecting you to your customers &#8211; IPs and DNS</h3>
<p>If one of your data centers fails then you need a way to redirect traffic to the secondary data center. We achieve this for <a href="http://www.serverdensity.com">Server Density</a> by using the <a href="http://blog.serverdensity.com/global-elastic-ips-multi-region-routing/">global IP service offered by Softlayer</a>. This is similar to Amazon&#8217;s Elastic IP service but allows you to point the IP to any server in any of their data centers, rather than being restricted by region. This allows us to redirect traffic to an entirely different data center without any customer impact &#8211; the IP stays the same, it&#8217;s the internal routing that changes. </p>
<p>However, there is a failure scenario where this doesn&#8217;t work. The change requires the original data center routers to acknowledge the new configuration before the new route is applied, which might fail if there is a network problem in the original data center. This is mitigated by using DNS failover.</p>
<p>Our DNS is set with a low TTL so that we can adjust the IPs we send traffic to. In the event we can&#8217;t reroute our global IP, we can update the DNS to point to an IP in another data center, which we have already reserved and are ready to run as hot standbys. The downside is this can cause some downtime because of the DNS caching for the length of the TTL. Some ISPs cache more aggressively so it&#8217;s not guaranteed to update for all customers at the same time.</p>
<p>An alternative would be to use round robin DNS to provide all the IPs at once but this relies on the client to time out on failed connections and means some connections would always go to the secondary data center, which may not be optimal if you run a primary/secondary data center setup.</p>
<h3>Internal connectivity</h3>
<p>This is mostly about latency. Within the data center you can usually expect sub-1ms round trip times but as soon as you start to go between data centers then this increases based on distance. </p>
<p>From Washington, DC to San Jose, CA in the USA we see round trip times of around 72ms. Across the Atlantic you can expect anything up to 100ms, trans-Pacific up to 150ms and between Europe and Japan up to 300ms.</p>
<p>The implications of this are relevant for database replication &#8211; can you survive eventual consistency or do you need to guarantee the data gets to all your data centers? <a href="http://aphyr.com/tags/jepsen">Network partitions can also be a concern</a>.</p>
<p>It&#8217;s also relevant for file transfers, particularly for things like backup. If you have to restore an offsite backup to your secondary data center, <a href="http://blog.serverdensity.com/ups-delivers-at-11-22mbs/">how long will the file transfer actually take</a>?</p>
<p><img src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/06/ping.png" alt="Internal ping" width="675" height="184" class="alignnone size-full wp-image-3570" /></p>
<h2>Server &#038; ops considerations for multi data center redundancy</h2>
<h3>Load balancer failover</h3>
<p>We run <a href="http://blog.serverdensity.com/how-to-configure-nginx-as-a-load-balancer/">load balancer pairs using nginx</a> &#8211; one active, one hot standby. These are monitored externally and if there is a failure, our global IP is automatically rerouted to the standby load balancer and an on-call alert triggered. </p>
<h3>Gateway access</h3>
<p>Almost none of our servers can be connected to via SSH from the public internet &#8211; all access goes through a single gateway server, including across multiple data centers. Of course, if the data center where this is located goes down then you lose all access. So we have a duplicate in our secondary data center.</p>
<p>An alternative to this is to use a VPN to connect directly into the environment network. I don&#8217;t like this approach because it opens up the entire network to your local system, which can lead to mistakes e.g. connecting to production by accident.</p>
<h3>Self hosted tools e.g config management</h3>
<p>We run Puppet to manage configuration across all our servers and amongst other things, it manages our internal hostnames through a centralised <code>/etc/hosts</code> file. When servers need to be replaced or IPs changed, <a href="http://blog.serverdensity.com/deploying-nginx-with-puppet/">we can do this using Puppet</a> but it has no built in redundancy so we have to set up a replacement Puppet slave ready as a hot standby.</p>
<p>This applies to other tools you might be using. Do you have a central logging server? A backup management system? Maybe you even run your own monitoring! The advantage of using SaaS products is you don&#8217;t have to think about redundancy and monitoring for these tools, but some are best run yourself. In those cases, you need to consider the failover of each tool and how your response might be impacted if that tool was unavailable.</p>
<p><a href="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/01/5.png"><img class="alignnone size-medium wp-image-3397" alt="5" src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/01/5-300x226.png" width="300" height="226" /></a> <a href="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/01/7.png"><img class="alignnone size-medium wp-image-3399" alt="7" src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/01/7-300x292.png" width="300" height="292" /></a></p>
<h3>Monitoring</h3>
<p>You need to know when you&#8217;re completely down, but also when certain failure scenarios take place: usually low traffic, high latency, increased errors. This requires a combination of monitoring across your entire infrastructure: from remote tests for response time through to end to end testing of request pipelines. Ideally, you will detect problems before customers start noticing.</p>
<p><a href="http://www.serverdensity.com"><img src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/06/sd-screenshot.png" alt="Server Density Screenshot" width="925" height="409" class="alignnone size-full wp-image-3567" /></a></p>
<h2>Communication</h2>
<p>The worst thing as a customer is to see problems with a service but have no idea what is happening. This is where <a href="http://status.serverdensity.com">public status pages</a> come in. But it&#8217;s not enough just to post when there are problems &#8211; if your service is critical to your customers, they need to be able to subscribe to be warned when there are issues. </p>
<p>Server Density monitoring is critical to our customers so if we have a service issue where our monitoring stops, they need to know immediately so they can manually monitor their systems. The same with services like PagerDuty &#8211; if their alerting system is down, you have no way to know when your own services are down!</p>
<p>You can build your own fancy status pages like Heroku and AWS do. But there are services like <a href="https://statuspage.io">StatusPage.io</a> which provide a hosted page for you.</p>
<p>It&#8217;s also worth considering <a href="http://community.uservoice.com/blog/where-to-communicate-outages/">where else you should be telling your customers</a> &#8211; many will be looking to social media if there are problems. </p>
<p><a href="http://http://status.aws.amazon.com/"><img src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/06/aws-status.png" alt="AWS Status" width="834" height="514" class="alignnone size-full wp-image-3568" /></a></p>
<h2>Automatic failover</h2>
<p>We have <a href="http://blog.serverdensity.com/deploying-nginx-with-puppet/">automated failover of our nginx load balancer pairs</a> but a full data center failover requires a manual process. The problem with automated failover is the <a href="http://blog.serverdensity.com/avoiding-flapping/">potential for flapping</a>, which can make a situation even more confusing.</p>
<p>Going to a full multi data center deployment all serving live traffic is probably the best way to do this because the requirements of that setup naturally mean each data center can serve traffic. Then it&#8217;s just a case of removing a data center from the rotation if it goes down, rather than having to deal with switching and failover.</p>
<p>Whether you do manual or automated failover, checklists are valuable. They help avoid mistakes with manual processes and allow you to run through to make sure everything is confirmed working when an automated failover happens. We have checklists for our manual data center failover process, load balancer failovers and also a recovery checklist to ensure all core functionality is working after an outage.</p>
<h2>Conclusions</h2>
<p>Full multi data center redundancy is a long process and always increases raw hosting costs. The stage of the business determines when this should be done. As you grow in revenue and customers, it becomes more appropriate, especially if people are relying on the service. </p>
<p>It may not be worth it depending on how critical your service is to your customers if you assume your downtime will be relatively short (hours). You have to <a href="http://programming.oreilly.com/2013/05/what-is-the-risk-that-amazon-will-go-down-again.html">weigh up whether the likelihood of a prolonged outage</a> from extreme events like Hurricane Sandy justifies the effort vs days of downtime.</p>
<p>You&#8217;ll also probably find other things not mentioned in this post, so feel free to comment with anything else you find in your own quest for full multi data center redundancy!</p>
<p>The post <a href="http://blog.serverdensity.com/multi-data-center-redundancy-sysadmin-considerations/">Multi data center redundancy – sysadmin considerations</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p><img src="http://feeds.feedburner.com/~r/serverdensity/~4/-tiIiFkk2hM" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.serverdensity.com/multi-data-center-redundancy-sysadmin-considerations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.serverdensity.com/multi-data-center-redundancy-sysadmin-considerations/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=multi-data-center-redundancy-sysadmin-considerations</feedburner:origLink></item>
		<item>
		<title>Sysadmin Sunday 129</title>
		<link>http://feedproxy.google.com/~r/serverdensity/~3/HyrWzUGsTuc/</link>
		<comments>http://blog.serverdensity.com/sysadmin-sunday-129/#comments</comments>
		<pubDate>Sun, 02 Jun 2013 14:00:39 +0000</pubDate>
		<dc:creator>Rufus</dc:creator>
				<category><![CDATA[Server Density]]></category>

		<guid isPermaLink="false">http://blog.serverdensity.com/?p=3563</guid>
		<description><![CDATA[<p>Google Finds NUMA Up to 20% Slower for Gmail and Websearch Watchman: Faster builds with large source trees SSH uses four TCP segments for each character you type automating/managing ssh [...]</p><p>The post <a href="http://blog.serverdensity.com/sysadmin-sunday-129/">Sysadmin Sunday 129</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p>]]></description>
				<content:encoded><![CDATA[<ul>
<li><a href="http://buff.ly/10DalmZ"><span></span>Google Finds NUMA Up to 20% Slower for Gmail and Websearch</a></li>
<li><a href="http://buff.ly/10D9RNC"><span></span>Watchman: Faster builds with large source trees</a></li>
<li><a href="http://buff.ly/18yC9gB"><span></span>SSH uses four TCP segments for each character you type</a></li>
<li><a href="http://buff.ly/18q8HJH"><span></span>automating/managing ssh configurations with python</a></li>
<li><a href="http://buff.ly/117CQZD"><span></span>Where did all the HTTP referrers go?</a></li>
<li><a href="http://buff.ly/117COkt"><span></span>Anatomy of a hack: How crackers ransack passwords like “qeadzcwrsfxv1331”</a></li>
<li><a href="http://buff.ly/1527fJy"><span></span>&#8220;Results of the Debian systemd survey&#8221;</a></li>
<li><a href="http://buff.ly/1ambhgj"><span></span>Linux Bumps Windows On ISS</a></li>
<li><a href="http://buff.ly/18co6xd"><span></span>Improving the security of your SSH private key files</a></li>
<li><a href="http://buff.ly/10RrpkP"><span></span>vnc over gif &#8211; &#8220;Serves screen updates as animated gif over http&#8221;</a></li>
<li><a href="http://buff.ly/10P9QSe"><span></span>Trying out this Go thing</a></li>
<li><a href="http://buff.ly/13O3dF6"><span></span>Amazon DynamoDB &#8211; Parallel Scans, 4x Cheaper Reads, Other Good News</a></li>
<li><a href="http://buff.ly/13NCkkE"><span></span>Sears is Turning Shuttered Stores into Data Centers</a></li>
<li><a href="http://buff.ly/13NCjx2"><span></span>IPv6 address space layout best practices</a></li>
<li><a href="http://buff.ly/13KAgtw"><span></span>XBox One: &#8220;things like Hyper V, virtualization, and 64-bit processors really started in a data center&#8221;</a></li>
</ul>
<p>The post <a href="http://blog.serverdensity.com/sysadmin-sunday-129/">Sysadmin Sunday 129</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p><img src="http://feeds.feedburner.com/~r/serverdensity/~4/HyrWzUGsTuc" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.serverdensity.com/sysadmin-sunday-129/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.serverdensity.com/sysadmin-sunday-129/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=sysadmin-sunday-129</feedburner:origLink></item>
		<item>
		<title>Multi data center redundancy – application considerations</title>
		<link>http://feedproxy.google.com/~r/serverdensity/~3/zGD0lLg2M-Y/</link>
		<comments>http://blog.serverdensity.com/multi-data-center-redundancy-application-considerations/#comments</comments>
		<pubDate>Thu, 30 May 2013 12:00:52 +0000</pubDate>
		<dc:creator>David Mytton</dc:creator>
				<category><![CDATA[Server Monitoring]]></category>
		<category><![CDATA[Servers]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://blog.serverdensity.com/?p=3558</guid>
		<description><![CDATA[<p>A few months ago we finished a long running multi data center redundancy project to allow our server monitoring service, Server Density, to survive the complete failure of our primary [...]</p><p>The post <a href="http://blog.serverdensity.com/multi-data-center-redundancy-application-considerations/">Multi data center redundancy &#8211; application considerations</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p>]]></description>
				<content:encoded><![CDATA[<p>A few months ago we finished a long running multi data center redundancy project to allow our server monitoring service, <a href="http://www.serverdensity.com">Server Density</a>, to survive the complete failure of our primary data center. We can now failover to another data center either with no customer impact or with minimal downtime (depending on the failure scenario).</p>
<p>There are x3 main types of multi data center deployments:</p>
<ol>
<li><strong>Disaster recovery</strong>: you maintain servers in an additional data center which allows you to recover data in the event your primary data center is destroyed. It&#8217;s designed as a last resort after a catastrophic event and doesn&#8217;t usually allow you to run normal operations from the secondary data center. This is more like a backup.</li>
<li><strong>Hot standby</strong>: you maintain duplicate servers in an additional data center which are running and ready to take over immediately in the event the primary data center fails. The entire application can fail over very quickly and means you can survive the destruction of the primary data center, or failover in the event of planned maintenance; basically any event where the primary data center is unavailable.</li>
<li><strong>Live traffic handling</strong>: the same as the hot standby option but this data center (or data centers) serve traffic too, so there is often no real &#8220;primary&#8221;. This is often used to locate the application closer to the user and tends to use some kind of geographical load balancing, such as using anycast DNS to route users to their closest location.</li>
</ol>
<p>These are ordered in terms of complexity and cost and are generally implemented in that order e.g. to have a live traffic handling data center you need all the same things as a hot standby facility. Each one gets more expensive because you have to duplicate servers with sufficient resources to take over live traffic, even if they&#8217;re not used (in the case of hot standby).  Server Density has had a disaster recovery setup for several years and we recently upgraded to hot standby ability, with a view to moving up to live traffic handling in the future.</p>
<p><a href="http://www.softlayer.com/about/datacenters"><img src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/05/softlayer-us-map.png" alt="SoftLayer US POP Map" width="521" height="349" class="alignnone size-full wp-image-3559" /></a></p>
<h2>Application considerations for multi data center redundancy</h2>
<p>There are two main aspects of implementing multi data center redundancy. The first is the sysadmin, network and server engineering work that needs to be done to deploy multiple servers and set up the failover mechanism, which will be covered in a future post. But before that, some preliminary work needs to be done to get your application ready to handle switching data centers.</p>
<h3>Database failover</h3>
<p>Databases are perhaps the most complicated component to scale and database failover is very product specific. So there are a number of generic things to consider in relation to which database you&#8217;re actually using:</p>
<ul>
<li>Replication: this is almost certainly how you&#8217;re going to handle failover on the database. Consider how replication is implemented with regards to master/slave and how failover is triggered.</li>
<li>How far behind are your slaves? Across regions there will be some replication lag due to network latency. Can your application handle some data being &#8220;lost&#8221; because it hasn&#8217;t been replicated yet, or do you need to ensure strong consistency?</li>
<li>How does your database handle <a href="http://aphyr.com/tags/jepsen">split brain conditions when the network partitions</a>? Do you need an independent node in a third data center to arbitrate over which node becomes master?</li>
<li>How does your application detect a change in database master? Does this even matter? Will your users get errors or will it happen automatically?</li>
</ul>
<h3>Functionality</h3>
<p>In hot standby setups, you can make tradeoffs if you don&#8217;t want to/can&#8217;t afford to replicate every single component. This means your application will need to degrade gracefully depending on the failure scenario; something which works well with service orientated architectures. For example, you could temporarily disable profile image uploading rather than duplicating large numbers of photos across data centers.</p>
<p>This requires your application to know when there is a failover situation, which can be done using a manual config flag that&#8217;s set as part of your failover process. Alternatively, you could set environment variables so the application knows which data center it is being served from and handles the situation appropriately.</p>
<h3>Assumptions</h3>
<p>Your application might make some assumptions about the availability of local resources. Paths, hostnames or IPs might be hard coded. You&#8217;ll need to audit your code to find out what assumptions have been made and good testing will reveal anything you have missed.</p>
<h3>Notifying users</h3>
<p>It can be useful to display a banner to users when there is a failover condition, especially if certain features are disabled or performance drops because of increased latency or cold caches. </p>
<p><img src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/05/facebook-unavailable-1024x222.jpg" alt="Facebook temporarily unavailable" width="1024" height="222" class="alignnone size-large wp-image-3560" /></p>
<h3>Local data e.g. sessions</h3>
<p>Sessions are often implemented using some kind of local storage. This problem tends to be solved as part of needing to balance traffic across multiple servers but it&#8217;s also worth considering how this works across data centers. If you use a database for storing session data then it will be replicated already, but be careful if you are using file or load balancer based session handling.</p>
<h3>Don&#8217;t forget websites</h3>
<p>Your core application needs to be able to fail over to allow existing users to continue using it, but don&#8217;t forget your product websites and billing systems. It&#8217;s good practice to consider your website a separate product with its own redundancy and deployment mechanism because for many businesses, its the sole location for new customers to find out and signup to the service.</p>
<h2>First step completed</h2>
<p>With all these considered and &#8220;fixed&#8221;, the next step is to start duplicating servers in a secondary location&#8230;which will be the topic of the next blog post!</p>
<p>The post <a href="http://blog.serverdensity.com/multi-data-center-redundancy-application-considerations/">Multi data center redundancy &#8211; application considerations</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p><img src="http://feeds.feedburner.com/~r/serverdensity/~4/zGD0lLg2M-Y" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.serverdensity.com/multi-data-center-redundancy-application-considerations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.serverdensity.com/multi-data-center-redundancy-application-considerations/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=multi-data-center-redundancy-application-considerations</feedburner:origLink></item>
		<item>
		<title>Sysadmin Sunday 128</title>
		<link>http://feedproxy.google.com/~r/serverdensity/~3/_nfIMTBwzyw/</link>
		<comments>http://blog.serverdensity.com/sysadmin-sunday-128/#comments</comments>
		<pubDate>Sun, 26 May 2013 14:00:29 +0000</pubDate>
		<dc:creator>Rufus</dc:creator>
				<category><![CDATA[Sysadmin Sunday]]></category>

		<guid isPermaLink="false">http://blog.serverdensity.com/?p=3556</guid>
		<description><![CDATA[<p>storm is a command line tool to manage your ssh connections What DuckDuckGo can offer you in the way of private communication Phantompy is a headless WebKit engine Firefox Developer [...]</p><p>The post <a href="http://blog.serverdensity.com/sysadmin-sunday-128/">Sysadmin Sunday 128</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p>]]></description>
				<content:encoded><![CDATA[<ul>
<li><a href="http://buff.ly/12rCOtJ"><span></span>storm is a command line tool to manage your ssh connections</a></li>
<li><a href="http://buff.ly/165oiyl"><span></span>What DuckDuckGo can offer you in the way of private communication</a></li>
<li><a href="http://buff.ly/164QxNA"><span></span>Phantompy is a headless WebKit engine</a></li>
<li><a href="http://buff.ly/12S5Q6t"><span></span>Firefox Developer Tool Features for Firefox 23</a></li>
<li><a href="http://buff.ly/162vxXZ"><span></span>Why I left Heroku, and notes on my new AWS setup</a></li>
<li><a href="http://buff.ly/12pj5uG"><span></span>Overwhelmed by JavaScript Dependencies</a></li>
<li><a href="http://buff.ly/10IOUQS"><span></span>Call me maybe: MongoDB</a></li>
<li><a href="http://buff.ly/19Ur45Y"><span></span>Lessons Learned and Questions Raised From Building Distributed Systems </a></li>
<li><a href="http://buff.ly/10Ia1D3"><span></span>Google&#8217;s Scaled Trunk Based Development</a></li>
<li><a href="http://buff.ly/16JuxHJ"><span></span>Deploying Node.js with systemd</a></li>
<li><a href="http://buff.ly/19RZFBx"><span></span>Docker automates the deployment of applications as highly portable, self-sufficient containers</a></li>
<li><a href="http://buff.ly/10CbieR"><span></span>Call me maybe: Carly Rae Jepsen and the perils of network partitions</a></li>
<li><a href="http://buff.ly/12jhUwY"><span></span>skype backdoor confirmation</a></li>
<li><a href="http://buff.ly/13AXaDB"><span></span>fish shell 2.0 &#8211; a fully-equipped command line shell</a></li>
<li><a href="http://buff.ly/13zuRpj"><span></span>AWS Redshift: How Amazon Changed The Game</a></li>
</ul>
<p>The post <a href="http://blog.serverdensity.com/sysadmin-sunday-128/">Sysadmin Sunday 128</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p><img src="http://feeds.feedburner.com/~r/serverdensity/~4/_nfIMTBwzyw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.serverdensity.com/sysadmin-sunday-128/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.serverdensity.com/sysadmin-sunday-128/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=sysadmin-sunday-128</feedburner:origLink></item>
		<item>
		<title>Sysadmin Sunday 127</title>
		<link>http://feedproxy.google.com/~r/serverdensity/~3/vsVLY-14zNs/</link>
		<comments>http://blog.serverdensity.com/sysadmin-sunday-127/#comments</comments>
		<pubDate>Sun, 19 May 2013 14:00:33 +0000</pubDate>
		<dc:creator>Rufus</dc:creator>
				<category><![CDATA[Sysadmin Sunday]]></category>

		<guid isPermaLink="false">http://blog.serverdensity.com/?p=3554</guid>
		<description><![CDATA[<p>WordPress Caching with Nginx and Redis How mitmproxy works Facebook June Open Compute hardware hackathon How I &#8216;stole&#8217; $14 million from a bank: A security tester&#8217;s tale Choosing the Right [...]</p><p>The post <a href="http://blog.serverdensity.com/sysadmin-sunday-127/">Sysadmin Sunday 127</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p>]]></description>
				<content:encoded><![CDATA[<ul>
<li><a href="http://buff.ly/13zgzFa"><span></span>WordPress Caching with Nginx and Redis</a></li>
<li><a href="http://buff.ly/13yqkmV"><span></span>How mitmproxy works</a></li>
<li><a href="http://buff.ly/15NEOTH"><span></span>Facebook June Open Compute hardware hackathon</a></li>
<li><a href="http://buff.ly/12blRnk"><span></span>How I &#8216;stole&#8217; $14 million from a bank: A security tester&#8217;s tale</a></li>
<li><a href="http://buff.ly/129uQVY"><span></span>Choosing the Right EC2 Instance Type for Your Application</a></li>
<li><a href="http://buff.ly/19z0dvS"><span></span>Google introduces a new NoSQL Database</a></li>
<li><a href="http://buff.ly/1287YGA"><span></span>This Is the Most Detailed Picture of the Internet Ever (and Making it Was Very Illegal)</a></li>
<li><a href="http://buff.ly/127RrlO"><span></span>The Secret to 10 Million Concurrent Connections -The Kernel is the Problem, Not the Solution</a></li>
<li><a href="http://buff.ly/16tQydr"><span></span>TDD your DevOps with test-kitchen 1.0</a></li>
<li><a href="http://buff.ly/19hVrmB"><span></span>The Human Side of Postmortems (free O&#8217;Reilly eBook)</a></li>
<li><a href="http://buff.ly/16tOC4H"><span></span>Center for Internet Security Linux Benchmark implementation for PuppetLabs</a></li>
<li><a href="http://buff.ly/16jfcNL"><span></span>Facebook aims to knock Cisco down a peg with open network hardware</a></li>
<li><a href="http://buff.ly/192gOIg"><span></span>How the Syrian Electronic Army Hacked The Onion</a></li>
</ul>
<p>The post <a href="http://blog.serverdensity.com/sysadmin-sunday-127/">Sysadmin Sunday 127</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p><img src="http://feeds.feedburner.com/~r/serverdensity/~4/vsVLY-14zNs" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.serverdensity.com/sysadmin-sunday-127/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.serverdensity.com/sysadmin-sunday-127/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=sysadmin-sunday-127</feedburner:origLink></item>
		<item>
		<title>Sysadmin Sunday 126</title>
		<link>http://feedproxy.google.com/~r/serverdensity/~3/EbefMLE3kpo/</link>
		<comments>http://blog.serverdensity.com/sysadmin-sunday-126/#comments</comments>
		<pubDate>Sun, 12 May 2013 14:00:52 +0000</pubDate>
		<dc:creator>Rufus</dc:creator>
				<category><![CDATA[Sysadmin Sunday]]></category>

		<guid isPermaLink="false">http://blog.serverdensity.com/?p=3552</guid>
		<description><![CDATA[<p>Wikipedia Adopts MariaDB The Security Benefits of RPM Packaging Doom Your Chef in 3 Easy Steps What TokuDB might mean for MongoDB CDNs fail, but your scripts don&#8217;t have to [...]</p><p>The post <a href="http://blog.serverdensity.com/sysadmin-sunday-126/">Sysadmin Sunday 126</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p>]]></description>
				<content:encoded><![CDATA[<ul>
<li><a href="http://buff.ly/10NN1W4"><span></span>Wikipedia Adopts MariaDB</a></li>
<li><a href="http://buff.ly/10NNgQR"><span></span>The Security Benefits of RPM Packaging</a></li>
<li><a href="http://buff.ly/Zf8u3y"><span></span>Doom Your Chef in 3 Easy Steps</a></li>
<li><a href="http://buff.ly/ZPdznV"><span></span>What TokuDB might mean for MongoDB</a></li>
<li><a href="http://buff.ly/ZUMFem"><span></span>CDNs fail, but your scripts don&#8217;t have to &#8211; fallback from CDN to local jQuery </a></li>
<li><a href="http://buff.ly/13P80Ew"><span></span>How to use MongoDB as a pure in-memory DB (Redis style)</a></li>
<li><a href="http://buff.ly/10uNJQx"><span></span>Managing multiple MongoDB clusters with chef</a></li>
<li><a href="http://buff.ly/10uNKE3"><span></span>Redis partial word match, you (auto)complete me</a></li>
<li><a href="http://buff.ly/10uNQeQ"><span></span>Cassandra anti-patterns: Queues and queue-like datasets</a></li>
<li><a href="http://buff.ly/ZuObzs"><span></span>2 advanced techniques to dramatically increase the performance of your responsive website</a></li>
<li><a href="http://buff.ly/140z6c0"><span></span>Google&#8217;s buildings are hackable</a></li>
<li><a href="http://buff.ly/15muFgl"><span></span>DevOps and cloud: A view from outside the Bay Area bubble</a></li>
<li><a href="http://buff.ly/15mQFrD"><span></span>nginx security advisory (CVE-2013-2028)</a></li>
<li><a href="http://buff.ly/15muG3O"><span></span>DevopsDays 2013 &#8211; we are avoiding culture, why?</a></li>
<li><a href="http://buff.ly/15oSMuX"><span></span>SELinux &#038; Return On Time Invested</a></li>
</ul>
<p>The post <a href="http://blog.serverdensity.com/sysadmin-sunday-126/">Sysadmin Sunday 126</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p><img src="http://feeds.feedburner.com/~r/serverdensity/~4/EbefMLE3kpo" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.serverdensity.com/sysadmin-sunday-126/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.serverdensity.com/sysadmin-sunday-126/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=sysadmin-sunday-126</feedburner:origLink></item>
		<item>
		<title>Many projects with Vagrant and Puppet</title>
		<link>http://feedproxy.google.com/~r/serverdensity/~3/-HHCfQifljQ/</link>
		<comments>http://blog.serverdensity.com/many-projects-with-vagrant-and-puppet/#comments</comments>
		<pubDate>Tue, 07 May 2013 11:46:30 +0000</pubDate>
		<dc:creator>Tom Wardill</dc:creator>
				<category><![CDATA[Puppet]]></category>
		<category><![CDATA[Vagrant]]></category>

		<guid isPermaLink="false">http://blog.serverdensity.com/?p=3543</guid>
		<description><![CDATA[<p>When we started Server Density v2, one of the main ideas was to build it as a collection of RESTful services, all talking over HTTP. Initially, these were installed locally [...]</p><p>The post <a href="http://blog.serverdensity.com/many-projects-with-vagrant-and-puppet/">Many projects with Vagrant and Puppet</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p>]]></description>
				<content:encoded><![CDATA[<p>When we started <a href="http://www.serverdensity.com/comingsoon/">Server Density v2</a>, one of the main ideas was to build it as a collection of RESTful services, all talking over HTTP.</p>
<p>Initially, these were installed locally on a developer&#8217;s machine and set up via Apache vhosts running each component separately.</p>
<p>This soon became unmaintainable on a daily basis without a lot of work. We were spending too much time discussing whether a certain component was up to date and fighting bugs caused by API incompatibilities between versions. As we added more services to deal with things beyond the core of the product, this just got worse.</p>
<p>The answer was <a href="http://vagrantup.com">Vagrant</a>.</p>
<h2>What is Vagrant?</h2>
<p>Essentially, vagrant is a command line tool for managing virtualbox instances (other backends are now available in the latest version). You use a pre built box (or package your own) to create a fresh virtual machine with all your tools installed, accessible via SSH from the host machine.</p>
<p>Once a vagrant box is configured, it&#8217;s just a case of <code>vagrant up</code>, waiting a while and then a system is up and running.</p>
<h2>Our vagrant box</h2>
<p>We went through some iterations and experimentation to find something suitable for the way we wanted to work, combined with what we could actually achieve in the box. This is the end result for now, and I&#8217;ll list some modifications that we have planned but haven&#8217;t found the time to do.</p>
<p>The box is split in two, a base and an environment box:</p>
<h3>The Base Box</h3>
<p>The first stage in the vagrant build was to build a customised base box. We based it on Ubuntu 12.04, 64-bit build as that&#8217;s what we planned to use in production. Once that was chosen, I collected a set of dependencies and development tools that were required for each service (MongoDB, Apache, Node.js, vim, screen, etc). These were then deployed into the base box using a fairly standard puppet manifest with some available modules. At this stage, it installs just the tooling, dependencies and makes required system level config changes (networking setup, DNS entries).</p>
<p>Once this box is built, it&#8217;s uploaded to a development webserver so all the team have access to it. The Vagrantfile and the puppet manifests live in git, alongside the development box. This means that the base box can be recreated/tweaked/reviewed by anyone at any time if that&#8217;s a necessity.</p>
<p>Using a base box like this loses some flexibility. Every time you want to add something new at the base level, you have to rebuild and re upload the entire box. But it saves deployment time in the next stage, which overall results in a win.</p>
<p>You can temporarily work around this by adding a dependency into the environment box, but if you&#8217;re not strict with how you manage this it becomes a bit of a mess of where everything is, and you lose the advantage of having a pre-built base box.</p>
<h3>The development environment</h3>
<p><img class="alignnone size-full wp-image-3546" style="margin-bottom: 2em" alt="Vagrant and Puppet" src="http://serverdensity.wpengine.netdna-cdn.com/wp-content/uploads/2013/05/vagrantandpuppet.gif" width="700" height="150" /></p>
<p>This is a little less straightforward than the base box build.</p>
<p>Starting with a Vagrantfile with the base box imported/declared, we added puppet modules to handle installing our agent into the box so it can report to itself, and a module to install apache vhosts.</p>
<p>Once that was done, we created a puppet module that can handle installing all of the services from git. This includes a clone/update, build process (buildout/composer), and finally run any test or development data scripts that we have. This was mostly a mash of already available bits and abuse of the exec puppet command.</p>
<p>The Vagrantfile is just a file of Ruby code, so it was easy to add something that set the box hostname based on the username of the host, and import a separate settings file for overriding the defaults for the vagrant settings (code checkout locations being the main one).</p>
<p>The final stage for this was to add a script that will kill and then start all the code that runs as a service (tornado/celery mainly). This grew out of an ugly hack involving starting lots of things in screen, and hasn&#8217;t really been updated to anything else. It does have a convenient advantage that `screen -list` will tell you exactly what is running, and a total number at the bottom for quick verification that everything started okay.</p>
<p>The code is checked out into a shared directory between the host and the guest, ultimately living on the host. This uses nfs for performance which means we can edit the code using the host editors and tools but the code will still run inside the vagrant box.</p>
<p>Debugging is taken care of by xdebug being configured to point to the host IP for the PHP services, and some work with WingIDE remote debugging for the python services, again with Wing configured to connect into the vagrant box.</p>
<p>Once the box was up and running, we added settings, configs and a custom domain (using vagrant-dns) to enable decent separation and ensure we don&#8217;t accidentally hardcode production/development URLs (or at least, that these are easier to catch if it does happen).</p>
<p>The main feature of this box is that the puppet manifests run with each provision. These update and redeploy the code each time, simplifying the update process to a single command, across every service and repository that we have deployed.</p>
<h2>The Advantages</h2>
<ul>
<li style="margin-bottom: 1em">Reproducible environment for everyone involved.</li>
<li style="margin-bottom: 1em">The puppet manifests mean that just a <code>vagrant provision</code> then waiting is enough to bring everything up to date with the latest master.</li>
<li style="margin-bottom: 1em">Self contained stack, you can see what is running at any point.</li>
<li style="margin-bottom: 1em">Closer to production. We mostly develop on OSX, but deploy to Linux, this gives us both.</li>
<li style="margin-bottom: 1em">Shared URLs for testing. We can pop a URL from our vagrant machines into Hipchat, and other members of the team can use it locally, without having to change it. <a href="https://github.com/BerlinVagrant/vagrant-dns">vagrant-dns</a> is a big win for us there.</li>
<li style="margin-bottom: 1em">Easy to install. Install virtualbox, vagrant, get the Vagrantfiles, run <code>vagrant up</code>.</li>
<li style="margin-bottom: 1em">While there&#8217;s nothing in the vagrant configurations that couldn&#8217;t be as easily done with some scripting for the host machine and remove the need for virtualisation, it&#8217;s handy that when it goes wrong a fresh rebuild is just a <code>destroy</code> and <code>up</code> away. Extremely useful for testing system wide settings.</li>
</ul>
<h2>The Disadvantages</h2>
<ul>
<li style="margin-bottom: 1em">A from-nothing <code>vagrant up</code> takes 25 minutes and downloads about 2Gb (1.2Gb for the base box, the rest for code + dependencies + extras).</li>
<li style="margin-bottom: 1em"><code>vagrant provision</code> to update to latest can take up to 10 minutes depending on speed of the connection and the host machine, so it&#8217;s not that easy to &#8216;just test&#8217; something. You can update the individual services manually, but then you have the problem that we started with, making sure that everyone has the same code.</li>
<li style="margin-bottom: 1em">It&#8217;s hard work for the host machine. The box we have configured has 2 cores and 2Gb of RAM allocated. On a 4Gb Macbook Air, that can start getting a little close to resource starvation, particularly with an IDE and a debugger running.</li>
<li style="margin-bottom: 1em">Debugging isn&#8217;t as easy as I&#8217;d like, setting up the debugger is fairly involved in settings, and you can only really debug one thing at a time. Can be awkward when you&#8217;re trying to trace values across multiple services.</li>
<li style="margin-bottom: 1em">The configuration we have isn&#8217;t as close to production as I&#8217;d like (no nginx, no caching, no centralised logging) but this is just a matter of spending more time.</li>
<li style="margin-bottom: 1em">It doesn&#8217;t entirely solve &#8216;it works for me&#8217;. It just becomes &#8216;it works on my vagrant&#8217;. Fortunately, instances of that seem to be a lot less common.</li>
<li style="margin-bottom: 1em">Random virtualbox/vagrant/host problems. We&#8217;ve had boxes crash, networks go away and all manner of strange things. At least with the code living on the host machine, we&#8217;ve not lost work when that happens.</li>
</ul>
<p>While it seems that there&#8217;s more disadvantages than advantages, overall the reproducibility and simplicity of reducing updating to a single command far outweigh the drawbacks of working in a virtualised environment.</p>
<h2>Future plans</h2>
<p>Most of the future plans for this revolve around gradually bringing it in line with the production environment without losing the flexibility that we have gained.</p>
<ul>
<li style="margin-bottom: 1em">Use the production puppet manifests where possible. Our infrastructure is entirely puppet controlled, so I&#8217;d like to increase the reuse where possible.</li>
<li style="margin-bottom: 1em">Create a version that uses the multi-vm capability of vagrant to simulate a cluster, with each service separately. Could be handy for looking at scaling/communication problems.</li>
<li style="margin-bottom: 1em">See if we can reduce provision time even further, with possible build optimisations. This may then transfer to our deployment system.</li>
<li style="margin-bottom: 1em">Move to vagrant 1.2 and test out some other backends.</li>
<li style="margin-bottom: 1em">Remove the screen based development start script and move to something more production-like. (This is only used in the vagrant box, the live deployments use proper init scripts.)</li>
</ul>
<p>Overall, the use of vagrant has been a big win for us as a company, and has reduced a lot of the problems we were having. There&#8217;s still some work to be done until we&#8217;re completely happy with it, but I&#8217;d recommend that anyone looking at building this type of project take a serious look to see if it suits them.</p>
<p>The post <a href="http://blog.serverdensity.com/many-projects-with-vagrant-and-puppet/">Many projects with Vagrant and Puppet</a> appeared first on <a href="http://blog.serverdensity.com">Server Density Blog</a>.</p><img src="http://feeds.feedburner.com/~r/serverdensity/~4/-HHCfQifljQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.serverdensity.com/many-projects-with-vagrant-and-puppet/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		<feedburner:origLink>http://blog.serverdensity.com/many-projects-with-vagrant-and-puppet/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=many-projects-with-vagrant-and-puppet</feedburner:origLink></item>
	</channel>
</rss>
