<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:blogger='http://schemas.google.com/blogger/2008' xmlns:georss='http://www.georss.org/georss' xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-4209465858850350036</id><updated>2026-04-15T02:07:41.267-07:00</updated><category term="transparency"/><category term="downtime"/><category term="google"/><category term="uptime"/><category term="cloud"/><category term="amazon"/><category term="dashboard"/><category term="postmortem"/><category term="dashboard health"/><category term="app engine"/><category term="cloud computing"/><category term="facebook"/><category term="performance"/><category term="trust"/><category term="twitter"/><category term="velocity"/><category term="aws"/><category term="business"/><category term="gmail"/><category term="obama"/><category term="rackspace"/><category term="saas"/><category term="casestudy"/><category term="funny"/><category term="health"/><category term="monitoring"/><category term="mosso"/><category term="salesforce"/><category term="sla"/><category term="slides"/><category term="status"/><category term="system status"/><category term="zoho"/><category term="allspaw"/><category term="azure"/><category term="blogging"/><category term="bp"/><category term="cartoon"/><category term="cloudfront"/><category term="cnn"/><category term="etsy"/><category term="feedburner"/><category term="finalpost"/><category term="foursquare"/><category term="github"/><category term="godaddy"/><category term="gogrid"/><category term="happiness"/><category term="honesty"/><category term="magnolia"/><category term="marketing"/><category term="measurement"/><category term="mediatemple"/><category term="metaphors"/><category term="microsoft"/><category term="normalaccident"/><category term="online"/><category term="open"/><category term="opensrs"/><category term="operations"/><category term="oreilly"/><category term="outage"/><category term="pizza"/><category term="politics"/><category term="presentations"/><category term="psychology"/><category term="public"/><category term="quickbooks"/><category term="service level agreement"/><category term="service status"/><category term="seth godin"/><category term="slashdot"/><category term="social media"/><category term="sprint"/><category term="tao"/><category term="twilio"/><category term="user story"/><category term="web"/><category term="wordpress"/><category term="ylastic"/><title type='text'>Transparent Uptime</title><subtitle type='html'>The drive for transparency in the uptime and performance of online services</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default?redirect=false'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default?start-index=26&amp;max-results=25&amp;redirect=false'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>110</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-7255716242726834666</id><published>2011-06-15T17:12:00.000-07:00</published><updated>2011-06-15T17:12:57.193-07:00</updated><title type='text'>Passing the Transparency Torch</title><content type='html'>The torch has been passed! The following is re-posted from a new blog&amp;nbsp;&lt;a href=&quot;http://www.transparentperformance.com/&quot;&gt;Transparent Performance&lt;/a&gt;:&lt;br /&gt;
&lt;blockquote&gt;&quot;&lt;a href=&quot;http://www.transparentuptime.com/2008/08/first-post.html&quot;&gt;Almost three years ago&lt;/a&gt;, I started blogging about the importance of transparency at my blog &lt;a href=&quot;http://www.transparentuptime.com/&quot;&gt;Transparent Uptime&lt;/a&gt;. I chronicled the &lt;a href=&quot;http://www.transparentuptime.com/2010/07/benefits-of-transparency.html&quot;&gt;benefits of being transparent&lt;/a&gt; in your companies handling of downtime and performance. I did case studies on &lt;a href=&quot;http://www.transparentuptime.com/2010/03/google-app-engine-downtime-postmortem.html&quot;&gt;transparency done right&lt;/a&gt;, and &lt;a href=&quot;http://www.transparentuptime.com/2010/09/case-study-facebook-outage.html&quot;&gt;transparency done wrong&lt;/a&gt;. I s&lt;a href=&quot;http://www.slideshare.net/lennysan/the-upside-of-downtime-velocity-2010-4564992&quot;&gt;poke at conferences preaching the gospel of performance&lt;/a&gt;. What was a strange idea back then is now becoming obvious, especially to &lt;a href=&quot;http://www.transparentuptime.com/2010/10/foursquare-gets-transparency.html&quot;&gt;young companie&lt;/a&gt;s. Just as things were ramping up, I was forced to put my &lt;a href=&quot;http://www.transparentuptime.com/2010/11/all-good-thingsmust-come-to-end.html&quot;&gt;blog on hiatus&lt;/a&gt;.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote&gt;In the time since, I’ve been looking for someone to hand off the torch of transparency to. I’m really excited to be a part of the launch of &lt;a href=&quot;http://www.transparentperformance.com/&quot;&gt;Transparent Performance&lt;/a&gt;, a new online consortium that will bring together experts from around the industry to continue the transparency movement. Transparency isn’t about altruism, or being “good”. Transparency is good business decision. My hope is that Transparent Performance can help the industry cross the chasm, and make it both obvious and trivial to be transparent in your uptime and performance. If anyone can do it, these guys can.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote&gt;And if you’re interested in becoming a contributing editor too, do &lt;a href=&quot;http://www.transparentperformance.com/about/&quot;&gt;drop them a line&lt;/a&gt; or comment below.&quot;&lt;/blockquote&gt;Everyone that still follows this blog, or stumbles across this post, do yourself a favor and go to&amp;nbsp;&lt;a href=&quot;http://www.transparentperformance.com/&quot;&gt;Transparent Performance&lt;/a&gt;.</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/7255716242726834666/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2011/06/passing-transparency-torch.html#comment-form' title='38 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/7255716242726834666'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/7255716242726834666'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2011/06/passing-transparency-torch.html' title='Passing the Transparency Torch'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>38</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-2530595103935259437</id><published>2010-11-10T10:15:00.000-08:00</published><updated>2010-11-10T10:15:30.151-08:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="finalpost"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>All good things...must come to an end</title><content type='html'>After nearly two and half years, over one hundred posts, a &lt;a href=&quot;http://www.transparentuptime.com/2010/06/video-of-my-talk-upside-of-downtime-at.html&quot;&gt;presentation at Velocity 2010&lt;/a&gt;, a &lt;a href=&quot;http://www.transparentuptime.com/2010/06/quote-in-wsj.html&quot;&gt;quote in the Wall Street Journal&lt;/a&gt;, an &lt;a href=&quot;http://oreillynet.com/pub/e/1672&quot;&gt;O&#39;Reilly webinar&lt;/a&gt;, and&amp;nbsp;immeasurable&amp;nbsp;friendships, connections, and opportunities that have come as result of this blog, I am (sadly) putting the blog on&amp;nbsp;indefinite&amp;nbsp;hiatus. As of next week, I will be leaving my job (of nearly 10 years) and&amp;nbsp;pursuing&amp;nbsp;my dream of starting my own company. In that new world I do not foresee having the time necessary to give this blog the time it deserves, and so to&amp;nbsp;avoid leaving it in a state of perpetual uncertainty, this will be my final post.&lt;br /&gt;
&lt;br /&gt;
It saddens me to bring this site to an end. I&#39;ve gotten more out of it than I could have ever hoped. Reading over my &lt;a href=&quot;http://www.transparentuptime.com/2008/08/first-post.html&quot;&gt;first post&lt;/a&gt;&amp;nbsp;(as painful as that is), I am happy to see that I have met the goals I set out for myself. Things have come a long way since those days, but there is still far more to do. My biggest hope is that you as a reader have gained some nugget of useful knowledge out of my writings, and that you continue to push forward on the basic ideas of&amp;nbsp;transparency,&amp;nbsp;openness, and simply helping your company act more human.&lt;br /&gt;
&lt;br /&gt;
In regards to my startup, I don&#39;t have a lot of details to share just yet, but if you are interested in staying up to date please&amp;nbsp;&lt;a href=&quot;http://twitter.com/lennysan&quot;&gt;&lt;b&gt;follow me on twitter&lt;/b&gt;&lt;/a&gt;&amp;nbsp;or &lt;a href=&quot;http://www.linkedin.com/in/lennyrachitsky&quot;&gt;&lt;b&gt;LinkedIn&lt;/b&gt;&lt;/a&gt;. I can also give you my personal email address if you would like to contact me for any reason. All I can say at this point is that I will be moving to one and only city of&amp;nbsp;&lt;a href=&quot;http://www.conferencedemontreal.com/wp-content/uploads/2010/04/port_montreal_aggrandi.jpg&quot;&gt;Montreal&lt;/a&gt; to work with the wonderful folks at &lt;a href=&quot;http://www.yearonelabs.com/&quot;&gt;Year One Labs&lt;/a&gt;. Mysterious eh?&lt;br /&gt;
&lt;br /&gt;
Below is a list of my favorite (and most popular) posts from the past 2+ years:&lt;br /&gt;
&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/07/why-transparency-works.html&quot;&gt;Why Transparency Works&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/07/benefits-of-transparency.html&quot;&gt;The Benefits of Transparency&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/02/tao-of-web-performance-and-uptime.html&quot;&gt;The Tao of Web Performance and Uptime&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/03/guideline-for-postmortem-communication.html&quot;&gt;A Guideline for Postmortem Communication&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2008/11/rules-for-successful-public-health.html&quot;&gt;7 Keys to a Successful Public Health Dashboard&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2008/12/saas-slas-state-of-union.html&quot;&gt;Comprehensive review of SaaS SLAs - A sad state of affairs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/03/google-app-engine-downtime-postmortem.html&quot;&gt;Google App Engine downtime postmortem, nearly a perfect model for others&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/03/google-app-engine-downtime-postmortem.html&quot;&gt;&lt;/a&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/06/amazoncom-goes-down-good-case-study-of.html&quot;&gt;As Amazon goes down, good case study of consumer-facing transparency (or lack thereof)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010_06_20_archive.html&quot;&gt;The Upside of Downtime (slides)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/03/how-to-trust-cloud.html&quot;&gt;How to Trust the Cloud (slides)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/02/cloud-metaphors-weather-pattern.html&quot;&gt;The top 7 most overused cloud metaphors, sorted by weather pattern&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;br /&gt;
Note: If you are doing anything similar that you think readers of this blog would find useful, please let me know in the comments and I&#39;ll update this post.&lt;br /&gt;
&lt;br /&gt;
Signing off,&lt;br /&gt;
Lenny Rachitsky (&lt;a href=&quot;http://twitter.com/lennysan&quot;&gt;@lennysan&lt;/a&gt;,&amp;nbsp;&lt;a href=&quot;http://www.linkedin.com/in/lennyrachitsky&quot;&gt;LinkedIn&lt;/a&gt;)</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/2530595103935259437/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/11/all-good-thingsmust-come-to-end.html#comment-form' title='92 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/2530595103935259437'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/2530595103935259437'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/11/all-good-thingsmust-come-to-end.html' title='All good things...must come to an end'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>92</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-7241731779652359536</id><published>2010-10-08T09:44:00.000-07:00</published><updated>2010-10-08T09:44:54.482-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="allspaw"/><category scheme="http://www.blogger.com/atom/ns#" term="etsy"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>Etsy.com opens the kimono and talks frankly about outages</title><content type='html'>&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;You know when &lt;a href=&quot;http://twitter.com/#!/allspaw&quot;&gt;John Allspaw&lt;/a&gt;&amp;nbsp;(VP of Ops at Etsy, Manager of Operations at Flickr, Infrastructure Architect at Friendster) is involved, you&#39;re going to get a unique perspective on things. A few weeks ago &lt;a href=&quot;http://etsystatus.com/2010/09/15/september-14th-2010-site-outage/&quot;&gt;Etsy.com was down&lt;/a&gt;. John (and his operations) department decided it would be a good opportunity to take what I&#39;ll call an &quot;outage bankruptcy&quot; and basically reset expectations. In an extremely &lt;a href=&quot;http://codeascraft.etsy.com/2010/09/17/frank-talk-about-site-outages/&quot;&gt;detailed and well thought out post&lt;/a&gt;&amp;nbsp;(titled &quot;&lt;a href=&quot;http://draft.blogger.com/&quot;&gt;&lt;span id=&quot;goog_519630304&quot;&gt;&lt;/span&gt;Frank Talk about &amp;nbsp;Site Outages&lt;span id=&quot;goog_519630305&quot;&gt;&lt;/span&gt;&lt;/a&gt;&quot;) he goes on to describe the entire end-to-end processes that go into managing uptime at Etsy. I would recommend reading &lt;a href=&quot;http://codeascraft.etsy.com/2010/09/17/frank-talk-about-site-outages/&quot;&gt;the entire post&lt;/a&gt;, but I thought it would be useful to point out the things that we can all take away from the experience of one of the most well respected operations people in the industry:&lt;/span&gt;&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Metrics&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;
&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;&quot;Today, &lt;b&gt;we gather a little over 30,000 metrics&lt;/b&gt;, on everything from CPU usage, to network bandwidth, to the rate of listings and re-listings done by Etsy sellers.&amp;nbsp;Some of those metrics are gathered every 20 seconds, 24 hours a day, 365 days a year.&lt;b&gt; About 2,000 metrics will alert someone on our operations staff&lt;/b&gt; (we have an on-call rotation) to wake up in the middle of the night to fix a problem.&quot;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;i&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Takeaway: Capture data on every part of your infrastructure, and later decide which metrics are leading indicators of problems. He goes on to talk about the importance of external monitoring (outside of your firewall) to measure the actual end-user experience.&lt;/span&gt;&lt;/i&gt;&lt;/blockquote&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Communication&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;
&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;&quot;When we have an outage or issue that affects a measurable portion of the site’s functionality, &lt;b&gt;we quickly group together to coordinate our response&lt;/b&gt;. We follow the same basic approach as most incident response teams. &lt;b&gt;We assign some people to address the problem and others to update the rest of the staff and post to http://etsystatus.com to alert the community&lt;/b&gt;. Changes that are made to mitigate the outage are largely done in a one-at-a-time fashion, and we track both our time-to-detect as well as our time-to-resolve, for use in a follow-up meeting after the outage, called a “post-mortem” meeting. Thankfully, our average time-to-detect is on the order of 2 minutes for any outages or major site issues in the past year. This is mostly due to continually tuning our alerting system.&quot;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;i&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Takeaway: Two important points here. First, communication and collaboration are key to successfully managing issues. Second, and even more interesting, is the need for two teams...one to address the problem and one to communicate status updates both internally and externally. This is often a missing piece for companies, where no updates go out because everyone is busy fixing the problem.&lt;/span&gt;&lt;/i&gt;&lt;/blockquote&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Post-Mortems&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;
&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;&quot;&lt;b&gt;After any outage, we meet to gather information about the incident&lt;/b&gt;. We reconstruct the time-line of events; when we knew of the outage, what we did to fix it, when we declared the site to be stable again. We do a root cause analysis to characterize why the outage happened in the first place. We make a list of remediation tasks to be done shortly thereafter, &lt;b&gt;focused on preventing the root cause from happening again&lt;/b&gt;. These tasks can be as simple as fixing a bug, or as complex as putting in new infrastructure to increase the fault-tolerance of the site. We document this process, for use as a reference point in measuring our progress.&quot;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;i&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Takeaway: Fixing the problem and getting back online is not enough. Make it a an automatic habit to schedule a postmortem to do a deep dive into the root cause(s) of the problem, and address not only the immediate bugs but also the deeper issues that led to the root cause. The &lt;/span&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/5_Whys&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Five Why&#39;s&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt; can help here, as can the Lean methodology of investing a proportional number of hours into the most&amp;nbsp;problematic&amp;nbsp;parts of the infrastructure.&lt;/span&gt;&lt;/i&gt;&lt;/blockquote&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Single Point of Failure Reduction&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;
&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;&quot;As Etsy has grown from a tiny little start-up to the mission-critical service it is today, we’ve had to outgrow some of our infrastructure. One reason we have for this evolution is to avoid depending on single pieces of hardware to be up and running all of the time. Servers can fail at any time, and Etsy.com should be able to keep working if a single server dies. To do that, we have to put our data in multiple places, keep them in sync, and make sure our code can route around any individual failures.&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;So we’ve been working a lot this year to reduce those “single points of failure,” and to put in redundancy as fast as we safely can. Some of this means being very careful (paranoid) as we migrate data from the single instances to multiple or replicated instances. As you can imagine, it’s a bit of a feat to move that volume of data around while still seeing a peak of 15 new listings per second, all the while not interrupting the site’s functionality.&quot;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;i&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Takeaway: Reduce single points of failure incrementally. Do what you can in the time you have.&lt;/span&gt;&lt;/i&gt;&lt;/blockquote&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Change Management and Risk&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;
&lt;blockquote&gt;&lt;div style=&quot;display: inline !important; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 10px; padding-right: 10px; padding-top: 10px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 21px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;&quot;For every type of technical change, we have answers to questions like:&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;ul style=&quot;list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 10px; margin-left: 20px; margin-right: 20px; margin-top: 10px; padding-bottom: 0px; padding-left: 20px; padding-right: 20px; padding-top: 0px;&quot;&gt;&lt;li style=&quot;list-style-image: initial; list-style-position: outside; list-style-type: disc; margin-bottom: 10px; margin-left: 30px; margin-right: 0px; margin-top: 10px;&quot;&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 21px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;What problem does the change solve?&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;/li&gt;
&lt;li style=&quot;list-style-image: initial; list-style-position: outside; list-style-type: disc; margin-bottom: 10px; margin-left: 30px; margin-right: 0px; margin-top: 10px;&quot;&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 21px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Has this kind of change happened before? Is there a successful history?&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;/li&gt;
&lt;li style=&quot;list-style-image: initial; list-style-position: outside; list-style-type: disc; margin-bottom: 10px; margin-left: 30px; margin-right: 0px; margin-top: 10px;&quot;&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 21px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;When is the change going to start? When is it expected to end?&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;/li&gt;
&lt;li style=&quot;list-style-image: initial; list-style-position: outside; list-style-type: disc; margin-bottom: 10px; margin-left: 30px; margin-right: 0px; margin-top: 10px;&quot;&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 21px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;What is the expected effect of this change on the Etsy community? Is a downtime required for the change?&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;/li&gt;
&lt;li style=&quot;list-style-image: initial; list-style-position: outside; list-style-type: disc; margin-bottom: 10px; margin-left: 30px; margin-right: 0px; margin-top: 10px;&quot;&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 21px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;What is the rollback plan, if something goes wrong?&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;/li&gt;
&lt;li style=&quot;list-style-image: initial; list-style-position: outside; list-style-type: disc; margin-bottom: 10px; margin-left: 30px; margin-right: 0px; margin-top: 10px;&quot;&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 21px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;What test is needed to make sure that the change succeeded?&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;As with all change, the risk involved and the answers to these questions are largely dependent on the judgment of the person at the helm. At Etsy, we believe that if we understand the likely failures, and if there’s a plan in place to fix any unexpected issues, we’ll make progress.&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Just as important, we also track the results of changes. &lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;We have an excellent history with respect to the number of successful changes. This is a good record that we plan on keeping.&quot;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;i&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Takeway: Be prepared for failure by anticipating worst-case scenario&#39;s for every change. Be ready to roll back and respond. More importantly, make sure to track when things go right to have a realistic measure of risk.&lt;/span&gt;&lt;/i&gt;&lt;/blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;&lt;b&gt;Other takeaways:&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;
&lt;ul&gt;&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Declaring &quot;outage bankruptcy&quot; is not the ideal approach. But it is better than simply going along without any authentic communication with your customers throughout a period of instability. Your customers will understand, if you act human.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;Etsy has been doing a great job keeping customers up to date at&amp;nbsp;&lt;a href=&quot;http://etsystatus.com/&quot;&gt;http://etsystatus.com/&lt;/a&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, Helvetica, sans-serif;&quot;&gt;A glance at the &lt;a href=&quot;http://codeascraft.etsy.com/2010/09/17/frank-talk-about-site-outages/&quot;&gt;comments&lt;/a&gt; on the page shows a few upset customers, but a generally positive response.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/7241731779652359536/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/10/etsycom-opens-kimono-and-talks-frankly.html#comment-form' title='73 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/7241731779652359536'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/7241731779652359536'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/10/etsycom-opens-kimono-and-talks-frankly.html' title='Etsy.com opens the kimono and talks frankly about outages'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>73</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-5841969430408518758</id><published>2010-10-06T09:47:00.000-07:00</published><updated>2010-10-06T09:48:37.742-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="foursquare"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>Foursquare gets transparency</title><content type='html'>&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Early Monday&amp;nbsp;morning&amp;nbsp;of this week,&amp;nbsp;Foursquare&amp;nbsp;went down hard:&lt;/span&gt;&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDxAg4Fzcbxm0odnSWXeI_CbXKfdVhS_N9140ldYyqXjW0FBQOhtEhT7AI3g14maujAwlaewM48vaKLnqsPdgTmzZ6eq_Jyo8UMxKQlfyPGA2Phw3WZ9S_BHDFpLSO40iAsZfd1CqNhyg/s1600/Twitter+_+@foursquare+support_+The+servers+are+overloaded+....png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;img border=&quot;0&quot; height=&quot;242&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDxAg4Fzcbxm0odnSWXeI_CbXKfdVhS_N9140ldYyqXjW0FBQOhtEhT7AI3g14maujAwlaewM48vaKLnqsPdgTmzZ6eq_Jyo8UMxKQlfyPGA2Phw3WZ9S_BHDFpLSO40iAsZfd1CqNhyg/s400/Twitter+_+@foursquare+support_+The+servers+are+overloaded+....png&quot; width=&quot;400&quot; /&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;11 hours later, the #caseofthemondays was over and they were back online. Throughout the those 11 hours, users had one of the following experiences:&lt;/span&gt;&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;1. &lt;b&gt;When visiting foursquare.com&lt;/b&gt;, they saw:&lt;/span&gt;&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;a href=&quot;http://img.skitch.com/20101006-rmn6f2ycwjakw67nknqwgqrurr.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;img border=&quot;0&quot; height=&quot;178&quot; src=&quot;http://img.skitch.com/20101006-rmn6f2ycwjakw67nknqwgqrurr.png&quot; width=&quot;320&quot; /&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;2. &lt;b&gt;When using the iPhone/Android/Blackberry ap&lt;/b&gt;p, they saw an error telling them the service is down and to try again later.&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;3. &lt;b&gt;When checking Twitter&lt;/b&gt; (the not default source of downtime information), they saw &lt;a href=&quot;http://twitter.com/#!/search/foursquare%20down&quot;&gt;a lot of people complaining&lt;/a&gt; and the following tweets from the official &lt;a href=&quot;http://twitter.com/foursquare&quot;&gt;@foursquare&lt;/a&gt; account (if they thought of checking the&amp;nbsp;&lt;a href=&quot;http://twitter.com/foursquare&quot;&gt;@foursquare&lt;/a&gt;&amp;nbsp;account):&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfsdnTJWeY9jYkaqX6iORkVhR7YpBS_0wLEeDQ4hBOATj_U6vTnRUFd8G-0WL6IPCN8nCWuvSXWsQrfZFgExdx8WQ-Qpj_MC_N1rDK2-AW3IcQROcoAsA_pUzPwaRby-6ZBEqd3-cA67w/s1600/foursquare+(foursquare)+on+Twitter.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;img border=&quot;0&quot; height=&quot;640&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfsdnTJWeY9jYkaqX6iORkVhR7YpBS_0wLEeDQ4hBOATj_U6vTnRUFd8G-0WL6IPCN8nCWuvSXWsQrfZFgExdx8WQ-Qpj_MC_N1rDK2-AW3IcQROcoAsA_pUzPwaRby-6ZBEqd3-cA67w/s640/foursquare+(foursquare)+on+Twitter.png&quot; width=&quot;448&quot; /&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Those were the only options available to a user of Foursquare for those 11 hours. &lt;b&gt;A important question we need to answer is whether anyone seriously cared&lt;/b&gt;. Are users of consumer services like Foursquare&amp;nbsp;legitimately&amp;nbsp;concerned with Foursquare&#39;s downtime? Are they going to leave for competing services or just quit the whole check-in game? I&lt;b&gt;&#39;d like to believe that 11 hours of downtime matters, but honestly it&#39;s too early to tell&lt;/b&gt;. This will be a great test of the stickiness and &lt;a href=&quot;http://www.amazon.com/Whuffie-Factor-Social-Networks-Business/dp/0307409503&quot;&gt;Whuffie&lt;/a&gt; that Foursquare has built up.&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;The way I see it is that this is one strike against Foursquare (which includes the continued instability they&#39;ve seen since Monday). They probably won&#39;t see a significant impact to their user base. However, if this happens again, and again, and again, the story changes. And as I&#39;ve argued, downtime is inevitable. Foursquare will certainly go down again. &lt;b&gt;They key is not reducing downtime to zero, but how you handle that downtime to avoid giving your competition an opening and even more importantly using that downtime to build trust and loyalty with your users.&lt;/b&gt; How do you accomplish this? Transparency.&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;We&#39;ve talked about the &lt;a href=&quot;http://www.transparentuptime.com/2010/07/benefits-of-transparency.html&quot;&gt;benefits of transparency&lt;/a&gt;, &lt;a href=&quot;http://www.transparentuptime.com/2010/07/why-transparency-works.html&quot;&gt;why transparency works&lt;/a&gt;, and &lt;a href=&quot;http://www.transparentuptime.com/2008/11/rules-for-successful-public-health.html&quot;&gt;how&lt;/a&gt; &lt;a href=&quot;http://www.transparentuptime.com/2010/03/guideline-for-postmortem-communication.html&quot;&gt;to&lt;/a&gt;&amp;nbsp;&lt;a href=&quot;http://www.slideshare.net/lennysan/the-upside-of-downtime-velocity-2010-4564992&quot;&gt;implement&lt;/a&gt; it. We saw above how Foursquare handled the pre- and intra- downtime steps (not well), so let&#39;s take a look at how they did in the post-downtime phase by reviewing the &lt;a href=&quot;http://blog.foursquare.com/2010/10/05/so-that-was-a-bummer/&quot;&gt;public postmortem&lt;/a&gt;&amp;nbsp;(&lt;a href=&quot;http://blog.foursquare.com/2010/10/06/quite-the-way-to-celebrate-our-200-millionth-check-in/&quot;&gt;both of them&lt;/a&gt;)&amp;nbsp;they published. As always, let&#39;s run it through the &lt;a href=&quot;http://www.transparentuptime.com/2010/03/guideline-for-postmortem-communication.html&quot;&gt;gauntlet&lt;/a&gt;.&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;a href=&quot;http://draft.blogger.com/&quot;&gt;&lt;/a&gt;&lt;span id=&quot;goog_2038938476&quot;&gt;&lt;/span&gt;&lt;span id=&quot;goog_2038938477&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: #09111a; font-size: 14px; line-height: 18px;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;
&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Prerequisites:&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Admit failure&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- Excellent. The entire first paragraph describes the downtime, and how painful it was to users.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Sound like a human&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- Very much. This has never been a problem for Foursquare. The tone is very trustworthy.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Have a communication channel&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- Prior to the event, all they had were their &lt;a href=&quot;http://twitter.com/#!/4sqsupport&quot;&gt;twitter&lt;/a&gt; &lt;a href=&quot;http://twitter.com/#!/foursquare&quot;&gt;accounts&lt;/a&gt; and their API developer forums. As a result of this incident, they have since launched &lt;a href=&quot;http://status.foursquare.com/&quot;&gt;http://status.foursquare.com/&lt;/a&gt;, and have promised to update &lt;a href=&quot;http://twitter.com/#!/4sqsupport&quot;&gt;@4sqsupport&lt;/a&gt; on a regular basis throughout the incident.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Above all else, be authentic&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;&lt;i&gt;-&lt;/i&gt;&amp;nbsp;This may be the biggest thing going for them.&amp;nbsp;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Requirements:&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;b&gt;Start time and end time of the incident&lt;/b&gt;&amp;nbsp;- Missing. All we know is that they were down for 11 hours. I don&#39;t see this as being critical in this case, but it would have been nice to have.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Who/what was impacted&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- A bit vague, but the impression was that everyone was impacted.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;What went wrong&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- Extremely well done. I feel very informed, and can sympathize with the situation.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Lessons learned&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- Again, extremely well done. I love the structure they used:&amp;nbsp;What happened, What we’ll be doing differently – technically speaking, What we’re doing differently – in terms of process. Very effective.&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Bonus:&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Details on the technologies involved - Yes!&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Answers to the&amp;nbsp;&lt;a href=&quot;http://en.wikipedia.org/wiki/5_Whys&quot; style=&quot;color: #336699;&quot;&gt;Five Why&#39;s&lt;/a&gt;&amp;nbsp;- No :(&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Human elements - heroic efforts, unfortunate coincidences, effective teamwork, etc - Yes!&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;What others can learn from this experience - Yes!&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: x-large; line-height: 20px;&quot;&gt;&lt;b&gt;Other takeaways:&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;ul&gt;&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 20px;&quot;&gt;Foursquare launched a public heath status feed! Check it out at&amp;nbsp;http://status.foursquare.com/.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 20px;&quot;&gt;I really like the structure used in this postmortem. It has inspired me to want to create a basic template for postmortems. Stay tuned...&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 20px;&quot;&gt;Could this be Facebook&#39;s Friendster moment? I hope not. My &lt;a href=&quot;http://www.assistedserendipity.com/&quot;&gt;personal project&lt;/a&gt; rely&#39;s completely on Foursquare.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 20px;&quot;&gt;I&#39;ve come to realize that for in most cases, downtime is less&amp;nbsp;impactful&amp;nbsp;to the long term success of a business than site performance. Downtime users understand and just try again later. Slowness eats away at you, you start to hate using the service and jump on an opportunity to use something more fun/fast/pleasant.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;Going forward, the big question will be whether Foursquare maintains their new processes, keeps the status blog up to date, and can fix their scalability issues. I for one am rooting for them.&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/5841969430408518758/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/10/foursquare-gets-transparency.html#comment-form' title='26 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/5841969430408518758'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/5841969430408518758'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/10/foursquare-gets-transparency.html' title='Foursquare gets transparency'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDxAg4Fzcbxm0odnSWXeI_CbXKfdVhS_N9140ldYyqXjW0FBQOhtEhT7AI3g14maujAwlaewM48vaKLnqsPdgTmzZ6eq_Jyo8UMxKQlfyPGA2Phw3WZ9S_BHDFpLSO40iAsZfd1CqNhyg/s72-c/Twitter+_+@foursquare+support_+The+servers+are+overloaded+....png" height="72" width="72"/><thr:total>26</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-5169062484193144848</id><published>2010-09-29T10:26:00.000-07:00</published><updated>2010-09-29T10:26:06.152-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="facebook"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>Case Study: Facebook outage</title><content type='html'>&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;I&#39;m a bit late to the story (something called a day job getting in the way!) but I can&#39;t pass up an opportunity to discuss how Facebook handled the &quot;worst outage [they&#39;ve] had in over four years&quot;. &amp;nbsp;I &lt;a href=&quot;http://www.transparentuptime.com/2010/09/facebook-downtime.html&quot;&gt;blogged about the intra-incident communication&lt;/a&gt; the day they had the outage, so let&#39;s review the &lt;a href=&quot;http://www.facebook.com/note.php?note_id=431441338919&amp;amp;id=9445547199&amp;amp;ref=mf&quot;&gt;postmortem&lt;/a&gt; that came out after they had recovered, and how they handled the downtime as a whole.&lt;/span&gt;&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4KpqGeunV5EBiBq0ClK_pFvqNF210S5yRzi6DVqhDc5j8X89iuLntE98gZY_alHEDSOzw34P7zdb8ywKvEercyLJptnESm1-4D69ExJN3S0j71bkmlGMpQC7DCuY1zC4Rpz9m55mNhjc/s1600/Preparation+Framwork.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&lt;img border=&quot;0&quot; height=&quot;300&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4KpqGeunV5EBiBq0ClK_pFvqNF210S5yRzi6DVqhDc5j8X89iuLntE98gZY_alHEDSOzw34P7zdb8ywKvEercyLJptnESm1-4D69ExJN3S0j71bkmlGMpQC7DCuY1zC4Rpz9m55mNhjc/s400/Preparation+Framwork.png&quot; width=&quot;400&quot; /&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Using the &quot;&lt;a href=&quot;http://www.slideshare.net/lennysan/the-upside-of-downtime-velocity-2010-4564992&quot;&gt;Upside of Downtime&quot; framework&lt;/a&gt;&amp;nbsp;(above)&amp;nbsp;as a guide:&lt;/span&gt;&lt;br /&gt;
&lt;ol&gt;&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Prepare&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;: Much room for improvement. The &lt;a href=&quot;http://developers.facebook.com/live_status&quot;&gt;health status feed&lt;/a&gt; is hard to find for the average user/developer, and the information was limited. On the plus side, it exists. &lt;a href=&quot;http://twitter.com/facebook&quot;&gt;Twitter&lt;/a&gt; was also used to communicate updates, but again the information was limited.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Communicate&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;: Without a strong foundation create by the Prepare step, you don&#39;t have much opportunity to excel at the Communicate step. There was an opportunity to use the basic communication channels they had in place (status feed, twitter) more effectively by communicating throughout the incident, with more actionable information, but alas this was not the case. Instead, there was mass speculation about the root cause and the severity. That is exactly what you want to strive to avoid.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Explain&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;: Let&#39;s find out by running the &lt;a href=&quot;http://www.facebook.com/note.php?note_id=431441338919&amp;amp;id=9445547199&amp;amp;ref=mf&quot;&gt;postmortem&lt;/a&gt; through our &lt;a href=&quot;http://www.transparentuptime.com/2010/03/guideline-for-postmortem-communication.html&quot;&gt;guideline for postmortem communication&lt;/a&gt;...&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: #09111a; font-size: 14px; line-height: 18px;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;
&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: #09111a; line-height: 18px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Prerequisites:&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style=&quot;margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;b style=&quot;color: #09111a; font-size: 14px; line-height: 18px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Admit failure&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: #09111a; font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 14px; line-height: 18px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&amp;nbsp;- Excellent, almost a textbook&amp;nbsp;admittance without hedging or blaming.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: #09111a; font-size: 14px; line-height: 18px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt; &lt;/span&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Sound like a human&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&amp;nbsp;- Well done. Posted from Director of Engineering at Facebook Robert Johnson&#39;s personal account, the tone and style was personal and effective.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Have a communication channel&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&amp;nbsp;- Can be improved greatly. Making the existing health status page easier to find, more public, and more useful would help in all future incidents. I&#39;ve covered how &lt;a href=&quot;http://www.transparentuptime.com/2010/07/facebook-and-transparency.htmlhttp://www.transparentuptime.com/2010/07/facebook-and-transparency.html&quot;&gt;Facebook can improve this page in a previous post&lt;/a&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Above all else, be authentic&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&amp;nbsp;&lt;i&gt;-&lt;/i&gt;&amp;nbsp;No issues here.&lt;/span&gt;&lt;/li&gt;
&lt;/span&gt;&lt;/ol&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: #09111a; line-height: 18px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Requirements:&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: #09111a; font-size: 14px; line-height: 18px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Start time and end time of the incident&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&amp;nbsp;- Missing.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: #09111a; font-size: 14px; line-height: 18px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt; &lt;/span&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Who/what was impacted&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&amp;nbsp;- Partial. I can understand this being difficult in the case of Facebook, but I would have liked to see more specifics around how many many users were affected. On one hand this is a global consumer service that may not be critical to people&#39;s lives. On the other hand though, if you treat your users with respect, they&#39;ll reward you for it.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;What went wrong&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&amp;nbsp;- Well done, maybe the best part of the postmortem.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Lessons learned&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&amp;nbsp;- Partial. It sounds like many lessons were certainly learned, but they weren&#39;t directly shared. I&#39;d love to know what the &quot;design patterns of other systems at Facebook that deal more gracefully with feedback loops and transient spikes&quot; look like.&lt;/span&gt;&lt;/li&gt;
&lt;/span&gt;&lt;/ol&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: #09111a; line-height: 18px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Bonus:&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: #09111a; font-size: 14px; line-height: 18px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Details on the technologies involved&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&amp;nbsp;- No&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: #09111a; font-size: 14px; line-height: 18px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt; &lt;/span&gt;&lt;/b&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Answers to the&amp;nbsp;&lt;/span&gt;&lt;/b&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/5_Whys&quot; style=&quot;color: #336699; font-weight: bold;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Five Why&#39;s&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&amp;nbsp;- No&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Human elements - heroic efforts, unfortunate coincidences, effective teamwork, etc&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&amp;nbsp;- No&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;What others can learn from this experience &lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;- Marginal&lt;/span&gt;&lt;/li&gt;
&lt;/span&gt;&lt;/ol&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Biggest lesson for us to take away:&amp;nbsp;&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;Preparation is key to successfully managing outages, and using them to build trust with your users.&lt;/span&gt;&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Verdana, sans-serif;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/5169062484193144848/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/09/case-study-facebook-outage.html#comment-form' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/5169062484193144848'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/5169062484193144848'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/09/case-study-facebook-outage.html' title='Case Study: Facebook outage'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4KpqGeunV5EBiBq0ClK_pFvqNF210S5yRzi6DVqhDc5j8X89iuLntE98gZY_alHEDSOzw34P7zdb8ywKvEercyLJptnESm1-4D69ExJN3S0j71bkmlGMpQC7DCuY1zC4Rpz9m55mNhjc/s72-c/Preparation+Framwork.png" height="72" width="72"/><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-4638133386768003067</id><published>2010-09-29T08:46:00.000-07:00</published><updated>2010-09-29T08:46:21.527-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><category scheme="http://www.blogger.com/atom/ns#" term="twitter"/><title type='text'>Transparency in action at Twitter</title><content type='html'>&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKYwepNldOAv4yntGonfahL71LawsZSCrs9nrCnj5Q6LdN2b_vtqVpQY8e9m5sSWtP2jniZVdGKwgJZRVWaomwbv-6GWaUQPPEWaqvEK1U8Y49p-aIEEZ23UkgYw8SoLdUKnAijmc-BwQ/s1600/Twitter+_+Arnout+Kazemier_+Great+xss+transparency+by+....png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; height=&quot;170&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKYwepNldOAv4yntGonfahL71LawsZSCrs9nrCnj5Q6LdN2b_vtqVpQY8e9m5sSWtP2jniZVdGKwgJZRVWaomwbv-6GWaUQPPEWaqvEK1U8Y49p-aIEEZ23UkgYw8SoLdUKnAijmc-BwQ/s400/Twitter+_+Arnout+Kazemier_+Great+xss+transparency+by+....png&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;br /&gt;
Enjoyed that &lt;a href=&quot;http://twitter.com/3rdEden/statuses/25139158073&quot;&gt;tweet&lt;/a&gt; from the other day. As you may know, Twitter ran into a very public cross-site scripting (XSS) vulnerability &lt;a href=&quot;http://blog.twitter.com/2010/09/all-about-onmouseover-incident.html&quot;&gt;recently&lt;/a&gt;:&lt;br /&gt;
&lt;blockquote&gt;&quot;The short story: This morning at 2:54 am PDT Twitter was notified of a security exploit that surfaced about a half hour before that, and we immediately went to work on fixing it. By 7:00 am PDT, the primary issue was solved. And, by 9:15 am PDT, a more minor but related issue tied to hovercards was also fixed.&quot;&lt;/blockquote&gt;&lt;div&gt;News of the vulnerability exploded, but very quickly Twitter came out with a fix and just as importantly an &lt;a href=&quot;http://blog.twitter.com/2010/09/all-about-onmouseover-incident.html&quot;&gt;detailed explanation of what happened&lt;/a&gt;, what they did about it, and where they are going from here:&lt;br /&gt;
&lt;blockquote&gt;&amp;nbsp;The security exploit that caused problems this morning Pacific time was caused by cross-site scripting (XSS). Cross-site scripting is the practice of placing code from an untrusted website into another one. In this case, users submitted javascript code as plain text into a Tweet that could be executed in the browser of another user.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote&gt;We discovered and patched this issue last month. However, a recent site update (unrelated to new Twitter) unknowingly resurfaced it.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote&gt;Early this morning, a user noticed the security hole and took advantage of it on Twitter.com. First, someone created an account that exploited the issue by turning tweets different colors and causing a pop-up box with text to appear when someone hovered over the link in the Tweet. This is why folks are referring to this an “onMouseOver” flaw -- the exploit occurred when someone moused over a link.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote&gt;Other users took this one step further and added code that caused people to retweet the original Tweet without their knowledge.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote&gt;This exploit affected Twitter.com and did not impact our mobile web site or our mobile applications. The vast majority of exploits related to this incident fell under the prank or promotional categories. Users may still see strange retweets in their timelines caused by the exploit. However, we are not aware of any issues related to it that would cause harm to computers or their accounts. And, there is no need to change passwords because user account information was not compromised through this exploit.&lt;/blockquote&gt;&lt;blockquote&gt;We’re not only focused on quickly resolving exploits when they surface but also on identifying possible vulnerabilities beforehand. This issue is now resolved. We apologize to those who may have encountered it.&lt;/blockquote&gt;Well done.&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/4638133386768003067/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/09/transparency-in-action-at-twitter.html#comment-form' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/4638133386768003067'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/4638133386768003067'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/09/transparency-in-action-at-twitter.html' title='Transparency in action at Twitter'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKYwepNldOAv4yntGonfahL71LawsZSCrs9nrCnj5Q6LdN2b_vtqVpQY8e9m5sSWtP2jniZVdGKwgJZRVWaomwbv-6GWaUQPPEWaqvEK1U8Y49p-aIEEZ23UkgYw8SoLdUKnAijmc-BwQ/s72-c/Twitter+_+Arnout+Kazemier_+Great+xss+transparency+by+....png" height="72" width="72"/><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-4936640341863591925</id><published>2010-09-23T14:38:00.000-07:00</published><updated>2010-09-23T15:14:49.343-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="facebook"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>Facebook downtime</title><content type='html'>&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_FZuU8t1vIb9wddFwWR_Y7HhcUvBQ7FZCQi-gQJl4skVErsgG7kQZCHctI5D0PX6dNeeToTThvFeaVGXIcov0O495M4iD4-uOPijtv-WY1jb39FU1jA-WIarnGXAtyt8jpQhSSmwqtS8/s1600/Platform+Live+Status+-+Facebook+Developers.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: right; float: right; margin-bottom: 1em; margin-left: 1em;&quot;&gt;&lt;img border=&quot;0&quot; height=&quot;257&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_FZuU8t1vIb9wddFwWR_Y7HhcUvBQ7FZCQi-gQJl4skVErsgG7kQZCHctI5D0PX6dNeeToTThvFeaVGXIcov0O495M4iD4-uOPijtv-WY1jb39FU1jA-WIarnGXAtyt8jpQhSSmwqtS8/s320/Platform+Live+Status+-+Facebook+Developers.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;Facebook has been experiencing some &lt;a href=&quot;http://mashable.com/2010/09/23/facebook-down-again/&quot;&gt;major downtime today in various locations around the world&lt;/a&gt;:&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: #333333; font-family: &#39;Helvetica Neue&#39;, Arial, sans-serif; font-size: 13px; line-height: 18px;&quot;&gt;&lt;/span&gt;&lt;br /&gt;
&lt;blockquote&gt;&quot;After issues at a third-party networking provider&amp;nbsp;&lt;a href=&quot;http://mashable.com/2010/09/22/facebook-down-for-some/&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: black;&quot;&gt;took down Facebook&lt;/span&gt;&lt;/a&gt;&amp;nbsp;for some users on Wednesday, the social networking site is once again struggling to stay online.&lt;/blockquote&gt;&lt;blockquote&gt;The company reports latency issues with its API on its&lt;a href=&quot;http://developers.facebook.com/live_status&quot; target=&quot;_blank&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: black;&quot;&gt;developer site&lt;/span&gt;&lt;/a&gt;, but the problem is clearly broader than that with thousands of users&amp;nbsp;&lt;a href=&quot;http://search.twitter.com/search?q=facebook+down&quot; target=&quot;_blank&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: black;&quot;&gt;tweeting&lt;/span&gt;&lt;/a&gt;&amp;nbsp;about the outage.&lt;/blockquote&gt;&lt;blockquote&gt;On our end when we attempt to access Facebook, we’re seeing the message: “Internal Server Error – The server encountered an internal error or misconfiguration and was unable to complete your request.” Facebook “Like” buttons also appear to be down on our site and across the Web&quot;&lt;/blockquote&gt;&lt;div style=&quot;line-height: 1.5em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;&quot;&gt;Details are still sketchy (there&#39;s &lt;a href=&quot;http://search.twitter.com/search?q=akamai+facebook&quot;&gt;speculation Akamai is at fault&lt;/a&gt;). And that&#39;s the problem. It&#39;s almost all speculation right now. &lt;a href=&quot;http://developers.facebook.com/live_status&quot;&gt;The official word from facebook is simply&lt;/a&gt;:&lt;/div&gt;&lt;blockquote&gt;&quot;We are currently experiencing latency issues with the API, and we are actively investigating. We will provide an update when either the issue is resolved or we have an ETA for resolution.&quot;&lt;/blockquote&gt;&lt;div style=&quot;line-height: 1.5em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;&quot;&gt;That&#39;s not going to cut it when you have 500+ millions, and countless developers (&lt;a href=&quot;http://zynga.com/&quot;&gt;Zynga&lt;/a&gt; must be freaking out right now). I&#39;m seeing about &lt;a href=&quot;http://search.twitter.com/search?q=facebook+down&quot;&gt;400 tweets/second complaining about the downtime&lt;/a&gt;. Outages will happen. The problem isn&#39;t the downtime itself. Where Facebook is missing the boat is using this opportunity to build increased trust with their user and developer community by simply opening up the curtains a bit and telling us something useful. &lt;a href=&quot;http://www.transparentuptime.com/2010/07/facebook-and-transparency.html&quot;&gt;I&#39;ve seen some movement from Facebook on this front before&lt;/a&gt;. But there&#39;s much more they can do, and I&#39;m hoping this experience pushes them in the right drirection.&amp;nbsp;&lt;a href=&quot;http://www.transparentuptime.com/2010/07/why-transparency-works.html&quot;&gt;&lt;b&gt;Give us back a sense of control and we&#39;ll be happy&lt;/b&gt;&lt;/a&gt;.&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 1.5em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;&quot;&gt;P.S. You can watch for updates &lt;a href=&quot;http://developers.facebook.com/live_status&quot;&gt;here&lt;/a&gt;, &lt;a href=&quot;http://mailman.nanog.org/pipermail/nanog/2010-September/thread.html&quot;&gt;here&lt;/a&gt;, and &lt;a href=&quot;http://twitter.com/facebook&quot;&gt;here&lt;/a&gt;.&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/4936640341863591925/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/09/facebook-downtime.html#comment-form' title='21 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/4936640341863591925'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/4936640341863591925'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/09/facebook-downtime.html' title='Facebook downtime'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_FZuU8t1vIb9wddFwWR_Y7HhcUvBQ7FZCQi-gQJl4skVErsgG7kQZCHctI5D0PX6dNeeToTThvFeaVGXIcov0O495M4iD4-uOPijtv-WY1jb39FU1jA-WIarnGXAtyt8jpQhSSmwqtS8/s72-c/Platform+Live+Status+-+Facebook+Developers.png" height="72" width="72"/><thr:total>21</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-1681309870240798126</id><published>2010-09-22T09:05:00.000-07:00</published><updated>2010-09-22T09:05:17.996-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="bp"/><category scheme="http://www.blogger.com/atom/ns#" term="normalaccident"/><title type='text'>BP portraying Deepwater Horizon explosion as a &quot;Normal Accident&quot;...unknowingly calls for end of drilling</title><content type='html'>&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;http://inhabitat.com/wp-content/blogs.dir/1/files/2010/05/deepwater-horizon.jpg&quot; imageanchor=&quot;1&quot; style=&quot;clear: right; float: right; margin-bottom: 1em; margin-left: 1em;&quot;&gt;&lt;img border=&quot;0&quot; height=&quot;238&quot; src=&quot;http://inhabitat.com/wp-content/blogs.dir/1/files/2010/05/deepwater-horizon.jpg&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;While reading last week&#39;s issue of Time magazine, I came across this explanation of BP&#39;s pitch attempting to explain the recent accident in the Gulf:&lt;br /&gt;
&lt;blockquote&gt;&quot;Following a four-month investigation, BP released a report Sept. 8 that tried to divert blame from itself to other companies -- including contractors like Transocean -- for the April 20 explosion that sank the Deepwater Horizon rig, killing 11 people and resulting in the worst oil spill in U.S. history. &lt;b&gt;A team of investigators cited &#39;a complex and interlinked series of mechanical failures, human judgement&#39; and &#39;engineering design&#39; as the ultimate cause of the accident.&lt;/b&gt;&quot;&lt;/blockquote&gt;Though to some it may come off as a naive &quot;it&#39;s not our fault&quot; strategy, the reality (and consequence) is a lot more interesting. I&#39;ve spoken before about the concept of a &quot;&lt;a href=&quot;http://www.hazardcards.com/research.php?aid=36&quot;&gt;Normal Accident&quot;&lt;/a&gt;, but let&#39;s define it again:&lt;br /&gt;
&lt;blockquote&gt;Normal Accident Theory:&amp;nbsp;When a technology has become sufficiently complex and tightly coupled, accidents are inevitable and therefore in a sense &#39;normal&#39;.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote&gt;Accidents such as Three Mile Island and a number of others, all began with a mechanical or other technical mishap and then spun out of control through a series of technical cause-effect chains because the operators involved could not stop the cascade or unwittingly did things that made it worse. Apparently &lt;b&gt;trivial errors suddenly cascade through the system in unpredictable ways and cause disastrous results.&lt;/b&gt;&lt;/blockquote&gt;What BP is saying is that their systems are so &quot;complex and interlinked&quot; that they were unable to avert the disaster. In a sense, they are arguing that disaster was&amp;nbsp;inevitable. If &quot;Normal Accident Theory&quot; can be believed, &lt;b&gt;BP is indirectly suggesting deep water oil drilling should be abandoned&lt;/b&gt;:&lt;br /&gt;
&lt;blockquote&gt;&quot;This way of analysing technology has normative consequences: If potentially disastrous technologies, such as nuclear power or biotechnology, cannot be made entirely &#39;disaster proof&#39;, we must consider abandoning them altogether.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote&gt;Charles Perrow, the author of Normal Accident Theory, came to the conclusion that &quot;some technologies, such as nuclear power, should simply be abandoned because they are not worth the risk&quot;.&lt;/blockquote&gt;Where do I sign?&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Arial, sans-serif; font-size: 12px;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/1681309870240798126/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/09/bp-portraying-deepwater-horizon.html#comment-form' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/1681309870240798126'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/1681309870240798126'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/09/bp-portraying-deepwater-horizon.html' title='BP portraying Deepwater Horizon explosion as a &quot;Normal Accident&quot;...unknowingly calls for end of drilling'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-6993440742496355262</id><published>2010-09-16T15:13:00.000-07:00</published><updated>2010-09-16T15:13:10.153-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>Chase.com goes down due to third party DB issues, apologizes...eventually</title><content type='html'>From &lt;a href=&quot;http://www.datacenterknowledge.com/archives/2010/09/16/chase-site-back-online-after-outage/&quot;&gt;Data Center Knowledge&lt;/a&gt;:&lt;br /&gt;
&lt;blockquote&gt;&quot;The Chase.com online banking portal is back online and processing customer bill payments that were delayed during lengthy outages Tuesday and Wednesday, the company &lt;a href=&quot;https://www.chase.com/index.jsp?pg_name=ccpmapp/shared/marketing/page/outage&quot;&gt;said this morning&lt;/a&gt;.&lt;/blockquote&gt;&lt;blockquote&gt;The Chase web site crashed Monday evening when a third party vendor’s database software corrupted the log-in process, the bank told the &lt;a href=&quot;http://online.wsj.com/article/SB20001424052748703743504575493752756026016.html&quot;&gt;Wall Street Journal&lt;/a&gt;. Chase said no customer data was at risk and that its telephone banking and ATMs functioned as usual throughout the outage.&quot;&lt;/blockquote&gt;Unfortunately there was no communication during the event, and finally got a message out to customers that visited the website four days after the first outage:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;http://img.skitch.com/20100916-r11a3k9qxjn943xjbg8qmayx9h.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; height=&quot;244&quot; src=&quot;http://img.skitch.com/20100916-r11a3k9qxjn943xjbg8qmayx9h.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
The &quot;we&#39;re sorry&quot; message is well done, but overall...not good.</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/6993440742496355262/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/09/chasecom-goes-down-due-to-third-party.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/6993440742496355262'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/6993440742496355262'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/09/chasecom-goes-down-due-to-third-party.html' title='Chase.com goes down due to third party DB issues, apologizes...eventually'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-5276342495423291816</id><published>2010-09-13T14:39:00.000-07:00</published><updated>2010-09-13T14:39:53.270-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="pizza"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>Domino&#39;s using transparency as a competitive advantage</title><content type='html'>From the &lt;a href=&quot;http://mediadecoder.blogs.nytimes.com/2010/08/27/bad-pizza-is-subject-of-new-dominos-spot/&quot;&gt;NY Times&lt;/a&gt;:&lt;br /&gt;
&lt;blockquote&gt;&lt;i&gt;Domino’s Pizza is extending its campaign that promises customers transparency along with tasty, value-priced pizza.&lt;/i&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;i&gt;The campaign, by Crispin Porter &amp;amp; Bogusky, part of MDC Partners, began with a reformulation of pizza recipes and continued recently with a pledge to show actual products in advertising rather than enhanced versions lovingly tended to by professional food artists.&lt;/i&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;i&gt;The vow to be more real was accompanied by a request to send Domino’s photographs of the company’s pizzas as they arrive at customers’ homes. AWeb site, showusyourpizza.com, was set up to receive the photos.&lt;/i&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;i&gt;A commercial scheduled to begin running on Monday will feature Patrick Doyle, the chief executive of Domino’s, pointing to one of the photographs that was uploaded to the Web site. The photo shows a miserable mess of a delivered pizza; the toppings and a lot of the cheese are stuck to the inside of the box.&lt;/i&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;i&gt;“This is not acceptable,” Mr. Doyle says in the spot, addressing someone he identifies as “Bryce in Minnesota.”&lt;/i&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;i&gt;“You shouldn’t have to get this from Domino’s,” Mr. Doyle continues. “We’re better than this.” He goes on to say that such subpar pizza “really gets me upset” and promises: “We’re going to learn; we’re going to get better. I guarantee it.”&lt;/i&gt;&lt;/blockquote&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/5276342495423291816/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/09/dominos-using-transparency-as.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/5276342495423291816'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/5276342495423291816'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/09/dominos-using-transparency-as.html' title='Domino&#39;s using transparency as a competitive advantage'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-414074979187962251</id><published>2010-08-13T16:47:00.000-07:00</published><updated>2010-08-13T16:47:50.720-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="downtime"/><title type='text'>How to Prevent Downtime Due to Human Error</title><content type='html'>Great &lt;a href=&quot;http://www.datacenterknowledge.com/archives/2010/08/13/how-to-prevent-downtime-due-to-human-error/&quot;&gt;post today over at Datacenter Knowledge&lt;/a&gt;, citing the fact that &quot;70 percent of the problems that plague data centers&quot; are caused by human error. Below are the best practices to avoid data center failure by human error:&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: Geneva, Arial, Helvetica, sans-serif; font-size: 10px;&quot;&gt;&lt;/span&gt;&lt;br /&gt;
&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;1. &lt;b&gt;Shielding Emergency OFF Buttons&lt;/b&gt;&amp;nbsp;– Emergency Power Off (EPO) buttons are generally located near doorways in the data center. Often, these buttons are not covered or labeled, and are mistakenly shut off during an emergency, which shuts down power to the entire data center. Labeling and covering EPO buttons can prevent someone from accidentally pushing the button. See&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;a href=&quot;http://www.datacenterknowledge.com/archives/2007/05/07/averting-disaster-with-the-epo-button/&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: black;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Averting Disaster with the EPO Button&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;and&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;a href=&quot;http://www.datacenterknowledge.com/archives/2008/04/02/best-label-ever-for-an-epo-button/&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: black;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Best Label Ever for an EPO Button&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;for more on this topic.&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;2. &lt;b&gt;Documented Method of Procedure -&lt;/b&gt;&amp;nbsp;A documented step-by-step, task-oriented procedure mitigates or eliminates the risk associated with performing maintenance. Don’t limit the procedure to one vendor, and ensure back-up plans are included in case of unforeseen events.&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;3. &lt;b&gt;Correct Component Labeling&lt;/b&gt;&amp;nbsp;-&amp;nbsp;To correctly and safely operate a power system, all switching devices must be labeled correctly, as well as the facility one-line diagram to ensure correct sequence of operation. Procedures should be in place to double check device labeling.&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;4. &lt;b&gt;Consistent Operating Practice&lt;/b&gt;s&amp;nbsp;– Sometimes data center managers get too comfortable and don’t follow procedures, forget or skip steps, or perform the procedure from memory and inadvertently shut down the wrong equipment. It is critical to keep all operational procedures up to date and follow the instructions to operate the system.&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;5.&lt;b&gt; Ongoing Personnel Training&lt;/b&gt;&amp;nbsp;– Ensure all individuals with access to the data center, including IT, emergency, security and facility personnel, have basic knowledge of equipment so that it’s not shut down by mistake.&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;6. &lt;b&gt;Secure Access Policies&lt;/b&gt;&amp;nbsp;– Organizations without data center sign-in policies run the risk of security breaches. Having a sign-in policy that requires an escort for visitors, such as vendors, will enable data center managers to know who is entering and exiting the facility at all times.&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;7. &lt;b&gt;Enforcing Food/Drinks Policies&lt;/b&gt;&amp;nbsp;– Liquids pose the greatest risk for shorting out critical computer components. The best way to communicate your data center’s food/drink policy is to post a sign outside the door that states what the policy is, and how vigorously the policy is enforced.&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;8. &lt;b&gt;Avoiding Contaminants&lt;/b&gt;&amp;nbsp;– Poor indoor air quality can cause unwanted dust particles and debris to enter servers and other IT infrastructure. Much of the problem can be alleviated by having all personnel who access the data center wear antistatic booties, or by placing a mat outside the data center. This includes packing and unpacking equipment outside the data center. Moving equipment inside the data center increases the chances that fibers from boxes and skids will end up in server racks and other IT infrastructure.&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/414074979187962251/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/08/how-to-prevent-downtime-due-to-human.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/414074979187962251'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/414074979187962251'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/08/how-to-prevent-downtime-due-to-human.html' title='How to Prevent Downtime Due to Human Error'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-5897828858465374535</id><published>2010-08-12T16:05:00.000-07:00</published><updated>2010-08-16T11:37:44.725-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="downtime"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>Downtime, downtime, downtime - DNS Made Easy, Posterous, Evernote</title><content type='html'>&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;It&#39;s been a busy week on the interwebs. Either downtime incidents are becoming more common, or I&#39;m just finding out about more of them. One nice thing about this blog is that readers send me downtime events that they come across. I don&#39;t know if I want to be the first person that people think of when they see downtime, but I&#39;ll take it. In the spirit of this blog, let&#39;s take a look at the recent downtime events to see &lt;a href=&quot;http://www.transparentuptime.com/2010/03/guideline-for-postmortem-communication.html&quot;&gt;what they did right, what they can improve, and what we can all learn from their experience&lt;/a&gt;.&lt;/span&gt;&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: x-large;&quot;&gt;DNS Made Easy&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;
&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;On Saturday August 7th, &lt;/span&gt;&lt;a href=&quot;http://www.theregister.co.uk/2010/08/09/dns_service_monster_ddos/&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;DNS Made Easy was host to a massive DDoS attack&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;:&lt;/span&gt;&lt;br /&gt;
&lt;blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&quot;The firm said it experienced 1.5 hours of actual downtime during the attack, which lasted eight hours. Carriers including Level3, GlobalCrossing, Tinet, Tata, and Deutsche Telekom assisted in blocking the attack, which due to its size flooded network backbones with junk.&quot;&lt;/span&gt;&lt;/blockquote&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: #09111a; font-family: Trebuchet, &#39;Trebuchet MS&#39;, Arial, sans-serif; font-size: 15px; line-height: 19px;&quot;&gt;&lt;/span&gt;&lt;br /&gt;
&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Prerequisites:&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Admit failure&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- Through a serious of customer email communications and &lt;a href=&quot;http://twitter.com/DNSMadeEasy/status/20548505230&quot;&gt;tweets&lt;/a&gt;, there was a clear&amp;nbsp;admittance&amp;nbsp;of failure early and often.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Sound like a human&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal;&quot;&gt;&amp;nbsp;- Yes, the communications all sounded genuine and human.&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Have a communication channel&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- Marginal. The communication channels were Twitter and email, which are not as powerful as a &lt;a href=&quot;http://www.transparentuptime.com/2008/11/rules-for-successful-public-health.html&quot;&gt;health status dashboard&lt;/a&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Above all else, be authentic&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;i&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;-&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-style: normal;&quot;&gt; Great job here. All of the communication I saw sounded authentic and heartfelt, including the final postmortem. Well done.&lt;/span&gt;&lt;/span&gt;&lt;/i&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Requirements:&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Start time and end time of the incident&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal;&quot;&gt;&amp;nbsp;- Yes, final postmortem email&amp;nbsp;communication&amp;nbsp;included the official start and end times (8:00 UTC - 14:00 UTC).&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Who/what was impacted&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- The postmortem addressed this directly, but didn&#39;t spell out a completely clear picture of who was affected and who wasn&#39;t. This is probably because there isn&#39;t a clear distinction between sites that were and weren&#39;t affected. To address this, they recommended customers review their DNS query traffic to see how they were affected.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;What went wrong&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- A good amount of detail on this, and I hope there is more coming. DDoS attacks are a great examples of where sharing knowledge and experience help the community as a whole, so I hope to see more detail come out about this.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Lessons learned&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- The postmortem included some lessons learned, but nothing very specific. I would have liked to see more here.&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Bonus:&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Details on the technologies involved - Some.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Answers to the&amp;nbsp;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/5_Whys&quot; style=&quot;color: #336699;&quot;&gt;Five Why&#39;s&lt;/a&gt;&amp;nbsp;- Nope.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Human elements - heroic efforts, unfortunate coincidences, effective teamwork, etc - Some.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;What others can learn from this experience - Some.&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;div&gt;Other notes:&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;The communication throughout the incident was excellent, though they could have benefited from a public dashboard or status blog that went beyond twitter and private customer emails.&lt;/li&gt;
&lt;li&gt;I don&#39;t think this is the right way to address the question of whether SLA credits will be issued: &quot;Yes it will be. With thousands paying companies we obviously do not want every organization to submit an SLA form.&quot;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: x-large;&quot;&gt;&lt;b&gt;Posterous&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;Starting Wednesday 8/4, &lt;a href=&quot;http://blog.posterous.com/&quot;&gt;Posterous began to experience various stability issues&lt;/a&gt;:&lt;/div&gt;&lt;blockquote&gt;&quot;As you’re no doubt aware, Posterous has had a rocky six days.&lt;/blockquote&gt;&lt;blockquote&gt;On Wednesday and Friday, our servers were hit by massive Denial of Service (DoS) attacks. We responded quickly and got back online within an hour, but it didn’t matter; the site went down and our users couldn’t post.&lt;/blockquote&gt;&lt;blockquote&gt;On Friday night, our team worked around the clock to move to new data centers, better capable of handling the onslaught. It wasn’t easy. Throughout the weekend we were fixing issues, optimizing the site, some things going smoothly, others less so.&lt;/blockquote&gt;&lt;blockquote&gt;Just at the moments we thought the worst was behind us, we’d run up against another challenge. It tested not only our technical abilities, but our stamina, patience, and we lost more than a few hairs in the process.&quot;&lt;/blockquote&gt;Posterous&amp;nbsp;&lt;a href=&quot;http://blog.posterous.com/&quot;&gt;continued to update their users on their blog&lt;/a&gt;, and on &lt;a href=&quot;http://blog.posterous.com/&quot;&gt;twitter&lt;/a&gt;. They also sent out an email communication to all of their customers to let everyone know about the issues.&lt;br /&gt;
&lt;br /&gt;
&lt;div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;div style=&quot;margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Prerequisites:&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Admit failure&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- Clearly yes, both on the blog and on Twitter.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Sound like a human&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- Very much so.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Have a communication channel&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- A combination of blog and Twitter. Again, not ideal, as customers may not think about visiting the blog or checking Twitter. Especially when the blog is inaccessible during the downtime, and they may not be aware of the Twitter account. One of the &lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4KpqGeunV5EBiBq0ClK_pFvqNF210S5yRzi6DVqhDc5j8X89iuLntE98gZY_alHEDSOzw34P7zdb8ywKvEercyLJptnESm1-4D69ExJN3S0j71bkmlGMpQC7DCuY1zC4Rpz9m55mNhjc/s1600/Preparation+Framwork.png&quot;&gt;keys to communication channel&lt;/a&gt; is to host if offsite, which would have been important in this case.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Above all else, be authentic&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;- No issues here, well done.&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;div style=&quot;margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Requirements:&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Start time and end time of the incident &lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal;&quot;&gt;- A bit vague in the postmortem, but can be calculated from the Twitter communication. Can be improved.&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Who/what was impacted&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- The initial &lt;a href=&quot;http://blog.posterous.com/todays-outage-and-changes-for-custom-domains&quot;&gt;post&lt;/a&gt; described this fairly well, that all customers hosted on Posterous.com are affected, including custom domains.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;What went wrong&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;-&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;A series of things went wrong in this case, and I believe the issues were described fairly well.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Lessons learned&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- Much room for improvement here. I don&#39;t see any real lessons learned in the postmortem posts or other communications. There were things put in place to avoid the issues int he future, such as moving to a new datacenter and adding hardware, but I don&#39;t see any real lessons learned as a result of this downtime.&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;div style=&quot;margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Bonus:&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Details on the technologies involved - Very little.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Answers to the&amp;nbsp;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/5_Whys&quot; style=&quot;color: #336699;&quot;&gt;Five Why&#39;s&lt;/a&gt;&amp;nbsp;- No.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Human elements - Yes, in the final postmortem, well done.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;What others can learn from this experience - Not a lot here.&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: x-large;&quot;&gt;&lt;b&gt;Evernote&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;From their &lt;a href=&quot;http://blog.evernote.com/2010/08/09/july1/&quot;&gt;blog&lt;/a&gt;:&lt;br /&gt;
&lt;blockquote&gt;&quot;EvernoteEvernote servers. We immediately contacted all affected users via email and our support team walked them through the recovery process. We automatically upgraded all potentially affected users to Evernote Premium (or added a year of Premium to anyone who had already upgraded) because we wanted to make sure that they had access to priority tech support if they needed help recovering their notes and as a partial apology for the inconvenience.&quot;&lt;/blockquote&gt;&lt;div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;div style=&quot;margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Prerequisites:&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Admit failure&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- Extremely solid, far beyond the bare minimum.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Sound like a human&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- Yes.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Have a communication channel&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- A simple &lt;a href=&quot;http://status.evernote.com/&quot;&gt;health status blog&lt;/a&gt; (which according to the comments is not easy to find), a &lt;a href=&quot;http://blog.evernote.com/&quot;&gt;blog&lt;/a&gt;, and a &lt;a href=&quot;http://twitter.com/evernote&quot;&gt;Twitter&lt;/a&gt; channel. Biggest area of improvement here is to make the status blog easier to find. I have no idea how to get to that from the site or the application, and that defeats its purpose.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Above all else, be authentic&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;- The only communication I saw was the &lt;a href=&quot;http://blog.evernote.com/2010/08/09/july1/&quot;&gt;final postmortem&lt;/a&gt;, and in that I think in that post (and the comments) they were very authentic.&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;div style=&quot;margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Requirements:&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Start time and end time of the incident&amp;nbsp;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal;&quot;&gt;- Rough timeframe, would have liked to see more detail.&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Who/what was impacted&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- First time I&#39;ve seen an exact figure like &quot;6,323&quot; users. Impressive.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;What went wrong&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;-&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;Yes, at the end of the &lt;a href=&quot;http://blog.evernote.com/2010/08/09/july1/&quot;&gt;postmortem&lt;/a&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Lessons learned&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&amp;nbsp;- Marginal. A bit vague and hand-wavy.&amp;nbsp;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;div style=&quot;margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Bonus:&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 1.3em; margin-bottom: 0.75em; margin-left: 0px; margin-right: 0px; margin-top: 0px;&quot;&gt;&lt;ol&gt;&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Details on the technologies involved - Not bad.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Answers to the&amp;nbsp;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/5_Whys&quot; style=&quot;color: #336699;&quot;&gt;Five Why&#39;s&lt;/a&gt;&amp;nbsp;- No.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;Human elements - No.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-family: inherit;&quot;&gt;What others can learn from this experience - Not a lot here.&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: x-large;&quot;&gt;&lt;b&gt;Conclusion&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;Overall, I&#39;m impressed with how these companies are handling downtime. Each communicated early and often. Each&amp;nbsp;admitted&amp;nbsp;failure immediately, and kept their users up to date. Each put out a solid postmortem that detailed the key information. It&#39;s interesting to see how Twitter is becoming the de-facto communication channel during an incident. I still wonder how effective it is in getting news out to all of your users, and how many users are aware of it. Overall, well done guys.&lt;br /&gt;
&lt;br /&gt;
Update: DNS Made Easy just &lt;a href=&quot;http://www.dnsstatus.com/&quot;&gt;launched a public health dashboard&lt;/a&gt;!&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/5897828858465374535/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/08/downtime-downtime-downtime-dns-made.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/5897828858465374535'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/5897828858465374535'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/08/downtime-downtime-downtime-dns-made.html' title='Downtime, downtime, downtime - DNS Made Easy, Posterous, Evernote'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-2332387785017960436</id><published>2010-08-10T08:18:00.000-07:00</published><updated>2010-08-10T08:18:53.018-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="casestudy"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><category scheme="http://www.blogger.com/atom/ns#" term="twilio"/><title type='text'>Transparency in action at Twilio</title><content type='html'>&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;http://static1.twilio.com/resources/images/company/logos/logos_icon_roundname.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: right; float: right; margin-bottom: 1em; margin-left: 1em;&quot;&gt;&lt;img border=&quot;0&quot; src=&quot;http://static1.twilio.com/resources/images/company/logos/logos_icon_roundname.png&quot; /&gt;&lt;/a&gt;&lt;/div&gt;When &lt;a href=&quot;http://blog.twilio.com/2010/07/twilio-open-sources-stashboard-the-status-dashboard.html&quot;&gt;Twilio launched an open-source public health dashboard&lt;/a&gt; tool a couple of weeks ago, I knew I had to learn more about Twilio. I connected with&amp;nbsp;&lt;a href=&quot;http://twitter.com/johndbritton&quot;&gt;John Britton&lt;/a&gt;&amp;nbsp;(Developer Evangelist at Twilio) to get some insight into the Twilio&#39;s transparency story. Enjoy...&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Q. What motivated Twilio to launch a public health dashboard and to put resources into transparency?&lt;/b&gt;&lt;br /&gt;
Twilio&#39;s goal is to bring the simplicity and transparency common in the world of web technologies to the opaque world of telephony and communications. &amp;nbsp;Just as Amazon AWS and other web infrastructure providers give customers direct and immediate information on service availability, &lt;a href=&quot;http://www.stashboard.org/&quot;&gt;Stashboard&lt;/a&gt; allows Twilio to provide a dedicated &lt;a href=&quot;http://status.twilio.com/&quot;&gt;status portal&lt;/a&gt; that our customers can visit anytime to get up-to-the-minute information on system heath. &amp;nbsp;During the development of Stashboard, we realized how many other companies and businesses could use a simple, scalable status page, so we open sourced it! &amp;nbsp;You can &lt;a href=&quot;http://github.com/twilio/stashboard/downloads&quot;&gt;download&lt;/a&gt; the source code or &lt;a href=&quot;http://github.com/twilio/stashboard&quot;&gt;fork&lt;/a&gt; your own version.&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Q. What roadblocks did you encounter on the way to launching the public dashboard, and how did you overcome them?&lt;/b&gt;&lt;br /&gt;
The most difficult part of building and launching Stashboard was creating a robust set of APIs that would encompass Twilio&#39;s services as well as other services from companies interested in running an instance of Stashboard themselves. We looked at existing status dashboards for inspiration, including the Amazon AWS Status Page and the Google Apps Status Page, and settled on a very general design independent from Twilio&#39;s product. The result is a dashboard that can be utilized to track a variety of APIs and services. &amp;nbsp;For example, a few days after the release of Stashboard, MongoHQ, a hosted MongoDB database provider launched &lt;a href=&quot;http://status.mongohq.com/&quot;&gt;their own instance of Stashboard&lt;/a&gt; to give their customers API status information.&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Q. What benefits have you seen as a result of your transparency initiatives?&lt;/b&gt;&lt;br /&gt;
Twilio&#39;s rapid growth is a great example of how developers at both small and large companies have responded to Twilio&#39;s simple open approach. &amp;nbsp;The Twilio developer community has grown to more then 15,000 strong and we see more and more applications and developers on the platform everyday. &amp;nbsp;Twilio was founded by developers who have a strong background in web services and distributed systems. &amp;nbsp;This is reflected in our adoption of open standards like HTTP and operational transparency with services like &lt;a href=&quot;http://status.twilio.com/&quot;&gt;http://status.twilio.com&lt;/a&gt;. &amp;nbsp;Another example is the community that has grown up around &lt;a href=&quot;http://www.openvbx.org/&quot;&gt;OpenVBX&lt;/a&gt;, a downloadable phone system for small business Twilio developed and open sourced a few week ago. &amp;nbsp; We opened OpenVBX to provide developers the simplest way to hack, skin, and integrate it with their own systems.&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Q. What is your hope with the open source dashboard framework?&lt;/b&gt;&lt;br /&gt;
The main goal of Stashboard is to give back to the community. &amp;nbsp;We use open source software extensively inside Twilio and we hope that by opening up Stashboard it will help other hosted services and improve the whole web services ecosystem.&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Q. What would you say to companies considering transparency in their uptime/performance?&lt;/b&gt;&lt;br /&gt;
Openness and transparency are key to building trust with customers. &amp;nbsp;Take the telecom industry as an example. &amp;nbsp;They are known for being completely closed. &amp;nbsp;Customers rarely love or trust their telecom providers. &amp;nbsp; In contrast, Twilio brings the open approach of the web to telecom and the response has been truly amazing. &amp;nbsp;When customers know they can depend on a company to provide accurate data concerning performance and reliability, they are more willing give that company their business and recommend it to their peers. &amp;nbsp;Twilio&#39;s commitment to transparency and openness has been a huge driver of our success and Stashboard and projects like OpenVBX are just the beginning.&lt;br /&gt;
&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/2332387785017960436/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/08/transparency-in-action-at-twilio.html#comment-form' title='26 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/2332387785017960436'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/2332387785017960436'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/08/transparency-in-action-at-twilio.html' title='Transparency in action at Twilio'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>26</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-4003621197928601887</id><published>2010-07-28T16:55:00.000-07:00</published><updated>2010-07-28T16:58:47.571-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="casestudy"/><category scheme="http://www.blogger.com/atom/ns#" term="opensrs"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>Transparency in action at OpenSRS</title><content type='html'>OpenSRS has long been a company that &quot;gets it&quot;, so I was excited to have the opportunity to interview&amp;nbsp;Ken Schafer, who leads the transparency efforts at &lt;a href=&quot;http://opensrs.com/&quot;&gt;OpenSRS&lt;/a&gt; and &lt;a href=&quot;http://tucowsinc.com/&quot;&gt;Tucows&lt;/a&gt;. OpenSRS has an excellent &lt;a href=&quot;http://status.opensrs.com/&quot;&gt;public health dashboard&lt;/a&gt;, and continues to put a lot of effort into transparency. Heather Leson, who works with Ken, has done a lot to raise the bar in the online transparency community. My hope is that the more transparent we all get about our own transparency efforts (too much?) the more we all benefit. Below, Ken tells us how he got the company to accept the need for transparency, what hurdles they had to overcome, and what benefits they&#39;ve seen. Enjoy the interview, and if you have any questions for Ken, please post them as comments below.&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Q. Can you briefly explain your role and how you got involved in your company’s transparency initiative?&lt;/b&gt;&lt;br /&gt;
My formal title is Executive Vice President of Product &amp;amp; Marketing. &amp;nbsp;That means I&#39;m on the overall &lt;a href=&quot;http://tucowsinc.com/&quot;&gt;Tucows&lt;/a&gt; exec team and I&#39;m also responsible for the product strategy and marketing of &lt;a href=&quot;http://opensrs.com/&quot;&gt;OpenSRS&lt;/a&gt;, our wholesale Internet services group.&lt;br /&gt;
&lt;br /&gt;
Tucows is one of the original Internet companies - founded in 1993. We&#39;ve moved well beyond the original &lt;a href=&quot;http://tucows.com/&quot;&gt;software download site&lt;/a&gt; and now the company makes most of its money providing easy-to-use Internet services.&lt;br /&gt;
&lt;br /&gt;
OpenSRS provides end users with over 10 million domain names, millions of mailboxes, and tens of thousands of digital certificates through over 10,000 resellers in over 100 countries. Our resellers are primarily web hosts, ISPs, web developers and designers, and IT consultants.&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Q. What has your group done to create transparency for your organization?&lt;/b&gt;&lt;br /&gt;
Given the technical adeptness of our resellers we&#39;ve always tried not to talk down to them and to provide as much information as we can. Our success and the success of our resellers are highly dependent on each other so we&#39;re very open to sharing and in fact since the beginning of OpenSRS in 1999 we&#39;ve run mailing lists, blogs, forums, wiki, status pages and a host of other ways for us to communicate better with our resellers.&lt;br /&gt;
&lt;br /&gt;
Transparency is kind of in the nature of the business at this point.&lt;br /&gt;
&lt;br /&gt;
Right now we provide transparency into what we&#39;re doing through a &lt;a href=&quot;http://opensrs.com/blog&quot;&gt;blog&lt;/a&gt;, a &lt;a href=&quot;https://forums.opensrs.com/&quot;&gt;reseller forum&lt;/a&gt;, our &lt;a href=&quot;http://status.opensrs.com/&quot;&gt;Status site&lt;/a&gt; and our activity on a &lt;a href=&quot;http://www.opensrs.com/community/sites&quot;&gt;host of social networks&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Q. What was the biggest hurdle you had to get over to push this through?&lt;/b&gt;&lt;br /&gt;
The biggest challenge is really whether your commitment to transparency can survive the bad times. Being transparent when you&#39;ve got a status board full of green tick marks isn&#39;t that hard. When everything starts turning red and staying that way, THAT&#39;S a lot harder.&lt;br /&gt;
&lt;br /&gt;
We&#39;re generally proud of our uptime and the quality of our services but a few years ago we struggled with scaling some of our applications and, frankly, our communication around the problems we were facing suffered as a result. People here were just too embarrassed to tell our resellers that we&#39;d messed stuff up and in particular to admit to our fellow geeks HOW we&#39;d messed up.&lt;br /&gt;
&lt;br /&gt;
But when we pushed and DID share information and admitted our mistakes and talked about what we could do to make it better what we found was that our resellers were appreciative AND very sympathetic. They&#39;d all been there too and knew it was hard to fess up to our errors in judgment and they really appreciated it.&lt;br /&gt;
&lt;br /&gt;
One thing we STILL struggle with is how we communicate around network attacks. Our services run a big chunk of the Internet and as such we&#39;re under pretty much constant attack of one sort or another. We handle most of these without anyone noticing. Our operations and security teams do an amazing job of keeping things running smoothly in the face of these attacks but every once in a while something new - in scope, scale or technique - happens that puts pressure on our systems until we can adjust to the new threat.&lt;br /&gt;
&lt;br /&gt;
In those cases we&#39;ve tended to put our desire for transparency aside and give minimal information so as not to show our hand to the bad guys. It&#39;s a struggle between what we share so customers understand what is happening and not showing potential vulnerabilities that others could exploit.&lt;br /&gt;
&lt;br /&gt;
I guess &quot;sharing what is exploitable&quot; is where I draw the line when it comes to transparency.&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Q. What benefits have you seen as a result of your transparency?&lt;/b&gt;&lt;br /&gt;
One of the biggest benefits is in the overall quality of the service. When you say that EVERY problem is going to get publicly and permanently posted to a status page it REALLY focusses the organization on quality of service!&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Q. Can you give us some insight into the processes around your transparency? Specifically who manages the communication, who is responsible for maintaining the dashboard, and what the general process looks like before/during/after a big event.&lt;/b&gt;&lt;br /&gt;
Our communications team (Marketing) is responsible for the OpenSRS Status page. &amp;nbsp;We generally hire marketers that are technically comfortable so they can write to be understood and understand what they&#39;re writing about.&lt;br /&gt;
&lt;br /&gt;
We have someone from Marketing on call 24/7/365 and whenever an issue cannot be resolved in an agreed-to period of time (generally 15 minutes) our Network Operations Center (also 24/7/365) informs Marketing and we post to Status.&lt;br /&gt;
&lt;br /&gt;
Our Status page is a heavily customized version of Wordpress plus an email notification system and auto-updates to our Twitter feed.&lt;br /&gt;
&lt;br /&gt;
Marketing and NOC then stay in touch until the issue is resolved, posting updates as material changes occur or at two hour intervals if the issue is ongoing.&lt;br /&gt;
&lt;br /&gt;
You&#39;ll notice this is a largely manual system. We decided against posting our internal monitoring tools publicly because of the complexity of our operations. Multiple services each composed of multiple sub-systems running in data centers around the word mean that the raw data isn&#39;t as useful to resellers as it may be for some less complex environments.&lt;br /&gt;
&lt;br /&gt;
In the event of a serious problem we also have an escalation process - once again managed by Marketing - that brings in additional levels of communications and executives. For major issues we also have a &quot;War Room&quot; procedure that is put in place until the issue is resolved.&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Q. What would you say to other organizations that are considering transparency as a strategic initiative?&lt;/b&gt;&lt;br /&gt;
The days of hiding are over. You now have a choice of whether you want to tell the story or have others misrepresent the story on your behalf. It seems scary to admit you have problems but you gain so much by being open and honest that the stress of taking a new approach to communications is easily outweighed.</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/4003621197928601887/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/07/transparency-in-action-at-opensrs.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/4003621197928601887'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/4003621197928601887'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/07/transparency-in-action-at-opensrs.html' title='Transparency in action at OpenSRS'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-3327621957095788956</id><published>2010-07-27T08:32:00.000-07:00</published><updated>2010-07-27T08:44:15.384-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="oreilly"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>I&#39;m doing an O&#39;Reilly Webcast this Thursday!</title><content type='html'>&lt;a href=&quot;http://www.oreillynet.com/pub/e/1672&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 500px;;&quot; src=&quot;http://img.skitch.com/20100727-j2qnyf8em8pchsggfykbgbtc3p.png&quot; border=&quot;0&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;The folks at O&#39;Reilly asked me to do a &lt;a href=&quot;http://www.oreillynet.com/pub/e/1672&quot;&gt;webcast&lt;/a&gt; of my &lt;a href=&quot;http://www.transparentuptime.com/2010/06/upside-of-downtime-velocity-2010.html&quot;&gt;talk&lt;/a&gt;, and I was happy to oblige. This talk will be very similar to the one I did at Velocity. I don&#39;t think I&#39;ll be doing this talk for much longer, so this may be your last chance to hear it live. I&#39;d love to have you there and to hear any feedback you may have about the message. The webcast will begin at 10am PST this coming Thursday, and you can &lt;a href=&quot;http://www.oreillynet.com/pub/e/1672&quot;&gt;register here&lt;/a&gt;.&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/3327621957095788956/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/07/im-doing-oreilly-webcast-this-thursday.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/3327621957095788956'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/3327621957095788956'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/07/im-doing-oreilly-webcast-this-thursday.html' title='I&#39;m doing an O&#39;Reilly Webcast this Thursday!'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-7576914822538721816</id><published>2010-07-20T15:52:00.000-07:00</published><updated>2010-07-20T15:52:44.376-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="psychology"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>Why Transparency Works</title><content type='html'>&lt;p class=&quot;p1&quot;&gt;We&#39;ve talked about the &lt;a href=&quot;http://www.transparentuptime.com/2010/07/benefits-of-transparency.html&quot;&gt;&lt;span class=&quot;Apple-style-span&quot;&gt;benefits of transparency&lt;/span&gt;&lt;/a&gt;. We&#39;ve talked about &lt;a href=&quot;http://www.transparentuptime.com/2010/06/upside-of-downtime-velocity-2010.html&quot;&gt;&lt;span class=&quot;Apple-style-span&quot;&gt;implementing transparency&lt;/span&gt;&lt;/a&gt;. We&#39;ve talked about &lt;a href=&quot;http://www.transparentuptime.com/2010/04/zendesk-transparency-in-action.html&quot;&gt;&lt;span class=&quot;Apple-style-span&quot;&gt;transparency&lt;/span&gt;&lt;/a&gt; &lt;a href=&quot;http://www.transparentuptime.com/2010/04/atlassian-has-security-breach-responds.html&quot;&gt;&lt;span class=&quot;Apple-style-span&quot;&gt;in&lt;/span&gt;&lt;/a&gt; &lt;a href=&quot;http://www.transparentuptime.com/2010/04/transparent-censorship.html&quot;&gt;&lt;span class=&quot;Apple-style-span&quot;&gt;action&lt;/span&gt;&lt;/a&gt;. What we haven&#39;t yet talked about is...&lt;b&gt;why the heck does transparency work?&lt;/b&gt; Why does transparency make your users happier? Why do customers trust you more when you are transparent? Why do we want to know what&#39;s going on? What allows us to be OK with major problems by simply knowing what is going on? My theory is simple: &lt;b&gt;Transparency gives us a sense of control, and control is required for happiness.&lt;/b&gt; Allow me to elaborate.&lt;/p&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;Downtime and learned helplessness&lt;/span&gt;&lt;/b&gt;&lt;div&gt;&lt;blockquote&gt;&lt;i&gt;The concept of learned helplessness was developed in the 1960s and 1970s by Martin Seligman at the University of Pennsylvania. He found that animals receiving electric shocks, which they had no ability to prevent or avoid, were unable to act in subsequent situations where avoidance or escape was possible. Extending the ramifications of these findings to humans, Seligman and his colleagues found that &lt;b&gt;human motivation [...] is undermined by a lack of control over one&#39;s surroundings&lt;/b&gt;. (&lt;a href=&quot;http://findarticles.com/p/articles/mi_g2602/is_0003/ai_2602000349/&quot;&gt;&lt;span class=&quot;Apple-style-span&quot;&gt;source&lt;/span&gt;&lt;/a&gt;)&lt;/i&gt;&lt;/blockquote&gt; &lt;/div&gt;&lt;div&gt;Learned helplessness was discovered by accident when Seligman was researching &lt;a href=&quot;http://en.wikipedia.org/wiki/Pavlovian&quot;&gt;&lt;span class=&quot;Apple-style-span&quot;&gt;Pavlovian conditioning&lt;/span&gt;&lt;/a&gt;. His experiment was set up to associate a tone with a (harmless) shock, to test whether the animal would learn to run away from just the sound of the tone. In the now famous experiment, one group of dogs was restrained and unable to escape the shock for a period of time (i.e. this group had no control over its situation). Later this group was placed into an area that now allowed them to escape the shock; unexpectedly the dogs stayed put. The shocks continued to come, yet the dogs simply curled up in the corner and whimpered. These dogs exhibited depression, and in a sense gave up on life, because these negative events were seemingly random. Seligman &lt;a href=&quot;http://en.wikipedia.org/wiki/Learned_helplessness&quot;&gt;&lt;span class=&quot;Apple-style-span&quot;&gt;concluded&lt;/span&gt;&lt;/a&gt; that &quot;the strongest predictor of a depressive response was lack of control over the negative stimulus.&quot; &lt;b&gt;What is downtime if not a lack of control over a negative stimulus?&lt;/b&gt;&lt;/div&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://www.flyfishingdevon.co.uk/salmon/year2/psy221depression/dog-shuttle-box.gif&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 362px; height: 208px;&quot; src=&quot;http://www.flyfishingdevon.co.uk/salmon/year2/psy221depression/dog-shuttle-box.gif&quot; border=&quot;0&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;The Cloud and loss of control&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;Many concerns come up when businesses consider the cloud, but as the &lt;a href=&quot;http://blogs.idc.com/ie/wp-content/uploads/2009/12/idc_cloud_challenges_2009.jpg&quot;&gt;survey by IDC&lt;/a&gt; below shows the overriding concern is rooted in a loss of control:&lt;/div&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://img.skitch.com/20100720-gmskcih2puabjb3kfkac3rxdus.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 500px;&quot; src=&quot;http://img.skitch.com/20100720-gmskcih2puabjb3kfkac3rxdus.png&quot; border=&quot;0&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;div&gt;You give up a lot of control in exchange for reduced cost, higher efficiency, and increased flexibility. Yet that that desire for control persists, and the remaining bits of control you maintain become even more valuable.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;Downtime kills our sense of control&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;Downtime is quite simply a negative event over which you have almost no control. Especially when using SaaS/cloud services your remaining semblance of control vanishes as soon as service goes down and you have no insight into what is going on. &lt;i&gt;&lt;b&gt;We&lt;/b&gt;&lt;/i&gt;&lt;b&gt; are the dogs trapped in the shock machine, whimpering in the corner.&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://jaydixit.com/wordpress/wp-content/uploads/2009/01/sad-dog.jpg&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px;&quot; src=&quot;http://jaydixit.com/wordpress/wp-content/uploads/2009/01/sad-dog.jpg&quot; border=&quot;0&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As I described in &lt;a href=&quot;http://www.transparentuptime.com/2010/06/video-of-my-talk-upside-of-downtime-at.html&quot;&gt;my talk&lt;/a&gt;, downtime is inevitable. Thanks to things like risk homeostasis, black swan events, unknown unknowns, and our own nature, there is no way to avoid failure. All we can do is prepare for it, and communicate/explain what is going on. And that is the key to keeping us from the fate of a depressed canine.&lt;b&gt; Transparency gives us a sense of control over the uncontrollable.&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large; &quot;&gt;How transparency gives us back the sense of control&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;Imagine walking through the park, the sun shining, the birds singing. All of a sudden you notice a strong pain in your arm. Your mind jumps to the worst. Are you having a heart attack, did something just bite you, are you getting older and sicker? Then a split second later you remember...your buddy jokingly punched you earlier in the day! The punch must have been harder than you remember, but it explains the pain. Instantly you feel better.&lt;b&gt; Though the pain is the same, you understand the source. You have an explanation for the pain. Transparency delivers that explanation.&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;When Amazon goes down, or Gmail isn&#39;t loading, users feel pain. Part of that pain comes from the inconvenience of not being able to do what you want to get done, or the lost revenue that comes with downtime. But just as painful is the sense of fatalistic helplessness, especially if someone is breathing down your neck expecting you to fix the problem. Without insight into what is happening with the service, you are completely without control. If on the other hand the service provides an explanation, through a &lt;a href=&quot;http://www.transparentuptime.com/2008/11/rules-for-successful-public-health.html&quot;&gt;public health dashboard&lt;/a&gt; or a &lt;a href=&quot;http://www.transparentuptime.com/2010/02/github-system-status-page-most-fun.html&quot;&gt;status blog&lt;/a&gt; or a &lt;a href=&quot;http://www.transparentuptime.com/2010/04/zendesk-transparency-in-action.html&quot;&gt;simple tweet&lt;/a&gt;, your fatalistic reaction turns to concrete concern. Your mind goes from assuming the worst (e.g. this service is terrible, they don&#39;t know what they are doing, it always fails) to focusing on a real and specific problem (e.g. some hard drive in the datacenter failed, they had some user error, this&#39;ll be gone soon). A specific problem is fixable, an unexplained pain is not. Transparency brings the pain down to a specific and knowable problem, while also holding the provider accountable for their issues (which indirectly gives you even more control). Or better said:&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;&lt;i&gt;Seligman believes it is possible to change people&#39;s explanatory styles to replace learned helplessness with &quot;learned optimism.&quot; To combat (or even prevent) learned helplessness in both adults and children, he has successfully used techniques similar to those used in cognitive therapy with persons suffering from depression. These include &lt;b&gt;identifying negative interpretations of events, evaluating their accuracy, generating more accurate interpretations, and decatastrophizing (countering the tendency to imagine the worst possible consequences for an event)&lt;/b&gt;. (&lt;a href=&quot;http://findarticles.com/p/articles/mi_g2602/is_0003/ai_2602000349/&quot;&gt;source&lt;/a&gt;)&lt;/i&gt;&lt;/blockquote&gt;&lt;/div&gt;&lt;div&gt;By providing a sense of control, transparency is one of the keys to keeping us happy, productive, and sane in an increasingly uncontrollable world.&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/7576914822538721816/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/07/why-transparency-works.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/7576914822538721816'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/7576914822538721816'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/07/why-transparency-works.html' title='Why Transparency Works'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-5678245838894242717</id><published>2010-07-08T13:13:00.000-07:00</published><updated>2010-07-09T08:56:07.636-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="facebook"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>Facebook and transparency</title><content type='html'>&lt;span class=&quot;Apple-style-span&quot; &gt;As some of you may know, I use Facebook as an example of how not-to-do-transparency in &lt;/span&gt;&lt;a href=&quot;http://www.slideshare.net/lennysan/the-upside-of-downtime-velocity-2010-4564992&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;my&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt; &lt;/span&gt;&lt;a href=&quot;http://www.youtube.com/watch?v=6MF2Pu6IW3Q&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;talk&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;. Immediately following my talk at &lt;/span&gt;&lt;a href=&quot;http://en.oreilly.com/velocity2010/public/schedule/detail/12605&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Velocity&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;, I received the following comment from Bret Taylor (CTO of Facebook):&lt;/span&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://img.skitch.com/20100708-j7nfwx5sym9pg8rj6eeud684b4.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 694px; height: 188px;border:1px solid #000&quot; src=&quot;http://img.skitch.com/20100708-j7nfwx5sym9pg8rj6eeud684b4.png&quot; border=&quot;0&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;The &lt;/span&gt;&lt;a href=&quot;http://developers.facebook.com/live_status&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&quot;Platform Live Status&quot;&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt; page that is mentioned is such:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://img.skitch.com/20100708-xksau32hebs6jd97ntbu85kk8.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 600px;&quot; src=&quot;http://img.skitch.com/20100708-xksau32hebs6jd97ntbu85kk8.png&quot; border=&quot;0&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;There&#39;s some really good stuff here (e.g. it exists, it looks up-to-date, and it has some great features). There is also a lot of room for improvement. Putting aside the fact that this wasn&#39;t meant to be a fully-featured dashboard, and is far better then nothing, lets run their status page through the &lt;/span&gt;&lt;a href=&quot;http://www.transparentuptime.com/2008/11/rules-for-successful-public-health.html&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;rules for a successful public health dashboard&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt; and see what we can advise for the next evolution of Facebook&#39;s transparency initiative:&lt;/span&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Rule #1: Must show the current status for each &quot;service&quot; you offer&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Today the status page only gives the status of various services through plain text. For example, at the time of this writing, the &lt;/span&gt;&lt;i&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&quot;hello, active and total user counts are currently missing from both public profile pages and the API.&quot; &lt;/span&gt;&lt;/i&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;The two graphs to the right show API response time and error rate across all API functions, not per-API or per-function area. Showing a graph and/or status light for each API/function would add tremendous value for developers that use specific parts of the application and only need to know about those specific areas. It would also make it easier to automate functionality, and to decide which components can be relied on in your architecture.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Recommendation: &lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;A graph and status light for each specific API function and end-point that developers may use. See &lt;/span&gt;&lt;a href=&quot;http://code.google.com/status/appengine&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Google&#39;s health dashboard&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt; for ideas.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Rule #2: Data must be accurate and timely&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;From the outside this appears to be solid. My big worry is that updates are currently very manual, which isn&#39;t going to scale. I haven&#39;t watched the site long enough to gauge how timely the updates are, but let&#39;s give them the benefit of the doubt. The main reason for this rule, requiring that your data be accurate and timely, comes down to trust. If your users get a hint of inaccuracy or delays in updates, they lose faith in the tool and stop using it. Your &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;users will resort back to emailing/tweeting/complaining, which&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt; defeats the entire purpose.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Recommendation&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;: Automate status updates as much as possible. Set up regular monitoring that posts status changes automatically. Create a formal process that requires someone to post a detailed update within a Minimum-Time-To-Communicate. &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Rule #3&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;: Must be easy to find&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;This may be the biggest problem today with Facebook&#39;s status page. I&#39;ve been collecting public heath blogs/dashboards for a couple years now, and I&#39;ve never come across it. Google&#39;ing for &quot;&lt;/span&gt;&lt;a href=&quot;http://www.google.com/search?sourceid=chrome&amp;amp;ie=UTF-8&amp;amp;q=facebook+uptime&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;facebook uptime&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&quot; or &quot;&lt;/span&gt;&lt;a href=&quot;http://www.google.com/search?sourceid=chrome&amp;amp;ie=UTF-8&amp;amp;q=facebook+status&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;facebook status&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&quot; does not help. There are &lt;/span&gt;&lt;a href=&quot;link:http://developers.facebook.com/live_status&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;over 100 links to the page&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;, but most are from deep within developer forums. If Facebook is serious about using transparency to their advantage, this page needs to be linked to from the first place that developers would go when they experience issues with the API.&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Recommendation&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;: Link to the status page from &lt;/span&gt;&lt;a href=&quot;http://developers.facebook.com/&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;here&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt; and &lt;/span&gt;&lt;a href=&quot;http://forum.developers.facebook.com/&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;here&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;. Not being a Facebook developer, I&#39;m not the best judge of this, but I&#39;m sure Facebook has plenty of data to figure this out.&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span style=&quot;font-weight: bold; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Rule #4:&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt; Must provide details for events in real time&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;We discussed this already, but it&#39;s very important, especially for API-based developers. The error rate graph is very useful for this, which appears to be real-time. I would do more with it.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Recommendations&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;: Show error rate per API/function (including the types of issues seen), and show historical information to give an impression of what&#39;s &quot;normal&quot;. Developer mostly need to know who is at fault. If you simply let them know that something is up on your end, they&#39;ll feel a lot better and be able to go on with their day. See &lt;/span&gt;&lt;a href=&quot;http://trust.salesforce.com/trust/status/&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;trust.salesforce.com&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt; for an easy way so integrate basic updates into dashboard (click on an error icon).&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Rule #5&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;: Provide historical &lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;uptime&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt; and performance data&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Mostly lacking in this area. The graphs only go back to the start of the current day, and the text status-updates go back about 2 weeks. A historical perspective gives new developers a baseline to go by, and gives existing developers a chance to correlate issues they saw on their end.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Recommendation&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;: See &lt;/span&gt;&lt;a href=&quot;http://status.opensrs.com/&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;OpenSRS&#39;s dashboard&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt; for a simple way to do historical uptime/performance by service/API. Clicking on the &quot;archive&quot; link shows you past updates for every service.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Rule #6&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;: Provide a way to be notified of status changes&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Facebook is actually doing a great job here. They have both an RSS feed and an email option, which is extremely rare and extremely awesome. This allows developers to be pushed updates, and to integrate the updates into your internal dashboards. Great job here.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Recommendations&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;: None!&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Rule #7&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;: Provide details on how the data is gathered&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;b&gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Currently customers have no insight into how the API response/errors are measured, and what the policy around status updates is. Is it ad-hoc, is it comprehensive, is it automated? It&#39;s hard to rely on this data today without insight into those policies and processes.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Recommendation&lt;/span&gt;&lt;/b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;: Add an explanation to the bottom of the page, or as a link off of the page, going into some of these details. You don&#39;t have to reveal your special sauce, just give us confidence that we can rely on this data.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: 15px; line-height: 19px; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Bonus&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;The list of top bugs along the left side is a GREAT idea. This takes transparency to another level, and I would highly encourage other sites to adopt this practice. Developers are the target audience for both health issue and outstanding bugs, so why not combine them &lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;(along with the &quot;Developer Updates&quot; feed) into a single dashboard ? Brilliant.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;I like how the &quot;Current Status&quot; is broken out into a big yellow box at the top, making it clear what the situation is right now. This is much better than the default approach of showing the latest status as simply the top news item in the chronological list. A nice touch.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;Conclusion&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium;&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;The most important takeaway is that Facebook has taken the hardest step toward transparency: getting a status blog/dashboard online. If they were to implement some of the recommendations above, they would see more of the&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: medium; line-height: 19px; &quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt; &lt;/span&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/07/benefits-of-transparency.html&quot;&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;&lt;span class=&quot;Apple-style-span&quot; &gt;benefits that come with transparency&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class=&quot;Apple-style-span&quot; &gt;, and set a great example for other development platforms.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/5678245838894242717/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/07/facebook-and-transparency.html#comment-form' title='21 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/5678245838894242717'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/5678245838894242717'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/07/facebook-and-transparency.html' title='Facebook and transparency'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>21</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-3473671177504590945</id><published>2010-07-01T15:35:00.000-07:00</published><updated>2010-07-02T21:21:22.763-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>Benefits of Transparency</title><content type='html'>&lt;div&gt;I thought it would be helpful to consolidate a list of the primary benefits of web sites/services being transparent online. If there are any I missed, please leave a comment and I&#39;ll update the list:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-size: large;&quot;&gt;Benefits of Transparency (for online websites and services)&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;b&gt;1. Build trust with your users&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;2. Increase loyalty, reduce churn&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;3. Improve perception of your reliability&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;4. Reduce support costs&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;5. Control the message&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;6. Gain a competitive advantage&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;7. More time to focus on the actual problem&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;8. Reduce stress&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;9. Learn&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;See below for more detail...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;1. Build trust with your users&lt;/b&gt;&lt;/div&gt;&lt;div&gt;Your users have a pretty low bar for how they expect to be treated. They basically expect you to screw them, hide information from them, and do the bare minimum to take their money. If you do something good for them, something unexpected like admit that you have problems proactively, and show your humanity, your users will develop a sense of trust for your service and your company. I believe that trust may be the most important asset you can earn on the web, especially if you deal things that are really important to your customers (e.g. money, email, photos, etc.).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;Example: If the car company does a recall as soon as there is a hint of a problem, you trust them a lot more then if they are forced to do a recall after a number of deaths. &lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The more times you are proactive and admit to problems before you are caught, the stronger the sense of trust gets. If you are instead forced to admit your problems, or your customers complain before you tell them that you are aware of the problem, the harder it gets to convince them that you know what you are doing and that you care about the quality of the service.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;2. Increase loyalty, reduce churn&lt;/b&gt;&lt;/div&gt;&lt;div&gt;Your users don&#39;t expect you to be perfect. They will forgive you when you have a problem. But only if they feel that they can trust you, that you know what you are doing, and that things are improving. Your users will stick with you if they feel like you know what you are doing, that you feel their pain, that you are taking these issues seriously. Apologizing and explaining after the fact is much more difficult. It is hard to convince your customers that you know what you are doing and that you care about their issues if you avoid the problem, or worse pretend that it doesn&#39;t exist.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;i&gt;Example: &lt;/i&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/04/atlassian-has-security-breach-responds.html&quot;&gt;&lt;i&gt;Atlassian&#39;s security breach&lt;/i&gt;&lt;/a&gt;&lt;i&gt; a few months ago...they could have lost a lot of concerned customers questioning their is trustworthy. Instead they increased loyalty and trust by being up front about the situation, explaining what they are doing about it, and improving for the future. If instead the issue was exposed independently, they would have seen a mass exodus.&lt;/i&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A major downtime event is innately going to lead to unhappy customers. You may as well try to turn it around into something worthwhile, and try to keep as many customers as you can. A nice side benefit is that the more your users learn to trust you, the more loyal and forgiving they become. It&#39;s a powerful loop that you want to get on the right side of. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;3. Improve perception of your reliability&lt;/b&gt;&lt;/div&gt;&lt;div&gt;When users run into a problem with your service, whether it&#39;s their fault or yours, they&#39;ll often assume the wrong is on your end. If you instead show them exactly when you are actually having problems, and if you do this reliably and consistently, they&#39;ll know when you really have problems, and end up seeing that you aren&#39;t down as often as they thought. It&#39;s ironic that the more open you are about how often you have a problem, the less often your users will think you really are down.&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;Example: A complex web applications made up of many components, say using Google App Engine, the Foursquare API, and Google ads. You get alerted about a timeout issue...will you assume that Google is at fault or one of the other components. A quick visit to &lt;/i&gt;&lt;a href=&quot;http://code.google.com/status/appengine&quot;&gt;&lt;i&gt;Google&#39;s public dashboard&lt;/i&gt;&lt;/a&gt;&lt;i&gt; would show you that they are perfectly fine, and that the problem lies with one of the other services (which need their own public dashboards). &lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;4. Reduce support costs&lt;/b&gt;&lt;/div&gt;&lt;div&gt;During a downtime incident your support department gets flooded with the same type of question...&quot;I&#39;m seeing a problem, what&#39;s going on?&quot; and &quot;Is the site down or is it just me?&quot;. If you can allow your customers to serve themselves, or make it easy for your support department to point complaints to a single succinct explanation, they can operate much more efficiently, and focus on higher level issues.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Also, a lot of times support doesn&#39;t even know what&#39;s going on during a downtime event, and having something to check themselves gives them more insight into the health of the system&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;Example: Amazon Web Services barely has support. They have a &lt;/i&gt;&lt;a href=&quot;http://aws.amazon.com/premiumsupport/&quot;&gt;&lt;i&gt;paid support service&lt;/i&gt;&lt;/a&gt;&lt;i&gt;, and their &lt;/i&gt;&lt;a href=&quot;http://developer.amazonwebservices.com/connect/forumindex.jspa&quot;&gt;&lt;i&gt;forums&lt;/i&gt;&lt;/a&gt;&lt;i&gt;, but otherwise there is very little real-time support. They can do this because they have a &lt;/i&gt;&lt;a href=&quot;http://status.aws.amazon.com/&quot;&gt;&lt;i&gt;real-time public health dashboard&lt;/i&gt;&lt;/a&gt;&lt;i&gt; that addresses 90% of the questions users are going to have in their day-to-day use of the service.&lt;br /&gt;&lt;/i&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;5. Control the message&lt;/b&gt;&lt;/div&gt;&lt;div&gt;If you don&#39;t tell your users what&#39;s going on during an event, they are going to speculate and assume the worst. They&#39;ll assume you aren&#39;t aware of the problem, that it&#39;ll last a long time, and that you&#39;re not taking it seriously. Even a simple update telling users that you are aware of the problem and are working on it gives them confidence that this isn&#39;t going to be the end of the company, and that you feel their pain.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;Example: Users of Twitter experience on-and-off issues, but they can always tell how healthy the service is as a whole by visiting their &lt;/i&gt;&lt;a href=&quot;http://dev.twitter.com/status&quot;&gt;&lt;i&gt;public dashboard&lt;/i&gt;&lt;/a&gt;&lt;i&gt; and &lt;/i&gt;&lt;a href=&quot;http://status.twitter.com/&quot;&gt;&lt;i&gt;status blog&lt;/i&gt;&lt;/a&gt;&lt;i&gt;. They don&#39;t have to wonder how far-reaching the downtime is, or how long it&#39;ll last.&lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;6. Gain a competitive advantage&lt;/b&gt;&lt;/div&gt;&lt;div&gt;All else being equal, when prospects are comparing your service to a competitor, especially when your service is critical to their own life/business, being able to tell a story about being transparent and open is a powerful differentiator. It gives your prospect a feeling of control, that they won&#39;t be left in the dark when the sh** hits the fan and their boss is breathing down their neck.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;7. More time to focus on the actual problem&lt;/b&gt;&lt;/div&gt;&lt;div&gt;Especially for a small company, you can spend more time dealing with resolving the issue and less time fielding calls/emails. The better your process, the less you have to worry about beyond fixing the actual problem.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;8. Reduce stress&lt;/b&gt;&lt;/div&gt;&lt;div&gt;With a defined process, ideally one that is procedural, you keeps people from freaking out and having to scramble at the worst possible time. The last thing you want to be doing during a downtime event is figuring out who can say what, and how to actually contact your entire customer base about a potential problem.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;font-weight: normal; &quot;&gt;&lt;div&gt;&lt;b&gt;9. Learn&lt;/b&gt;&lt;/div&gt;&lt;div&gt;As noted by a comment by Heather Leson in the original post, disasters are an opportunity to help both customers and company staff share in the learning process. The more open you are about your issues, the more opportunity you&#39;ll have in both learning from your customers that may have had similar experiences, and the more your customers will learn from your experience. You aren&#39;t alone. Your customers have a vested interest in helping you succeed. You may be surprised by how forthcoming they are with advice and recommendations for your situation. Google App Engine ended up adding new features after a major downtime event, no doubt based on customer feedback. Amazon added their public health dashboard after one too many outages. As Heather put it, &quot;Mutual success is one of the cornerstones of open source/open web organizations.&quot;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/3473671177504590945/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/07/benefits-of-transparency.html#comment-form' title='31 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/3473671177504590945'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/3473671177504590945'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/07/benefits-of-transparency.html' title='Benefits of Transparency'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>31</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-715463700971039362</id><published>2010-06-30T16:55:00.001-07:00</published><updated>2010-06-30T16:59:25.841-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="amazon"/><category scheme="http://www.blogger.com/atom/ns#" term="downtime"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>Quote in WSJ</title><content type='html'>&lt;div&gt;In &lt;a href=&quot;http://online.wsj.com/article/SB10001424052748704103904575337430326186048.html?KEYWORDS=amazon&quot;&gt;today&#39;s issue of the Wall Street Journal&lt;/a&gt;:&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;&lt;i&gt;Lenny Rachitsky, the head of research and development for the website monitoring company Webmetrics.com, said companies can take advantage of unexpected outages by communicating with customers about what is going on—something Amazon didn&#39;t do during the outage, beyond its note to sellers. &quot;Customers don&#39;t expect you to be perfect, as long as they feel that they can trust you,&quot; he said. &quot;All it takes is to give your users some sense of control.&quot;&lt;/i&gt;&lt;/blockquote&gt;&lt;/div&gt;&lt;div&gt;A similar sentiment was posited by &lt;a href=&quot;http://blogs.barrons.com/techtraderdaily/2010/06/30/update-amazon-back-to-normal-so-wheres-the-explanation/?utm_source=feedburner&amp;amp;utm_medium=feed&amp;amp;utm_campaign=Feed:+barrons/techtraderdaily/feed+(BARRONS.com+Blog:+Tech+Trader+Daily)&quot;&gt;Eric Savitz over at Barrons&lt;/a&gt;:&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;&lt;i&gt;So, here’s the thing: it seems to me that Amazon actually made a bad situation worse by failing to communicate the details of the situation with its customers. My little post Tuesday afternoon on the technical troubles triggered 149 comments, and counting. The company’s customers did not like having the site go down, and even more, they did not like being left in the dark. And so far, the company still has not come clean on what went wrong. Some of the people who commented on my previous post were worried that their personal data might have been compromised. I have no real reason to think that was the case, but it certainly seems odd to me that Amazon has taken what appear to be a defensive and closed-mouth stance on an issue so basic to its customers: the ability to simply use the site. Jeff Bezos, your customers deserve better.&lt;/i&gt;&lt;/blockquote&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/715463700971039362/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/06/quote-in-wsj.html#comment-form' title='48 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/715463700971039362'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/715463700971039362'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/06/quote-in-wsj.html' title='Quote in WSJ'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>48</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-1415271123813086474</id><published>2010-06-29T15:54:00.000-07:00</published><updated>2010-06-29T17:01:34.383-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="amazon"/><category scheme="http://www.blogger.com/atom/ns#" term="downtime"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><title type='text'>Amazon.com goes down, good case study of consumer-facing transparency (or lack thereof)</title><content type='html'>&lt;div style=&quot;text-align: left;&quot;&gt;One of the questions I received from the audience after &lt;a href=&quot;http://www.transparentuptime.com/2010/06/video-of-my-talk-upside-of-downtime-at.html&quot;&gt;my talk&lt;/a&gt; last week was about how B2C companies should handle downtime and transparency. Today we have a great case study, as &lt;a href=&quot;http://www.datacenterknowledge.com/archives/2010/06/29/performance-issues-for-amazon-com/&quot;&gt;Amazon.com was down/degraded for about three hours&lt;/a&gt;:&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;&lt;i&gt;You often hear about Amazon Web Services having some downtime issues, but it’s rare to see Amazon.com itself have major issues. In fact, I can’t ever remember it happening the past couple of years. But that’s very much the case today as &lt;b&gt;for the past couple of hours the service has been switching back and forth between being totally down and being up&lt;/b&gt;, but showing no products. (&lt;a href=&quot;http://techcrunch.com/2010/06/29/amazon-down-2/&quot;&gt;source&lt;/a&gt;)&lt;/i&gt;&lt;/blockquote&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; &quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; &quot;&gt;&lt;img src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6Aq8Iu53AQQE9tNl-eJYkPJJHPt77WC3M_bC8fztwYyRr7oYgkPsrvdl8f3FUhDBdBS5KddA0DPwBGqTEP4vJqk8hGgXHIUTc9CMMhzij8rgUBLI5Q8h6efXeeaJCgVFEanptGHqRXT4/s400/Twitter+_+dan+kagan_+Amazon+is+down.+I+cannot+....png&quot; border=&quot;0&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5488339682386508434&quot; style=&quot;display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 400px; height: 111px; &quot; /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; &quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; &quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;The telling quote, and impression that appears to be prevalent across Twitter and other blogs that have picked up the story is this:&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;&lt;i&gt;Obviously, Twitter is abuzz about this — &lt;b&gt;though there’s no word from Amazon&lt;/b&gt; on Twitter yet about the downtime. Amazon Web Services, &lt;b&gt;meanwhile, all seem to be a go, according to their dashboard&lt;/b&gt;. The mobile apps on the iPhone, iPad and Android devices are sort of working, but it doesn’t appear you can go to actual product pages.&lt;/i&gt;&lt;/blockquote&gt;&lt;/div&gt;&lt;div&gt;Let&#39;s think about this from the perspective of the customer. They visit Amazon.com and see this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgpBS2lCPwl5wxrO5rLmrooRuIWIV988GY4Vq9JtZJPWLtrl1TvXoqyM-RCE7bUPbW2xO1DgXtZsJENHlv9CWePPmkJXT_BJqOButQx83E0pcWUfSVdYl2zMgIU9_0hmFmHlgOBTTqwPJo/s1600/Amazon.com_+Amazon.com.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 130px;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgpBS2lCPwl5wxrO5rLmrooRuIWIV988GY4Vq9JtZJPWLtrl1TvXoqyM-RCE7bUPbW2xO1DgXtZsJENHlv9CWePPmkJXT_BJqOButQx83E0pcWUfSVdYl2zMgIU9_0hmFmHlgOBTTqwPJo/s320/Amazon.com_+Amazon.com.png&quot; border=&quot;0&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5488340158902695042&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;They wonder what&#39;s going on. They question whether something is wrong with their computer. If they are technical enough they may visit the &lt;a href=&quot;https://twitter.com/amazon&quot;&gt;Amazon&#39;s Twitter account&lt;/a&gt; to see if there is anything going on (a whole lot of nothing):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUlfZOtHRWumVoxmZqcJ4lWq_u40dQ8W4ulVzZYa2J0iOj96Hql_Y9feNZEHOS1axquLSIlLkmixwXm9XCJrXr4fHZ4sMFU0fWkY575_yAkczYk4qdhUS5mofYlg9szHXJiGEglOYSAYI/s1600/Amazon+(amazon)+on+Twitter-1.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 291px;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUlfZOtHRWumVoxmZqcJ4lWq_u40dQ8W4ulVzZYa2J0iOj96Hql_Y9feNZEHOS1axquLSIlLkmixwXm9XCJrXr4fHZ4sMFU0fWkY575_yAkczYk4qdhUS5mofYlg9szHXJiGEglOYSAYI/s400/Amazon+(amazon)+on+Twitter-1.png&quot; border=&quot;0&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5488340753177787570&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Maybe the visitor is even more technical, and knows about the &lt;a href=&quot;http://status.aws.amazon.com/&quot;&gt;public health dashboard that Amazon offers&lt;/a&gt; for their AWS clients. Well, that again gives us the wrong impression (all green lights):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyUr3dbFkof_2Uf2f2buzY8guTaHn8HzsA6eRjUVCLx_fLL4EUQBijvPQfAZolKi9AMn9TGEOvJReB4VK5E-6HQ-WGenEkkzN2E9JN9KPvY_jI5uEvdtb_nQhRdl4OAAxkEscWg2Tdero/s1600/AWS+Service+Health+Dashboard+-+Jun+29,+2010.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 305px;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyUr3dbFkof_2Uf2f2buzY8guTaHn8HzsA6eRjUVCLx_fLL4EUQBijvPQfAZolKi9AMn9TGEOvJReB4VK5E-6HQ-WGenEkkzN2E9JN9KPvY_jI5uEvdtb_nQhRdl4OAAxkEscWg2Tdero/s400/AWS+Service+Health+Dashboard+-+Jun+29,+2010.png&quot; border=&quot;0&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5488341309135142658&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;At this point the user is frustrated. She may hop on Twitter and search for something like &lt;a href=&quot;http://search.twitter.com/search?q=amazon+down&amp;amp;result_type=recent&quot;&gt;&quot;amazon down&quot;&lt;/a&gt;, which would show her that a lot of other people are also having the same problem. This would at least make her feel better. Otherwise she would be stuck, wondering what is going on, how long it&#39;ll last, and whether to try shopping someplace else.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZ_GaHgIgfrTY3vL-ReySOgJnW8Vlq9P8xy9hxK4wwMhO75Z7n1lI7f1CL5WFoRawIZKXMBN78jcB1uEL28PTfYFS-TTTG_LfWY11FN-Tsc6jKTpiFoVd4SsBVo1hFZHNoVTMAgjwK51M/s1600/Twitter+_+Rachel_+Amazon,+why+must+you+be+do+....png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 141px;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZ_GaHgIgfrTY3vL-ReySOgJnW8Vlq9P8xy9hxK4wwMhO75Z7n1lI7f1CL5WFoRawIZKXMBN78jcB1uEL28PTfYFS-TTTG_LfWY11FN-Tsc6jKTpiFoVd4SsBVo1hFZHNoVTMAgjwK51M/s400/Twitter+_+Rachel_+Amazon,+why+must+you+be+do+....png&quot; border=&quot;0&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5488342886319695874&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It turns out that Amazon did in fact put out an update about what was going on...in the well hidden &lt;a href=&quot;http://www.amazonsellercommunity.com/forums/thread.jspa?threadID=187232&amp;amp;tstart=0&quot;&gt;Amazon services seller forum&lt;/a&gt;:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_Ldk-ciwOICxhQ5JvPXw-Bg4ShX6A7qQx-JKvXQBb4tVs6l3J9UlxvB0K1RLARxn8DCWgoSiuyilnM-k6zT_rXtcCn8kubyl4dKlf6JfZFEL6yyQXWfXcHS1VKUcqm2ErYzpTZbZl4uw/s1600/Amazon+Seller+Community_+Amazon.com+Site+Latencies+...-1.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 161px;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_Ldk-ciwOICxhQ5JvPXw-Bg4ShX6A7qQx-JKvXQBb4tVs6l3J9UlxvB0K1RLARxn8DCWgoSiuyilnM-k6zT_rXtcCn8kubyl4dKlf6JfZFEL6yyQXWfXcHS1VKUcqm2ErYzpTZbZl4uw/s400/Amazon+Seller+Community_+Amazon.com+Site+Latencies+...-1.png&quot; border=&quot;0&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5488349353752547250&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Realistically, Amazon doesn&#39;t go down very often, and for most people this is more of an annoyance than anything. I don&#39;t see Amazon customers losing trust in Amazon as a result of his incident. As Jesse Robbins put it:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxoGjDLTR0zwsLLlbfd8ma-OQRpvpxhv94j55gaCj-zXYbU-HCJWp_Jr__pbleCFtiMxY4F-O4MSbQIU_bNJejKyLSVioqeG6OuXzirFW65wIGCs4FN93PGCrBkhu_6WWK0x2IyYWN4Ng/s1600/Twitter+_+Jesse+Robbins_+Amazon+operates+one+of+the+....png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 161px;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxoGjDLTR0zwsLLlbfd8ma-OQRpvpxhv94j55gaCj-zXYbU-HCJWp_Jr__pbleCFtiMxY4F-O4MSbQIU_bNJejKyLSVioqeG6OuXzirFW65wIGCs4FN93PGCrBkhu_6WWK0x2IyYWN4Ng/s400/Twitter+_+Jesse+Robbins_+Amazon+operates+one+of+the+....png&quot; border=&quot;0&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5488344152418878178&quot; /&gt;&lt;/a&gt;They key here is that now Amazon has a lot less room for error. One more major downtime like this, especially within the year, will begin to eat away at the trust that customers have built for the service. To be proactive in avoiding that problem, and to give themselves more room for error, I would strongly advise Amazon to do the following:&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Put some sort of communication out within 24 hours acknowledging the issues.&lt;/li&gt;&lt;li&gt;Put out a &lt;a href=&quot;http://www.transparentuptime.com/2010/03/guideline-for-postmortem-communication.html&quot;&gt;detailed postmortem&lt;/a&gt;, explaining what happened, and what they are doing to improve for the future.&lt;/li&gt;&lt;li&gt;Improve your process around updating the public about amazon.com downtime. The Twitter account is a good start, and it&#39;s very promising that you put out a communication to the public. The problem is that the places your users looked for updates they saw nothing, and the forum you posted to very few users would ever think to check. I would launch a new public health dashboard focused on overall Amazon.com health (and make sure to host this outside of your infrastructure!), which would include the AWS health as a subset (or a simply link), along with other increasingly important elements of your company: Kindle download health, shipping health, etc.&lt;/li&gt;&lt;li&gt;Implement the improvements discussed in the postmortem. &lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Other takeaways&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;I&#39;m feeling that transparency in the B2C world is rarely as critical as in B2B relationships. There are certainly cases where consumers are just as inconvenienced and frustrated when their services are down, but in terms of impact and revenue loss, the bar has to be much higher for B2B businesses. I also believe that consumers are much more forgiving of downtime, and won&#39;t require as much from a company when they go down. This will change however as consumers become more dependent on the cloud for their everyday lives.&lt;/li&gt;&lt;li&gt;Amazon set the bar high for their AWS transparency. Users of those services automatically checked the existing communication channels, which is what you would want. Unfortunately Amazon did not set up a process to connect those two parts of the company.&lt;/li&gt;&lt;li&gt;This also exposed the problem with having different processes and tools for different parts of your organization. Ideally there would be a central place for status across the entire amazon.com property. It&#39;s understandable that AWS is doing things a bit differently, but the consequence as we saw was that users waste time looking at the wrong place. This is something &lt;a href=&quot;http://www.transparentuptime.com/2010/01/advice-for-rackspace-on-communicating.html&quot;&gt;Rackspace has trouble with as well&lt;/a&gt;.&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/1415271123813086474/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/06/amazoncom-goes-down-good-case-study-of.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/1415271123813086474'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/1415271123813086474'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/06/amazoncom-goes-down-good-case-study-of.html' title='Amazon.com goes down, good case study of consumer-facing transparency (or lack thereof)'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6Aq8Iu53AQQE9tNl-eJYkPJJHPt77WC3M_bC8fztwYyRr7oYgkPsrvdl8f3FUhDBdBS5KddA0DPwBGqTEP4vJqk8hGgXHIUTc9CMMhzij8rgUBLI5Q8h6efXeeaJCgVFEanptGHqRXT4/s72-c/Twitter+_+dan+kagan_+Amazon+is+down.+I+cannot+....png" height="72" width="72"/><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-1094282040186053324</id><published>2010-06-28T08:53:00.000-07:00</published><updated>2010-06-28T09:11:42.278-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="downtime"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><category scheme="http://www.blogger.com/atom/ns#" term="velocity"/><title type='text'>Video of my talk (Upside of Downtime) at Velocity 2010</title><content type='html'>Video of my talk has been posted (below), though watching it and listening to myself feels pretty damn weird. I&#39;ve been blown away by response I&#39;ve gotten to this talk. I know of at handful of companies circulating these slides/notes internally and working to make their companies more transparent. I&#39;ve personally heard from a number of people at the conference that were discussing the ideas with their coworkers thinking about the best approach to take action. Even Facebook (the example I used of how not to handle downtime) has found resonance with the talk, and pointed me to a little known &lt;a href=&quot;http://developers.facebook.com/live_status&quot;&gt;status page&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I&#39;m hoping to start a conversation around &lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4KpqGeunV5EBiBq0ClK_pFvqNF210S5yRzi6DVqhDc5j8X89iuLntE98gZY_alHEDSOzw34P7zdb8ywKvEercyLJptnESm1-4D69ExJN3S0j71bkmlGMpQC7DCuY1zC4Rpz9m55mNhjc/s1600/Preparation+Framwork.png&quot;&gt;the framework&lt;/a&gt; and continue to evolve it. I&#39;m going to expand on the ideas in this blog, so if there is anything specific you would like me to explore (e.g. hard ROI, B2C examples, cultural differences, etc), please let me know.&lt;br /&gt;&lt;br /&gt;Enjoy the video:&lt;br /&gt;&lt;embed src=&quot;http://blip.tv/play/AYHpmmEC&quot; type=&quot;application/x-shockwave-flash&quot; width=&quot;480&quot; height=&quot;300&quot; allowscriptaccess=&quot;always&quot; allowfullscreen=&quot;true&quot;&gt;&lt;/embed&gt;&lt;br /&gt;&lt;br /&gt;The slides can be found here: &lt;a href=&quot;http://www.slideshare.net/lennysan/the-upside-of-downtime-velocity-2010-4564992&quot;&gt;http://www.slideshare.net/lennysan/the-upside-of-downtime-velocity-2010-4564992&lt;/a&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/1094282040186053324/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/06/video-of-my-talk-upside-of-downtime-at.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/1094282040186053324'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/1094282040186053324'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/06/video-of-my-talk-upside-of-downtime-at.html' title='Video of my talk (Upside of Downtime) at Velocity 2010'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-4018306656136264626</id><published>2010-06-23T16:00:00.000-07:00</published><updated>2010-06-23T17:01:37.682-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="downtime"/><category scheme="http://www.blogger.com/atom/ns#" term="operations"/><category scheme="http://www.blogger.com/atom/ns#" term="transparency"/><category scheme="http://www.blogger.com/atom/ns#" term="velocity"/><title type='text'>The Upside of Downtime (Velocity 2010)</title><content type='html'>Here is the full deck from my talk at Velocity, including two bonus sections at the end:&lt;br /&gt;&lt;div style=&quot;width: 425px;&quot; id=&quot;__ss_4564992&quot;&gt;&lt;strong style=&quot;display: block; margin: 12px 0pt 4px;&quot;&gt;&lt;a href=&quot;http://www.slideshare.net/lennysan/the-upside-of-downtime-velocity-2010-4564992&quot; title=&quot;The Upside of Downtime (Velocity 2010)&quot;&gt;The Upside of Downtime (Velocity 2010)&lt;/a&gt;&lt;/strong&gt;&lt;object id=&quot;__sse4564992&quot; width=&quot;425&quot; height=&quot;355&quot;&gt;&lt;param name=&quot;movie&quot; value=&quot;http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=velocity-upsideofdowntime-100621102257-phpapp01&amp;amp;rel=0&amp;amp;stripped_title=the-upside-of-downtime-velocity-2010-4564992&quot;&gt;&lt;param name=&quot;allowFullScreen&quot; value=&quot;true&quot;&gt;&lt;param name=&quot;allowScriptAccess&quot; value=&quot;always&quot;&gt;&lt;embed name=&quot;__sse4564992&quot; src=&quot;http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=velocity-upsideofdowntime-100621102257-phpapp01&amp;amp;rel=0&amp;amp;stripped_title=the-upside-of-downtime-velocity-2010-4564992&quot; type=&quot;application/x-shockwave-flash&quot; allowscriptaccess=&quot;always&quot; allowfullscreen=&quot;true&quot; width=&quot;425&quot; height=&quot;355&quot;&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;script src=&quot;http://b.scorecardresearch.com/beacon.js?c1=7&amp;amp;c2=7400849&amp;amp;c3=1&amp;amp;c4=&amp;amp;c5=&amp;amp;c6=&quot;&gt;&lt;/script&gt;&lt;script src=&quot;http://b.scorecardresearch.com/beacon.js?c1=7&amp;amp;c2=7400849&amp;amp;c3=1&amp;amp;c4=&amp;amp;c5=&amp;amp;c6=&quot;&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;Also, here is the &quot;Upside of Downtime Framework&quot; cheat-sheet (click through to download):&lt;br /&gt;&lt;/div&gt;&lt;script src=&quot;http://b.scorecardresearch.com/beacon.js?c1=7&amp;amp;c2=7400849&amp;amp;c3=1&amp;amp;c4=&amp;amp;c5=&amp;amp;c6=&quot;&gt;&lt;/script&gt;&lt;script src=&quot;http://b.scorecardresearch.com/beacon.js?c1=7&amp;amp;c2=7400849&amp;amp;c3=1&amp;amp;c4=&amp;amp;c5=&amp;amp;c6=&quot;&gt;&lt;/script&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4KpqGeunV5EBiBq0ClK_pFvqNF210S5yRzi6DVqhDc5j8X89iuLntE98gZY_alHEDSOzw34P7zdb8ywKvEercyLJptnESm1-4D69ExJN3S0j71bkmlGMpQC7DCuY1zC4Rpz9m55mNhjc/s1600/Preparation+Framwork.png&quot;&gt;&lt;img style=&quot;text-align: left; display: block; margin: 0px auto 10px; cursor: pointer; width: 400px; height: 300px;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4KpqGeunV5EBiBq0ClK_pFvqNF210S5yRzi6DVqhDc5j8X89iuLntE98gZY_alHEDSOzw34P7zdb8ywKvEercyLJptnESm1-4D69ExJN3S0j71bkmlGMpQC7DCuY1zC4Rpz9m55mNhjc/s400/Preparation+Framwork.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5485360703227431250&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;script src=&quot;http://b.scorecardresearch.com/beacon.js?c1=7&amp;amp;c2=7400849&amp;amp;c3=1&amp;amp;c4=&amp;amp;c5=&amp;amp;c6=&quot;&gt;&lt;/script&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/4018306656136264626/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/06/upside-of-downtime-velocity-2010.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/4018306656136264626'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/4018306656136264626'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/06/upside-of-downtime-velocity-2010.html' title='The Upside of Downtime (Velocity 2010)'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4KpqGeunV5EBiBq0ClK_pFvqNF210S5yRzi6DVqhDc5j8X89iuLntE98gZY_alHEDSOzw34P7zdb8ywKvEercyLJptnESm1-4D69ExJN3S0j71bkmlGMpQC7DCuY1zC4Rpz9m55mNhjc/s72-c/Preparation+Framwork.png" height="72" width="72"/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-3109617042651510248</id><published>2010-06-21T15:17:00.000-07:00</published><updated>2010-06-21T15:27:41.650-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="velocity"/><title type='text'>See you at Velocity 2010!</title><content type='html'>&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://img.skitch.com/20100621-q25ubjidr44pwnfkwtqrhq8k8f.png&quot;&gt;&lt;img style=&quot;float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 239px; height: 245px;&quot; src=&quot;http://img.skitch.com/20100621-q25ubjidr44pwnfkwtqrhq8k8f.png&quot; border=&quot;0&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;Tonight I leave for the (sold out) &lt;a href=&quot;http://en.oreilly.com/velocity2010&quot;&gt;O&#39;Reilly Velocity conference&lt;/a&gt; in Santa Clara, CA. I&#39;ll be presenting &lt;a href=&quot;http://en.oreilly.com/velocity2010/public/schedule/detail/12605&quot;&gt;&quot;The Upside of Downtime: How to Turn a Disaster into an Opportunity&quot;&lt;/a&gt; on Wednesday at 4:35pm. If you&#39;re a reader of this blog and are at the conference, I&#39;d love to meet up! Tweet me &lt;a href=&quot;http://twitter.com/lennysan&quot;&gt;@lennysan&lt;/a&gt; or simply leave a comment here.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As soon as my talk ends, I will be posting the full slide-deck right here on this blog. Stay tuned!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;P.S. If you&#39;re reading this post during my talk, here are some of the links I may or may not be referencing:&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2008/11/rules-for-successful-public-health.html&quot;&gt;7 Keys To a Successful Public Health Dashboard&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/03/guideline-for-postmortem-communication.html&quot;&gt;A Guideline for Postmortem Communication&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/03/google-app-engine-downtime-postmortem.html&quot;&gt;Google App Engine Downtime Postmortem&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2009/01/transparency-can-help-business.html&quot;&gt;How Transparency Can Help your Business&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://www.transparentuptime.com/2010/02/tao-of-web-performance-and-uptime.html&quot;&gt;The Tao of Web Performance and Uptime&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/3109617042651510248/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/06/see-you-at-velocity-2010.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/3109617042651510248'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/3109617042651510248'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/06/see-you-at-velocity-2010.html' title='See you at Velocity 2010!'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-1816025699605904344</id><published>2010-06-10T08:30:00.000-07:00</published><updated>2010-06-10T08:50:53.119-07:00</updated><title type='text'>Quick update (and Velocity preview)</title><content type='html'>Alas this blog has been quite for too long. My pathetic excuse is that I&#39;m channeling the efforts that would normally go to this blog into my &lt;a href=&quot;http://en.oreilly.com/velocity2010/public/schedule/detail/12605&quot;&gt;upcoming talk at Velocity&lt;/a&gt;. To make up for my negligence, here is a sneak peek at the talk:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://img.skitch.com/20100610-xi7gw99rej2828q9brftnkhe17.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px&quot; src=&quot;http://img.skitch.com/20100610-xi7gw99rej2828q9brftnkhe17.png&quot; border=&quot;0&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://img.skitch.com/20100610-e6pkqkw8pyagpnq3b3n9ngafb4.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; &quot; src=&quot;http://img.skitch.com/20100610-e6pkqkw8pyagpnq3b3n9ngafb4.png&quot; border=&quot;0&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://img.skitch.com/20100610-8ce6g26fnmp5t8ny2aa3sy862y.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px;&quot; src=&quot;http://img.skitch.com/20100610-8ce6g26fnmp5t8ny2aa3sy862y.png&quot; border=&quot;0&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://img.skitch.com/20100610-fct4k2ii728hwcw1bjawi5ww5a.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px&quot; src=&quot;http://img.skitch.com/20100610-fct4k2ii728hwcw1bjawi5ww5a.png&quot; border=&quot;0&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://img.skitch.com/20100610-nmj6qj7pj2btn59hn3ap7kn9ck.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px;&quot; src=&quot;http://img.skitch.com/20100610-nmj6qj7pj2btn59hn3ap7kn9ck.png&quot; border=&quot;0&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://img.skitch.com/20100610-jah37fstix8kes9bcn1s3s4cat.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px&quot; src=&quot;http://img.skitch.com/20100610-jah37fstix8kes9bcn1s3s4cat.png&quot; border=&quot;0&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;I will post the entire slide deck here on the blog immediately following the talk. Stay tuned!</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/1816025699605904344/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/06/quick-update.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/1816025699605904344'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/1816025699605904344'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/06/quick-update.html' title='Quick update (and Velocity preview)'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4209465858850350036.post-3981346927263099603</id><published>2010-04-30T08:45:00.000-07:00</published><updated>2010-04-30T08:48:14.686-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="performance"/><category scheme="http://www.blogger.com/atom/ns#" term="web"/><title type='text'>A proposal for new community focused on web performance</title><content type='html'>&lt;div&gt;I&#39;ve been really impressed with the StackExchange platform (&lt;a href=&quot;http://www.stackexchange.com/&quot;&gt;http://www.stackexchange.com&lt;/a&gt;,  made by the same people that run &lt;a href=&quot;http://stackoverflow.com/&quot;&gt;stackoverflow.com&lt;/a&gt;),  and I feel that it could be an extremely effective  platform to host a  web performance focused community. They built the platform from scratch  in order to improve on the innate flaws with regular threaded discussion  boards (e.g. Yahoo forums, Google Groups, phpBB, vBulletin, etc.). More  importantly, the platform walks the line between incentivizing quick  answers (for immediate feedback), and keeping answers from getting  obsolete over time.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;My hope is that this site  becomes an evolving source of definitive answers on web performance best  practices, tips, tool tricks, book recommendations, data exchange, etc.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The  process to make this a reality is:&lt;/div&gt;&lt;div&gt;1. Submit a proposal for  peer review &lt;/div&gt;&lt;div&gt;2. If there is enough support (votes), it moves  on to the next stage.&lt;/div&gt;&lt;div&gt;3. People  that would like to participate in the community (and help manage it)  sign up&lt;/div&gt;&lt;div&gt;4. The details of the community get ironed out  (moderators, name, tags, etc.)&lt;/div&gt;&lt;div&gt;5. It goes public&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I&#39;ve  gone ahead and submitted the initial proposal (step 1):&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;a href=&quot;http://meta.stackexchange.com/questions/5821/proposal-for-stackexchange-site-focused-on-web-site-performance&quot;&gt;&lt;div&gt;http://meta.stackexchange.com/questions/5821/proposal-for-stackexchange-site-focused-on-web-site-performance&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;/a&gt; &lt;div&gt;I&#39;m just here to get the initial ball rolling, but from  here on out it&#39;s going to be all about the greater community. This next  stage, where everyone votes on the proposals, is going to make or break  the concept. It&#39;s already received a good amount of votes, but it&#39;s  going to take a lot more support to push it forward. &lt;b&gt;If you think  this has legs, and can see the value, vote it up!&lt;/b&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.transparentuptime.com/feeds/3981346927263099603/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.transparentuptime.com/2010/04/proposal-for-new-community-focused-on.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/3981346927263099603'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4209465858850350036/posts/default/3981346927263099603'/><link rel='alternate' type='text/html' href='http://www.transparentuptime.com/2010/04/proposal-for-new-community-focused-on.html' title='A proposal for new community focused on web performance'/><author><name>Lenny Rachitsky</name><uri>http://www.blogger.com/profile/09606931718004405443</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='29' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibBCvU9PaIYkwUIqBnkhojHYLW7C0snHPTVlZdZY1C5WGxiLhByPHYscasT22NdwmjQfwLla19G3PgFcc7xphPatOJv7GzIwH_oo_CJAsQr0qZZRlZ1SPurOJiJlc-LO8/s220/Lenny-profile.jpg'/></author><thr:total>2</thr:total></entry></feed>