Cloud Developer Tips by Shlomo Swidler

Using the new awscli with Chef and OpsWorks

Shlomo Swidler — Tue, 05 Nov 2013 15:28:52 +0000

We used to go through so much trouble to manipulate Amazon Web Services resources from the command-line. There were separate AWS command-line tools for each service, and each needed to be downloaded and had unique configuration, and each had its own unique output format. Eric Hammond wrote in detail about this. With the new awscli tool by Mitch Garnaat (of python boto fame) and his team at AWS, we have a single tool that does it all, uniformly, and can be simply configured. Thanks, Amazon!

But what if you need to use the awscli from within a Chef recipe? How do you install and configure the awscli with Chef and OpsWorks? I had this very same question on some recent projects. Here’s how to do this easily.

The awscli cookbook

I created the awscli cookbook to install and configure the awscli command line tools. Add this cookbook to your deployment and include the awscli::default recipe in your run list, and watch it work.

This cookbook supports some common deployment scenarios:

Installing the awscli “plain vanilla”, during the Chef “execute” stage.
Installing the awscli during the Chef “compile” phase, so it can be called from within Chef.
Configuring the awscli’s AWS access credentials.
Configuring multiple configuration profiles.

It does not need to be run on an AWS instance – you can use this cookbook to install the awscli anywhere that you can run Chef.

See the cookbook’s README for more details.

Using awscli with IAM Roles

It’s not obvious from the awscli documentation, but when an instance has an associated IAM Role, the awscli will automatically read its credentials from the instance’s IAM metadata. This will often be the case when you are using OpsWorks to launch your instances. In these circumstances, you don’t need to configure any awscli access credentials. But, make sure you use IAM to grant the IAM Role permissions for the AWS API calls you’ll be making with the awscli.

Why use the awscli from within Chef?

Chef recipes are written in Ruby, and it’s easy to parse and manipulate JSON in Ruby. The awscli outputs its responses in JSON format – so it’s easy to parse those responses into your ruby code. The two make a very handy combination. For example, here is how to wait for a newly-created EBS volume to be available (so you can attach it to an instance without an error):

    require 'json'
    volume_id = "vol-12345678"
    region = "us-east-1"
    command = ""
    case node[:platform]
    when 'debian','ubuntu'
      command << "/usr/local/bin/"
    when 'redhat','centos','fedora','amazon','scientific'
      command << "/usr/bin/"
    end
    command << "aws --region #{region} ec2 describe-volumes --volume-ids #{volume_id} --output json"
    Chef::Log.info("Waiting for volume #{volume_id} to be available")
    loop do
      state=""
      shell = Mixlib::ShellOut.new("#{command} 2>&1")
      shell.run_command
      if !shell.exitstatus
        Chef::Log.fatal(output)
        raise "#{command} failed:" + shell.stdout
      end
      jdoc = JSON.parse(shell.stdout)
      vol_state = jdoc["Volumes"].first["State"]
      Chef::Log.debug("#{volume_id} is #{vol_state}")
      if vol_state=="available"
        break
      else
        Chef::Log.debug("Waiting 5 more seconds for volume #{volume_id} to be ready...")
        sleep 5
      end
    end

The last ten lines show how easy it is to access the awscli output within your ruby code (in your chef recipe).

Fragment of heretofore unknown Tractate of Babylonian Talmud discovered

Shlomo Swidler — Wed, 03 Jul 2013 22:06:04 +0000

Ancient wisdom apparently has much to offer modern cloud application architects. This fragment was discovered in a shadowy basement in the Tel Aviv area of Israel.

See a PDF of the fragment

This finding clearly shows that ancient cloud application architects in the great talmudic academies of Babylon struggled with the transition away from classic databases. At the time, apparently, a widely used solution was known as Urim veTumim (“oracle”). Yet this database was unsuited for reliable use in cloud applications, and the text explores the reasons behind that unsuitability.

Okay, here’s the real story: I created this for a client in 2011, and I was delighted to find it on my computer serendipitously today. It reflects the state of the art at the time. Translation into plain English:

1. Oracle RAC does not run on EC2

2. Achieving Oracle high availability on EC2 is a problem: there is no shared device, and relying on NFS is problematic.

3. The cloud frameworks (enStratus, etc.) do not currently support Oracle.

Do you have what it takes to build and run your own private cloud?

Shlomo Swidler — Mon, 24 Sep 2012 21:04:25 +0000

Building and operating a private cloud ourselves is simple, right? Give our experienced IT folks a budget for hardware and software, a clear mandate, and deadlines, and we’re likely to succeed. That’s what we’ve always done, and it’s worked out tolerably well. Private cloud isn’t so different from running our own virtual machine environment, and we do that already, so why wouldn’t we succeed? I hear this sentiment expressed often so it seems to be a widely held belief. But it’s wrong precisely 97.73 percent of the time – I measured. Unfortunately, too many organizations act on this mistaken belief and experience unnecessary hurt as a result.

My good friend Keith Hudgins made this point cogently in his presentation at the CloudConnect conference in Chicago. Keith is one of the few people who really knows what it takes to build and run a cloud: he was part of the core team with me building KT’s public cloud service in South Korea, and he is currently Senior Cloud Engineer at enStratus where he helps enterprises get their private clouds to work properly. Keith’s presentation was sobering for the audience members, several of whom said they plan to reevaluate their private cloud initiatives as a result.

If you want to succeed at your own private cloud you need to consider these four critical aspects: your users, your competition, your skills, and your technology. Know these four and you have a fighting chance to make it. If not, get help – or prepare for disappointment.

Know your users

Different users need different applications, and your cloud must be designed to support the required applications. Private clouds might have three different types of users:

Regular desktop users, the kind of user most likely found in your Finance, HR, Sales, and Marketing organizations, will need your cloud to support the applications your IT department currently supports already: CRM, accounting, ERP, and business applications your IT department is already familiar with.
Software R&D staff will need your cloud to support the development environment, tools, and deployment processes they’re already using. This group will also likely need training to properly architect and operate reliable applications in a distributed environment.
IT staff will need your cloud to support core capabilities such as backup, directory management, and file sharing.

Ask your users what applications they are using and what they need the cloud to support.

Know your competition

Odds are your users already use public cloud services, and – like it or not – your own private cloud will be competing against those public alternatives. How well do public cloud services meet your users’ needs?

How reliable is their service? How does this compare to your own historical reliability?
How easy are they to use? Do public cloud service customers need multiple levels of corporate approval to acquire resources, or is the ordering and fulfillment process seamless and extremely short – a few minutes at most from order to delivery? Can you deliver the same or better usability?
How good is their support? Are there multiple sources for expertise and communities where users can share solutions and experience?
How rich is their ecosystem? How easy is it to employ a vast variety of tools and seamlessly integrated value-added services?

Determining how well your users’ needs are met by existing public cloud alternatives will give you a good idea of what you need to provide in order to achieve successful adoption of your private cloud.

Know your skills

The skills required to build and operate a cloud are not taught in classes, they’re learned by experience. These include:

System administration. Okay, this one is taught in classes, but the issues you’ll experience in a cloud environment are much more complicated and at a larger scale than you will normally encounter elsewhere. Keith suggested that OpenStack and Eucalyptus will be a challenge for your team if they lack linux administration skills.
Storage management skills are critical because storage is the most complicated technical aspect of a cloud. Keith emphasized the importance of NAS and iSCSI setup and management experience.
Networking is a key element of the cloud and networking expertise must be a part of your core cloud engineering team.
Automation will dictate how quickly you can recover from problems, so you must be experienced in its use and integrating automation tools into your workflow. Identical inputs (that is, identical hardware revisions) must always yield identical outputs, and that is only possible at a large scale with automation. Keith pointed out that automation takes time to produce. Make sure your team is proficient with the automation tools that work best with your technology choices.

Get these skills if you don’t have them. They will influence your choice of hardware and software.

Know your technology

Cloud is powered by technology, so you’ll need expertise in at least the following four technical areas to build and run it yourself.

Cloud Management Software is the layer that provisions cloud guests and sits between the API and the hypervisor. If you can’t explain the differences between the four major cloud management software options – CloudStack, OpenStack, Eucalyptus, and OpenNebula, then you do not have the expertise to choose the right one for your needs.
Storage, Keith pointed out, is by far the most complicated technical aspect to get right in a cloud. Local guest storage is the simplest to build and operate, but limits the range of applications that can be easily supported on the cloud. Attached storage increases flexibility at the cost of complexity. And object storage requires lots and lots of disks. Your options will be influenced by your choice of cloud management software, and any change to the default configurations will require serious development and testing to get it right.
Hardware and the way you manage it is the most important choice you can make. Of all the components in a cloud the hardware is what is going to break, so the ease with which you can replace hardware will dictate the maintenance headache. Keith provided some hard-won tips: make sure your hardware vendor can supply identical hardware down to the BIOS revision, create (automated) procedures to verify or reject hardware as it is delivered, and diligently avoid hardware inconsistencies since they greatly increase the complexity of the endeavor.

Your choices in the above three technology categories should be guided by your needs, but remember: you can only choose wisely if you have appropriate expertise in these technologies.

Do you have what it takes?

The above four aspects are critical to successfully building and operating your own private cloud. There are other success factors – know your strategy, know your business model, and use effective project management – that are part of any successful technological change. But the above four are critical in any private cloud initiative. So, how do you rate on these aspects?

Odds are, you don’t have the expertise required to do it yourself.

Developing Means Delivering

Shlomo Swidler — Tue, 17 Jul 2012 20:39:19 +0000

An informal survey of my application developer friends revealed four main types of motivations driving developers. These four types of motivations are represented in this diagram:

Individual developers can be motivated more or less altruistically. Individual developers can also be focused on the external manifestations of success – such as appreciative customers – or on the internal manifestations – such as more power or money. The diagram above is not a judgement of “better” or “worse” motivations – it is simply meant to capture two personality factors and their expression among individual developers. Also note that developers may have different motivations at different times, or even simultaneously in combination.

Just as individual developers can have varying motivations, organizations can also be more or less altruistic, and focused more internally or externally. Internally focused organizations spend an undue proportion of their energy and resources doing work (or inventing work to do) that will not be visible to the outside world, while externally focused organizations measure their results based on their effect on the outside world – market share, profit, and customer satisfaction.

Where do you rate your organization on the above diagram? Most business leaders want organizations firmly motivated by providing value to the customer, and software businesses are no different. Developing software is all about delivering value. Your software development efforts can only provide value if they are successfully delivered to the customer – and in today’s “as-a-serivce” world, that value is constantly provided via the internet. This means that you should spend significant effort ensuring your customer can actually reach your service to receive the value. It means automating your deliveries. It means measuring and improving your delivery speed and success rates. It means involving your customer early on so that you know the value they seek and so you can provide it. It means making sure all your teams work together to get this to happen.

Because ultimately, it means a developer’s job is not finished until the customer has derived value.

Software Delivery Operations

Shlomo Swidler — Mon, 16 Jul 2012 21:42:56 +0000

Like companies producing any kind of offering, software companies require three elements in order to successfully deliver: knowing what to build, knowing how to build it, and knowing how to deliver it to the customer. When each of these three functions – research, engineering, and delivery – does its job independently, you get slow progress and uninspired, sub-par products. But when all three functions work together in a customer-centric fashion, you get excellence – in implementation, innovation – in concept, and world-class results – by delivering those two to the customer.

With the explosion of internet-based services in the past decade delivery has by necessity become an ongoing activity. And so in recent years much attention has been devoted to integrating development with technical operations – DevOps. DevOps integrates into a single team the skills of how to build the service and the skills of how to deliver it. But, as pointed out earlier, DevOps teams often lack the most critical element: the customer’s involvement. DevOps teams can build and deliver really fast, but don’t know what to build.

Integrating development with the customer-facing functions (marketing, sales, etc.) without technical operations results in an organization that knows what to build and how to build it, but doesn’t know how to deliver it to the customer. This organization is full of ideas – even innovative ideas – and implements them, but they don’t see the light of day. I call this a “feasibility practice” – it can show you what is feasible but it can’t deliver it to customers.

Teams that know what to build and how to deliver it to the customer, but not how to build it, also do not provide value to the customer. Such teams are close to the customer both on the inbound (understanding the customer’s needs) and the outbound (explaining the company’s offering) directions. Marketing sits squarely in this space: it knows what to build and how to deliver but can’t transform that knowledge into value for the customer.

As an executive leading a software delivery effort, make sure your organization occupies the very center so it can understand, build, and deliver the right value to the customer. The customer is here in the center as well: he participates in validating the ideas, critiquing the implementation, and accepting the delivery. It is only here that your organization can achieve excellence, innovation, and world-class results.

The missing layer

Shlomo Swidler — Thu, 12 Jul 2012 12:47:46 +0000

Traditional descriptions of cloud computing and the various cloud operating models – IaaS, PaaS, SaaS – focus on the locus of responsibility for various layers of the system: facilities, network, storage, servers, etc. But these descriptions typically omit a critical element. Can you spot it in the diagram below?

The missing layer is the only layer that really matters – the customer. Ensuring that your application can actually be consumed – that delivery can be successful – is a critical part of providing your value. Your application’s facilities, network, and storage may be running properly, but will your customers actually benefit? Without a properly staffed, trained, equipped, and managed operations team, your service won’t last long enough for customers to care. If your customer is important, then so is your operations team.

Poking Holes in CloudFront-Based Sites for Dynamic Content

Shlomo Swidler — Mon, 14 May 2012 18:21:57 +0000

As of Februrary 2011 AWS S3 has been able to serve static websites, giving you superior availability for unchanging (or seldom-changing) content. But most websites today are not static; dynamic elements drive essential features such as personalized pages, targeted advertisements, and shopping carts. Today’s release from AWS CloudFront: Support for Dynamic Content alleviates some of the challenge of running dynamic websites. You can now configure a custom set of URL patterns to always be passed through to the origin server. This allows you to “poke holes” in the CDN cache for providing dynamic content.

Some web sites, such as this one, appear to be static but are driven by dynamic code. WordPress renders each page on every request. Though excellent tools exist to provide caching for WordPress, these tools still require your web server to process WordPress’s PHP scripts. Heavy traffic or poor hosting choices can still overwhelm your web server.

Poking Holes

It’s relatively easy to configure your entire domain to be served from CloudFront. What do you need to think about when you poke holes in a CloudFront distribution? Here are two important items: admin pages and form actions.

Admin pages

The last thing you want is for your site’s control panel to be statically served. You need an accurate picture of the current situation in order to manage your site. In WordPress, this includes everything in the /wp-admin/* path as well as the /wp-login.php page.

Form actions

Your site most likely does something with the information people submit in forms – search with it, store it, or otherwise process it. If not, why collect it? In order to process the submitted information you need to handle it dynamically in your web application, and that means the submit action can’t lead to a static page. Make sure your form submission actions – such as search and feedback links – pass through to the webserver directly.

A great technique for feedback forms is to use WuFoo, where you can visually construct forms and integrate them into your website by simple Javascipt. This means that your page can remain static – the Javascript code dynamically inserts the form, and WuFoo handles the processing, stops the spam, and sends you the results via email.

When Content Isn’t So Dynamic

Sometimes content changes infrequently – for example, your favicon probably changes rarely. Blog posts, once written, seldom change. Serving these items from a CDN is still an effective way to reduce load on your webserver and reduce latency for your users. But when things do change – such as updated images, additional comments, or new posts, how can you use CloudFront to serve the new content? How can you make sure CloudFront works well with your updated content?

Object versioning

A common technique used to enable updating static objects is called object versioning. This means adding a version number to the file name, and updating the link to the file when a new version is available. This technique also allows an entire set of resources to be versioned at once, when you create a versioned directory name to hold the resources.

Object versioning works well with CloudFront. In fact, it is the recommended way to update static resources that change infrequently. The alternative method, invalidating objects, is more expensive and difficult to control.

Combining the Above Techniques

You can use a combination of the above techniques to create a low-latency service that caches sometimes-dynamic content. For example, a WordPress blog could be optimized by integrating these techniques into the WordPress engine, perhaps via a plugin. Here’s what you’d do:

Create a CloudFront distribution for the site, setting its custom origin to point to the webserver.
Poke holes in the distribution necessary for the admin, login, and forms pages.
Create new versions of pages, images, etc. when they change, and new versions of the pages that refer to them.

Even though WordPress generates each page via PHP, this collection of techniques allows the pages to be served via CloudFront and also be updated when changes occur. I don’t know of a plugin that combines all these techniques, but I suspect the good folks at W3-EDGE, producers of the W3 Total Cache performance optimization framework I mentioned above, are already working on it.

Scalability and HA Limitations of AWS Marketplace AMIs

Shlomo Swidler — Wed, 25 Apr 2012 14:54:24 +0000

Reading AWS’s recent announcement of the AWS Marketplace you would think that it provides a catalog of click-to-deploy, highly-available, scalable applications running on EC2. You’d be partially right: the applications available in the AWS Marketplace are deployable in only a few clicks. But highly-available and scalable services will be difficult to build using Marketplace images. Here’s why.

Essential Ingredients of HA and Scalability on AWS

AWS makes it easy to run scalable, HA applications via several features. Not all applications use all of these features, but it would be very difficult to provide scalable and highly available service without using at least one of these:

Elastic Load Balancing
Auto Scaling
Elastic Block Storage volumes

ELB and AutoScaling both enable horizontal scalability: spreading load and controlling deployment size via first-class-citizen tools integrated into the AWS environment. They also enable availability by providing an automated way to recover from the failure of individual instances. [Scalability and availability often move in lock-step; improving one usually improves the other.] EBS volumes provide improved data availability: data can be retrieved off of dying instances – and often are used in RAID configurations to improve write performance.

AWS Marketplace Limitations

The AWS Marketplace has limitations that cripple two of the above features, making highly available and scalable services much more difficult to provide.

Marketplace AMI instances cannot be added to an ELB

Update 17 May 2012: The Product Manager for AWS Marketplace informed me that AWS Marketplace instances are now capable of being used with ELB. This limitation no longer exists.

~~Try it. You’ll get this error message:~~

Error: InvalidInstance: ElasticLoadBalancing does not support the paid AMI or supported AMI of instance i-10abf677.

~~There is no mention of this limitation in the relevant ELB documentation.~~

This constraint severely limits horizontal scalability for Marketplace AMIs. Without ELB it’s difficult to share web traffic to multiple identically-configured instances of these AMIs. The AWS Marketplace offers several categories of AMIs, including Application Stacks (RoR, LAMP, etc.) and Application Servers (JBoss, WebSphere, etc.), that are typically deployed behind an ELB – but that won’t work with these Marketplace AMIs.

Root EBS volumes of Marketplace AMI instances cannot be mounted on non-root devices

Because all Marketplace AMIs are EBS-backed, you might think that there is a quick path to recover data if the instance dies unexpectedly: simply attach the root EBS volume to another device on another instance and get the data from there. But don’t rely on that – it won’t work. Here is what happens when you try to mount the root EBS volume from an instance of a Marketplace AMI on an another instance:

Failed to attach EBS volume 'New-Mongo-ROOT-VOLUME' (/dev/sdj) to 'New-Mongo' due to: OperationNotPermitted: 'vol-98c642f7' with Marketplace codes may not be attached as a secondary device.

This limitation is described here in AWS documentation:

If a volume has an AWS Marketplace product code:

The volume can only be attached to the root device of a stopped instance.

You must be subscribed to the AWS Marketplace code that is on the volume.

The configuration (instance type, operating system) of the instance must support that specific AWS Marketplace code. For example, you cannot take a volume from a Windows instance and attach it to a Linux instance.

AWS Marketplace product codes will be copied from the volume to the instance.

Closing a Licensing Loophole

Why did AWS place these constraints on using Marketplace-derived EBS volumes? To help Sellers keep control of the code they place into their AMI. Without the above limitations it’s simple for the purchaser of a Marketplace AMI to clone the root filesystem and create as many clones of that Marketplace-derived instance without necessarily being licensed to do so and without paying the premiums set by the Seller. It’s to close a licensing loophole.

AWS did a relatively thorough job of closing that hole. Here is a section of the current (25 April 2012) AWS overview of the EC2 EBS API and Command-Line Tools, with relevant Marketplace controls highlighted:

Command and API Action Description

ec2-create-volumeCreateVolume Creates a new Amazon EBS volume using the specified size or creates a new volume based on a previously created snapshot. Any AWS Marketplace product codes from the snapshot are propagated to the volume. For an overview of the AWS Marketplace, go to https://aws.amazon.com/marketplace/help/200900000. For details on how to use the AWS Marketplace, see AWS Marketplace.

ec2-attach-volumeAttachVolume Attaches the specified volume to a specified instance, exposing the volume using the specified device name. A volume can be attached to only a single instance at any time. The volume and instance must be in the same Availability Zone. The instance must be in the running or stoppedstate.

Note

If a volume has an AWS Marketplace product code:

The volume can only be attached to the root device of a stopped instance.

You must be subscribed to the AWS Marketplace code that is on the volume.

The configuration (instance type, operating system) of the instance must support that specific AWS Marketplace code. For example, you cannot take a volume from a Windows instance and attach it to a Linux instance.

AWS Marketplace product codes will be copied from the volume to the instance.

For an overview of the AWS Marketplace, go to https://aws.amazon.com/marketplace/help/200900000. For details on how to use the AWS Marketplace, see AWS Marketplace.

ec2-detach-volumeDetachVolume Detaches the specified volume from the instance it’s attached to. This action does not delete the volume. The volume can be attached to another instance and will have the same data as when it was detached. If the root volume is detached from an instance with an AWS Marketplace product code, then the AWS Marketplace product codes from that volume will no longer be associated with the instance.

ec2-create-snapshotCreateSnapshot Creates a snapshot of the volume you specify. After the snapshot is created, you can use it to create volumes that contain exactly the same data as the original volume. When a snapshot is created, any AWS Marketplace product codes from the volume will be propagated to the snapshot.

ec2-modify-snapshot-attributeModifySnapshotAttribute Modifies permissions for a snapshot (i.e., who can create volumes from the snapshot). You can specify one or more AWS accounts, or specify all to make the snapshot public.

Note

Snapshots with AWS Marketplace product codes cannot be made public.

The constraints above are meant to maintain the AWS Marketplace product code, the mechanism AWS uses to identify resources (AMIs, snapshots, volumes, and instances) that require Marketplace licensing integration. Note that not all AMIs in the AWS Marketplace have a product code – for example, the Amazon Linux AMI does not have one. AMIs that do not require licensing control (such as Amazon Linux, and Ubuntu without support) do not have AWS Marketplace product codes – but the rest do.

A Hole

There remains a hole in this lockdown scheme. Any instance whose kernel allows booting from a volume based on its volume label can be manipulated into booting from a secondary EBS volume. This requires root privileges on the instance. I have successfully booted an instance of the MongoDB AMI in the AWS Marketplace from a secondary EBS volume created from the Amazon Linux AMI. Anyone exploiting this hole can circumvent the product code lockdown.

Plugging the Hole

Sellers want these licensing controls and lockdowns. Here’s how:

Disable the root account.
Disable sudo.
Prevent user-data from being executed. On the Amazon Linux AMI and Ubuntu AMIs, user-data beginning with a hashbang is executed as root during the startup sequence.

Unfortunately these mitigations result in a crippled instance. Users won’t be able to mount EBS volumes – which requires root access – so data can’t be stored on EBS volumes for better recoverability.

Alternatively, you could develop your AWS Marketplace solutions as SaaS applications. For many potential Sellers this would be a long-term effort.

I’m still looking for good ways to enable scalability and HA of Marketplace AMIs. I welcome your suggestions.

Update 27 April 2012: Amazon Web Services PR has contacted me to say they are actively working on a fix for the ELB limitations, and are also working on removing the limitation related to mounting Marketplace-derived EBS volumes on secondary devices. I’ll update this article when this happens. In the meantime, AWS said that users who want to recover data from Marketplace-derived EBS volumes should reach out to AWS Support for help.

Update 17 May 2012: The Product Manager for AWS Marketplace informed me that AWS Marketplace instances are now capable of being used with ELB.

Ten^H^H^H Many Cloud App Design Patterns

Shlomo Swidler — Sun, 08 May 2011 21:54:45 +0000

Today I presented at the Enterprise Cloud Summit at the Interop conference. The talk was officially entitled Ten Cloud Design Patterns, but because my focus is on the application, I re-titled it. And I mention more than ten patterns, hence the final title Many Cloud App Design Patterns.

I explore a number of important issues for cloud applications. Application state and behavior must both be scaled. Availability is dependent on MTTR and MTTF – but one of them is much more important than the other in the cloud. Single Points of Failure are the nemesis of availability, and I walk through some typical patterns for reducing SPOFs in the cloud.

Hat tips to Jonas Bonér for his inspiring presentation Scalability, Availability, Stability Patterns and to George Reese for his blog The AWS Outage: The Cloud’s Shining Moment.

Here’s the presentation – your comments are welcome.

Many Cloud App Design Datterns

View more presentations from Shlomo Swidler.

Roundup: CloudConnect 2011 Platforms and Ecosystems BOF

Shlomo Swidler — Mon, 14 Mar 2011 16:44:47 +0000

The need for cloud provider price transparency. What is a workload and how to move it. “Open”ness and what it means for a cloud service. Various libraries, APIs, and SLAs. These are some of the engaging discussions that developed at the Platforms and Ecosystems “Birds of a Feather”/”Unconference”, held on Tuesday evening March 8th during the CloudConnect 2011 conference. What about the BOF worked? What didn’t? What should be done differently in the future? Below are some observations gathered from early feedback; please leave your comments, too.

Roundup

In true unconference form, the discussions reflected what was on the mind of the audience. Some were more focused than others, and some were more contentious than others. Each turn of the wheel brought a new combination of experts, topics, themes, and participants.

Provider transparency was a hot subject, both for IaaS services and for PaaS services. When you consume a utility, such as water, you have a means to independently verify your usage: you can look at the water meter. Or, if you don’t trust the supplier’s meter, you can install your own meter on your main intake line. But with cloud services there is no way to measure many kinds of usage that you pay for – you must trust the provider to bill you correctly. How many Machine Hours did it take to process my SimpleDB query, or CPU Usage for my Google App Engine request? Is that internal IP address I’m communicating with in my own availability zone (and therefore free to communicate with) or in a different zone (and therefore costs money)? Today, the user’s only option is to trust the provider. Furthermore, it would be useful if we had tools to help estimate the cost of a particular workload. We look forward to more transparency in the future.

As they rotated through the topics, two of the themes generated similar discussions: Workload Portability and Avoiding Vendor Lock-in. The themes are closely related, so this is not surprising. Lesson learned: next time use themes that are more orthogonal, to explore the ecosystem more thoroughly.

In total nine planned discussions took place over the 90 minutes. A few interesting breakaway conversations spun off as well, as people opted to explore other aspects separately from the main discussions. I think that’s great: it means we got people thinking and engaged, which was the goal.

Some points for improvement: The room was definitely too small and the acoustics lacking. We had a great turnout – over 130 people, despite competing with the OpenStack party – but the undersized room was very noisy and some of the conversations were difficult to follow. Next time: a bigger room. And more pizza: the pizza ran out during the first round of discussions.

Participants who joined after the BOF had kicked off told me they were confused about the format. It is not easy to join in the middle of this kind of format and know what’s going on. In fact, I spent most of the time orienting newcomers as they arrived. Lesson learned: next time show a slide explaining the format, and have it displayed prominently throughout the entire event for easy reference.

Overall the BOF was very successful: lots of smart people having interesting discussions in every corner of the room. Would you participate in another event of this type? Please leave a comment with your feedback.

Thanks

Many thanks to the moderators who conducted each discussion, and the experts who contributed their experience and ideas. These people are: Dan Koffler, Ian Rae, Steve Wylie, David Bernstein, Adrian Cole, Ryan Dunn, Bernard Golden, Alex Heneveld, Sam Johnston, David Kavanagh, Ben Kepes, Tony Lucas, David Pallman, Jason Read, Steve Riley, Andrew Shafer, Jeroen Tjepkema, and James Urquhart. Thanks also to Alistair Croll not only for chairing a great CloudConnect conference overall, but also for inspiring the format of this BOF.

And thanks to all the participants – we couldn’t have done it without you.