<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" gd:etag="W/&quot;CkUFR3Y-fip7ImA9WhRaE0U.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716</id><updated>2012-02-16T00:16:56.856-08:00</updated><category term="copper" /><category term="RiotCloud" /><category term="recipes" /><category term="news" /><title>GridCentric Blog</title><subtitle type="html">High-performance virtualization.</subtitle><link rel="http://schemas.google.com/g/2005#feed" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/posts/default" /><link rel="alternate" type="text/html" href="http://blog.gridcentriclabs.com/" /><link rel="next" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default?start-index=26&amp;max-results=25&amp;redirect=false&amp;v=2" /><author><name>David Scannell</name><uri>http://www.blogger.com/profile/14713329443907634842</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><generator version="7.00" uri="http://www.blogger.com">Blogger</generator><openSearch:totalResults>26</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/gridcentric" /><feedburner:info uri="gridcentric" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry gd:etag="W/&quot;Dk8HRXc7eCp7ImA9WhZVFUw.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-88633153206907523</id><published>2011-05-27T09:47:00.000-07:00</published><updated>2011-05-27T09:47:14.900-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2011-05-27T09:47:14.900-07:00</app:edited><title>Creating VMs with JavaScript and Node</title><content type="html">&lt;h2&gt;Seriously, JavaScript?&lt;/h2&gt;&lt;p&gt;&lt;a href="http://nodejs.org/"&gt;Node&lt;/a&gt; is a framework built around the speedy &lt;a href="http://code.google.com/p/v8/"&gt;V8&lt;/a&gt; JavaScript interpreter.  It enables you to use JavaScript to create back-end services for your web application (or any other application). I/O is based around the same event-driven model that is familiar to front-end JavaScript programmers. Putting aside any claims about the inherent performance benefits of that model, it's powerful to use the same language in all parts of a web application and I have no doubt that Node will grow to stand next to the big players (Rails, Django, Struts) in the world of web frameworks.&lt;/p&gt;&lt;p&gt;&lt;a href="http://gridcentric.com/products/copper"&gt;Copper&lt;/a&gt; is focused on enabling dynamic and flexible services on the back-end. We empower programmers and administrators with a programmatic model for scaling applications, embedded within the control flow of the applications. Given that Node is an upcoming web framework, a few weekends ago I embarked on the fun exercise of creating some GridCentric V8 bindings.&lt;/p&gt;&lt;p&gt;When I co-founded a systems company a couple years ago I would never have guessed that I'd write a line of JavaScript, but technology brings us to unexpected places.&lt;/p&gt;&lt;p&gt;This post has two simple goals.&lt;br /&gt;
&lt;ol&gt;&lt;li&gt;To enable &lt;a href="http://downloads.gridcentriclabs.com/doxygen/html/index.html"&gt;GridCentric API&lt;/a&gt; calls in Node applications.&lt;/li&gt;
&lt;li&gt;To use those bindings in a nifty demo, creating VMs on-demand to scale an application.&lt;/li&gt;
&lt;/ol&gt;&lt;/p&gt;&lt;p&gt;For a nifty demo, I've chosen to take the standard &lt;a href="http://chat.nodejs.org/"&gt;Node chat demo&lt;/a&gt; and make it &lt;b&gt;scale automatically&lt;/b&gt; as users join chat rooms, &lt;i&gt;using only about a hundred lines of JavaScript&lt;/i&gt;.&lt;/p&gt;&lt;h1&gt;&lt;a name="extension"&gt;&lt;/a&gt;Building the extension&lt;/h1&gt;&lt;p&gt;To skip the details of building the extension, click &lt;a href="#demo"&gt;here&lt;/a&gt; to jump straight to the demo.&lt;/p&gt;&lt;h3&gt;Starting points for V8&lt;/h3&gt;&lt;p&gt;If you want to write a Node extension, without a doubt the best place to start is the useful &lt;a href="https://www.cloudkick.com/blog/2010/aug/23/writing-nodejs-native-extensions/"&gt;blog post from cloudkick&lt;/a&gt;. Figuring out where to go next once you've exhausted their example however, is a bit tricky.  I found that the &lt;a href="http://code.google.com/apis/v8/embed.html"&gt;V8 embedder's guide&lt;/a&gt; was mostly useless, but maybe you'll have a different experience. I think the best way to learn more is by looking at the Node source and the source for other native extensions (a relatively complete list of extensions can be found &lt;a href="https://github.com/joyent/node/wiki/modules"&gt;here&lt;/a&gt;).&lt;/p&gt;&lt;h3&gt;Synchronous bindings&lt;/h3&gt;&lt;p&gt;Let's start with the easy stuff.&lt;/p&gt;&lt;p&gt;Simple, synchronous function bindings for Node are quite straight-forward. Many of our API functions can be considered non-blocking (reading a value from the kernel, equivalent to a system call) so they are quite simple to wrap. For example, the C function below will return the current &lt;tt&gt;vmid&lt;/tt&gt;.&lt;br /&gt;
&lt;pre class="brush: c"&gt;#include &amp;lt;gridcentric/gc-guest.h&amp;gt;

int func() {
  return gc_vmid();
}
&lt;/pre&gt;In the JavaScript world, we really want to this look like:&lt;br /&gt;
&lt;pre class="brush: c"&gt;var gridcentric = require("gridcentric");
var vmid = gridcentric.vmid();
&lt;/pre&gt;To get those semantics, we can take the C example above and wrap it in some V8 voodoo.&lt;br /&gt;
&lt;pre class="brush: cpp"&gt;#include &amp;lt;node/v8.h&amp;gt;
#include &amp;lt;node/node.h&amp;gt;

#include &amp;lt;gridcentric/gc-guest.h&amp;gt;

using namespace node;
using namespace v8;

static Handle&amp;lt;Value&amp;gt; VmId(const Arguments&amp; args)
{
    HandleScope scope;
    Local&amp;lt;Integer&amp;gt; result = Integer::New(gc_vmid());
    return scope.Close(result);
}
&lt;/pre&gt;The V8 function is complete.&lt;/p&gt;&lt;p&gt;Before we have a usable extension however, we must bind this function appropriately within the native module. When Node loads a native extension, it executes the &lt;tt&gt;init&lt;/tt&gt; symbol, passing in a variable representing the scope created for the module.  Using a &lt;tt&gt;FunctionTemplate&lt;/tt&gt; wrapper, we define an &lt;tt&gt;init&lt;/tt&gt;&amp;#8224; symbol that does the appropriate binding within the module.&lt;br /&gt;
&lt;pre class="brush: cpp"&gt;extern "C" {
  void init (Handle&amp;lt;Object&amp;gt; target)
  {
    Local&amp;lt;FunctionTemplate&gt; vmid = FunctionTemplate::New(VmId);
    target-&amp;gt;Set(String::NewSymbol("vmid"), vmid-&amp;gt;GetFunction());
  }
}
&lt;/pre&gt;&amp;#8224; &lt;font size="-1"&gt;The &lt;tt&gt;init&lt;/tt&gt; function must be wrapped in an &lt;tt&gt;extern "C"&lt;/tt&gt; declaration to prevent &lt;tt&gt;g++&lt;/tt&gt; from name mangling.&lt;/font&gt;&lt;/p&gt;&lt;p&gt;Almost done -- we only need the build script.  To build the module with our &lt;tt&gt;vmid&lt;/tt&gt; function, we first create a &lt;tt&gt;wscript&lt;/tt&gt; file (assuming that our source file is &lt;tt&gt;src/gridcentric.cc&lt;/tt&gt;) used by the Node build tool &lt;tt&gt;node-waf&lt;/tt&gt;.&lt;br /&gt;
&lt;pre class="brush: python"&gt;def set_options(opt):
  opt.tool_options("compiler_cxx")

def configure(conf):
  conf.check_tool("compiler_cxx")
  conf.check_tool("node_addon")

def build(bld):
  obj = bld.new_task_gen("cxx", "shlib", "node_addon")
  obj.cxxflags = ["-Wall"]
  obj.ldflags = ["-lgridcentric"]
  obj.target = "gridcentric"
  obj.source = "src/gridcentric.cc"
&lt;/pre&gt;Finally, we run &lt;tt&gt;node-waf configure &amp;&amp; node-waf build&lt;/tt&gt; to build our extension.&lt;/p&gt;&lt;p&gt;We now have a basic &lt;tt&gt;gridcentric&lt;/tt&gt; module, and the following code works.&lt;br /&gt;
&lt;pre class="brush: javascript"&gt;var gridcentric = require("./build/default/gridcentric");
var vmid = gridcentric.vmid();
console.log("My vmid is " + vmid);
&lt;/pre&gt;&lt;/p&gt;&lt;p&gt;Now we can move on to adding some meat to the module.&lt;/p&gt;&lt;h3&gt;Understanding non-blocking operations&lt;/h3&gt;&lt;p&gt;Node event-driven semantics require that any function doing significant work (i.e. I/O) be structured using callbacks. Much of the heavy-lifting of a binding is caused by the need to restructure calls to your library using the callback mechanism provided by Node.&lt;/p&gt;&lt;p&gt;This can be a bit of a pain, but fortunately many of the functions in our guest bindings were well-suited to the asynchronous callback style required by Node (and I think that it's generally not &lt;i&gt;too&lt;/i&gt; difficult to find a nice mapping).  For example, the request ticket operation may take a few hundred milliseconds to make the round-trip to the scheduler, allocate the requested resources and return the result.  Similarly, the clone operation may take seconds, but in a complex control flow you'll likely need to be doing other things during that time.&lt;/p&gt;&lt;p&gt;With our C bindings, the request ticket function call looks like:&lt;br /&gt;
&lt;pre class="brush: c"&gt;#include&amp;lt;gridcentric/gc-guest.h&amp;gt;
...
gc_uuid_t ticket;
if( gc_request_ticket(1, 1, 1, 1000, &amp;ticket) &lt; 0 ) {
   perror("Couldn't request ticket");
}
&lt;/pre&gt;
If we were to translate directly into JavaScript using a synchronous style this might look like:
&lt;pre class="brush: javascript"&gt;var gridcentric = require('gridcentric');
...
ticket = gridcentric.request_ticket(1, 1, 1, 1000);
if( ticket ) {
  console.log("Successfully allocated ticket " + ticket + ".");
} else {
  console.log("Unable to allocate ticket in 1000 milliseconds.");
}
&lt;/pre&gt;But because the &lt;tt&gt;request_ticket&lt;/tt&gt; operation will &lt;b&gt;block&lt;/b&gt; up to 1000 milliseconds in this case, this function doesn't conform to the non-blocking semantics required by Node.&lt;/p&gt;&lt;p&gt;Instead, we must structure this function to use an asynchronous callback when the ticket request is completed, as follows:
&lt;pre class="brush: javascript"&gt;var gridcentric = require('gridcentric');
...
gridcentric.request_ticket(1, 1, 1, 1000, function(ticket) {
  if( ticket ) {
    console.log("Successfully allocated ticket " + ticket + ".");
  } else {
    console.log("Unable to allocate ticket in 1000 milliseconds.");
  }
});
&lt;/pre&gt;Notice that we pass in a function as the last parameter. This function will be called asynchronously with the return value of the &lt;tt&gt;request_ticket&lt;/tt&gt; function after it has completed. We don't have any guarantees about when that function will be executed.&lt;/p&gt;&lt;p&gt;Once this style is adopted for all functions, we can easily see how to chain operations using closures.  For example, in order to &lt;a href="http://blog.gridcentriclabs.com/2010/10/how-do-i-clone-thee-let-me-count-ways.html"&gt;fork()&lt;/a&gt; the VM we can extend the above:
&lt;pre class="brush: javascript"&gt;var gridcentric = require('gridcentric');
...
gridcentric.request_ticket(1, 1, 1, 1000, function(ticket) {
  if( ticket ) {
    console.log("Successfully allocated ticket " + ticket + ".");
    gridcentric.clone(ticket, function(vmid) {
       if( vmid &gt; 0 ) {
           console.log("On a clone VM.");
       } else if( vmid == 0 ) {
           console.log("Still on the master VM.");
       } else {
           console.log("Error during clone operation.");
       }
    });
  } else {
    console.log("Unable to allocate ticket in 1000 milliseconds.");
  }
});
&lt;/pre&gt;&lt;/p&gt;&lt;h3&gt;Implementing callbacks&lt;/h3&gt;&lt;p&gt;Before implementing these functions, you'll notice above that my simple example did not require any arguments.  The first thing that I will do is define a number of processor macros to sanity check passed in arguments.
&lt;pre class="brush: cpp"&gt;#define REQUIRE(I, ISTYPE, CASTTYPE, NAME)                     \
  if( args.Length() &lt;= (I) || !args[I]-&gt;Is##ISTYPE() )         \
    return ThrowException(Exception::TypeError(                \
      String::New("Argument " #I " must be a " #ISTYPE "."))); \
  Local&amp;lt;CASTTYPE&amp;gt; NAME = Local&amp;lt;CASTTYPE&amp;gt;::Cast(args[I]);

#define REQUIRE_STRING(I, NAME) \
        REQUIRE(I, String, String, NAME)
#define REQUIRE_INTEGER(I, NAME) \
        REQUIRE(I, Number, Integer, NAME)
#define REQUIRE_FUNCTION(I, NAME) \
        REQUIRE(I, Function, Function, NAME)
&lt;/pre&gt;&lt;/p&gt;&lt;p&gt;We could now implement a no-op request ticket function that takes the appropriate arguments.
&lt;pre class="brush: cpp"&gt;static Handle&amp;lt;Value&amp;gt; RequestTicket(const Arguments&amp; args)
{
    REQUIRE_INTEGER(0, maxcpus);
    REQUIRE_INTEGER(1, minvms);
    REQUIRE_INTEGER(2, mincpuspervm);
    REQUIRE_INTEGER(3, timeout);
    REQUIRE_FUNCTION(4, cb);
    return Undefined();
}
&lt;/pre&gt;It sanity-checks it's input.  Now it needs to &lt;b&gt;do&lt;/b&gt; something.&lt;/p&gt;&lt;p&gt;Node uses &lt;tt&gt;libeio&lt;/tt&gt; as the basis for its thread pool (which, assuming you are not working with raw file descriptors and sockets, you will likely be using). To use &lt;tt&gt;libeio&lt;/tt&gt;, you schedule two functions for future execution: one that does the work and one which will be called when the work is completed. The function called when the work is completed will be executed in the main thread, so it needs to be quick. You are also permitted to pass an &lt;a href="http://en.wikipedia.org/wiki/Opaque_pointer"&gt;opaque pointer&lt;/a&gt;, which will be (indirectly) passed to each of the two functions.&lt;/p&gt;&lt;p&gt;For our example below, we will first define a new class that we can use as an opaque pointer.  This class will hold all data related to the ticket request, the callback function passed in, and the return value to be given.  Since we've going to have three functions involved: the one called by V8, the one scheduled by &lt;tt&gt;libeio&lt;/tt&gt; and the one executed after the work is complete, this class will be used to pass around shared information to each of them.&lt;/p&gt;&lt;p&gt;We will also declare two functions ahead of time that will use for our &lt;tt&gt;libeio&lt;/tt&gt; work, &lt;tt&gt;EIO_RequestTicket&lt;/tt&gt; and &lt;tt&gt;EIO_Post&lt;/tt&gt;.
&lt;pre class="brush: cpp"&gt;class CallbackData
{
public:
    Handle&amp;lt;Value&amp;gt; This;      // The this scope we were called in.
    Persistent&amp;lt;Function&amp;gt; cb; // The callback function passed.
    Handle&amp;lt;Value&amp;gt; rval;      // The return value to be given.

    // The parameters required for request_ticket.
    int maxcpus;
    int minvms;
    int mincpuspervm;
    int timeout;
}

static int EIO_RequestTicket(eio_req* req);
static int EIO_Post(eio_req *req);
&lt;/pre&gt;We use the &lt;tt&gt;This&lt;/tt&gt; variable to track the scope, &lt;tt&gt;cb&lt;/tt&gt; to record the callback the user passes in and &lt;tt&gt;rval&lt;/tt&gt; to store the return value once the work is done.  The rest of the parameters are required for the actual ticket request.&lt;/p&gt;&lt;p&gt;Given these declarations, the actual &lt;tt&gt;RequestTicket&lt;/tt&gt; function is straight-forward.
&lt;pre class="brush: cpp"&gt;static Handle&amp;lt;Value&amp;gt; RequestTicket(const Arguments&amp; args)
{
    REQUIRE_INTEGER(0, maxcpus);
    REQUIRE_INTEGER(1, minvms);
    REQUIRE_INTEGER(2, mincpuspervm);
    REQUIRE_INTEGER(3, timeout);
    REQUIRE_FUNCTION(4, cb);

    // Create the opaque pointer.
    CallbackData *data = new CallbackData();

    // Set the scope variable (in case its needed).
    data-&amp;gt;This = args.This();

    // Set the parameters associated with the ticket request.
    data-&amp;gt;maxcpus = maxcpus-&amp;gt;Value();
    data-&amp;gt;minvms = minvms-&amp;gt;Value();
    data-&amp;gt;mincpuspervm = mincpuspervm-&amp;gt;Value();
    data-&amp;gt;timeout = timeout-&amp;gt;Value();

    // Save the passed callback.
    data-&amp;gt;cb = Persistent&amp;lt;Function&amp;gt;::New(cb);

    // Schedule the EIO functions to be run.
    eio_custom(EIO_RequestTicket, EIO_PRI_DEFAULT, EIO_Post, data);
    ev_ref(EV_DEFAULT_UC);

    return Undefined();
}
&lt;/pre&gt;As required, it doesn't do any real work. It allocates the opaque data pointer (the CallbackData class we defined), schedules the work in the thread pool (lines &lt;tt&gt;15&lt;/tt&gt; and &lt;tt&gt;16&lt;/tt&gt;), and returns &lt;tt&gt;Undefined()&lt;/tt&gt; immediately.&lt;/p&gt;&lt;p&gt;All that remains is for us to actually implement the missing functions.&lt;/p&gt;&lt;p&gt;The first &lt;tt&gt;EIO_RequestTicket&lt;/tt&gt; does the work required (called &lt;tt&gt;gc_request_ticket&lt;/tt&gt;) and sets the return value (&lt;tt&gt;rval&lt;/tt&gt;) in the opaque pointer.
&lt;pre class="brush: cpp"&gt;static int EIO_RequestTicket(eio_req* req)
{
    CallbackData *data = static_cast&amp;lt;CallbackData*&amp;gt;(req-&amp;gt;data);
    gc_uuid_t uuid;

    if( gc_request_ticket(
            data-&amp;gt;maxcpus, data-&amp;gt;minvms, data-&amp;gt;mincpuspervm,
            data-&amp;gt;timeout, &amp;uuid) &amp;lt; 0 ) {
        // Set the result to undefined.
        data-&amp;gt;rval = Undefined();
    } else {
        // Save the resulting ticket as a string.
        data-&amp;gt;rval = String::New(uuid.value);
    }

    return 0;
}
&lt;/pre&gt;The second function, takes the given opaque pointer, creates a V8 array using the return value and calls the callback function that was passed in as a argument.  This will also not block.
&lt;pre class="brush: cpp"&gt;static int EIO_Post(eio_req *req)
{
    CallbackData *data = static_cast&amp;lt;CallbackData*&amp;gt;(req-&amp;gt;data);
    ev_unref(EV_DEFAULT_UC);
    Local&amp;lt;Value&amp;gt; argv[1] = { *(data-&amp;gt;rval) };
    TryCatch try_catch;
    data-&amp;gt;cb-&amp;gt;Call(Context::GetCurrent()-&amp;gt;Global(), 1, argv);
    if (try_catch.HasCaught()) {
        FatalException(try_catch);
    }
    data-&amp;gt;cb.Dispose();
    delete data;
    return 0;
}
&lt;/pre&gt;That's it!  All that's left to do is to bind the &lt;tt&gt;RequestTicket&lt;/tt&gt; function appropriately within the extension (see &lt;tt&gt;vmid&lt;/tt&gt; example above), then our asynchronous request ticket function will be working like a charm.&lt;/p&gt;&lt;h3&gt;Wrapping objects&lt;/h3&gt;&lt;p&gt;Some of the GridCentric API functions return more complex structures.  Although I would recommend mapping values to V8 primitives wherever possible, the need may arise to return more complex JavaScript objects.&lt;/p&gt;&lt;p&gt;After negative experiences with wrapped objects in V8, I think that unless you require complex interactions with the JavaScript world -- you can return complex objects as simple JavaScript Objects (i.e., no prototype).  Below is my example for creating a &lt;tt&gt;TicketInfo&lt;/tt&gt; object.
&lt;pre class="brush: javascript"&gt;#define SET_VALUE(VAR, NAME, TYPE, VAL) \
    VAR-&gt;Set(String::New(NAME), TYPE::New(VAL))
#define SET_INTEGER(VAR, NAME, VAL) \
        SET_VALUE(VAR, NAME, Integer, VAL)
#define SET_STRING(VAR, NAME, VAL) \
        SET_VALUE(VAR, NAME, String, VAL)

class TicketInfo {
public:
    static Handle&amp;lt;Object&amp;gt; Create(gc_ticket_info_t info)
    {
        HandleScope scope;
        Local&amp;lt;Object&amp;gt; obj = Object::New();
        SET_STRING(obj, "id", info.id.value);
        SET_STRING(obj, "status",
           gc_ticket_status_string(info.status));
        SET_INTEGER(obj, "cpus", info.cpus);
        SET_INTEGER(obj, "vms", info.vms);
        SET_INTEGER(obj, "mincpuspervm", info.mincpuspervm);
        return scope.Close(obj);
    }
};
&lt;/pre&gt;&lt;/p&gt;&lt;h3&gt;Pre-processor tricks and gotchas&lt;/h3&gt;&lt;p&gt;If you look at the source for my extension on &lt;a href="http://code.gridcentric.ca/nodejs-bindings"&gt;bitbucket&lt;/a&gt;, you'll see that I didn't explicitly define separate classes and functions for each of the callbacks.  Due to the repetitive nature of the wrapping, I wrapped most of the callback code into hacky pre-processor macros.&lt;/p&gt;&lt;p&gt;I also encountered one annoying &lt;b&gt;gotcha&lt;/b&gt; while building the Node extension. During the clone operation, &lt;tt&gt;libgridcentric&lt;/tt&gt; executes the scripts at &lt;tt&gt;/etc/gridcentric/pre-clone&lt;/tt&gt; and &lt;tt&gt;/etc/gridcentric/post-clone&lt;/tt&gt;.  This execution is simple. Here's some pseudo-C.
&lt;pre class="brush: c"&gt;pid_t child = fork();
if( !child ) {
  exec(script);
} else {
  int rc = waitpid(child,...);
}
&lt;/pre&gt;When executed from within Node, the &lt;tt&gt;waitpid&lt;/tt&gt; fails with return value &lt;tt&gt;-1&lt;/tt&gt; and causes the clone operation to be aborted if there is an &lt;tt&gt;/etc/gridcentric/pre-clone&lt;/tt&gt; script. Why? Ostensibly, the &lt;tt&gt;waitpid&lt;/tt&gt; fails because a different part of Node gobbles up all child processes and their associated return values. Presumably this is prevent Zombie processes, but it's not a great solution. The workaround for the &lt;tt&gt;gridcentric&lt;/tt&gt; extension is to remove these scripts, but then you lose this functionality.&lt;/p&gt;&lt;h1&gt;&lt;a name="demo"&gt;&lt;/a&gt;
The application&lt;/h1&gt;&lt;p&gt;Enabling the &lt;a href="http://downloads.gridcentriclabs.com/doxygen/html/index.html"&gt;GridCentric API&lt;/a&gt; in an application running on our platform allows it to dynamically scale horizontally by requesting resources and cloning itself, much in the same way &lt;tt&gt;fork()&lt;/tt&gt; works in UNIX. The cloning operation is handled transparently from under the VM in seconds, with state magically propagated. With a Node application, our stack will look something like this.&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://adin.scannell.ca/node-drawing.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="176" src="http://adin.scannell.ca/node-drawing.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;With the bindings I've just built, this operation in JavaScript looks like this.
&lt;pre class="brush: javascript"&gt;var gridcentric = require("gridcentric");
gridcentric.request_ticket(1, 1, 1, 1000, function(ticket) {
  if( ticket ) {
    gridcentric.clone(ticket, function(vmid) {
      if( vmid &lt; 0 ) {
        console.log("There was an error.");
      } else if( vmid == 0 ) {
        console.log("I'm on the original VM.");
      } else {
        console.log("I'm on a clone with id " + vmid + ".");
      }
    });
  } else {
    console.log("Unable to allocate resources.");
  }
});
&lt;/pre&gt;
&lt;/p&gt;&lt;h3&gt;Service structure&lt;/h3&gt;&lt;p&gt;More logic is required to scale a service than simply cloning a VM. To scale any service horizontally, you'll need to implement same kind of proxy or load-balancing mechanism.&lt;/p&gt;&lt;p&gt;Using the completed bindings, I created an &lt;a href="http://code.gridcentric.ca/nodejs-bindings/src/77fca5f99728/demos/autoscale.js"&gt;autoscale.js&lt;/a&gt; module which turns the master VM into a proxy (based on &lt;a href="http://www.catonmat.net/http-proxy-in-nodejs"&gt;this&lt;/a&gt;) and routes
requests to clones which are created automatically. In other words, it turns a regular Node application into a auto-scaling service.&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://adin.scannell.ca/node-scaling-drawing.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="179" src="http://adin.scannell.ca/node-scaling-drawing.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;In this case, we create a new VM for every two active users we have and don't synchronize state across different VMs (think of it as a Node chat roulette -- only with cloning VMs).&lt;/p&gt;&lt;p&gt;More specifically, the &lt;tt&gt;autoscale.js&lt;/tt&gt; implements the following simple algorithm.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;b&gt;Every second, we fetch the list of clone domains and store their IPs in a global array.&lt;/b&gt;&lt;/li&gt;
This information is used by the proxy to route incoming connections.
&lt;li&gt;&lt;b&gt;If there is less than one clone for every two active connections, we create the appropriate number of clones.&lt;/b&gt;&lt;/li&gt;
Obviously, this is kind of a bold (ridiculous) metric for measuring load and scaling the system.
&lt;li&gt;&lt;b&gt;When a new connection arrives, it is mapped to the latest clone VM.&lt;/b&gt;&lt;/li&gt;
We could use a number of more reasonable strategies here, such as round robin, least-loaded, random.  The last clone heuristic is actually quite silly, but allows for a deterministic demo.&lt;/ul&gt;&lt;h3&gt;Integration&lt;/h3&gt;&lt;p&gt;To leverage this service, I modified the chat demo to use the auto-scaling module.  This required adding the following lines at the bottom of the &lt;a href="https://github.com/ry/node_chat/blob/master/server.js"&gt;server.js&lt;/a&gt; file:
&lt;pre class="brush: javascript"&gt;setTimeout(function() {
  as = require("./autoscale");
  as.setup(80, PORT);
}, 3000);
&lt;/pre&gt;&lt;font size="-1"&gt;I add the 3 second delay so that the service can get started before the first clone operation.&lt;/font&gt;&lt;/p&gt;&lt;p&gt;The following video is a quick demo of the service.  I log on to the auto-scaling chat service with five different users.  Because the service has been configured to create new VMs for every two users, the five users in the demo are routed to three different VMs that are created on-demand, in the span of seconds.  Enable annotations for notes during the video.&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;iframe title="YouTube video player" width="540" height="390" src="http://www.youtube.com/embed/Y5jCSKyjgcE" frameborder="0" allowfullscreen&gt;&lt;/iframe&gt;
&lt;/div&gt;&lt;h3&gt;Caveats&lt;/h3&gt;&lt;p&gt;Much like Node chat, the demo described here is not intended to be a serious service. Were you to add auto-scaling to a real application, you'd definitely need to do a better job of tracking active hosts, synchronizing necessary state across slaves and handling errors in general.&lt;/p&gt;&lt;p&gt;It's worth pointing out however, that this is miles easier with the &lt;tt&gt;gridcentric&lt;/tt&gt; extension than in the case where you have to provision new VMs from scratch.  Provisioning from scratch, you'll likely need to involve lots of languages and tools (init scripts, chef or puppet, configuration files, proxies, synchronization servers) before you even touch the application. The semantics of &lt;tt&gt;clone()&lt;/tt&gt; give the programmer a very powerful primitive on top of which they can build reliable distributed services. Plus, it's pretty awesome.&lt;/p&gt;&lt;h3&gt;Do-it-yourself&lt;/h3&gt;&lt;p&gt;If you have a &lt;a href="http://gridcentric.com/products/copper"&gt;Copper&lt;/a&gt; installation, feel free to install the bindings from &lt;a href="http://search.npmjs.org/#/gridcentric"&gt;NPM&lt;/a&gt; and try them out for yourself.  There are likely a few bugs, but I'd love to hear feedback or complaints. You will need to have the &lt;tt&gt;gc-guest-base&lt;/tt&gt; package installed in the VM to provide &lt;tt&gt;libgridcentric&lt;/tt&gt;, then the module can installed simply:
&lt;pre class="brush: bash"&gt;$ npm install gridcentric
gridcentric@0.0.1 ./node_modules/gridcentric
&lt;/pre&gt;&lt;/p&gt;&lt;p&gt;If you want to dig more into the bindings (or steal macros -- please go ahead), the full source is available &lt;a href="http://code.gridcentric.ca/nodejs-bindings"&gt;here&lt;/a&gt;.  This source also includes the simple &lt;a href="http://code.gridcentric.ca/nodejs-bindings/src/77fca5f99728/demos/autoscale.js"&gt;autoscale.js&lt;/a&gt; module used above.&lt;/p&gt;&lt;p&gt;Enjoy!&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-88633153206907523?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/RzjcjQEq7oU" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/88633153206907523/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2011/05/creating-vms-with-javascript-and-node.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/88633153206907523?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/88633153206907523?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/RzjcjQEq7oU/creating-vms-with-javascript-and-node.html" title="Creating VMs with JavaScript and Node" /><author><name>Adin Scannell</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="32" height="32" src="http://4.bp.blogspot.com/-8P6v_oedKYM/TYwFBEi9dcI/AAAAAAAAAHY/HrMruNm5VZI/s1600/df9880db6b3fe373113598d001fc2438" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://img.youtube.com/vi/Y5jCSKyjgcE/default.jpg" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2011/05/creating-vms-with-javascript-and-node.html</feedburner:origLink></entry><entry gd:etag="W/&quot;Dk4FRH04eip7ImA9WhZTE0U.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-8069940034944662012</id><published>2011-03-17T11:01:00.000-07:00</published><updated>2011-03-17T11:01:55.332-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2011-03-17T11:01:55.332-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="news" /><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><title>Infrastructure performance</title><content type="html">&lt;p&gt;There's a lot of &lt;a href="http://en.wikipedia.org/wiki/Fear,_uncertainty_and_doubt"&gt;FUD&lt;/a&gt; surrounding private cloud. Without getting in to &lt;a href="http://csrc.nist.gov/publications/drafts/800-145/Draft-SP-800-145_cloud-definition.pdf"&gt;definitions&lt;/a&gt;, I figured I would add a bit of noise and have some fun with infrastructure performance in the process.&lt;/p&gt;&lt;h2&gt;&lt;a href="http://www.youtube.com/watch?v=j2RJVtzlCF8"&gt;Cloud vs Claude&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;I'm a virtualization guy -- I know that &lt;a href="http://www.youtube.com/watch?v=8UYa6gQC14o"&gt;cloud&lt;/a&gt; can be interpreted a lot of ways, but for now I'm using it to describe Infrastructure-as-a-Service (IaaS).  Public cloud means public infrastructure providers such as Amazon's EC2 or Rackspace's cloud.&lt;/p&gt;&lt;p&gt;From my perspective, public IaaS offerings fundamentally provide two forms of value:&lt;br /&gt;
&lt;ol&gt;&lt;li&gt;Outsourcing of infrastructure costs (hardware, management, etc.)&lt;/li&gt;
&lt;li&gt;Cloud technology as a platform (APIs, billing, image libraries, etc.)&lt;/li&gt;
&lt;/ol&gt;&lt;/p&gt;&lt;p&gt;It's clear that the value of the first item is fully realized by traditional (non-cloud) hosting and co-location services.  Yet, based on how these companies are scrambling to release cloud offerings, the cloud providers must be killing these shops in the market.  It follows that there &lt;em&gt;must&lt;/em&gt; be something more than just vapor to that second point.  Hourly billing alone is not driving people to choose cloud providers over traditional hosts.&lt;/p&gt;&lt;h2&gt;Private Claude&lt;/h2&gt;&lt;p&gt;That brings us to the definition of private cloud.  It's what you get when you have cloud &lt;i&gt;technology&lt;/i&gt; running on your own infrastructure -- no outsourcing of hardware. But &lt;a href="http://www.youtube.com/watch?v=GDeqc8sTLpc"&gt;WTF&lt;/a&gt; is cloud technology? While virtualizing your infrastructure buys you a lot of agility, there's definitely more to it than just virtualization. Here's my take.&lt;/p&gt;&lt;blockquote&gt;&lt;b&gt;The key difference between vanilla virtualization and cloud is &lt;i&gt;automation&lt;/i&gt;.&lt;/b&gt;&lt;/blockquote&gt;&lt;p&gt;Automation isn't just one thing -- there are lots of ways to automate.  Automated provisioning means that you can deploy virtual machines based on standard templates with the click of a button (an or API call).  Automated billing means that resources are accounted for and reports are automatically generated or fed-back into a billing system.  Automated scaling means that services are architected to automatically grow and shrink their footprints over time, using infrastructure APIs.&lt;/p&gt;&lt;p&gt;To me, automation is a fundamental tenet of cloud technology.&lt;/p&gt;&lt;p&gt;So where does private cloud make sense? In environments where automation makes sense.&lt;/p&gt;&lt;p&gt;Automation sees return on investment when things change often: workloads, requirements, environments, usage.  These environments are very common, but not in every business.  Technical computing, content creation, service providers and hosting all may have demanding and unpredictable workloads but may also have strict data requirements (legal or bandwidth) that prevent them from using public infrastructure. Private cloud makes immediate sense here.&lt;/p&gt;&lt;p&gt;If you're not using hosted infrastructure today but are faced with dynamic workload and automation needs, then it's likely that cloud is a revolution that will change how you compute in your own datacenter.&lt;/p&gt;&lt;h3&gt;Where do we come in?&lt;/h3&gt;&lt;p&gt;Our technology transforms how applications interact with infrastructure, simplifying automated provisioning and scaling.&lt;/p&gt;&lt;p&gt;Instead of requiring virtual machine templates to be stored and maintained as complete disk images, &lt;a href="http://gridcentric.com/products/copper"&gt;Copper&lt;/a&gt; enables on-demand cloning of live, running virtual machines in real-time via a &lt;a href="http://wiki.gridcentriclabs.com/index.php?title=GridCentric_API"&gt;powerful API&lt;/a&gt;.  One second you have a single server, the next you have dozens -- and each server is aware of its unique ID allowing complex, distributed services to be created easily.&lt;/p&gt;&lt;blockquote&gt;You provision virtual machines in seconds. &lt;b&gt;So what?&lt;/b&gt; I can deploy a new server on &lt;i&gt;&amp;lt;favorite&amp;nbsp;platform&amp;gt;&lt;/i&gt; in just a few minutes.  That's good enough, I've never needed anything better.&lt;/blockquote&gt;&lt;p&gt;The above remark is a classic (and ludicrous) response I occasionally get when I explain what our technology does.  It's a total straw man. That's good enough for what you do &lt;b&gt;today&lt;/b&gt; because you've never been able to do anything more. It's like saying that you'll never need a hover car, or that &lt;a href="http://www.google.com/search?q=640K+ought+to+be+enough+for+anybody"&gt;"640Kb [of RAM] ought to be enough for anybody"&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Taking minutes to provision an additional virtual machine means that you need to be able to &lt;b&gt;predict&lt;/b&gt; fluctuations in demand and provision in anticipation of load. It also means that if you want to run a large-scale parallel computation that can take as little as a few minutes given a few hundred machines then you'll probably wait many, many times the required time or most of the time and cost in your analysis will consist of deploying virtual machines instead of actually running code.&lt;/p&gt;&lt;p&gt;Being able to scale in seconds changes the name of the game.  Software can predict and &lt;b&gt;react&lt;/b&gt; in response to demand, in real-time.  That's the basis for our upcoming hosted cloud offering, &lt;a href="http://riotcloud.com"&gt;RiotCloud&lt;/a&gt;, and our powerful grid queueing solution &lt;a href="http://wiki.gridcentriclabs.com/index.php?title=GCQ"&gt;GCQ&lt;/a&gt;.&lt;/p&gt;&lt;h2&gt;Taking it on the road&lt;/h2&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-ET7ESw4mF5U/TX57epV890I/AAAAAAAAAHM/HdOfETMnlvc/s1600/steve.png" imageanchor="1" style="clear:right; float:right; margin-left:1em; margin-bottom:1em"&gt;&lt;img border="0" height="320" width="187" src="http://3.bp.blogspot.com/-ET7ESw4mF5U/TX57epV890I/AAAAAAAAAHM/HdOfETMnlvc/s320/steve.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;If automation is a fundamental tenet of cloud technology, then measuring the speed at which infrastructure software reacts to programmatic automation is a logical next step. I'm a massive fan of &lt;a href="http://www.topgear.com"&gt;Top Gear&lt;/a&gt;, despite having a complete lack of interest in cars.  After realizing that this was quite normal, I thought it would be fun to do a Top Gear tribute by boasting about our raw infrastructure performance.&lt;/p&gt;&lt;p&gt;The below tests were performed on our test &lt;strike&gt;track&lt;/strike&gt; cluster by our tame &lt;strike&gt;race driver&lt;/strike&gt; infrastructure programmer, &lt;strike&gt;the Stig&lt;/strike&gt; Steve, seen here.  Some say that Steve's brain, if hooked up to an IPv6 backbone, is capable of routing over 50GiB per second.&lt;/p&gt;&lt;h3&gt;Acceleration&lt;/h3&gt;&lt;p&gt;A fun test for infrastructure software is how quickly it allows an application to scale, which is tantamount to how fast new machines can be provisioned and booted. A faster time-to-scale means smoother upgrades, fewer resources wasted transferring state and an overall more dynamic infrastructure. For example, consider scaling an analysis application (like &lt;a href="http://blog.gridcentriclabs.com/2010/07/howto-build-hadoop-cluster-in-five.html"&gt;Hadoop&lt;/a&gt;) to perform a time-critical computation. Our software requires zero time to create and bundle virtual machine templates (it's not necessary), so for us this is simply a matter of how quickly we can scale a running virtual machine.  In fact, everything here is done &lt;em&gt;within&lt;/em&gt; the virtual machine being scaled, showing the power of our API.&lt;/p&gt;&lt;p&gt;Before Steve did his thing, we were lucky enough to have Jeremy Clarkson of Top Gear take our software for a spin and give his first impressions.&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;iframe title="YouTube video player" width="540" height="390" src="http://www.youtube.com/embed/pqb06h6JkYA" frameborder="0" allowfullscreen&gt;&lt;/iframe&gt;&lt;br /&gt;
&lt;p&gt;&lt;em&gt;&lt;font size="-1"&gt;Don't worry: he's fine now, just a little shaken up.&lt;br /&gt;
Also, that's obviously not really Jeremy Clarkson.&lt;/font&gt;&lt;/em&gt;&lt;/p&gt;&lt;/div&gt;&lt;p&gt;In the tests performed by the our tame infrastructure programmer, we found a blazing average time of just a little over 9 seconds to scale from a single virtual machine to sixty-one, making repurposing infrastructure a staggeringly fast operation.  Remember, that when &lt;a href="http://gridcentric.com/products/copper"&gt;Copper&lt;/a&gt; clones virtual machines, they are already running with all the necessary application state -- 9 seconds is everything, there is no boot time.&lt;/p&gt;&lt;h3&gt;Cornering&lt;/h3&gt;&lt;p&gt;Although acceleration is important, a huge factor in how fast you can make it around the track is how your machine can handle corners.  Application cornering is a lot like cornering in a race car: you need to slow down in one direction, then speed up in another.&lt;/p&gt;&lt;p&gt;The below video shows Steve apexing a nearly 180 degree corner.  Our infrastructure starts off running one application with over one hundred virtual machines, destroys them all, then scales a second application out to over one hundred virtual machines.  This process takes less than 20 seconds.&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;iframe title="YouTube video player" width="540" height="390" src="http://www.youtube.com/embed/J6dRBYTrfCg" frameborder="0" allowfullscreen&gt;&lt;/iframe&gt;&lt;br /&gt;
&lt;/div&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;h3&gt;The fine print&lt;/h3&gt;&lt;p&gt;These demos show the power and speed of dynamic infrastructure, but I'm glossing over a few details. When you're scaling applications (in or out), they'd better be ready for it. Although a lot of applications can be adapted, there &lt;b&gt;are&lt;/b&gt; plenty of frameworks that fit the bill already, especially in spaces accustomed to dynamic workloads.&lt;/p&gt;&lt;p&gt;To give some concrete examples of what they might be, suppose that application one is for distributed data crunching (like SETI@Home or a batch processing system) and application two does distributed web load testing (like Selenium Grid).  In that case, cornering might consist of putting aside the opportunistic analysis that we have our infrastructure doing in order to load test the latest website release.&lt;/p&gt;&lt;p&gt;When provisioning is so simple and takes seconds, we can easily imagine building new infrastructures that react and scale like never before. Applications can scale in response to load or needs in real-time. The example above, where infrastructure is taken over at the flick of a switch, could easily happen every night, every day, every hour or even every time some stupid developer (or umm... CTO) pushes code and breaks the build.&lt;/p&gt;&lt;h2&gt;That's it folks!&lt;/h2&gt;&lt;p&gt;In other news, we've recently released &lt;a href="http://wiki.gridcentriclabs.com/index.php?title=ReleaseNotes"&gt;Copper 1.4.2&lt;/a&gt; and updated all the &lt;a href="http://downloads.gridcentriclabs.com/guest-images"&gt;guest images&lt;/a&gt; available include Debian 6.0, Fedora 12, Fedora 13, and CentOS 5.5.  Happy updating.&lt;/p&gt;&lt;p&gt;I also want to add that although we're excited to work every day at changing the world with innovative virtualization software, our team's thoughts and hopes are with those affected by the &lt;a href="http://www.google.com/crisisresponse/japanquake2011.html"&gt;tragedy in Japan&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-8069940034944662012?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/giKNde8ZsC0" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/8069940034944662012/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2011/03/infrastructure-performance.html#comment-form" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/8069940034944662012?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/8069940034944662012?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/giKNde8ZsC0/infrastructure-performance.html" title="Infrastructure performance" /><author><name>Adin Scannell</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://3.bp.blogspot.com/-ET7ESw4mF5U/TX57epV890I/AAAAAAAAAHM/HdOfETMnlvc/s72-c/steve.png" height="72" width="72" /><thr:total>1</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2011/03/infrastructure-performance.html</feedburner:origLink></entry><entry gd:etag="W/&quot;D0UHR308eip7ImA9Wx9XEUo.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-3622930876474706433</id><published>2011-01-04T11:13:00.000-08:00</published><updated>2011-01-04T13:07:16.372-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2011-01-04T13:07:16.372-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><title>How fast can you add more worker nodes?</title><content type="html">There is a very important, and often overlooked, question when it comes to the scalability of the system: How fast does the system scale? The main question is still how will the system scale in the first place. But in the age of Internet powered applications that revolves around sudden, unexpected peaks of traffic, just answering this question is not good enough. The system doesn't just need to be able to scale, but to also scale&amp;nbsp;instantaneously.&lt;br /&gt;
&lt;br /&gt;
The producer-consumer programming model is a good example of a software design that has a very simple conceptual path to scaling. The model basically consists of three parts: A Producer that creates work, a Queue that stores a backlog of pending work the producer has created, and finally a Worker that takes the pending work from the queue and executes it.&lt;br /&gt;
&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_V1O61leJWWY/TSNSyzQqeTI/AAAAAAAAAD0/FxqqMC5IAjc/s1600/single_worker.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="62" src="http://1.bp.blogspot.com/_V1O61leJWWY/TSNSyzQqeTI/AAAAAAAAAD0/FxqqMC5IAjc/s320/single_worker.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;br /&gt;
Take for example uploading a video to YouTube. The producer would be the web server that creates the work of "process this video file". A worker then takes the next video file to process, and performs all the codec conversions, formatting, etc. on the video file. Once it is done, it picks the next waiting video file from the queue. Suppose suddenly a couple hundred people decide to all upload their video at once. The work queue would get huge, and the system will start to get bogged down. However, more workers can be added to the system in order to scale it up to handle the increased load. Essentially the system's throughput is&amp;nbsp;proportional&amp;nbsp;to the number of workers.&lt;br /&gt;
&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_V1O61leJWWY/TSNTG35dEaI/AAAAAAAAAD4/CHFykb6D_jA/s1600/two_workers.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="160" src="http://2.bp.blogspot.com/_V1O61leJWWY/TSNTG35dEaI/AAAAAAAAAD4/CHFykb6D_jA/s320/two_workers.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
Here is a quick video that demonstrates the scalable nature of the producer-consumer model. There is a producer node that creates jobs that will take a worker ~ 5 seconds to complete. The producer creates, or submits, 2 jobs every second. Obviously this is more than the single worker can handle, so more workers are added to the system to keep up to pace with the producer node, and to tackle the backlog of jobs.&lt;br /&gt;
&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;object class="BLOGGER-youtube-video" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0" data-thumbnail-src="http://i.ytimg.com/vi/pneU79njviQ/0.jpg" height="499" width="600"&gt;&lt;param name="movie" value="http://www.youtube.com/v/pneU79njviQ?f=user_uploads&amp;c=google-webdrive-0&amp;app=youtube_gdata" /&gt;&lt;param name="bgcolor" value="#FFFFFF" /&gt;&lt;embed width="600" height="499" src="http://www.youtube.com/v/pneU79njviQ?f=user_uploads&amp;c=google-webdrive-0&amp;app=youtube_gdata" type="application/x-shockwave-flash"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;&lt;br /&gt;
There are two main virtual machines in the video: A producer machine and the worker machine. The producer machine is running a &lt;a href="http://www.djangoproject.com/"&gt;Djang&lt;/a&gt;o application, which is used to visualize the system, &lt;a href="http://www.rabbitmq.com/"&gt;RabbitMQ&lt;/a&gt;, which is an AMQP system used to support the producer-consumer model, and finally a simple shell loop that hits the Django application to submit jobs to the work queue. Here is the code snippet from the the Django view.py that handles the job submission:&lt;br /&gt;
&lt;br /&gt;
&lt;pre class="brush: python"&gt;from django.shortcuts import render_to_response
from prdconapp.models import Job, Server

# pika is a simple AMQP library that is used to interact with the
# RabbitMQ instance.
import pika

def give(request, vm_id, num_jobs):
    """
    This gives jobs to the system. It will create num_jobs more
    jobs and submit them to the queue, and update the 
    visualization state with these jobs. The vm_id is just a 
    string to indicate what machine is giving these jobs.
    """
    
    # Get a connection of the queue instance called "task_queue" 
    # that is running on the local host.
    connection = pika.AsyncoreConnection(
            pika.ConnectionParameters(host='localhost'))
    channel = connection.channel()
    channel.queue_declare(queue='task_queue')
    
    # Submit the desired number of jobs to RabbitMQ.
    for i in range(int(num_jobs)):
        j = Job(owner=vm_id, status='pending')
        j.save()
        message = "Job id: " + str(j.id)
        channel.basic_publish(exchange='', 
                routing_key='task_queue',
                body=message, 
                properties=pika.BasicProperties(
                delivery_mode = 2, # make message persistent
            ))


    return render_to_response("index.html", job_dict())

&lt;/pre&gt;&lt;br /&gt;
The worker virtual machine simply runs a single python script that takes the next job from the RabbitMQ queue, updates the Django application visualization, runs the job, and then loops to take the next job. When the Copper platform does the &lt;i&gt;live-cloning&lt;/i&gt; of the virtual machine, this same python script is already automatically running in each of the cloned machines. This is how the new workers are able to connect to the RabbitMQ system, and start pulling off the next pending job. Here is the worker code:&lt;br /&gt;
&lt;pre class="brush: python"&gt;import time
import urllib2
import sys

# pika is a simple AMQP library that is used to interact with the
# RabbitMQ instance.
import pika

# The gridcentric library binding used to get some information
# about the virtual machine on which this script is running.
from gridcentric import guest as gc

# The host that has the RabbitMQ instance. It is also the host
# running the Django app we need to update.
host = '192.168.1.80'

# We use the gridcentric library to get the unique vmid of this
# machine. This is used to determine if we are a cloned virtual 
# machine, or the original master machine.
VMID = gc.vmid()

# Start a connection to the RabbitMQ server and listen to the 
# task_queue queue.
connection = pika.AsyncoreConnection(
        pika.ConnectionParameters(host=host),
        True,
        pika.connection.SimpleReconnectionStrategy())
channel = connection.channel()
channel.queue_declare(queue='task_queue')

print ' [*] Waiting for messages. To exit press CTRL+C'

# This is the call back function that gets called when this 
# worker receives a message from the server. It essentially 
# updates the visualizer, and runs the task.
def callback(ch, method, header, body):
    # Use the gridcentric library to get the IP address of 
    # this machine in the cluster's private virtual network. 
    # We will use this to uniquely identify this machine in
    # the visualization.
    ip = gc.my_ip()
    
    print " [%s] Received %r" % (ip,body,)
    job_id = body.split(":")[1].strip()
    try:
        # Update the visualizer to tell it that we have taken
        # the job and are going to be running it.
        urllib2.urlopen("http://%s:8000/prdcon/take/%s/%s" % 
                (host, ip, job_id) )

        # All jobs themselves are just sleep jobs. We could 
        # decode the message, and run the job, or better yet 
        # use something like Celery that does all of this for 
        # us. For the purposes of the demo, sleeping 5 seconds 
        # should be fine.
        time.sleep(5)

        # Update the visualizer to tell it that we have completed
        # the job.
        urllib2.urlopen("http://%s:8000/prdcon/done/%s/%s" % 
                (host, ip, job_id) )
        print " [x] Done"
    except Exception:
        pass

    # Tell the RabbitMQ server that we have processed the message, 
    # and we are ready for another one.
    ch.basic_ack(delivery_tag = method.delivery_tag)

    # If we detect that our VMID has changed, we basically want 
    # to reconnect back to the RabbitMQ server because we are a 
    # new machine. This is a bit of a hack to get a demo worker, 
    # but basically the worker will exit. This is why we run the
    # worker in a while loop in the shell.
    if gc.vmid() != VMID:
        sys.exit(0)

# Tell the RabbitMQ server that we only want to receive a single 
# message at a time.
channel.basic_qos(prefetch_count=1)

# Register the callback function.
channel.basic_consume(callback,
                      queue='task_queue')

# Just keep looping and waiting for new messages to process.
pika.asyncore_loop()

&lt;/pre&gt;&lt;br /&gt;
The producer-consumer model is a straightforward design that has many simple semantics that make it easy to reason about how it will scale. But, as I mentioned in the beginning of this post, the mere fact that this model can scale is not good enough. The next, and almost equally important question, is how fast does it scale?&lt;br /&gt;
&lt;br /&gt;
Let's take a look at how long it would take to scale the producer-consumer programming model throughout time:&lt;br /&gt;
&lt;ul&gt;&lt;li&gt;&lt;b&gt;Days&lt;/b&gt;: In the 1980s adding new workers to the system would be in the magnitude of days because it was a very manual process, both physically setting up the computer and manually configuring it. Taking days to scale an application is simply not good enough.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Hours&lt;/b&gt;: In the 1990s ghosting computer images enabled computers to be preconfigured, however they still needed to be physically setup and it was still a big manual process. Scaling in hours is better, but it is still not able to capture a lot of potential.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Minutes&lt;/b&gt;: In the 2000s virtualization and cloud computing started to become more main stream removing a lot of the manual process in setting up new workers. This has helped enabled the current explosion in Internet based applications, but there is still a window of lost potential as the system scales up.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Seconds&lt;/b&gt;: &lt;a href="http://gridcentric.com/download"&gt;GridCentric's Copper platform&lt;/a&gt; can add functioning workers to the system an order of magnitude better than everyone else. With delays of only seconds, the system has become a lot more responsive, and reactive, to peaks in demand. Now the window of lost potential has gone from minutes, down to &lt;i&gt;seconds&lt;/i&gt;. The added bonus is it is often&amp;nbsp;simpler, and easier, to add workers using the Copper platform, than using other platforms that can only accomplish the task in minutes.&lt;/li&gt;
&lt;/ul&gt;&lt;div&gt;Here is the video again, and as you watch it remember to ask yourself how fast can you add workers to your scalable system, and &lt;i&gt;what are you losing&lt;/i&gt; as you&amp;nbsp;scramble&amp;nbsp;to react to a temporary peak in demand?&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;object class="BLOGGER-youtube-video" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0" data-thumbnail-src="http://i.ytimg.com/vi/pneU79njviQ/0.jpg" height="499" width="600"&gt;&lt;param name="movie" value="http://www.youtube.com/v/pneU79njviQ?f=user_uploads&amp;c=google-webdrive-0&amp;app=youtube_gdata" /&gt;&lt;param name="bgcolor" value="#FFFFFF" /&gt;&lt;embed width="600" height="499" src="http://www.youtube.com/v/pneU79njviQ?f=user_uploads&amp;c=google-webdrive-0&amp;app=youtube_gdata" type="application/x-shockwave-flash"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-3622930876474706433?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/I4P275VlXmI" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/3622930876474706433/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2011/01/how-fast-can-you-add-more-worker-nodes.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/3622930876474706433?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/3622930876474706433?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/I4P275VlXmI/how-fast-can-you-add-more-worker-nodes.html" title="How fast can you add more worker nodes?" /><author><name>David Scannell</name><uri>http://www.blogger.com/profile/14713329443907634842</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://1.bp.blogspot.com/_V1O61leJWWY/TSNSyzQqeTI/AAAAAAAAAD0/FxqqMC5IAjc/s72-c/single_worker.png" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2011/01/how-fast-can-you-add-more-worker-nodes.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DEUERno5eyp7ImA9Wx9RGEs.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-2740583220349946781</id><published>2010-12-20T09:09:00.000-08:00</published><updated>2010-12-20T09:30:07.423-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-12-20T09:30:07.423-08:00</app:edited><title>Tis the season...</title><content type="html">We've been meaning for a while to put up some videos demoing how Copper can be used to implement load-testing environments with minimal setup.  Well, here they are:&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;object width="640" height="385"&gt;&lt;param name="movie" value="http://www.youtube.com/v/gKNiGlxU3OI?fs=1&amp;amp;hl=en_US"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/gKNiGlxU3OI?fs=1&amp;amp;hl=en_US" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="320" height="192"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;object width="640" height="385"&gt;&lt;param name="movie" value="http://www.youtube.com/v/PXgnd2wd4wA?fs=1&amp;amp;hl=en_US"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/PXgnd2wd4wA?fs=1&amp;amp;hl=en_US" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="320" height="192"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;&lt;br /&gt;In retrospect, it's a simple use case that's easy enough to set up and demonstrate, and makes for some cool visuals... so it's about time we got around to it.&lt;br /&gt;&lt;br /&gt;There are two videos, one by me and one by Dave.  Each of them follows the same basic format: we start off with a single master VM running a simple hitcounter webapp, and then spawn clone VMs that run an HTTP client and ping the hitcounter. The hitcounter is observed via the webapp's main page, which updates itself dynamically.  My hitcounter's main page just counts the hits, while Dave's gets all fancy about it ;)&lt;br /&gt;&lt;br /&gt;The way our scripts gets clones to do their work is also different. In my case, a script running on the master VM initiates the cloning, then branches into the code that runs the http client.  In Dave's case, he backgrounds a shell command which runs in an infinite loop pinging the hitcounter, and then just creates clones directly using the 'gc clone' command.  His background command continues executing on each clone.&lt;br /&gt;&lt;br /&gt;Also.. I know.. this demo is not actually a DDOS or a load test. However, throwing a couple of while loops around the http client would get the behaviour pretty close to it.. ;)  It's not particularly hard to topple a Django standalone server running in test mode.  Actually, in Dave's video you can see the server start struggling to fulfill requests towards the end..&lt;br /&gt;&lt;br /&gt;Anyway, if what you see intrigues you, feel free to &lt;a href="http://gridcentric.com/products/copper"&gt;download and play with Copper yourself&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Cheers and Happy Holidays&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-2740583220349946781?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/lNwvN3WbtZc" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/2740583220349946781/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/12/tis-season.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/2740583220349946781?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/2740583220349946781?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/lNwvN3WbtZc/tis-season.html" title="Tis the season..." /><author><name>Kannan Vijayan</name><uri>http://www.blogger.com/profile/01218320916390868729</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/12/tis-season.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DkABRHsyfyp7ImA9Wx5aFE8.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-1692942972235493943</id><published>2010-11-08T14:02:00.000-08:00</published><updated>2010-11-10T13:32:35.597-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-11-10T13:32:35.597-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="news" /><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><category scheme="http://www.blogger.com/atom/ns#" term="RiotCloud" /><title>Join The Revolution!</title><content type="html">&lt;span class="Apple-style-span" style="border-collapse: collapse; font-size: 13px;"&gt;&lt;div class="MsoNormal" style="margin: 0px; text-align: justify;"&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://riotcloud.com/" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="76" src="http://1.bp.blogspot.com/_V1O61leJWWY/TNsOFpVhTaI/AAAAAAAAADs/W-fdHGCL1BE/s320/riotcloud.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;span class="Apple-style-span"&gt;Scaling your services can be a pain -- a big one. Using conventional cloud hosting is easier than building and managing your own infrastructure, but service scaling is still painful. At &lt;a href="http://www.gridcentriclabs.com/"&gt;GridCentric&lt;/a&gt;, we think service and infrastructure scaling should be easy and goes hand-in-hand.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;span class="Apple-style-span"&gt;For example, c&lt;span style="border-collapse: collapse; font-size: 13px;"&gt;urrent automated cloud scaling typically includes learning new and complex APIs along with specifically prepared server images&lt;/span&gt;. There are third party services you can work with – some really good ones – but the increased cost and dependence doesn’t really make things any easier to swallow. For everybody that finds this as frustrating as we do – we’re proud to announce &lt;a href="http://riotcloud.com/"&gt;&lt;b&gt;RiotCloud&lt;/b&gt;&lt;/a&gt;: an easy-to-use, managed hosting service that makes automated scaling trivial.&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;/div&gt;&lt;div class="im"&gt;&lt;span class="Apple-style-span"&gt;RiotCloud is designed for simple deployment – users won’t need to learn new APIs, languages or have to manage/configure images; they just rent a server, change a few settings, and work away. The servers will scale automatically with the same application, run time and configuration states without you having to update or manage the system. In fact, the only thing missing is the administrative complexity. RiotCloud employs the revolutionary &lt;i&gt;live-cloning &lt;/i&gt;technology of &lt;a href="http://www.gridcentriclabs.com/products/copper/"&gt;&lt;b&gt;Copper&lt;/b&gt;&lt;/a&gt; to ensure consistent server state as you scale, exactly as you expect.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;span class="Apple-style-span"&gt; The entire team is excited and prepping for launch! How about you? Comment and let us know your thoughts/feedback – we’d love to hear them!&amp;nbsp;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;span class="Apple-style-span"&gt; &lt;b&gt;Join the Revolution: &lt;/b&gt;&lt;span class="Apple-style-span"&gt;&lt;a href="http://riotcloud.com/"&gt;http://riotcloud.com/&lt;/a&gt;&lt;/span&gt; - Register now for a free $50 credit!&lt;/span&gt;&lt;/div&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-1692942972235493943?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/fUm0M5V24gw" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/1692942972235493943/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/11/join-revolution.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/1692942972235493943?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/1692942972235493943?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/fUm0M5V24gw/join-revolution.html" title="Join The Revolution!" /><author><name>Karthik</name><uri>http://www.blogger.com/profile/15685561700677946158</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://1.bp.blogspot.com/_V1O61leJWWY/TNsOFpVhTaI/AAAAAAAAADs/W-fdHGCL1BE/s72-c/riotcloud.png" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/11/join-revolution.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DkcMSXg_fip7ImA9Wx5VE00.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-6571350489400123042</id><published>2010-10-05T11:34:00.000-07:00</published><updated>2010-10-05T11:41:28.646-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-05T11:41:28.646-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="news" /><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><title>How do I clone thee? Let me count the ways...</title><content type="html">Dynamically provisioning stateful VMs with &lt;a href="http://gridcentriclabs.com/products/copper"&gt;Copper &lt;/a&gt;is fast and easy.  It occurred to me that we don't often show how fast and easy it is in ways other than using the shell (because it's simple and accessible).  Not everyone's workflow is bash-based, so I thought I would show a few examples of our API used with different language bindings.&lt;br /&gt;
&lt;br /&gt;
Our API is powerful and fully featured, allowing for synchronization, resource reservations, and the equivalent of process control for VMs.  For the purpose of these examples, I've created four different extremely simple "VM fork" programs below.  Each one duplicates the running VM machine (which takes seconds with &lt;a href="http://gridcentriclabs.com/products/copper"&gt;Copper&lt;/a&gt;) and prints the VM's id ala &lt;a href="http://linux.die.net/man/2/fork"&gt;fork&lt;/a&gt;().&lt;br /&gt;
&lt;br /&gt;
&lt;h2&gt;C/C++&lt;/h2&gt;&lt;br /&gt;
&lt;pre class="brush: c"&gt;/* Must be linked with -lgridcentric */
#include &amp;lt;gridcentric/gc-guest.h&amp;gt;

int main(int argc, char** argv);
    gc_uuid_t ticket;
    int vmid;
    gc_request_ticket(1, 1, 1, 1000, &amp;ticket);
    vmid = gc_clone(ticket);
    printf("%d\n", vmid);
}
&lt;/pre&gt;&lt;br /&gt;
&lt;h2&gt;bash&lt;/h2&gt;&lt;br /&gt;
&lt;pre class="brush: bash"&gt;#!/bin/bash
ticket=`gc rt 1 1 1 1000 | awk '{print $2;}'`
gc clone $ticket # Could have pulled the vmid from here.
vmid=`gc vmid`   # But this is much simpler.
echo $vmid
&lt;/pre&gt;&lt;br /&gt;
&lt;h2&gt;python&lt;/h2&gt;&lt;br /&gt;
&lt;pre class="brush: python"&gt;#!/usr/bin/env python
import gridcentric.guest

ticket = gridcentric.guest.request_ticket(1,1,1,1000)
vmid = gridcentric.guest.clone(ticket)
print vmid
&lt;/pre&gt;&lt;br /&gt;
&lt;h2&gt;java&lt;/h2&gt;&lt;br /&gt;
&lt;pre class="brush: java"&gt;// Be sure to include '/usr/share/gridcentric/gridcentric.jar'
// in the classpath when running this program.
import ca.gridcentric.guest.API;

class VMFork {
    public static void main(String[] args) {
        API api = new API(); 
        API.Uuid ticket = api.requestTicket(1,1,1,1000);
        int vmid = api.clone(ticket);
        System.out.println("" + vmid);
    }
}
&lt;/pre&gt;&lt;br /&gt;
Got it?  These programs all implement &lt;a href="http://linux.die.net/man/2/fork"&gt;fork&lt;/a&gt;() for VMs.&lt;br /&gt;
&lt;br /&gt;
The result of running any of these programs is the same: two nearly-identical VMs, printing our two different vmid values.  The operation takes seconds, and can easily be incorporated into complex workflows.  It's pretty nuts. Of course, you need not limit yourself to &lt;a href="http://linux.die.net/man/2/fork"&gt;fork &lt;/a&gt;with Copper. Say hello to endless possibilities for distributed and scalable applications!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-6571350489400123042?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/WIK7Qcpy7mg" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/6571350489400123042/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/10/how-do-i-clone-thee-let-me-count-ways.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/6571350489400123042?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/6571350489400123042?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/WIK7Qcpy7mg/how-do-i-clone-thee-let-me-count-ways.html" title="How do I clone thee? Let me count the ways..." /><author><name>Adin Scannell</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/10/how-do-i-clone-thee-let-me-count-ways.html</feedburner:origLink></entry><entry gd:etag="W/&quot;D0ADQHs7eip7ImA9Wx5WEEQ.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-6396125145433916184</id><published>2010-09-21T11:47:00.000-07:00</published><updated>2010-09-21T12:02:51.502-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-09-21T12:02:51.502-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><title>GridCentric: Testing Mincemeat.py MapReduce Scalability</title><content type="html">&lt;b&gt;Introduction &lt;/b&gt;&lt;br /&gt;
The &lt;a href="http://en.wikipedia.org/wiki/MapReduce"&gt;MapReduce&lt;/a&gt; programming paradigm has been getting a lot of buzz the past couple of years because in general developers are facing larger and larger datasets. Google has shown that MapReduce is very effective at processing extremely large datasets by indexing the entire web using it. As a result of all this buzz there have also been &lt;a href="http://en.wikipedia.org/wiki/MapReduce#Implementations"&gt;numerous different implementations&lt;/a&gt; popping up. The front runner is probably &lt;a href="http://hadoop.apache.org/"&gt;Hadoop,&lt;/a&gt; and we have a post about &lt;a href="http://blog.gridcentriclabs.com/2010/07/howto-build-hadoop-cluster-in-five.html"&gt;how to set up a Hadoop cluster in 5 minutes&lt;/a&gt;. But there are many more, and in this post I am using a lightweight python implementation called &lt;a href="http://remembersaurus.com/mincemeatpy/"&gt;mincemeat.py&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
Essentially the promise of MapReduce is that we can easily throw machines at the dataset and process it faster. What I am interested in is the affect to the processing power of the system when adding additional machines. In other words, what does the performance function look like in relation to the number of machines in the system. To do this I am basically going to have a fixed dataset, and then test how fast it takes to process the dataset using a different number of worker machines.&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_V1O61leJWWY/TJjyajd492I/AAAAAAAAADc/Ri-WvMl3jlk/s1600/mapreduce_intro.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_V1O61leJWWY/TJjyajd492I/AAAAAAAAADc/Ri-WvMl3jlk/s320/mapreduce_intro.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt;Test Harness&lt;/b&gt;&lt;br /&gt;
Being able to dynamically vary the number of workers that I have is an interesting problem. I am a believer in the &lt;a href="http://en.wikipedia.org/wiki/Don%27t_repeat_yourself"&gt;DRY principle&lt;/a&gt; so I don't want to have to configure 10 machines with mincemeat, or synchronize them so that the correct number of them connect when running my tests. Basically I have a very simple experiment I want to run and I want a very simple solution that allows me to easily vary the number of machines in my tests.&lt;br /&gt;
&lt;br /&gt;
Fortunately the &lt;a href="http://www.gridcentriclabs.com/products/copper/"&gt;GridCentric Copper virtualization platform&lt;/a&gt; allows me to create an arbitrary sized cluster very easily. Copper provides a very simple, but powerful,&amp;nbsp; API call that allows a virtual machine to live-clone itself. Within seconds a running virtual machine can create exact replicas of itself that have the same memory loaded, the same disk state and even the same instruction pointer in the CPU. In other words, I can use the Copper API to create clusters with a single node, 5 nodes, 10 nodes, and so on within a matter of seconds without any need for additional configuration, synchronization, or any maintenance on my part. In addition, since this is an API call I can include it directly within my test harness.&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_V1O61leJWWY/TJjqy4ZnxhI/AAAAAAAAADU/Qt8PKUY5qDo/s1600/cloneout.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_V1O61leJWWY/TJjqy4ZnxhI/AAAAAAAAADU/Qt8PKUY5qDo/s320/cloneout.png" /&gt;&lt;/a&gt;&lt;/div&gt;My test harness is composed of two scripts: one that will create my test cluster and execute the MapReduce program, and the other that will vary the size of the cluster and collect the time information. The first script will basically take as a parameter the number of additional workers to add. It will use the Copper API to live-clone the machine to build the cluster with the additional workers, start up the MapReduce server and then connect all the workers together. It will then clean up the cluster to free up the resources for other tests, or other applications we want to run on our cluster.&lt;br /&gt;
&lt;br /&gt;
This is what my test script looks like that uses Copper to dynamically scale out a test cluster and executes my tests:&lt;br /&gt;
&lt;br /&gt;
&lt;pre class="brush: bash"&gt;############################################################
# 
# Simple test script that will create a cluster of arbitrary
# size and then use mincemeat.py to run a MapReduce job
# on the cluster.
#
############################################################


# This will synchronize the process on the ticket and act
# as a barricade so that the process won't resume until
# everyone has synchronized on the ticket.
#
function sync_ticket {

 TICKET=$1
 IS_MASTER=$2

 if [ $IS_MASTER -eq 0 ]
 then
   gc sync $TICKET 60000
 else
   gc sync
 fi
}



if [ $# -eq 0 ]
then
  echo "Usage: $0 num_workers"
else

  # The number of additional workers to create
  WORKERS=$1
  # The IP address of the server
  SERVER_IP=`gc my-ip`

  if [ $WORKERS -gt 0 ]
  then
    # If we need additional workers then we will request a 
    # ticket for them. This reserves the resources in Copper
    # for our cluster.
    TICKET=`gc rt $WORKERS $WORKERS 1 60000 | awk '{ print $2 }'`
    if [ ${#TICKET} -eq 0 ]
    then 
      # We failed to get a ticket for that number of workers.
      # Basically the resources in our system is in use, so
      # we just exit with an EBUSY error.
      echo "Failed to acquire the resources"
      exit 16
    else
      # We have the resources so we clone out. This will create
      # all the worker machines, and they will be created as
      # exact replicas of this machine, and each one will
      # be executing this script at this point!
      gc clone $TICKET

      # The gc clone API will create the replica virtual machines.
      # Essentially we will now have new $WORKERS machine all 
      # running this script because they are clones of a virtual
      # machine that is running this script. Moreover, they will
      # all be running this script from this point onwards because
      # gc clone was the last command executed by the virtual 
      # machine prior to cloning.
    fi
  else
    # We just get the ticket that the master was created.
    # There is no need to create additional workers.
    TICKET=`gc lt | grep ticket | awk '{ print $2 }'`
  fi

  gc ismaster
  IS_MASTER=$?

  if [ $IS_MASTER -eq 0 ]
  then
    # We want to start up the server on the master
    python example.py &amp;gt; results.out &amp;amp;
    # Give the serve some time to startup
    sleep 3
  fi
  
  # This is a barricade and ensures all workers have checked
  # in before we start the process.
  sync_ticket $TICKET $IS_MASTER

  # We start up the mincemeat worker and connect to the server
  python mincemeat.py -p changeme $SERVER_IP
  
  # We have finished processing so we want to destroy the 
  # additional workers that were created.
  if [ $IS_MASTER -ne 0 ]
  then
    # We are a worker so we should join back
    gc join noreport
  fi
fi
############################################################

&lt;/pre&gt;&lt;br /&gt;
&lt;b&gt;Results&lt;/b&gt;&lt;br /&gt;
I execute the above script using 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10 hosts. For each host configuration I ran the script five times to calculate the average time it takes for this configuration. Here are my results running MapReduce using mincemeat.py on a 28 MB dataset.&lt;br /&gt;
&lt;br /&gt;
&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_V1O61leJWWY/TJjjGoe4dBI/AAAAAAAAADM/dNuyxDbGBm0/s320/mapreduce.png" style="margin-left: auto; margin-right: auto;" /&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;The overall MapReduce time decreases as we add more workers, while at the same time the overhead and cloning of virtual machines remains constant. The time spent synchronizing the virtual machines, and requesting a ticket, is essentially zero.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_V1O61leJWWY/TJjjGoe4dBI/AAAAAAAAADM/dNuyxDbGBm0/s1600/mapreduce.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;/a&gt;&lt;/div&gt;We can clearly see a remarkable improvement when moving from 1 worker to 3 workers, and in fact it looks like 3 nodes is a local minimum performing fast than having either 4 or 5 workers. However, things do continue to improve and once we are in the 8 to 10 workers territory we start to see the performance improvements leveling off. Obviously this is specific to the 28 MB dataset, and I expect us to be able to get significant gains with larger number of nodes with larger data. &lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Conclusion&lt;/b&gt;&lt;br /&gt;
The results presented above highlight that there is a point of diminishing returns whereby adding new hosts to the MapReduce system does not provide any significant gains. I feel like this is a intuitive property of the MapReduce paradigm, but it is nice to have the data to support it. However, what is interesting is how &lt;i&gt;easy&lt;/i&gt; it was to test this prediction. In fact, any hypothesis on distributed applications can similarly be easily tested. This is because &lt;a href="http://www.gridcentriclabs.com/products/copper/"&gt;GridCentric's Copper virtualization platform&lt;/a&gt; makes it extremely simple to build a cluster in which to test the distributed application, as well as provides the API so that the test harness itself can dynamically create this cluster.&lt;br /&gt;
&lt;br /&gt;
The real result of this post is that Copper can open up a doorway into testing and experimenting with distributed applications that was otherwise closed due to the high cost of configuring, maintaining and synchronzing the distributed appication.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-6396125145433916184?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/YnwYiT7FbFs" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/6396125145433916184/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/09/gridcentric-testing-mincemeatpy.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/6396125145433916184?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/6396125145433916184?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/YnwYiT7FbFs/gridcentric-testing-mincemeatpy.html" title="GridCentric: Testing Mincemeat.py MapReduce Scalability" /><author><name>David Scannell</name><uri>http://www.blogger.com/profile/14713329443907634842</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://1.bp.blogspot.com/_V1O61leJWWY/TJjyajd492I/AAAAAAAAADc/Ri-WvMl3jlk/s72-c/mapreduce_intro.png" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/09/gridcentric-testing-mincemeatpy.html</feedburner:origLink></entry><entry gd:etag="W/&quot;C0ANSHo9eyp7ImA9Wx5XFU0.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-946847047463882150</id><published>2010-09-14T15:03:00.000-07:00</published><updated>2010-09-14T15:03:19.463-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-09-14T15:03:19.463-07:00</app:edited><title>Video: installation tutorial</title><content type="html">We think that our &lt;a href="http://wiki.gridcentriclabs.com/index.php?title=Tutorial_1"&gt;installation tutorial&lt;/a&gt; is straight-forward.  Hopefully it addresses most questions that come up when installing &lt;a href="http://gridcentriclabs.com/products/copper"&gt;Copper&lt;/a&gt;. But it's lengthy.  And a picture is worth a thousand words.&lt;br /&gt;
&lt;br /&gt;
In that spirit, we made a video version of the tutorial!  From bare-metal to cloning virtual machines in about ten minutes.  Most of that is the download time!  Enjoy!&lt;br /&gt;
&lt;br /&gt;
&lt;center&gt;&lt;br /&gt;
&lt;object width="480" height="385"&gt;&lt;param name="movie" value="http://www.youtube.com/v/mlBZR8VrvBo?fs=1&amp;amp;hl=en_US&amp;amp;fmt=22"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/mlBZR8VrvBo?fs=1&amp;amp;hl=en_US&amp;amp;fmt=22" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;
&lt;/center&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-946847047463882150?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/R8mR-XDzaj4" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/946847047463882150/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/09/video-installation-tutorial.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/946847047463882150?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/946847047463882150?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/R8mR-XDzaj4/video-installation-tutorial.html" title="Video: installation tutorial" /><author><name>Adin Scannell</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/09/video-installation-tutorial.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CUcERH49fCp7ImA9Wx5QEk0.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-227018510773950035</id><published>2010-08-30T13:00:00.000-07:00</published><updated>2010-08-30T14:16:45.064-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-08-30T14:16:45.064-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><title>Nimble Test Clusters</title><content type="html">I have been a big fan of &lt;a href="http://seleniumhq.org/"&gt;Selenium&lt;/a&gt; since I first used it about 4 years ago. It is a very flexible web testing framework that has the advantage of actually running within a real browser. This means that it can do complete end-to-end testing starting with client side javascript code down to SQL queries hitting the database server, and it can also do cross browser testing to ensure that everything works between IE, Firefox, Safari, Chrome, etc. The test cases can be written in a wide variety of syntaxes, from a simple HTML table to full-fledged programming languages like Java or Ruby. By using Selenium it is very easy for any team building a web project to create automated end-to-end regression tests that can be kicked off by a &lt;a href="http://en.wikipedia.org/wiki/Continuous_integration"&gt;Continuous Integration&lt;/a&gt; (CI) system whenever a developer commits. As I said before, I am a big fan of it.&lt;br /&gt;
&lt;br /&gt;
Unfortunately, automated tests in general have a downside especially when a project matures. The number of tests increase causing the time it takes to run the test suite to increase. This either causes the test suites to become a bottleneck in the development cycle, or people just run them less frequently. Running tests less frequently has a compounding effect because tests start going stale, which increases maintenance cost on the tests, which causes them again to be run less frequently. Unfortunately, Selenium has a handicap in this realm because it runs within the actual web browser which inherently adds time to each test. &lt;br /&gt;
&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_V1O61leJWWY/THvjYngPC7I/AAAAAAAAACk/Ne_HPXoqnow/s1600/slow-unittests.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_V1O61leJWWY/THvjYngPC7I/AAAAAAAAACk/Ne_HPXoqnow/s320/slow-unittests.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
Fortunately, automated tests are a good candidate for parallel execution even though they are generally executed serially. This is great news because we can significantly reduce the time a test suite takes to execute by distributing it across a mini-cluster of computers. Both JUnit and TestNG have projects for distributing their execution, and there is the &lt;a href="http://selenium-grid.seleniumhq.org/"&gt;Selenium Grid&lt;/a&gt; project that will distribute Selenium tests. Basically the goal of the project is to build a small test cluster where Selenium tests can be farmed out onto different machines to greatly reduce the time it takes to execute the full test suite. It is definitely a good idea, and it will address the problem of stale tests and their associated maintenance cost. With it taking less time to run the suite, we can easily add it back into a CI system and run the tests more frequently. &lt;br /&gt;
&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_V1O61leJWWY/THvjmNVwc7I/AAAAAAAAACs/CpBhBj95hgQ/s1600/fast-unittest.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_V1O61leJWWY/THvjmNVwc7I/AAAAAAAAACs/CpBhBj95hgQ/s320/fast-unittest.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
Unfortunately, as I mentioned in my post on &lt;a href="http://blog.gridcentriclabs.com/2010/07/elastic-build-systems.html"&gt;Elastic Build Systems&lt;/a&gt;, once we start to distribute our environments we open up a host of other problems. Namely, ensuring that all the machines in the cluster have been updated with the same version of Selenium, and Selenium Grid, that the browser versions being tested are properly synchronised, and in general be confident that the machines are identical: a test running on one machine is guaranteed to give the same results when executed on another machine. And of course the flip side of having tests run faster by distributing it across multiple machines means that there are more machines running idle for longer. Finally with a special consideration for Selenium it might be desirable to rerun the test suite across multiple browsers to regression test what is supported. With this in mind, your test cluster could start growing to include specialised Windows machines running IE7 and IE8, Linux machines running FireFox, etc. to ensure that you get good coverage, in addition to a speed boost. Of course, when adding more machines, the complexity of managing the cluster also increases, as well as the wasted resources of idling machines.&lt;br /&gt;
&lt;br /&gt;
Fortunately, GridCentic's &lt;a href="http://gridcentriclabs.com/products/copper/"&gt;Copper high-performance virtualization platform&lt;/a&gt; makes managing a computer cluster extremely easy and allows a cluster to be re-purposed within seconds.&lt;br /&gt;
&lt;br /&gt;
At the core of the Copper platform is the ability of virtual machines to&lt;i&gt; live-clone&lt;/i&gt; themselves. In other words, Copper allows running virtual machines to &lt;i&gt;instantly&lt;/i&gt; create multiple copies of themselves where each copy is an exact replica of the original machine, from the memory that has been loaded down to the instruction pointer in the CPU. This whole process of taking a single running virtual machine to saturating the physical limitation of your hardware with cloned virtual machines happens within seconds. In addition, the Copper platform also takes care of networking all of these virtual machines so that they can communicate with each other, and if desired they can also communicate with the rest of the network.&lt;br /&gt;
&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_V1O61leJWWY/THgidgyH4nI/AAAAAAAAABs/-BONi9HOhD4/s1600/cloneout.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="57" src="http://4.bp.blogspot.com/_V1O61leJWWY/THgidgyH4nI/AAAAAAAAABs/-BONi9HOhD4/s400/cloneout.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
Lets suppose for example that we manage to scrounge together a little test cluster of 4 machines that we can use for running Selenium Grid to optimize our Selenium test runs. Ideally we would like to ensure that our web platform tests run using our supported configurations: Windows running IE8, OS X running Safari, and Linux running Firefox. Traditionally, we would divide up our cluster for these configurations with a single machine per configuration and one spare machine we could give to a single configuration. We have just lost the benefits of Selenium Grid &lt;i&gt;and&lt;/i&gt; we have an extra computer that cannot be shared.&lt;br /&gt;
&lt;br /&gt;
However, we can make use of Copper's ability to dynamically and quickly re-purpose an entire cluster. Each testing environment (operating system, browser, tools and Selenium Grid) will be encapsulated in its own virtual machine that will run on top of the little test cluster. When it becomes time to run the tests, the virtual machine will scale itself out using the fast live-cloning provided by Copper and then run Selenium Grid as normal. Once it has finished executing the tests, it will clean up its clone machines and scale back down freeing up the cluster resources. Then another test environment can scale out onto the recently freed resources, run the tests, and again scale back down.&amp;nbsp; This can be done for each environment.&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_V1O61leJWWY/THvug5PJyzI/AAAAAAAAADE/Q97aC6peyYY/s1600/test-cycle.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_V1O61leJWWY/THvug5PJyzI/AAAAAAAAADE/Q97aC6peyYY/s320/test-cycle.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;Moreover, supporting a new test environment is nothing more that creating a virtual machine for it and configuring the machine by installing the test environment software. This new configuration can now scale up, run tests using Selenium Grid, and scale back down.&lt;br /&gt;
&lt;br /&gt;
In a &lt;a href="http://blog.gridcentriclabs.com/2010/08/elastic-build-system-in-action.html"&gt;previous post&lt;/a&gt; I showed how &lt;a href="http://hudson-ci.org/"&gt;Hudson&lt;/a&gt; can be easily transformed into an Elastic Build System. Basically, it allowed Hudson to dynamically create new slave machines that are identical to the master machine and then farm out build jobs to them. Hudson could also destroy the slaves when they were no longer needed to free up resources. Now let's combine everything together to show how we can build an ultimate continuous integration environment using the Copper platform.&lt;br /&gt;
&lt;br /&gt;
There will be a single virtual machine with Hudson installed that will be our master machine. We will continue to have a virtual machine per test environment that we want to support, and we'll configure Hudson to treat each of these machines as a slave. We'll then configure some Hudson jobs that will run on our test environment slaves that will basically clone out the test environment, start up Selenium Grid, run the tests and then scale back down. We'll also configure Hudson to have a couple of elastic slaves as described in a previous post. So now we have it. A developer commits some change, Hudson farms the build out to a dynamically created slave. Once the build is completed, it can trigger the Selenium Grid testing agents in turn so that they will scale out, run the regression integration tests, scale back down and produce a test report of any possible regression issues.&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_V1O61leJWWY/THvt6ytZMqI/AAAAAAAAAC8/I0luaofI_5w/s1600/all_together.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_V1O61leJWWY/THvt6ytZMqI/AAAAAAAAAC8/I0luaofI_5w/s320/all_together.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
Amazing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-227018510773950035?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/0dvG8Jok6Kc" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/227018510773950035/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/08/nimble-test-clusters.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/227018510773950035?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/227018510773950035?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/0dvG8Jok6Kc/nimble-test-clusters.html" title="Nimble Test Clusters" /><author><name>David Scannell</name><uri>http://www.blogger.com/profile/14713329443907634842</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://3.bp.blogspot.com/_V1O61leJWWY/THvjYngPC7I/AAAAAAAAACk/Ne_HPXoqnow/s72-c/slow-unittests.png" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/08/nimble-test-clusters.html</feedburner:origLink></entry><entry gd:etag="W/&quot;Dk8HRX0ycSp7ImA9Wx5SFUk.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-7898159452672970950</id><published>2010-08-11T08:41:00.000-07:00</published><updated>2010-08-11T09:40:34.399-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-08-11T09:40:34.399-07:00</app:edited><title>Slashdotted!</title><content type="html">&lt;span class="Apple-style-span"   style="  border-collapse: collapse; font-family:arial, sans-serif;font-size:13px;"&gt;Well, the &lt;a href="http://tech.slashdot.org/story/10/08/11/0027233/Extreme-Memory-Oversubscription-For-VMs"&gt;Slashdot&lt;/a&gt; referral certainly stirred things up over here.&lt;br /&gt;&lt;br /&gt;I think it's important to clarify some core points about our system and how it works.  Adin's done a good job responding to some of the posts on Slashdot, but we have our hands full cutting the Copper 1.1 release and I figured it would be best to provide a quick overview of what we are up to here, so we can all get back to work :).&lt;br /&gt;&lt;br /&gt;The memory oversubscription is one neat application of our core technology, which basically amounts to &lt;a href="http://http://en.wikipedia.org/wiki/Copy-on-write"&gt;COW&lt;/a&gt;-based cloning of entire virtual machines. Our goal with Copper is to use this primitive to enable whole-cluster virtualization.  We take a single physical cluster, and transparently run multiple virtual clusters on top of it.  Each of these virtual clusters can grow and shrink independently (by cloning new VMs and killing off clones), to accomodate different demands.&lt;br /&gt;&lt;br /&gt;Our ultimate aim is to take 'Virtual Machine Appliances' and turn them into 'Virtual Cluster Appliances': minimal-configuration disk images that are able to spin up a virtual cluster on demand, serve up a single app, and dynamically adjust resource usage based on whatever internal metrics they want to use.&lt;br /&gt;&lt;br /&gt;To this end, we've exposed the cloning primitive &lt;i&gt;inside&lt;/i&gt; the master virtual machine of Copper virtual clusters.  The master VM, through either an API call or a shell tool (which basically wraps the API call), can invoke a 'clone' operation.  For example, with the shell, it would look like this:&lt;br /&gt;&lt;br /&gt;'gc clone 3'&lt;br /&gt;&lt;br /&gt;This will request resources for 3 clone VMs from the Copper controller, create those clones, and then exit.  The clones are exact duplicates of the master VM in terms of the state of memory and the root disk. Peripheral devices, external storage resources, and network configurations of the &lt;b&gt;master&lt;/b&gt; can also be transparently replicated on the clones. &lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="  border-collapse: collapse; font-family:arial, sans-serif;font-size:13px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="  border-collapse: collapse; font-family:arial, sans-serif;font-size:13px;"&gt;The semantics of the clone operation closely match those of the familiar UNIX process fork(). &lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="  border-collapse: collapse; font-family:arial, sans-serif;font-size:13px;"&gt;There are several additional API calls that can help with managing the resulting virtual cluster - e.g. API calls for listing all clones, killing individual clones or entire generations of clones, etc. Bindings have been created to expose these APIs in several languages so applications can make full use of Copper's capabilities. For instance, David had a very informative post on how to use Copper's APIs to create a dynamic build system (&lt;a href="http://blog.gridcentriclabs.com/2010/08/elastic-build-system-in-action.html" target="_blank" style="color: rgb(42, 93, 176); "&gt;Elastic Build System in Action&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;Copper's use of lightweight virtual machine cloning is deceptively powerful for enabling 'virtual cluster appliances'.  For example, we can create a 'memcached' virtual cluster that just has memcached installed on the root disk, and can be instantly expanded or contracted in a matter of seconds, with no extra configuration steps.  Same can be done for apps like Hadoop and Cassandra. Check out some of the earlier blog posts (&lt;a href="http://blog.gridcentriclabs.com/2010/08/howto-build-and-scale-cassandra-cluster.html" target="_blank" style="color: rgb(42, 93, 176); "&gt;Howto: Build and Scale a Cassandra Cluster&lt;/a&gt;,&lt;a href="http://blog.gridcentriclabs.com/2010/07/howto-build-hadoop-cluster-in-five.html" target="_blank" style="color: rgb(42, 93, 176); "&gt;Howto: Build  a Hadoop Cluster in 5 Minutes &lt;/a&gt;, &lt;a href="http://blog.gridcentriclabs.com/2010/07/how-to-build-10-core-memcached-cluster.html" target="_blank" style="color: rgb(42, 93, 176); "&gt;Howto: Build a 10 Node Memcached Cluster&lt;/a&gt;) for enabling cluster appliances with Copper.&lt;br /&gt;&lt;br /&gt;Virtualization is growing up, and it can't be just about single machines anymore.  Copper's name was chosen to be reminiscent of 'cluster operating system', and that's really what we're trying to build at GridCentric.&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="  border-collapse: collapse; font-family:arial, sans-serif;font-size:13px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-7898159452672970950?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/21RHVmS2oSY" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/7898159452672970950/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/08/slashdotted.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/7898159452672970950?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/7898159452672970950?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/21RHVmS2oSY/slashdotted.html" title="Slashdotted!" /><author><name>Vivek Lakshmanan</name><uri>http://www.blogger.com/profile/13111992610360148210</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/08/slashdotted.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CEcDQHozcSp7ImA9Wx5SFEo.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-4182191969793939935</id><published>2010-08-10T12:51:00.000-07:00</published><updated>2010-08-10T13:27:51.489-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-08-10T13:27:51.489-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="news" /><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><title>Virtualization and over-subscription: breaking the 100% utilization barrier</title><content type="html">&lt;div&gt;Something that we think about a lot at &lt;a href="http://gridcentriclabs.com/"&gt;GridCentric&lt;/a&gt; is how to ensure that resources are used efficiently.  In fact, that's what led us to create &lt;a href="http://gridcentriclabs.com/products/copper"&gt;Copper&lt;/a&gt; in the first place -- seeing how difficult it was to share large clusters amongst multiple users, groups or even software versions.  We think that the ideas behind Infrastructure-as-a-Service have huge potential to revolutionize computing.&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;We realize that a lot of enterprises have focused on deriving maximum efficiency from their resources for a long time.  I thought that I'd take a ground-up look at some of the mechanisms used by virtualization to do that, and introduce something that we're developing for our next products in order to take resource multiplexing to the next level.&lt;/div&gt;&lt;br /&gt;
&lt;h2&gt;Example and demo&lt;/h2&gt;&lt;br /&gt;
&lt;div&gt;Our developers are hard at work on our next generation of products, which in addition to allowing instantaneous scaling and solving configuration nightmares, allows you to over-commit your physical resources (both processors &lt;i&gt;and&lt;/i&gt; memory).  This quick demo provides a great example of over-subscription, showing a development version of our platform cramming 16 gigabytes worth of clones onto a single host with 8 gigabytes of memory. Awesome.&lt;/div&gt;&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;object width="480" height="289"&gt;&lt;param name="movie" value="http://www.youtube.com/v/yxH-tGZ0-F8&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;fmt=22"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/yxH-tGZ0-F8&amp;amp;hl=en_US&amp;amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="289"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;
&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;If you're unfamiliar with Copper, each of the clone VMs created in the video is an independent machine with its own memory, disk and network initially identical to the master at the time of cloning.  Each of the 16 VMs could have been allocated to a user or used to perform a specific task independently of the rest.  At the end, if we look at the resources allocated by VMs on this host within our Copper installation, we see the following:&lt;/div&gt;&lt;pre class="brush: bash"&gt;$ gc-host stat node8&lt;/pre&gt;&lt;pre&gt;                   CPU(s) 16 of 4 (400%)
                      RAM 17408 of 5631 (309%)
               Local disk 0 of 192258490368 MB (0%)
  Available Named Storage ['kv-local(10485760000)']
       Used Named Storage []
    Available PCI Devices ['VGA(0000:01:05.0)']
         Used PCI Devices []
       Available Networks ['default', 'gridcentric']
           Available Tags ['development']
&lt;/pre&gt;&lt;br /&gt;
&lt;div&gt;To be fair, the memory shown is without the extra overhead from the management domain (2.5 gigabytes).  Pessimistically, we are achieving an actual memory over-subscription of ~212%, not 309%. Still, not too shabby.&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;We achieve over-subscription by leveraging our novel cloning mechanism. In this post, I take a look at over-subscription in general and touch on some of the mechanisms used to support squeezing more out of your resources. For simplicity, I will focus on two crucial computing resources which are typically managed by an operating system or hypervisor: the processor (CPU) and memory.  This post assumes a basic familiarity with virtualization concepts, but it will be gentle.&lt;/div&gt;&lt;br /&gt;
&lt;h2&gt;CPU over-subscription with multi-processing&lt;/h2&gt;&lt;br /&gt;
&lt;div&gt;Traditional operating systems have been over-subscribing and multiplexing resources since the days of the mainframe.  Contrary to what it might seem, multiplexing a resource may actually allow you to use it far more efficiently than just giving it to a single process or user (when you consider everything that it is doing). Multiplexing CPUs with virtual machines is a primary motivator for consolidation, since CPUs are often an underutilized component on a typical enterprise server.&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;Suppose we have four virtual machines (VMs) that perform some work (the dark color) followed by some waiting for an I/O event (the light color).  If you had an &lt;tt&gt;ssh&lt;/tt&gt; connection to a VM, this might consist of processing an incoming TCP packet, the terminal and shell doing a bit of work, generating and sending a response packet (with appropriate ACK number) then waiting for the next packet to arrive.&lt;/div&gt;&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_Na7dk58Iywg/TGFxZMzvwNI/AAAAAAAAAEU/nk8AftSdMCE/s1600/processes.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img src="http://3.bp.blogspot.com/_Na7dk58Iywg/TGFxZMzvwNI/AAAAAAAAAEU/nk8AftSdMCE/s200/processes.png" width="162" border="0" height="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;Virtual machines are not much different than processes in an operating system. Because most processes inside each VM are I/O-bound, VMs are also generally I/O-bound as a whole (depending, of course, on the workload).  This means that VMs often spend significant time waiting for I/O. If we were to just schedule just a single VM on a CPU, the execution would look much like the following.&lt;/div&gt;&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_Na7dk58Iywg/TGBI0r-b9BI/AAAAAAAAADU/lG28cbvIe3o/s1600/a_alone.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img src="http://4.bp.blogspot.com/_Na7dk58Iywg/TGBI0r-b9BI/AAAAAAAAADU/lG28cbvIe3o/s400/a_alone.png" width="400" border="0" height="33" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;Instead of maximizing efficiency, the CPU would actually be doing nothing most of the time.  If the VMs B, C, and C were similar and we had four distinct processors (one assigned to each), we would imagine them to running as follows.&lt;/div&gt;&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_Na7dk58Iywg/TGBJF0eU2_I/AAAAAAAAADk/3Wywz-824tM/s1600/ideal_schedule.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img src="http://2.bp.blogspot.com/_Na7dk58Iywg/TGBJF0eU2_I/AAAAAAAAADk/3Wywz-824tM/s400/ideal_schedule.png" width="400" border="0" height="141" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;I've staggered them to show that we could actually come up with a very efficient schedule of execution on a single CPU. Since each VM spends most of its time waiting (and the CPU is not required for I/O), we could simply overlay the above schedules.&lt;/div&gt;&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_Na7dk58Iywg/TGBJLob86-I/AAAAAAAAADs/zJFmYaX1RrQ/s1600/cpu_view.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img src="http://1.bp.blogspot.com/_Na7dk58Iywg/TGBJLob86-I/AAAAAAAAADs/zJFmYaX1RrQ/s400/cpu_view.png" width="400" border="0" height="33" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;When a virtual machine doesn't have work to do, we instead run another virtual machine. When an I/O request completes, we reschedule the appropriate VM which can once again perform useful work. We make this a bit more complex with limits in how long each virtual machine can run consecutively (the &lt;a href="http://en.wikipedia.org/wiki/Preemption_%28computing%29#Time_slice"&gt;quantum&lt;/a&gt;), priorities, interrupt handling, &lt;a href="http://en.wikipedia.org/wiki/Gang_scheduling"&gt;gang scheduling&lt;/a&gt;, etc.  But the above is the undeniably essence of all CPU over-commit: simple time-sharing.&lt;/div&gt;&lt;br /&gt;
&lt;h2&gt;Memory over-subscription&lt;/h2&gt;&lt;br /&gt;
&lt;div&gt;Memory over-subscription is where things get interesting.  There are several different approaches that we can take in order to over-commit our physical memory.&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;In a normal virtualization situation, VMs each have a fixed amount of memory, the sum of which is less than or equal to the total memory available on the physical machine.  Memory of the physical system is divided up into &lt;i&gt;pages&lt;/i&gt; by the hardware, which the hypervisor can arbitrarily remap (therefore the VMs do not need contiguous memory).  In a non-over-committed situation, there is an &lt;a href="http://en.wikipedia.org/wiki/Injective_function"&gt;injective mapping&lt;/a&gt; from the memory of the VMs to the pages of physical memory.  This situation is shown below, with colors corresponding to VMs and the numbers corresponding to the contents of their memory.&lt;/div&gt;&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_Na7dk58Iywg/TGBUe34pR3I/AAAAAAAAAD0/D_0whmRY1Eo/s1600/memory-normal.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img src="http://3.bp.blogspot.com/_Na7dk58Iywg/TGBUe34pR3I/AAAAAAAAAD0/D_0whmRY1Eo/s320/memory-normal.png" border="0" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;h3&gt;Paging / swap&lt;/h3&gt;&lt;br /&gt;
&lt;div&gt;Paging is the process in which a page is taken out of memory (possibly to a swap partition or pagefile), the corresponding page table entry is marked as &lt;b&gt;not present&lt;/b&gt;, and the page is returned to memory on-demand. Operating systems have been paging out process memory to disk for a long time.  This allows you to run more processes than you have space for or a single processes that requires more memory than the physical memory you have available. Most optimization and research around paging is focused on selecting the correct pages to remove from physical memory, as the penalty incurred for unexpectedly having to bring a page back in is quite high (disks are &lt;b&gt;extremely&lt;/b&gt; slow compared to memory).&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;In the below example, we see that we want to run VMs that have more memory than is available on the physical machine.  In this case, several pages from VM C have been marked &lt;b&gt;not present&lt;/b&gt; by the underlying hypervisor.  If VM C is scheduled and attempts to access these pages, it will be blocked while the system fetches those pages from secondary storage.  When that happens, we can assume that another page (belonging to one of A, B or C) will be paged out to disk to make room for the incoming one.&lt;/div&gt;&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_Na7dk58Iywg/TGBUjBHkGzI/AAAAAAAAAD8/-5DR-ASDiY4/s1600/memory_demand-paging.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img src="http://3.bp.blogspot.com/_Na7dk58Iywg/TGBUjBHkGzI/AAAAAAAAAD8/-5DR-ASDiY4/s320/memory_demand-paging.png" border="0" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;For virtualization specifically, it's useful to divide hypervisors into two different camps: type-I hypervisors run on the bare metal (e.g. VMWare ESX, Xen) while type-II hypervisors run on top of an existing operating system (e.g. VMWare Workstation, KVM, VirtualBox).&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;Type-II hypervisors often leverage existing operating system infrastructure in order to provide paging for virtual machines.  The fact that they are running on an existing operating system means that they can leverage many pre-existing mechanisms.  For type-I hypervisors, as far as I know, only VMWare's ESX products offer a paging mechanism today.  Several &lt;a href="http://cseweb.ucsd.edu/%7Evahdat/papers/osdi08-de.pdf"&gt;research projects&lt;/a&gt; have explored adding forms of paging to Xen, but as far as I know it's still a work-in-progress in the main tree.&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;Paging is a useful fall-back mechanism, and does not require modifications within guest VMs.  Unfortunately, it can be very expensive (in terms of a performance hit) and may suffer from poor interactions with the guest operating systems (double paging).&lt;/div&gt;&lt;br /&gt;
&lt;h3&gt;Ballooning&lt;/h3&gt;&lt;br /&gt;
&lt;div&gt;Ballooning is a very common mechanism used to support creating VMs with more memory than the host can support.  In essence, ballooning simply forces VMs to share with each other by requiring that they return some of their memory to the hypervisor.  This is done by dynamically inflating a &lt;b&gt;balloon driver&lt;/b&gt; within the guest operating system, then informing the hypervisor which pages have been allocated so that they may be used by other VMs.  This is a very effective mechanism, and generally does not suffer from poor interactions with guest operating systems (unless memory requirements change rapidly).  However, it is not transparent to the guest operating systems -- they are aware that they are missing memory and require the special balloon driver to be installed.&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;An example of ballooning is shown below.  The grey pages have been allocated by the balloon driver within the guest and returned to the hypervisor (we could say that the balloons have been inflated slightly within the VMs).&lt;/div&gt;&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_Na7dk58Iywg/TGBUnR6HlfI/AAAAAAAAAEE/I1o5pNcbfqM/s1600/memory_ballooning.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img src="http://3.bp.blogspot.com/_Na7dk58Iywg/TGBUnR6HlfI/AAAAAAAAAEE/I1o5pNcbfqM/s320/memory_ballooning.png" border="0" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;h3&gt;Page sharing&lt;/h3&gt;&lt;br /&gt;
&lt;div&gt;By far the neatest of the space-saving techniques, some of VMWare's products support a mechanism they call content-based page sharing.  Essentially, the hypervisor continually spends a relatively small amount of CPU time crawling memory and hashing pages.  If it identifies two pages with the same contents, it maps those pages to the same underlying physical page and frees one of the copies.  Of course, this single copy is marked read-only and will be copied if one of the VMs needs to change the contents (copy-on-write). Thus, if you are running a lot of similiar or identical VMs, there can be significant savings.&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;Similiar to paging, research projects (&lt;a href="http://cseweb.ucsd.edu/%7Evahdat/papers/osdi08-de.pdf"&gt;1&lt;/a&gt;, &lt;a href="http://www.xen.org/files/xensummit_oracle09/Satori.pdf"&gt;2&lt;/a&gt;) have explored this capability for Xen, but it has not quite made it into mainline.  In practice, savings for production workloads will likely be very small.  According to the &lt;a href="http://www.vmware.com/pdf/vdi_sizing_vi3.pdf"&gt;few numbers&lt;/a&gt; I've found from VMWare, it seems to be in 5-10% for simple workloads. Windows &lt;a href="http://en.wikipedia.org/wiki/Portable_Executable#Relocations"&gt;rewrites&lt;/a&gt; binaries heavily, so one is unlikely to find many similar pages outside of the kernel code pages, some fixed-address DLLs (win32 probably always gets its preferred address) and pages in the buffer cache.  I suspect that most savings (in whitepapers, datasheets, etc.) likely come from identification of unused zero-pages which would also be nicely gobbled up by an automated and co-operative balloon (&lt;a href="http://blogs.vmware.com/virtualreality/2008/03/cheap-hyperviso.html"&gt;an example&lt;/a&gt; -- each VM is only using about 5% of its memory).  An example of page sharing is shown below, where the hypervisor has identified the common pages and remapped appropriately, leading to significant savings.&lt;/div&gt;&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_Na7dk58Iywg/TGFxiKrEOBI/AAAAAAAAAEc/KD5imt8Prn8/s1600/memory_tps.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img src="http://4.bp.blogspot.com/_Na7dk58Iywg/TGFxiKrEOBI/AAAAAAAAAEc/KD5imt8Prn8/s320/memory_tps.png" border="0" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;h2&gt;Summary&lt;/h2&gt;&lt;br /&gt;
&lt;div&gt;For clarity, the three approaches for over-subscription memory that I've touched upon are outlined here.&lt;table padding="0" border="1"&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;th&gt;Technique&lt;/th&gt;&lt;th&gt;Advantage&lt;/th&gt;&lt;th&gt;Disadvantage&lt;/th&gt;&lt;th&gt;Whitepaper symptom&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Paging&lt;/td&gt;&lt;td&gt;(Mostly) transparent.&lt;br /&gt;
As much as you want.&lt;/td&gt;&lt;td&gt;Slow.&lt;br /&gt;
Poor interactions with guest VMs.&lt;/td&gt;&lt;td&gt;They don't show how much swap is active.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Ballooning&lt;/td&gt;&lt;td&gt;Co-operative.&lt;br /&gt;
Safe.&lt;/td&gt;&lt;td&gt;Guest VM modifications required.&lt;br /&gt;
Limits guest VM memory.&lt;/td&gt;&lt;td&gt;Big VMs with tiny memory usage (big balloons).&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Content-based sharing&lt;/td&gt;&lt;td&gt;Transparent.&lt;br /&gt;
Negligible overhead.&lt;/td&gt;&lt;td&gt;Probably limited gains.&lt;/td&gt;&lt;td&gt;VMs run the same application with the same data.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;Virtualization has come a long way in delivering techniques that allow organizations to get the most out their resources.  It's a very tricky problem however, and there is no magic bullet. If you see 5x over-commit with real workloads, it's almost guaranteed that there's either a lot of ballooning or a lot of paging. We'll be talking more about some of these ideas over the next few months, as we put the finishing touches on our own approach to cramming in as much stuff as possible into as little time and space as we can. :)&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-4182191969793939935?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/-1QG86kZmCk" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/4182191969793939935/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/08/virtualization-and-over-subscription.html#comment-form" title="4 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/4182191969793939935?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/4182191969793939935?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/-1QG86kZmCk/virtualization-and-over-subscription.html" title="Virtualization and over-subscription: breaking the 100% utilization barrier" /><author><name>Adin Scannell</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://3.bp.blogspot.com/_Na7dk58Iywg/TGFxZMzvwNI/AAAAAAAAAEU/nk8AftSdMCE/s72-c/processes.png" height="72" width="72" /><thr:total>4</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/08/virtualization-and-over-subscription.html</feedburner:origLink></entry><entry gd:etag="W/&quot;Dk8MQnoyfyp7ImA9Wx5SEU8.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-4653293152813199889</id><published>2010-08-06T12:50:00.000-07:00</published><updated>2010-08-06T13:01:23.497-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-08-06T13:01:23.497-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><category scheme="http://www.blogger.com/atom/ns#" term="recipes" /><title>Elastic Build System in Action</title><content type="html">&lt;a href="http://gridcentriclabs.com/products/copper/"&gt;GridCentric's Copper platform&lt;/a&gt; enables multiple diverse applications to securely, and easily, share the same underlying physical resources. Moreover, it allows these applications to &lt;span style="font-style: italic;"&gt;effortlessly&lt;/span&gt; scale up to create &lt;span style="font-style: italic;"&gt;on-demand homogeneous clusters&lt;/span&gt;, and to scale back down again. &lt;a href="http://blog.gridcentriclabs.com/2010/07/elastic-build-systems.html"&gt;My last post&lt;/a&gt; described how this capability is ideally situated to solve the problems with Distributed &lt;a href="http://en.wikipedia.org/wiki/Continuous_Integration"&gt;Continuous Integration&lt;/a&gt; (CI) servers. In this post I will show how we can make &lt;a href="http://hudson-ci.org/"&gt;Hudson&lt;/a&gt;, a popular CI server, aware of the Copper platform, and to use its powerful API to dynamically create &lt;span style="font-style: italic;"&gt;identical slave machines&lt;/span&gt; and become an &lt;span style="font-style: italic;"&gt;elastic build system&lt;/span&gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;Requirements&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://wiki.gridcentriclabs.com/wiki/Downloads"&gt;A Copper deployment&lt;/a&gt; (&lt;a href="http://gridcentriclabs.com/trial"&gt;60-day free fully featured license&lt;/a&gt;)&lt;br /&gt;
&lt;br /&gt;
A virtual cluster named &lt;span style="font-style: italic;"&gt;elastic-hudson&lt;/span&gt; created in the Copper platform. If you are unfamiliar with creating virtual clusters in Copper please take some time following the &lt;a href="http://wiki.gridcentriclabs.com/index.php?title=Main_Page"&gt;tutorials&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;Configure the Virtual Cluster&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
This section will describe all the configuration we need to do in Copper to boot the virtual machine that will host our Hudson installation. Once the virtual machine is booted, it will be able to use the Copper API to clone itself.&lt;br /&gt;
&lt;br /&gt;
As mentioned earlier I have a virtual cluster already created called &lt;span style="font-style: italic;"&gt;elastic-hudson&lt;/span&gt;. For the root container I downloaded the &lt;a href="http://downloads.gridcentriclabs.com/guest-images/ubuntu.9-04.x86-64.qcow2.bz2"&gt;pre-configured Ubuntu image&lt;/a&gt;, but&lt;a href="http://downloads.gridcentriclabs.com/guest-images/"&gt; there are other distros available&lt;/a&gt; if you would prefer something else. The only special configuration for this virtual cluster is that I am going to attach a public network to it.&lt;br /&gt;
&lt;br /&gt;
Whenever a virtual machine uses the clone API, the new virtual machine will automatically be configured to be apart of a private network between all of the clones and the original machine. Since Hudson is a build system, it will need access to my &lt;a href="http://mercurial.selenic.com/"&gt;Mercurial&lt;/a&gt; respository which is hosted on a separate machine outside this private network.&lt;br /&gt;
&lt;br /&gt;
To remedy this problem, I configured my virtual cluster to also have a public network. In this case, all the virtual machines in the cluster now have access to two networks: the original private network and the public network. I will be using the private network for communication between the Hudson master and the Hudson slaves, and then the public network for the machines to communicate with the repository.&lt;br /&gt;
&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_V1O61leJWWY/TFtCYG57uPI/AAAAAAAAABc/ao5waNB5obE/s1600/hudson_cluster_network_setup.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_V1O61leJWWY/TFtCYG57uPI/AAAAAAAAABc/ao5waNB5obE/s320/hudson_cluster_network_setup.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;br /&gt;
That's all the configuration needed, and now I can boot the &lt;span style="font-style: italic;"&gt;elastic-hudson&lt;/span&gt; virtual cluster.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;Configure Hudson&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Once the &lt;span style="font-style: italic;"&gt;elastic-hudson&lt;/span&gt; virtual cluster is booted we should be able to log into it.  By using the built-in DNS Server that comes with the Copper platform I can access the virtual cluster by simply using the domain name &lt;span style="font-style: italic;"&gt;elastic-hudson.clusters&lt;/span&gt;. Otherwise, I can use the management tools to determine the virtual machine's public IP address.&lt;br /&gt;
&lt;br /&gt;
Once logged into the virtual machine I will setup a vanilla Hudson install, &lt;span style="font-style: italic;"&gt;as I would do in any other environment&lt;/span&gt;. So I follow &lt;a href="http://hudson-ci.org/debian/"&gt;the Hudson install instructions&lt;/a&gt;, install the tools (e.g. Mercurial, Ant, unzip, wget, make,  etc.) that are required to build my project, and finally configure Hudson to build my projects. So far we have not done anything different when operating Hudson.&lt;br /&gt;
&lt;br /&gt;
Now comes the interesting part: making Hudson clone slaves of itself to create an elastic build system. Access to the Copper API is controlled using the familiar Unix permission scheme. Basically a user needs to have read/write access to &lt;span style="font-style: italic;"&gt;/proc/xen/xenbus&lt;/span&gt; in order to utilize the API. By default only the root user has permission to it and since Hudson runs as the special &lt;span style="font-style: italic;"&gt;hudson&lt;/span&gt; user, we need to also give this user permission to it in order for the application to clone itself. There are many ways to do this, but we are just going to give everyone access to it:&lt;br /&gt;
&lt;br /&gt;
&lt;pre&gt;root@elastic-hudon# chmod og+rw /proc/xen/xenbus
&lt;/pre&gt;&lt;br /&gt;
Now that the hudson user has permission to use the Copper API, the next piece of configuration is that we have to allow Hudson passwordless ssh within the private network. Simply &lt;a href="http://linuxproblem.org/art_9.html"&gt;follow this guide&lt;/a&gt; and use localhost (or 10.0.0.1) for the remote machine as the hudson user. This should enable the hudson user to ssh throughout the private network without the need of a password.&lt;br /&gt;
&lt;br /&gt;
Finally we are ready to finish off our setup by configuring a slave within Hudson. Log into the Hudson web interface (e.g. http://elastic-hudson-0.clusters:8080) and go to &lt;span style="font-style: italic;"&gt;Manage Hudson -&amp;gt; Manage Nodes -&amp;gt; New Node&lt;/span&gt;. Enter the name of the slave (e.g. clone-slave) and select the dumb slave option and finally click ok.&lt;br /&gt;
&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_V1O61leJWWY/TFsAxzYy0-I/AAAAAAAAABM/bp2pIUJOYT8/s1600/hudson_slave_config.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img alt="" border="0" height="188" id="BLOGGER_PHOTO_ID_5501992225370985442" src="http://3.bp.blogspot.com/_V1O61leJWWY/TFsAxzYy0-I/AAAAAAAAABM/bp2pIUJOYT8/s400/hudson_slave_config.png" style="display: block; height: 151px; margin: 0px auto 10px; text-align: center; width: 320px;" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;br /&gt;
In reality the only thing special about this configuration is that I've specified a custom launch script called &lt;span style="font-style: italic;"&gt;/var/lib/hudson/clone_slave_launcher.sh&lt;/span&gt;. This is the script that Hudson will call when establishing a connection to the slave machine. The first thing we do in this script is use the Copper API to clone the Hudson machine, then we determine the clone's private network IP address, and finally proceed as normal to establish a connection to the newly created slave machine.&lt;br /&gt;
&lt;br /&gt;
Before revealing the script the other interesting configuration is Hudson's slave availability. I have set it so that Hudson will connect to a slave (or in our case create one) if there is a pending job in the queue for more than a minute. It will then disconnect from the slave (or in our case destroy the machine) if the queue becomes empty for more than a minute. In other words, Hudson will automatically scale up its footprint when there are pending jobs it needs to complete, and then automatically scale down, freeing up resources for other users (potentially other Hudson system for different teams), once it no longer needs that many resources.&lt;br /&gt;
&lt;br /&gt;
Here is the script that makes Hudson aware of the Copper platform, and performs the magic to make it into an Elastic Build System:&lt;br /&gt;
&lt;br /&gt;
&lt;pre class="brush: shell"&gt;#!/bin/bash
###################################################################
# This script can be used with the Continuous Integration server
# Hudson to allow it to use the GridCentric Copper platform to
# create on-demand slaves that are preconfigured to be identical to
# the master, at the time the slave is created.
#
# It will also destroy the slaves once Hudson has finished with it 
# and turns it offline.
###################################################################


# The SSH command. We are not doing strict host checking because 
# these are clone machines on a private network. We are sure there
# will be no man in the middle attacks.
SSH="ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no"

# The amount of time (in seconds) the master should wait before 
# connecting to the slave. By default it is 5 seconds, but can be
# passed as the first argument to the script.
WAIT_BEFORE_CONNECT=5
if [ $# == 1 ]
then
  WAIT_BEFORE_CONNECT=$1
fi

# Acquire a ticket from the GridCentric Copper platform. The ticket
# basically reserves resources (CPUs, Memory, etc.) on the physical
# cluster to host our clone.
echo "Acquiring ticket for resources to clone on to..."
TICKET=`gc rt 1 1 1 60000 | awk '{ print $2 }'`

if [ ${#TICKET} == 0 ]
then
  echo "Failed to acquire ticket in 1 minute."
  # exit with an EBUSY signal
  exit 16
else
  # We reserved the resources, so now we will clone using the 
  # ticket. This is the API call that allows this virtual machine 
  # to grow its footprint.

  echo "Cloning on ticket $TICKET..."
  gc clone $TICKET

  # The clone operation is kinda like fork(), afterwards there will
  # be 2 copies of this script running. One on the master virtual 
  # machine (this one) and another on the clone virtual machine. 
  # They will both be at the instruction after the clone (i.e. here),
  # so we do a quick check to see if we are the master or clone and
  # then execute differently.

  gc ismaster

  if [ $? == 0 ]
  then

    # This is the master's task and will be executed on the same 
    # machine that the Hudson instance is running. Basically it 
    # will determine the clone's private network's IP address, wait
    # a little bit of time for the clone to take care of its setup, 
    # and then SSH to the clone and run the Hudson slave jar.
    gc ltd $TICKET
    IP_ADDRESS=`gc ltd $TICKET | grep ip | awk '{ print $2 }'`

    echo "Connecting to clone on $IP_ADDRESS after waiting for $WAIT_BEFORE_CONNECT seconds..."
    sleep $WAIT_BEFORE_CONNECT

    # This command is what lets Hudson communicate with the slave. 
    # Basically it uses the stdin and stdout for communication.
    $SSH $IP_ADDRESS java -jar /var/run/hudson/war/WEB-INF/slave.jar

  else

    # This is the slave's task that will be executed on the clone 
    # machine. It first needs to kill the Hudson process to ensure
    # no conflicts, and then it will sit and poll to see if the 
    # slave.jar job is done. Hudson just kills the script on the 
    # master, so we can't assume we'll get a signal from it. So, we
    # just poll and once the slave.jar stops executing, we 
    # 'gc join' back and kill ourselves.

    # Kill any running Hudson instance 
    HUDSON_PID=`ps aux | grep hudson.war | grep java | awk '{ print $2 }'`
    kill $HUDSON_PID

    # Kill any running SSH connection to another slave. Remember, 
    # this is a clone of the master virtual machine, so any process
    # that was running there is also running here.
    SLAVE_PIDS=$(`ps aux | grep slave.jar`)
    kill $SLAVE_PIDS

    # Just wait a bit for things to happen
    sleep 60

    # Now we periodically check if slave.jar is still running. If 
    # it is, we are good. Otherwise, we are finished.
    while true;
    do
      SLAVE_PID=`ps aux | grep slave.jar | grep java | awk '{ print $2 }'`
      if [ ${#SLAVE_PID} == 0 ]
      then

        # Slave process has gone. We join, but since the master is 
        # not waiting for us to join, we join with 'noreport'. In 
        # other words, this virtual machine will die and the master
        # will get no report.
        gc join noreport
      else
        wait $SLAVE_PID
      fi
    done

 fi
fi
&lt;/pre&gt;&lt;br /&gt;
Here is what it looks like in Hudson when we launch our new slave.&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://4.bp.blogspot.com/_V1O61leJWWY/TFsCOJw2rVI/AAAAAAAAABU/K7aHypvbJyg/s1600/hudson_clone_slave_log.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5501993811925445970" src="http://4.bp.blogspot.com/_V1O61leJWWY/TFsCOJw2rVI/AAAAAAAAABU/K7aHypvbJyg/s320/hudson_clone_slave_log.png" style="cursor: pointer; display: block; height: 160px; margin: 0px auto 10px; text-align: center; width: 320px;" /&gt;&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
Finally, we can add as many of these slaves to Hudson as we like, using the &lt;i&gt;exact same configuration&lt;/i&gt;, if we need more machines in the cluster. These machines will never sit idle because Hudson will create them only when it needs them, then destroy them when it is done with them, and they will be guanteed to be properly configured because they are &lt;i&gt;exact replicas&lt;/i&gt; of the master machine at the instance they are created.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-4653293152813199889?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/SAGGxlmW6RE" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/4653293152813199889/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/08/elastic-build-system-in-action.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/4653293152813199889?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/4653293152813199889?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/SAGGxlmW6RE/elastic-build-system-in-action.html" title="Elastic Build System in Action" /><author><name>David Scannell</name><uri>http://www.blogger.com/profile/14713329443907634842</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://1.bp.blogspot.com/_V1O61leJWWY/TFtCYG57uPI/AAAAAAAAABc/ao5waNB5obE/s72-c/hudson_cluster_network_setup.png" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/08/elastic-build-system-in-action.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CUEHSXY_fyp7ImA9Wx5SEEs.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-4304538663445590039</id><published>2010-08-05T16:36:00.000-07:00</published><updated>2010-08-05T20:00:38.847-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-08-05T20:00:38.847-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><category scheme="http://www.blogger.com/atom/ns#" term="recipes" /><title>Howto: Build and scale a Cassandra cluster in five minutes</title><content type="html">&lt;b&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;Introduction&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;div&gt;In the spirit of the software &lt;a href="http://blog.gridcentriclabs.com/search/label/recipes"&gt;recipes&lt;/a&gt; we've been posting recently, I decided to give &lt;a href="http://cassandra.apache.org/"&gt;Cassandra&lt;/a&gt; a shot.  Cassandra is a distributed key-value store that was open-sourced by Facebook in 2008 and is now under the umbrella of the Apache foundation.  It advertises high-performance and robustness to individual node failures while providing eventual consistency.  It's been receiving quite a bit of attention recently, and &lt;a href="http://www.riptano.com/"&gt;Riptano&lt;/a&gt; has appeared to provide commercial support.&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;Cassandra is a bit of an odd name for a piece of software in my opinion.  &lt;a href="http://en.wikipedia.org/wiki/Cassandra"&gt;Cassandra&lt;/a&gt; is a figure from Greek mythology, who was both blessed with the ability to see the future and the curse that no one would ever believe her predictions.  I struggled for a while to come up with some hilarious software-equivalent joke, but I think we are all better off without one.&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;I spent a bit of time and installed Cassandra on a Copper virtual cluster.  I then wrote a few quick scripts to automatically have clones join a Cassandra cluster when they are created automatically.  GridCentric Copper supports persistent local storage through an awesome VFS abstraction.  As with Hadoop, I've omitted this part of the setup for now to simplify this post, but I reserve the right to make that post in the near future.&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;Cassandra does not have different classes of nodes, so I had no need to run anything special on the master of the virtual cluster.  Cassandra only requires a &lt;b&gt;seed&lt;/b&gt; node, so that a freshly started instance can learn about the others.  I use the master for this purpose, since we can assume that it's always around.  Other than that, this article details an unmodified Cassandra installation running inside a virtual cluster -- with transient nodes that can be created and destroyed in seconds.&lt;/div&gt;&lt;br /&gt;
&lt;b&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;Requirements&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;div&gt;A GridCentric Copper deployment (free trial available &lt;a href="http://www.gridcentriclabs.com/trial"&gt;here&lt;/a&gt;, software downloadable &lt;a href="http://downloads.gridcentriclabs.com/"&gt;here&lt;/a&gt;).&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;Just as in my &lt;a href="http://blog.gridcentriclabs.com/2010/07/howto-build-hadoop-cluster-in-five.html"&gt;Hadoop post&lt;/a&gt;, I'll assume that you have a virtual cluster already. If you don't know how to create a virtual cluster, you can see the &lt;a href="http://wiki.gridcentriclabs.com/index.php?title=Tutorials"&gt;tutorials&lt;/a&gt;. I just used the cluster-in-a-box script to fire up a new Ubuntu cluster.&lt;/div&gt;&lt;br /&gt;
&lt;b&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;Installing Packages&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;div&gt;Cassandra needs a Java6 JRE installed and configured.  In Ubuntu, I just did an &lt;tt&gt;apt-get install sun-java6-jre&lt;/tt&gt;, but the exact steps will vary depending on your distribution.  The next step is to grab the latest Cassandra packages.  Choose a mirror from&lt;br /&gt;
&lt;a href="http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.6.4/apache-cassandra-0.6.4-bin.tar.gz"&gt;http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.6.4/apache-cassandra-0.6.4-bin.tar.gz&lt;/a&gt;.  Once I had downloaded the latest Cassandra, I extracted the contents and moved the directory to &lt;tt&gt;/opt&lt;/tt&gt;, so I could make a nice little system that corresponds to the Linux Standard Base (or at least closer).&lt;/div&gt;&lt;pre class="brush: bash"&gt;wget &amp;lt;url&amp;gt;
tar -zxvf apache-cassandra-0.6.4-bin.tar.gz
mv apache-cassandra-0.6.4 /opt
&lt;/pre&gt;&lt;br /&gt;
&lt;div&gt;Cassandra is actually ready to go with the default configuration and your Java install can be tested at this point.  Simply run &lt;tt&gt;/opt/apache-cassandra-0.6.4/bin/cassandra -f&lt;/tt&gt; and you should see it start up.  You can press &lt;tt&gt;Ctrl-C&lt;/tt&gt; to kill it, because we want to write a few more scripts and make a cluster.&lt;/div&gt;&lt;br /&gt;
&lt;b&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;Tweaking files and writing Scripts&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;div&gt;Okay, so one annoying thing about Cassandra is that you need to specify an address inside the configuration file for binding.  This doesn't lend itself well to any kind of cluster environment and it actually confused me -- an interface would make much more sense here.  We need to abstract away this detail so we can tune the bind address on the clone.  In order to do this, we copy &lt;tt&gt;/opt/apache-cassandra-0.6.4/conf/storage-conf.xml&lt;/tt&gt; to &lt;tt&gt;/opt/apache-cassandra-0.6.4/conf/storage-conf.xml.in&lt;/tt&gt; and change the following keys in our new &lt;tt&gt;storage-conf.xml.in&lt;/tt&gt;:&lt;/div&gt;&lt;pre class="brush: xml"&gt;...
  &amp;lt;Seeds&amp;gt;
      &amp;lt;Seed&amp;gt;127.0.0.1&amp;lt;/Seed&amp;gt;
  &amp;lt;/Seeds&amp;gt;
  ...
  &amp;lt;ListenAddress&amp;gt;localhost&amp;lt;/ListenAddress&amp;gt;
  ...
&lt;/pre&gt;to:&lt;br /&gt;
&lt;pre class="brush: xml"&gt;...
  &amp;lt;Seeds&amp;gt;
      &amp;lt;Seed&amp;gt;10.0.0.1&amp;lt;/Seed&amp;gt;
  &amp;lt;/Seeds&amp;gt;
  ...
  &amp;lt;ListenAddress&amp;gt;PRIVATE_ADDRESS&amp;lt;/ListenAddress&amp;gt;
  ...
&lt;/pre&gt;&lt;div&gt;I changed the &lt;b&gt;Seed&lt;/b&gt; key because in our virtual cluster I am going to use the master as the seed.  I changed the &lt;b&gt;ListenAddress&lt;/b&gt; to &lt;b&gt;PRIVATE_ADDRESS&lt;/b&gt; because we are going to write a quick script that replaces that token and generates a unique &lt;tt&gt;storage-conf.xml&lt;/tt&gt; on each clone with the correct addresses.&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;In order to write that script, I need a quick helper.  First, I create a script named &lt;tt&gt;get-private-ip&lt;/tt&gt; and put it in &lt;tt&gt;/usr/local/bin&lt;/tt&gt;.&lt;/div&gt;&lt;pre class="brush: bash"&gt;#!/bin/bash
ifconfig eth0| grep inet | cut -d: -f2| cut -d' ' -f1
&lt;/pre&gt;&lt;br /&gt;
&lt;div&gt;Next, I create&lt;br /&gt;
&lt;tt&gt;recreate-cassandra-config&lt;/tt&gt; in &lt;tt&gt;/usr/local/bin&lt;/tt&gt; to actually grab the IP we want, and recreate &lt;tt&gt;storage-conf.xml&lt;/tt&gt; from our tokenized &lt;tt&gt;storage-conf.xml.in&lt;/tt&gt;.&lt;/div&gt;&lt;pre class="brush: bash"&gt;#!/bin/bash
cd /opt/apache-cassandra-0.6.4
address=`get-private-ip`
cat conf/storage-conf.xml.in | sed -e "s/PRIVATE_ADDRESS/$address/" &amp;gt; conf/storage-conf.xml
&lt;/pre&gt;&lt;br /&gt;
&lt;div&gt;Finally, because I am not going to be using persistent local storage (saving for a future post), I create a script &lt;tt&gt;clear-cassandra-data&lt;/tt&gt; in &lt;tt&gt;/usr/local/bin&lt;/tt&gt; in order to clear the data directory.  We will run this on clones before kicking Cassandra.&lt;/div&gt;&lt;pre class="brush: bash"&gt;#!/bin/bash
rm -rf /var/lib/cassandra/data/*
&lt;/pre&gt;&lt;br /&gt;
&lt;div&gt;I'm not a fan of hacking ways of starting and stopping services.  Before running my mini-Cassandra cluster, I threw together a script that lets me start and stop nicely (at least somewhat nicely).  I put this in &lt;tt&gt;/etc/init.d/cassandra&lt;/tt&gt; and then run &lt;tt&gt;update-rc.d cassandra defaults&lt;/tt&gt; in order to install the appropriate symlinks from &lt;tt&gt;/etc/init.d/rc*.d&lt;/tt&gt;.&lt;/div&gt;&lt;pre class="brush: bash"&gt;#!/bin/bash
#
# Simple Cassandra init script.
#
# chkconfig: 345 99 99
# description: Cassandra

### BEGIN INIT INFO
# Provides: 
# Required-Start: $network
# Required-Stop: 
# Should-Start: 
# Should-Stop: 
# Default-Start: 3 4 5
# Default-Stop: 0 1 2 6
# Short-Description: start and stop cassandra daemon
# Description: Cassandra
### END INIT INFO

PROG="/opt/apache-cassandra-0.6.4/bin/cassandra -f"
PROGNAME="Cassandra"
LOGFILE=/var/log/cassandra.log
PIDFILE=/var/run/cassandra.pid

running(){
    PID=`cat $PIDFILE 2&amp;gt;/dev/null`

    # Check that the pid is sane.
    if [ "x$PID" == "x" ] ; then
        return 1
    fi

    # Check that the process is alive.
    ps $PID &amp;gt;/dev/null 2&amp;gt;&amp;amp;1 || return 1

    # Looks okay.
    return 0
}

start(){
    echo -n $"Starting $PROGNAME: "

    # Try to start the program.
    if running; then
        echo "Failed.  Maybe remove $PIDFILE?"
        return 1
    fi

    mkdir -p `dirname $LOGFILE`
    $PROG &amp;gt; $LOGFILE 2&amp;gt;&amp;amp;1 &amp;amp;
    PID=$!
    mkdir -p `dirname $PIDFILE`
    echo $PID &amp;gt; $PIDFILE

    echo "Success."
    return 0
}

stop(){
    echo -n $"Stopping $PROGNAME: "

    # Check if it's already stopped.
    if ! running ; then
        echo "Failed.  Already stopped."
        return 1
    fi 

    # Find the PID and kill it.
    PID=`cat $PIDFILE 2&amp;gt;/dev/null`
    if [ "x$PID" == "x" ] ; then
        echo "Failed."
        return 1
    fi
    # (Try five times to kill it).
    for i in `seq 0 5`; do
        kill $PID
        sleep 1
        if ! running ; then
            break
        fi
    done

    # Check if it is finished.
    if running ; then
        echo "Failed."
        return 1
    fi

    # Clear out the pidfile.
    echo "Success."
    rm -f $PIDFILE
    return 0
}

restart(){
    stop
    start
}

status(){
    echo -n $"Status of $PROGNAME: "

    if running ; then
        echo "Running."
        return 0
    else
        echo "Not running."
        return 1
    fi
}

# See how we were called.
case "$1" in
    start)
 start
 RETVAL=$?
 ;;
    stop)
 stop
 RETVAL=$?
 ;;
    status)
 status
 RETVAL=$?
 ;;
    restart)
 restart
 RETVAL=$?
 ;;
    *)
 echo $"Usage: $0 {start|stop|status|restart}"
 RETVAL=2
esac

exit $RETVAL
&lt;/pre&gt;&lt;br /&gt;
&lt;div&gt;Tying everything together, I create a simple post-clone script that will stop Cassandra, reconfigure, reset the data directory, and start Cassandra.  This I will put in &lt;tt&gt;/etc/gridcentric/post-clone-on-clone.d/cassandra&lt;/tt&gt; so that it runs on every clone immediately after creation.  Recall that we set the Cassanda seed to point to the master, so after cloning each clone will automatically join the ring and discover the others.&lt;/div&gt;&lt;pre class="brush: bash"&gt;/etc/init.d/cassandra stop
recreate-cassandra-config
clear-cassandra-data
/etc/init.d/cassandra start&lt;/pre&gt;&lt;br /&gt;
&lt;b&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;Creating the cluster&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;div&gt;We're finally ready to create the cluster!  To start a single instance of Cassandra on the master, we simply run:&lt;/div&gt;&lt;pre class="brush: bash"&gt;/etc/init.d/cassandra start
&lt;/pre&gt;&lt;div&gt;Now I am free to use nodetool to query Cassandra and start storing data, etc.  As an example, you can run &lt;tt&gt;/opt/apache-cassandra-0.6.4/bin/nodetool -h localhost ring&lt;/tt&gt;.  This shows the current nodes that are in the Cassandra cluster (should be just one).  In order to automatically grow this cluster, we simply clone this virtual machine.  Here, I'll create a cluster of size 5:&lt;/div&gt;&lt;pre class="brush: bash"&gt;gc clone 4 # Creates a cluster of size 5.
&lt;/pre&gt;&lt;div&gt;Now check out &lt;tt&gt;/opt/apache-cassandra-0.6.4/bin/nodetool -h localhost ring&lt;/tt&gt;!&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;Check out the video below. &amp;nbsp;I start just before starting Cassandra and creating the cluster.  I show the ring continuously after starting Cassandra.  This is all in real-time (except when I pause, so you can read the labels!), including the cloning of virtual machines.&lt;/div&gt;&lt;br /&gt;
&lt;object height="289" width="480"&gt;&lt;param name="movie" value="http://www.youtube.com/v/O-_QS8mGuNw&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;fmt=22"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/O-_QS8mGuNw&amp;amp;hl=en_US&amp;amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="289"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;div&gt;&lt;b&gt;Update:&lt;/b&gt; Sorry for the somewhat poor quality of the video.  I'll see if I can get a higher quality version uploaded soon so that it's clear what's going on.  Note that you can click on the video and it will take you to Youtube.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-4304538663445590039?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/shhmn5CeJ1g" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/4304538663445590039/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/08/howto-build-and-scale-cassandra-cluster.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/4304538663445590039?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/4304538663445590039?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/shhmn5CeJ1g/howto-build-and-scale-cassandra-cluster.html" title="Howto: Build and scale a Cassandra cluster in five minutes" /><author><name>Adin Scannell</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/08/howto-build-and-scale-cassandra-cluster.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DUcCSXwzeSp7ImA9Wx5TGEQ.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-2596912721672082584</id><published>2010-08-04T15:20:00.000-07:00</published><updated>2010-08-03T21:44:28.281-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-08-03T21:44:28.281-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><title>Elastic Build Systems</title><content type="html">I am a firm supporter of &lt;a href="http://en.wikipedia.org/wiki/Continuous_Integration"&gt;Continuous Integration&lt;/a&gt; (CI) and throughout my development career I have used &lt;a href="http://cruisecontrol.sourceforge.net/"&gt;CruiseControl&lt;/a&gt;, &lt;a href="http://www.jetbrains.com/teamcity/"&gt;TeamCity&lt;/a&gt; and finally at GridCentric we are using &lt;a href="http://hudson-ci.org/"&gt;Hudson&lt;/a&gt;. I am actually really proud of the setup that we have going on here. In addition to the usual tasks Hudson performs, such as compiling our source, running unit tests or packaging our distribution, we are also creating up-to-date documentation of both the JavaDocs and the database schema. It is really cool to have your database schema document automatically updated whenever you push a change set.&lt;br /&gt;
&lt;br /&gt;
Eventually as the project grows -- more sub-projects being supported by the CI server, more people committing, more tasks being done on each commit -- it will start to become too much for a single machine to handle. TeamCity supports distributing the build over multiple machines using what it calls agents, and Hudson does something similar with its notion of slaves.&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://3.bp.blogspot.com/_V1O61leJWWY/TFhnNDws1mI/AAAAAAAAAA8/BKTuRG9pIQw/s1600/build_cluster.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5501260418878592610" src="http://3.bp.blogspot.com/_V1O61leJWWY/TFhnNDws1mI/AAAAAAAAAA8/BKTuRG9pIQw/s320/build_cluster.png" style="cursor: pointer; display: block; height: 240px; margin: 0px auto 10px; text-align: center; width: 320px;" /&gt;&lt;/a&gt;&lt;br /&gt;
Great! We distribute the CI server and we can grow indefinitely living in our happy world of Continuous Integration.&lt;br /&gt;
&lt;br /&gt;
Wait, one second...&lt;br /&gt;
&lt;br /&gt;
Instead of having a single machine that is easy to update we now have a handful of machines. Each one in isolation is easy to manage, but now we have to worry about ensuring they are kept in sync. Suppose we realize we need  a new version of &lt;a href="http://ant.apache.org/"&gt;Ant&lt;/a&gt;, or our project's &lt;a href="http://maven.apache.org/"&gt;Maven&lt;/a&gt; settings.xml changes. Instead of simply modifying these changes in one spot, we now need to log into each machine in our build cluster and modify them. That doesn't sound like fun, it goes against my &lt;a href="http://en.wikipedia.org/wiki/Don%27t_repeat_yourself"&gt;DRY&lt;/a&gt; philosophy and I can foresee some things falling through the cracks every now and then.&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://wiki.hudson-ci.org/display/HUDSON/Distributed+builds#Distributedbuilds-OtherRequirements"&gt;Hudson even comes with this warning&lt;/a&gt;:&lt;br /&gt;
&lt;blockquote&gt;Also note that the slaves are a kind of a cluster, and operating a cluster (especially a large one or heterogeneous one) is always a non-trivial task. For example, you need to make sure that all slaves have JDKs, Ant, CVS, and/or any other tools you need for builds. You need to make sure that slaves are up and running, etc. Hudson is not a clustering middleware, and therefore it doesn't make this any easier.&lt;/blockquote&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://2.bp.blogspot.com/_V1O61leJWWY/TFhnYf-GTDI/AAAAAAAAABE/ytHYzG1ixhI/s1600/build_cluster_real.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5501260615429540914" src="http://2.bp.blogspot.com/_V1O61leJWWY/TFhnYf-GTDI/AAAAAAAAABE/ytHYzG1ixhI/s320/build_cluster_real.png" style="cursor: pointer; display: block; height: 240px; margin: 0px auto 10px; text-align: center; width: 320px;" /&gt;&lt;/a&gt;&lt;br /&gt;
Things only start to look worse when you go from a single team, to view a company with multiple teams all trying to keep their mini-build clusters synchronized. The probability that something will go overlooked increases, and at any given time probably half of those machines are just idling waiting for a build job. This is a waste of resources -- both in terms of physical machines consuming power doing nothing, and in developer's time tracking down build issues because one machine wasn't upgraded properly.&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://3.bp.blogspot.com/_V1O61leJWWY/TFg9xz-lWJI/AAAAAAAAAAc/Xuc-8ExiJxI/s1600/build_clusters.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5501214870808647826" src="http://3.bp.blogspot.com/_V1O61leJWWY/TFg9xz-lWJI/AAAAAAAAAAc/Xuc-8ExiJxI/s320/build_clusters.png" style="cursor: pointer; display: block; height: 240px; margin: 0px auto 10px; text-align: center; width: 320px;" /&gt;&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
Fortunately, GridCentric's &lt;a href="http://www.gridcentriclabs.com/products/copper/"&gt;Copper Virtualization Platform&lt;/a&gt; makes managing multiple homogeneous clusters &lt;span style="font-style: italic;"&gt;extremely&lt;/span&gt; easy.&lt;br /&gt;
&lt;br /&gt;
Copper provides a very simple, but powerful, API call within the virtual machine: &lt;span style="font-style: italic;"&gt;clone&lt;/span&gt;. Within seconds the virtual machine can create &lt;span style="font-style: italic;"&gt;multiple running replicas&lt;/span&gt; of itself &lt;span style="font-style: italic;"&gt;at the state just prior to being cloned&lt;/span&gt;. In addition, the clone virtual machines are automatically networked with the original virtual machine on a private network only visible to them. In other words, by using the &lt;span style="font-style: italic;"&gt;clone&lt;/span&gt; API call we can very easily create a homogeneous cluster based on the configuration of a single machine within seconds.&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://2.bp.blogspot.com/_V1O61leJWWY/TFhjPSmmD2I/AAAAAAAAAA0/ZHzIRmr-LDY/s1600/clone.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5501256059175964514" src="http://2.bp.blogspot.com/_V1O61leJWWY/TFhjPSmmD2I/AAAAAAAAAA0/ZHzIRmr-LDY/s320/clone.png" style="cursor: pointer; display: block; height: 240px; margin: 0px auto 10px; text-align: center; width: 320px;" /&gt;&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
Need to make a configuration change to your build cluster, such as manually installing a library into the Maven repository, or installing an interesting Ant plugin? Simply destroy your existing cluster (a couple of seconds), perform the configuration changes on your original virtual machine, then clone the cluster back into existence (just a couple of seconds). Keeping your build cluster's configuration in sync could not be easier, or faster.&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://2.bp.blogspot.com/_V1O61leJWWY/TFhbftY9YPI/AAAAAAAAAAs/sG0FBuXkXp4/s1600/configure_cluster.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5501247545151414514" src="http://2.bp.blogspot.com/_V1O61leJWWY/TFhbftY9YPI/AAAAAAAAAAs/sG0FBuXkXp4/s320/configure_cluster.png" style="cursor: pointer; display: block; height: 112px; margin: 0px auto 10px; text-align: center; width: 320px;" /&gt;&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
Of course Copper gives this powerful cloning ability to any virtual machine booted into the platform, enabling it to support multiple build clusters. In fact Copper makes consolidating all the mini-build clusters within an organization onto the same physical hardware possible and simple.&lt;br /&gt;
&lt;br /&gt;
If a new CI server is required, simply boot up a virtual machine in Copper and install the server on it. When it becomes time to distribute the server, clone the virtual machine running it, and its footprint will grow &lt;span style="font-style: italic;"&gt;without any additional configuration&lt;/span&gt;.&lt;br /&gt;
&lt;br /&gt;
The next piece of the puzzle is to teach the Continuous Integration servers about this powerful &lt;span style="font-style: italic;"&gt;clone&lt;/span&gt; API call so that they can automatically scale themselves up into a distributed system when needed, but can also scale back when running idle. Once this piece is solved we'll have a truly remarkable CI server and there will be no excuse about lacking resources for every team within an organization to be doing Continuous Integeration.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-2596912721672082584?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/DSyFyQ23qW0" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/2596912721672082584/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/07/elastic-build-systems.html#comment-form" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/2596912721672082584?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/2596912721672082584?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/DSyFyQ23qW0/elastic-build-systems.html" title="Elastic Build Systems" /><author><name>David Scannell</name><uri>http://www.blogger.com/profile/14713329443907634842</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://3.bp.blogspot.com/_V1O61leJWWY/TFhnNDws1mI/AAAAAAAAAA8/BKTuRG9pIQw/s72-c/build_cluster.png" height="72" width="72" /><thr:total>1</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/07/elastic-build-systems.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A0cHRHw6eCp7ImA9Wx5TFUw.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-1744725858857487812</id><published>2010-07-30T11:38:00.000-07:00</published><updated>2010-07-30T12:43:55.210-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-07-30T12:43:55.210-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><category scheme="http://www.blogger.com/atom/ns#" term="recipes" /><title>Howto: Build a Hadoop cluster in five minutes</title><content type="html">&lt;div&gt;One of the key benefits of Copper is that it allows very different applications to easily and securely share physical resources. We have standard packages that make grid queueing and MPI integration dead simple, but I figured it might be valuable to show off a newer paradigm, like Hadoop.  This continues what Kannan started, demonstrating how Copper makes deploying real-world services simple.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;Hadoop is quickly gaining traction as a powerful tool for data analysis.  A key benefit of Hadoop is that it integrates data management and a map-reduce engine, so that analysis and data processing jobs can be scheduled and performed in a data-aware fashion (e.g. computation goes to where the data is). This post will show a relatively simple setup, without integrating Copper's local storage containers (I'll reserve the right to do this step in a future post! :).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;b&gt;Requirements&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;A GridCentric Copper deployment (free trial available &lt;a href="http://www.gridcentriclabs.com/trial"&gt;here&lt;/a&gt;, software downloadable &lt;a href="http://downloads.gridcentriclabs.com/"&gt;here&lt;/a&gt;).&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;&lt;/div&gt;&lt;div&gt;I'll assume that like me, you have a virtual cluster named 'hadoop'. If you don't know how to create a virtual cluster, you can see the&amp;nbsp;&lt;a href="http://wiki.gridcentriclabs.com/index.php?title=Tutorials"&gt;tutorials&lt;/a&gt;. I just used the&amp;nbsp;&lt;tt&gt;cluster-in-a-box&lt;/tt&gt;&amp;nbsp;script to fire up a new Ubuntu cluster.&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;&lt;b&gt;Install Cloudera Hadoop Packages&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;The first step is to jump in to our virtual cluster (using &lt;tt&gt;gc-vm console&lt;/tt&gt; or via &lt;tt&gt;ssh&lt;/tt&gt;) and install the Cloudera Hadoop distribution.  I'm using Ubuntu Jaunty, so I'll be following the Cloudera instructions for debian-based distributions (it's a small delta for the RPM-based ones).  First, we'll edit &lt;tt&gt;/etc/apt/sources.list&lt;/tt&gt;:&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Make sure that the Ubuntu repositories include universe and multiverse.&lt;/li&gt;
&lt;li&gt;Add the Cloudera repositories:&lt;/li&gt;
&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;&lt;tt&gt;deb http://archive.cloudera.com/debian jaunty-cdh3 contrib&lt;/tt&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;Next, we'll install all the Hadoop packages (you may have to type '&lt;b&gt;Y'&lt;/b&gt; to accept unsigned packages or you can go through the &lt;a href="https://docs.cloudera.com/display/DOC/Hadoop+(CDH3)+Quick+Start+Guide"&gt;Cloudera instructions&lt;/a&gt; and install the key for their repository).&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;tt&gt;sudo apt-get update&lt;/tt&gt;&lt;/div&gt;&lt;div&gt;&lt;tt&gt;sudo apt-get install hadoop-0.20-conf-pseudo&lt;/tt&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;That was easy! Hadoop is installed and ready to run in pseudo-distributed mode.  Don't run it just yet though.  Psuedo-distributed mode involves all the services running on a single node.  Although that's a good start, I really want to run it across dozens of machines in the next couple of minutes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;b&gt;Tweak Configuration Files&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;Before we edit files to create a fully distributed Hadoop, we need to know a few things.  Each Hadoop cluster requires exactly one &lt;i&gt;namenode&lt;/i&gt; and one &lt;i&gt;jobtracker&lt;/i&gt;.  These services co-ordinate the activities of the &lt;i&gt;datanodes&lt;/i&gt; and &lt;i&gt;tasktrackers,&lt;/i&gt; which host storage and co-ordinate computation respectively.  In our setup, the virtual cluster master will host the &lt;i&gt;namenode&lt;/i&gt; and &lt;i&gt;jobtracker&lt;/i&gt;, and the clones will each run a &lt;i&gt;datanode&lt;/i&gt; and a &lt;i&gt;tasktracker&lt;/i&gt;.  We will automate this so that whenever we create new clones, they will just start the appropriate services and join the Hadoop cluster!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;First I'm going to edit &lt;tt&gt;/etc/hadoop/conf/mapred-site.xml&lt;/tt&gt; to ensure that the &lt;i&gt;tasktracker&lt;/i&gt; that runs on the clones always talks to the &lt;i&gt;jobtracker&lt;/i&gt; on our master.  In our configuration, our master always gets the address &lt;tt&gt;10.0.0.1&lt;/tt&gt; on its virtual cluster private interface.  So, in &lt;tt&gt;mapred-site.xml&lt;/tt&gt;, I simply changed &lt;tt&gt;localhost:8021&lt;/tt&gt;&amp;nbsp;to &lt;tt&gt;10.0.0.1:8021&lt;/tt&gt;.  It now reads:&lt;br /&gt;
&lt;pre&gt;&lt;/pre&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;pre&gt;&amp;lt;?xml version="1.0"?&amp;gt;
&amp;lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&amp;gt;

&amp;lt;configuration&amp;gt;
&amp;lt;property&amp;gt;
&amp;lt;name&amp;gt;mapred.job.tracker&amp;lt;/name&amp;gt;
&amp;lt;value&amp;gt;&lt;b&gt;10.0.0.1:8021&lt;/b&gt;&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;
&amp;lt;/configuration&amp;gt;&lt;/pre&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;Second, I will do something similiar for the &lt;i&gt;datanodes&lt;/i&gt;. Because we will run the &lt;i&gt;namenode&lt;/i&gt; on the master of the virtual cluster, we need to edit &lt;tt&gt;/etc/hadoop/conf/core-site.xml&lt;/tt&gt; so that the &lt;i&gt;datanodes&lt;/i&gt; on the clones talk back to the master. As seen in my new &lt;tt&gt;core-site.xml&lt;/tt&gt; below, I changed &lt;tt&gt;&lt;value&gt;hdfs://localhost:8020&lt;/value&gt;&lt;/tt&gt;&amp;nbsp;to be &lt;tt&gt;&lt;value&gt;hdfs://10.0.0.1:8020&lt;/value&gt;&lt;/tt&gt;:&lt;/div&gt;&lt;div&gt;&lt;pre&gt;&lt;/pre&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;pre&gt;&amp;lt;?xml version="1.0"?&amp;gt;
&amp;lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&amp;gt;

&amp;lt;configuration&amp;gt;
&amp;lt;property&amp;gt;
&amp;lt;name&amp;gt;fs.default.name&amp;lt;/name&amp;gt;
&amp;lt;value&amp;gt;&lt;b&gt;hdfs://10.0.0.1:8020&lt;/b&gt;&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;

&amp;lt;property&amp;gt;
&amp;lt;name&amp;gt;hadoop.tmp.dir&amp;lt;/name&amp;gt;
&amp;lt;value&amp;gt;&lt;b&gt;/data/hadoop/${user.name}&lt;/b&gt;&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;
&amp;lt;/configuration&amp;gt;&lt;/pre&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;In the above file, I also changed the "&lt;tt&gt;hadoop.tmp.dir&lt;/tt&gt;" to be "&lt;tt&gt;/data/hadoop/${user.name}&lt;/tt&gt;".  This is really a move for later, when I hook in Copper local storage containers.  For now I'll just create this directory and make sure that the permissions are correct.&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;root@ubuntu# mkdir -p /data/hadoop/hadoop&lt;/div&gt;&lt;div&gt;root@ubuntu# chown hadoop:hadoop /data/hadoop/hadoop&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;&lt;b&gt;Start Services&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;Finally, before starting the &lt;i&gt;namenode&lt;/i&gt; and &lt;i&gt;jobtracker&lt;/i&gt; on the master, I need to format the files used for the &lt;i&gt;namenode&lt;/i&gt;.&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div style="overflow: hidden;"&gt;&lt;blockquote&gt;&lt;pre&gt;root@ubuntu# su - hadoop -c "hadoop namenode -format"&amp;nbsp;
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: &amp;nbsp; host = ubuntu.gridcentric.ca/192.168.1.76
STARTUP_MSG: &amp;nbsp; args = [-format]
STARTUP_MSG: &amp;nbsp; version = 0.20.2+320
STARTUP_MSG: &amp;nbsp; build = &amp;nbsp;-r 9b72d268a0b590b4fd7d13aca17c1c453f8bc957; compiled by 'root' on Mon Jun 28 23:15:26 UTC 2010
************************************************************/
Re-format filesystem in /var/lib/hadoop-0.20/cache/hadoop/dfs/name ? (Y or N) Y
10/07/30 18:09:29 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
10/07/30 18:09:29 INFO namenode.FSNamesystem: supergroup=supergroup
10/07/30 18:09:29 INFO namenode.FSNamesystem: isPermissionEnabled=false
10/07/30 18:09:29 INFO common.Storage: Image file of size 96 saved in 0 seconds.
10/07/30 18:09:29 INFO common.Storage: Storage directory /var/lib/hadoop-0.20/cache/hadoop/dfs/name has been successfully formatted.
10/07/30 18:09:29 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu.gridcentric.ca/192.168.1.76
************************************************************/&lt;/pre&gt;&lt;/blockquote&gt;&lt;/div&gt;&lt;div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;At last, I start-up the &lt;i&gt;namenode&lt;/i&gt; and &lt;i&gt;jobtracker&lt;/i&gt; on the master.&lt;br /&gt;
&lt;blockquote&gt;&lt;pre&gt;root@ubuntu# /etc/init.d/hadoop-*-namenode start
root@ubuntu# /etc/init.d/hadoop-*-jobtracker start&lt;/pre&gt;&lt;/blockquote&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;Now I am running the services that will stay on the master.  On each of the clones that I create, I would like to have a &lt;i&gt;tasktracker &lt;/i&gt;and a &lt;i&gt;datanode &lt;/i&gt;running.  In order to automate this from the begining, I simply install a post-clone hook by creating the file &lt;tt&gt;/etc/gridcentric/post-clone-on-clone.d/hadoop&lt;/tt&gt; with the following contents and making sure that it is executable.&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;pre&gt;/etc/init.d/hadoop-*-namenode    stop
/etc/init.d/hadoop-*-jobtracker  stop
/etc/init.d/hadoop-*-datanode    start
/etc/init.d/hadoop-*-tasktracker start
&lt;/pre&gt;&lt;/blockquote&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;b&gt;Creating Your Cluster and Running Jobs&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;I am ready to create my hadoop cluster.  To create four clones running &lt;i&gt;datanodes&lt;/i&gt; and &lt;i&gt;tasktrackers&lt;/i&gt;, I simply run:&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;pre&gt;root@ubuntu# gc clone 4
&lt;/pre&gt;&lt;/blockquote&gt;&lt;div&gt;I wait for 30 seconds or so for the &lt;i&gt;datanodes&lt;/i&gt; and &lt;i&gt;tasktrackers&lt;/i&gt; to co-ordinate with the master and then I run sample Hadoop applications!&lt;/div&gt;&lt;div style="overflow: hidden;"&gt;&lt;pre&gt;root@ubuntu:~# hadoop jar /usr/lib/hadoop/hadoop-*-examples.jar pi 2 100000
Number of Maps  = 2
Samples per Map = 100000
Wrote input for Map #0
Wrote input for Map #1
Starting Job
10/07/29 22:31:21 INFO mapred.FileInputFormat: Total input paths to process : 2
10/07/29 22:31:21 INFO mapred.JobClient: Running job: job_201007292045_0008
10/07/29 22:31:22 INFO mapred.JobClient:  map 0% reduce 0%
10/07/29 22:31:51 INFO mapred.JobClient:  map 100% reduce 0%                      
10/07/29 22:32:12 INFO mapred.JobClient:  map 100% reduce 100%
10/07/29 22:32:22 INFO mapred.JobClient: Job complete: job_201007292045_0008
10/07/29 22:32:22 INFO mapred.JobClient: Counters: 18
10/07/29 22:32:22 INFO mapred.JobClient:   Job Counters
10/07/29 22:32:22 INFO mapred.JobClient:     Launched reduce tasks=1
10/07/29 22:32:22 INFO mapred.JobClient:     Launched map tasks=2
10/07/29 22:32:22 INFO mapred.JobClient:     Data-local map tasks=2
10/07/29 22:32:22 INFO mapred.JobClient:   FileSystemCounters
10/07/29 22:32:22 INFO mapred.JobClient:     FILE_BYTES_READ=50
10/07/29 22:32:22 INFO mapred.JobClient:     HDFS_BYTES_READ=236
10/07/29 22:32:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=170
10/07/29 22:32:22 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=215
...&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;
To grow my virtual cluster, I can simply clone again. &amp;nbsp;When I am finished, I run &lt;tt&gt;gc killall&lt;/tt&gt; to remove my clones and free up the resources for someone else. &amp;nbsp;That was pretty easy!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-1744725858857487812?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/01WxbHfRzvE" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/1744725858857487812/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/07/howto-build-hadoop-cluster-in-five.html#comment-form" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/1744725858857487812?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/1744725858857487812?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/01WxbHfRzvE/howto-build-hadoop-cluster-in-five.html" title="Howto: Build a Hadoop cluster in five minutes" /><author><name>Adin Scannell</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/07/howto-build-hadoop-cluster-in-five.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DEAMQn0_fSp7ImA9Wx5TFUw.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-3215172782468355353</id><published>2010-07-28T13:55:00.000-07:00</published><updated>2010-07-30T12:06:23.345-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-07-30T12:06:23.345-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><category scheme="http://www.blogger.com/atom/ns#" term="recipes" /><title>Howto: Build a ten node memcached cluster in five minutes</title><content type="html">&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;Introduction&lt;/span&gt;&lt;br /&gt;
&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;&lt;a href="http://memcached.org/"&gt;Memcached&lt;/a&gt; is used to cache everything from dynamically generated web pages to database query results.  Multiple memcached instances are used in situations where a single memcached server does not have enough CPU or RAM to fulfill application requirements.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;In a distributed setup, memcached servers run independently of each other, and have no knowledge of other servers.  Memcached's design pushes the logic of dealing with multiple servers to the client, which makes things very simple on the server end.&lt;/div&gt;&lt;div&gt;&lt;a href="http://www.gridcentriclabs.com/products/copper/"&gt;&lt;br /&gt;
&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://www.gridcentriclabs.com/products/copper/"&gt;GridCentric Copper&lt;/a&gt; is a bare-metal-up virtualization and management stack built to manage "virtual clusters". Copper's fast virtual machine cloning (which works similarly to &lt;span style="font-family: 'courier new';"&gt;fork()&lt;/span&gt; in UNIX, except it creates clones of entire running virtual machines in just a few seconds) can be used to quickly expand and contract a collection of memcached servers.  We're going to use this to make a single memcached virtual machine, then grow that machine to a pool of 10 virtual machines in a couple seconds.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;Requirements&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;GridCentric Copper (free fully featured, 60-day trial available &lt;a href="http://gridcentriclabs.com/trial"&gt;here&lt;/a&gt;, software downloadable &lt;a href="http://downloads.gridcentriclabs.com/"&gt;here&lt;/a&gt;).&lt;/li&gt;
&lt;li&gt;The GridCentric DNS mapper (part of the Copper distribution).&lt;/li&gt;
&lt;li&gt;Enough free hardware resources to host 10 virtual machines.&lt;/li&gt;
&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;Process&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: 130%;"&gt;Create a New Virtual Cluster&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;First, create a new virtual cluster using either the included "cluster-in-a-box" script or via the &lt;a href="http://wiki.gridcentriclabs.com/index.php?title=AdminConsole#Add_New_Virtual_Cluster"&gt;Admin Web Console&lt;/a&gt;. In this walkthrough we will name our virtual cluster "memcached". Make sure to give the virtual cluster a managed public network interface, and make sure the "master only" checkbox is unchecked.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
Let's boot the memcached virtual cluster.&lt;/div&gt;&lt;pre&gt;gc-vc boot memcached
&lt;/pre&gt;&lt;div&gt;This will bring up a single virtual machine - a cluster of size 1.  This virtual machine will be the basis of our memcached cluster, so we need to set it up.  We get a console on the new master with:&lt;/div&gt;&lt;pre&gt;gc-vm console memcached-0
&lt;/pre&gt;This will give us a login prompt on the master.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-size: 130%;"&gt;Install memcached&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
We install memcached according to the README and configure it to listen on all addresses (edit the /etc/memcached.conf file and change '-l 127.0.0.1' to '-l 0.0.0.0'), and then start it up.&lt;br /&gt;
&lt;br /&gt;
&lt;div&gt;We now have a Copper virtual cluster, which for now contains a single virtual machine running memcached.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-size: 130%;"&gt;Create some clone virtual machines&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
To start up more memcached instances, we invoke the &lt;a href="http://wiki.gridcentriclabs.com/index.php?title=Gc"&gt;gc command line tool&lt;/a&gt; from within the virtual machine to create some clones of itself:&lt;/div&gt;&lt;pre&gt;[ubuntu]$ gc clone 9
&lt;/pre&gt;After a few seconds, the clone command returns, we have now have 10 virtual machines running memcached.&lt;br /&gt;
&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;Note: we just scaled 1 memcached instance to 10 with zero extra configuration, in just a few seconds!&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: 130%;"&gt;&lt;br /&gt;
&lt;/span&gt;&lt;span style="font-size: 100%;"&gt;&lt;span style="font-size: 130%;"&gt;Configure the client address pool&lt;/span&gt;&lt;br /&gt;
&lt;/span&gt;&lt;/div&gt;&lt;div&gt;Now we have to let the memcached clients know about the servers so that the clients can add the servers to their pool.&lt;br /&gt;
&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;One easy method is just to use the GridCentric DNS service.  This service runs on any computer with access to the Copper head node, and does dynamic mapping of DNS lookups to virtual clusters within Copper.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;One of its particularly nice features is that it maps the name of a running virtual cluster to the list of all public managed IPs on that virtual cluster.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;On some machine which has access to the GridCentric DNS service (let's say it's running on host gridcentric-headnode), we execute the following:&lt;/div&gt;&lt;pre&gt;$ dig +short memcached @gridcentric-headnode
192.168.1.144
192.168.1.142   
192.168.1.143   
192.168.1.140   
192.168.1.146   
192.168.1.141   
192.168.1.145   
192.168.1.149   
192.168.1.147   
192.168.1.148&lt;/pre&gt;&lt;br /&gt;
&lt;div&gt;&lt;/div&gt;&lt;div&gt;This returns a list of public IPs for all 10 of the memcached servers running within the virtual cluster.  You could even design your web application to automatically query this DNS server and periodically update its view of the set of servers running.&lt;br /&gt;
&lt;br /&gt;
That's it! You're now ready to start using memcached on your client nodes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;Homework:&lt;br /&gt;
&lt;div&gt;Try to accomplish the above on a traditional cluster setup ;)&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-3215172782468355353?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/bmXwcGmu7Ec" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/3215172782468355353/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/07/how-to-build-10-core-memcached-cluster.html#comment-form" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/3215172782468355353?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/3215172782468355353?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/bmXwcGmu7Ec/how-to-build-10-core-memcached-cluster.html" title="Howto: Build a ten node memcached cluster in five minutes" /><author><name>Kannan Vijayan</name><uri>http://www.blogger.com/profile/01218320916390868729</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/07/how-to-build-10-core-memcached-cluster.html</feedburner:origLink></entry><entry gd:etag="W/&quot;C0YMRXs9fip7ImA9WxFTGEg.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-327435095513843347</id><published>2010-04-09T10:11:00.000-07:00</published><updated>2010-04-09T15:13:04.566-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-04-09T15:13:04.566-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><title>A new grid computing paradigm explained</title><content type="html">&lt;div style="text-align: left;"&gt;Copper represents a new paradigm for grid computing. &lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Why? Copper flips some of the standard practices in grid computing on their head.  I'll illustrate this by starting with a typical cluster environment.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_Na7dk58Iywg/S79_6VD_v0I/AAAAAAAAAC8/jVjWfFNNcW0/s1600/old_computing.png"&gt;&lt;img src="http://2.bp.blogspot.com/_Na7dk58Iywg/S79_6VD_v0I/AAAAAAAAAC8/jVjWfFNNcW0/s400/old_computing.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5458221913459375938" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 400px; height: 226px; " /&gt;&lt;/a&gt;&lt;div&gt;&lt;div style="text-align: center;"&gt;(click for larger version)&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In this environment, users log in to a cluster headnode (or login node) and submit jobs (as scripts) to a scheduler that copies those scripts to several pre-configured slave nodes (a.k.a. compute nodes).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The software environment that runs on the headnode must be carefully crafted by an administrator to meet all of the users' needs and accommodate each application that they may want to run.  Note that users do not have any administrative control over the cluster in this context, so they may not freely install new software.  This leads to users building software in their personal home directories, using large, statically-linked binaries and often banging their head against the wall.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Additionally, administrators must ensure that the slave nodes are kept in sync with the headnode.  If a new software package is installed on the headnode, it must be installed on each of the slaves.  If a configuration or library changes, then it must be changed on each of the slaves.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A Copper computing environment is fundamentally different.&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_Na7dk58Iywg/S7-AX7v44YI/AAAAAAAAADM/Vyyh8Nr2O4E/s1600/new_computing.png"&gt;&lt;img src="http://1.bp.blogspot.com/_Na7dk58Iywg/S7-AX7v44YI/AAAAAAAAADM/Vyyh8Nr2O4E/s400/new_computing.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5458222422060228994" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 400px; height: 296px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;(click for larger version)&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In this case, users may each be given a virtual headnode on which they have complete administrative control.  This means that they can freely change the configuration and install new software.  They are also free to use the distribution and tools that they are familiar with (and possibly have on their workstations).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When jobs are submitted to a queue in a Copper cluster, slave nodes are created on the fly, which are &lt;b&gt;exact&lt;/b&gt; clones of the headnode at the time.  This means that the user does not need to worry about updating software on any slaves nodes, as everything will be exactly as the same as it was on their virtual headnode.  If their software ran on the headnode, then it is guaranteed to run on their slave nodes.  When the jobs complete, the slave nodes are automatically destroyed and their associated resources are freed.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;How is this possible? Our technology allows virtual machines to clone nearly instantaneously and with almost zero overhead. Using this cloning for grid computing means that you don't have to worry about keeping environments in sync. Ever.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;In fact, the above diagram is not really accurate.  Scripts are not copied to the slave VMs, because &lt;b&gt;they are already there&lt;/b&gt;.  Because the slaves are guaranteed to up-to-date from the point when you actually submitted the job, they can just run a &lt;i&gt;command&lt;/i&gt; that you specify instead of having to copy a script.  That's fundamentally cool; it's like we copy the whole headnode whenever you run a job.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As an example, this will run flawless on Copper, but not on traditional clusters:&lt;/div&gt;&lt;div&gt;&lt;pre&gt;apt-get install python     # Install python.&lt;br /&gt;vim myprogram.py           # Edit my python program.&lt;br /&gt;gcqsub python myprogram.py # Submit my python program to the queue.&lt;/pre&gt;&lt;/div&gt;&lt;div&gt;In the above, I install python, edit a program and run it through the grid queue in steps that are just are natural as using my workstation.&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-327435095513843347?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/YAElK9vwUKc" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/327435095513843347/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/04/new-grid-computing-paradigm-explained.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/327435095513843347?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/327435095513843347?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/YAElK9vwUKc/new-grid-computing-paradigm-explained.html" title="A new grid computing paradigm explained" /><author><name>Adin Scannell</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://2.bp.blogspot.com/_Na7dk58Iywg/S79_6VD_v0I/AAAAAAAAAC8/jVjWfFNNcW0/s72-c/old_computing.png" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/04/new-grid-computing-paradigm-explained.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CUMHQH4yeSp7ImA9WxFQFU8.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-6027774335170001244</id><published>2010-04-08T07:19:00.000-07:00</published><updated>2010-05-10T13:30:31.091-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-05-10T13:30:31.091-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="news" /><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><title>GridCentric Inc. Announces Copper™ Cluster Management Software</title><content type="html">&lt;div style="border-width: medium medium 1pt; border-style: none none solid; font-weight: bold;"&gt;  &lt;p class="MsoNormal" style="text-align: center; border: medium none; padding: 0in;" align="center"&gt;&lt;span style=";font-family:&amp;quot;;font-size:14pt;"  &gt;Copper™ Combines Cloud and Grid Computing Technologies to Provide Unprecedented Ease-of-Use and Flexibility&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;/div&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Toronto, Canada – April 8th, 2010 – GridCentric Inc., a Toronto-based software company, today announced the availability of Copper™, a cluster management system for high performance computing workloads. Copper™ combines virtualization and grid computing technologies to enable simple, efficient and flexible cluster deployment, management, and use.&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;The GridCentric Copper™ platform enables real-time, on-demand sharing of high performance computing resources and makes cluster administration a one-click task. With Copper™, organizations reduce IT expenditures through significantly lower installation and management costs, higher utilization of their physical infrastructure, and improved sharing of their data assets. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Operators of large clusters face enormous challenges due to system complexity – compute clusters are typically composed of hundreds or even thousands of individual computers, each requiring separate software components for hardware provisioning and management, resource allocation, job scheduling, and application support. Initial setup of a compute cluster can take several months because of these challenges.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;By combining technologies from cloud and grid computing, GridCentric’s Copper™ platform provides server provisioning, resource management, job control, and high-level application support in a single integrated software package. With Copper™, the time required to setup and configure a compute cluster goes from months to days. Additionally, Copper™ has built-in support for operating hundreds of “virtual clusters” on the same set of physical resources, scaling their compute footprints on-demand in real time – the industry’s first true high-performance cloud computing platform. Copper™ gives cluster users the power of a supercomputer with the ease-of-use of a PC. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;“GridCentric's approach to cluster management will enable many organizations to service users in ways that would have otherwise been impractical or impossible.&lt;span style=""&gt;  &lt;/span&gt;Our experiences with it to date have been very positive.&lt;span style=""&gt;  &lt;/span&gt;Copper™ is the way of the future for many organizations,” said Professor Michael Bauer of the University of Western Ontario, and Associate Director of the SHARCNET academic supercomputing consortium. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Copper™ has been in limited trial on a SHARCNET compute cluster located at York University in Toronto, Canada since late 2009, and is now open to all researchers from SHARCNET’s 17 academic member organizations. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;“GridCentric’s Copper™ product represents a new class of cluster management software,” said Tim Smith, co-founder and CEO of GridCentric Inc. “Systems integrators, cluster operators, and end-users will all benefit from Copper’s unprecedented ease of installation, management, and use.”&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Availability and Pricing&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Copper is immediately available through authorized systems integrators and resellers at a list price of $1250.00 per compute node. Volume and Academic discounts may apply. For more information, please visit &lt;/span&gt;&lt;a href="http://www.gridcentriclabs.com/copper"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;www.gridcentriclabs.com/copper&lt;/span&gt;&lt;/a&gt;&lt;span style=";font-family:&amp;quot;;" &gt; or e-mail &lt;/span&gt;&lt;a href="mailto:sales@gridcentriclabs.com"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;sales@gridcentriclabs.com&lt;/span&gt;&lt;/a&gt;&lt;span style=";font-family:&amp;quot;;" &gt;.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;About GridCentric Inc.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;GridCentric Inc. is a technology-leading systems, networking, and virtualization software company. It is the mission of GridCentric to make high performance computing easy, without sacrificing performance and flexibility. GridCentric is a privately held corporation, and is funded in part by Rogers Ventures. GridCentric was recently named as one of the “Top 25 Canadian ICT Up and Comers” by the Branham Group. For more information, please visit &lt;/span&gt;&lt;a href="http://www.gridcentriclabs.com/"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;www.gridcentriclabs.com&lt;/span&gt;&lt;/a&gt;&lt;span style=";font-family:&amp;quot;;" &gt;.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;About SHARCNET&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;SHARCNET was established in 2001. It is one of seven world-leading Compute Canada (&lt;/span&gt;&lt;a href="http://www.computecanada.org/#_blank"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;www.computecanada.org&lt;/span&gt;&lt;/a&gt;&lt;span style=";font-family:&amp;quot;;" &gt;) supercomputing consortia. SHARCNET currently serves 14 universities, 2 colleges, and one research institute across western Ontario. For more information, please visit &lt;/span&gt;&lt;a href="http://www.sharcnet.ca/"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;www.sharcnet.ca&lt;/span&gt;&lt;/a&gt;&lt;span style=";font-family:&amp;quot;;" &gt;.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;a href="mailto:blim@gridcentriclabs.com"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;/span&gt;&lt;/a&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-6027774335170001244?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/zKszbrPs5mA" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/6027774335170001244/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/04/gridcentric-inc-announces-copper.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/6027774335170001244?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/6027774335170001244?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/zKszbrPs5mA/gridcentric-inc-announces-copper.html" title="GridCentric Inc. Announces Copper™ Cluster Management Software" /><author><name>Tim Smith</name><uri>http://www.blogger.com/profile/14856519127995550658</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/04/gridcentric-inc-announces-copper.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A0QDR3s6eCp7ImA9WxBWE0w.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-9081913929518616342</id><published>2010-02-04T07:10:00.000-08:00</published><updated>2010-02-04T13:36:16.510-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-04T13:36:16.510-08:00</app:edited><title>Grid Queueing, The Copper Way</title><content type="html">So let's talk about grid queueing.  Before I wax poetic about how GridCentric's cluster operating system makes grid queueing better, it's probably a good idea to start with a delineation of what the current state of grid queueing is, and some of the problems we're trying to solve for people who use it.&lt;br /&gt;&lt;br /&gt;Conceptually, grid queueing is simple: you have a cluster, and you want to utilize the free compute nodes in the cluster to execute programs.  So you install a grid queueing engine, which is aware of all the computers in your cluster, and you submit your jobs to it.  The queueing engine figures out what computers are capable of executing your jobs, what other jobs have been submitted by other people, prioritizes them according to its configuration, and communicates with the hosts on the cluster to run them.  It monitors and stores the execution and state of these jobs and reports them to the user as requested.&lt;br /&gt;&lt;br /&gt;Pretty straightforward, yes?  Well.. conceptually yes.. but the devil, as always, is in the details. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;The Setup&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Let's say Uma the user, an analyst, has three types of heavyweight programs she uses routinely:&lt;br /&gt;&lt;br /&gt;1. A reporting program that analyzes logs and generates and XML reports.  This program is written in python and uses a wide set of python libraries, including bindings to a C XML library.&lt;br /&gt;&lt;br /&gt;2. A JDBC client that connects to her company's database, retrieves a dataset, and performs a long-running analysis on it, storing the results back into the database.&lt;br /&gt;&lt;br /&gt;3. An internal C++ application that takes in raw input data collected from surveys and stored as flat files on a network filesystem, and prepares them for entry into the database.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;The Old Way&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Let's assume we already have a traditional, non-Copper, cluster, installed and configured appropriately with operating systems, networking, etc.&lt;br /&gt;&lt;br /&gt;What does the cluster administrator need to do to let Uma run her jobs on the cluster?&lt;br /&gt;&lt;br /&gt;1. He needs to install and configure the 3 different applications on &lt;span style="font-style: italic;"&gt;all&lt;/span&gt; the machines in the cluster.  This might be straightforward for one or two machines, but tedious once you get up to dozens, let alone hundreds or thousands of machines.  They all need Python, the appropriate Python libraries, java, the appropriate JDBC adaptors, the c++ program, etc.&lt;br /&gt;&lt;br /&gt;2. He needs to be wary of conflicts that Uma's tools might have with some other users' tools.  He needs to ensure that if another user already has a tool on a host that requires, say, an XML library that conflicts with Uma's requirements, then Uma's tools don't get installed on that machine.&lt;br /&gt;&lt;br /&gt;3. He needs to ensure that the network filesystem that Uma's tools access is configured at the same path on &lt;span style="font-style: italic;"&gt;all&lt;/span&gt; the machines in the cluster.&lt;br /&gt;&lt;br /&gt;4. He needs to add a queue to the queueing engine specific to the machines capable of executing Uma's jobs.&lt;br /&gt;&lt;br /&gt;5. Even after all this, when Uma actually submits her jobs, she needs to be careful to send it to the appropriate queue as configured by the administrator.  If she does that incorrectly, then her jobs fail.&lt;br /&gt;&lt;br /&gt;By no means is this a trivial amount of work.  In fact, once you scale it up to 100s, or 1000s of machines, it's extremely difficult to manage, as cluster administrators are already well aware.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;The Copper Way&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;So how does Copper help with grid queueing?  It helps by eliminating nearly &lt;span style="font-style: italic;"&gt;all&lt;/span&gt; of the redundancies in the above steps.&lt;br /&gt;&lt;br /&gt;Let's assume, as before, that the cluster has been set up, but this time with the Copper platform.  None of Uma's applications have been installed or configured.&lt;br /&gt;&lt;br /&gt;1. The administrator allocates a new virtual machine image to Uma with the queueing daemon installed (installing the daemon is trivial: just download the appropriate package and install it.  There is &lt;span style="font-style: italic;"&gt;zero&lt;/span&gt; configuration).  He then configures Uma's VM to have access to the appropriate network filesystem containers.  This can be done quickly and easily through Copper's web interface.&lt;br /&gt;&lt;br /&gt;2. The administrator installs and configures the applications on this virtual machine image.  This setup is done exactly once, and on a virtual machine independent of all other users and types of jobs that might execute on the cluster, obviating any possible issues with software conflicts.  The administrator can even skip this step if Uma is able to install and configure the applications herself.  Since Uma's VM is independent of all other VMs, and exists in its own private virtual cluster, the administrator need not be concerned with any security or conflict issues.&lt;br /&gt;&lt;br /&gt;3. Uma logs into the virtual machine, and submits jobs to the queueing engine running on it.  The jobs she submits can be any command that can be executed on the virtual machine.  Copper will ensure through the magic of VM cloning that wherever the job executes, the host it executes on will look and behave exactly like the host it was invoked on.  As far as the job is concerned, it is for all practical purposes running on Uma's original VM.&lt;br /&gt;&lt;br /&gt;4. There is no step 4.&lt;br /&gt;&lt;br /&gt;With Copper, we've dropped the overhead of cluster setup as close as possible to zero, for both the cluster administrator and the user.  Uma can even trivially add new applications to execute on the cluster: she simply installs it on her VM, and it implicitly and automatically becomes available as an application that can be queued up to execute on the cluster.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;The Lesson&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Grid queueing doesn't have to be hard or tedious, no matter what the size of your cluster.  Using virtualization, cloning, and the GridCentric high-level toolkit, Copper makes implementing grid queueing a snap.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-9081913929518616342?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/Qszt1jsrqJg" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/9081913929518616342/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/02/grid-queueing-copper-way.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/9081913929518616342?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/9081913929518616342?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/Qszt1jsrqJg/grid-queueing-copper-way.html" title="Grid Queueing, The Copper Way" /><author><name>Kannan Vijayan</name><uri>http://www.blogger.com/profile/01218320916390868729</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/02/grid-queueing-copper-way.html</feedburner:origLink></entry><entry gd:etag="W/&quot;D0cNSXo-eip7ImA9WxBXGU0.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-7849363708481222475</id><published>2010-01-30T13:33:00.000-08:00</published><updated>2010-01-30T18:31:38.452-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-01-30T18:31:38.452-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><title>Copper 1.0 on its way</title><content type="html">The months are flying by! As always, we've been busy here preparing to rock the world of virtualization and High Performance Computing with Copper.  But as Copper 1.0 nears, I thought that I'd provide a bit of background on what exactly it is.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;GridCentric's platform has been named Copper in light of the fact that it really is the first true "Cluster Operating System".  That means that it lets you do on your cluster all the things that we're used to doing with a modern multi-tasking operating system.  Let me provide a table with a few simple analogies:&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;Feature&lt;/th&gt;&lt;th style="text-align: center;" align="center"&gt;Desktop OS&lt;br /&gt;(e.g. Windows, Linux)&lt;/th&gt;&lt;th style="text-align: center;" align="center"&gt;Cluster OS&lt;br /&gt;(Copper)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Multi-tasking&lt;/th&gt;&lt;td&gt;Run multiple programs simultaneously.&lt;/td&gt;&lt;td&gt;Run different cluster applications simultaneously (which include operating system stacks, libraries, across hundreds of machines); these are virtual clusters.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Resource accounting&lt;/th&gt;&lt;td&gt;See memory and CPU usage of different processes.&lt;/td&gt;&lt;td&gt;Account for memory, CPU, disk and network usage of every virtual cluster.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Job control&lt;/th&gt;&lt;td&gt;The ability to kill misbehaving processes (some of the time).&lt;/td&gt;&lt;td&gt;The ability to pause, suspend, limit or kill any virtual cluster.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Developer API&lt;/th&gt;&lt;td&gt;An API for performing system-related functions, such as creating processes, reading and writing files.&lt;/td&gt;&lt;td&gt;An API for growing the size of your virtual cluster instantaneously, killing machines within it, accessing data sets, reserving resources, etc.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Simple, eh?  Now for some Copper eye candy.  Some of those following Copper may know that the fundamental primitive that enables all this amazing stuff is blazingly fast virtual machine cloning.  Just for fun: here's a video of me growing my virtual cluster from one machine to a few, then back to one, then to eleven and back to one, then to over twenty and back to one.  Each clone is fully-functional and independent -- waiting for ssh is actually slower than creating the machine itself.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;object width="425" height="344"&gt;&lt;param name="movie" value="http://www.youtube.com/v/ASbhANnp_Xc&amp;amp;hl=en&amp;amp;fs=1"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/ASbhANnp_Xc&amp;amp;hl=en&amp;amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-7849363708481222475?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/-OCF7fAozFI" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/7849363708481222475/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/01/copper-10-on-its-way.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/7849363708481222475?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/7849363708481222475?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/-OCF7fAozFI/copper-10-on-its-way.html" title="Copper 1.0 on its way" /><author><name>Adin Scannell</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/01/copper-10-on-its-way.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CEYHRX08eSp7ImA9WxBWEEo.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-843757204023557198</id><published>2010-01-30T13:13:00.000-08:00</published><updated>2010-02-01T16:55:34.371-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-01T16:55:34.371-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="news" /><category scheme="http://www.blogger.com/atom/ns#" term="copper" /><title>Copper deployment at York open to researchers</title><content type="html">We're proud to open up the deployment at York University to researchers there who need a bit more computing power and want to experiment with the next generation of High Performance Computing.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;You will be able to harness the power of your own personal easy-to-use cluster - without any of the hassles of keeping compute nodes consistent with your binaries, worrying about cross compilers, restrictions installing packages on the headnode.  Copper gives everyone the simplicity of one machine with the power of hundreds.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This hardware for this trial is generously provided by &lt;a href="http://sharcnet.ca/"&gt;SHARCNET&lt;/a&gt; and we expect that we will be able to open up this deployment to all &lt;a href="http://sharcnet.ca/"&gt;SHARCNET&lt;/a&gt; users in the near future.  For now, York researchers can &lt;a href="http://gridcentric.ca/dolphin.php"&gt;request a Copper virtual cluster&lt;/a&gt;.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-843757204023557198?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/ciD1_6BWagw" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/843757204023557198/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2010/01/copper-deployment-at-york-open-to.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/843757204023557198?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/843757204023557198?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/ciD1_6BWagw/copper-deployment-at-york-open-to.html" title="Copper deployment at York open to researchers" /><author><name>Adin Scannell</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2010/01/copper-deployment-at-york-open-to.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DEcFRHs-cSp7ImA9WxBXGEU.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-3939964587357063483</id><published>2009-12-18T10:28:00.000-08:00</published><updated>2010-01-30T13:13:35.559-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-01-30T13:13:35.559-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="news" /><title>GridCentric Copper deployment to York University</title><content type="html">&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_bx4J52y3Zos/SyvM3ugyULI/AAAAAAAAAAU/IxWMWYrz_xs/s1600-h/cluster-front.jpg"&gt;&lt;img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 140px; height: 188px;" src="http://2.bp.blogspot.com/_bx4J52y3Zos/SyvM3ugyULI/AAAAAAAAAAU/IxWMWYrz_xs/s320/cluster-front.jpg" alt="" id="BLOGGER_PHOTO_ID_5416648234593570994" border="0" /&gt;&lt;/a&gt;In December GridCentric performed a beta installation of its Copper Cluster Operating System on a &lt;a href="http://sharcnet.ca/"&gt;SHARCNET&lt;/a&gt; cluster at York University in Toronto. The cluster has 32 compute nodes, each with two dual-core Opteron processors and 8GB of RAM, for a total of 128 compute cores and 256GB of memory. The cluster also includes 4TB of RAID5 storage, as well as Myrinet 2G low-latency interconnect between each node.&lt;br /&gt;&lt;br /&gt;Total installation time was around five hours, 4.5 of which were spent installing Fedora on the login node, head node, and storage node. After that was completed, we booted the compute nodes into our Copper compute image and the cluster was ready for action.&lt;br /&gt;&lt;br /&gt;Here's a picture of the Copper admin console showing an Ubuntu 9.04 virtual cluster running a single virtual machine (click for larger image):&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_bx4J52y3Zos/S0IUY5dzemI/AAAAAAAAAAs/zqh9mWcxJ58/s1600-h/preclone_physical.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 330px; height: 176px;" src="http://3.bp.blogspot.com/_bx4J52y3Zos/S0IUY5dzemI/AAAAAAAAAAs/zqh9mWcxJ58/s320/preclone_physical.png" alt="" id="BLOGGER_PHOTO_ID_5422919319282743906" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;Here's the same "Virtual Cluster" after expanding out to 85 VMs (click for larger image):&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_bx4J52y3Zos/S0IUuwj2apI/AAAAAAAAAA0/-L46Goaq2Dk/s1600-h/postclone_virtual.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 330px; height: 176px;" src="http://2.bp.blogspot.com/_bx4J52y3Zos/S0IUuwj2apI/AAAAAAAAAA0/-L46Goaq2Dk/s320/postclone_virtual.png" alt="" id="BLOGGER_PHOTO_ID_5422919694849305234" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;This "footprint expansion" takes a few seconds. Try that on EC2!&lt;br /&gt;&lt;br /&gt;For a better idea of what's going on, take a look at out &lt;a href="http://blog.techscene.ca/2009/12/11/tim-smith-gridcentric-democamp-toronto-24/"&gt;our demo at DCT24&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-3939964587357063483?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/MZZUTMldcqo" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/3939964587357063483/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2009/12/gridcentric-copper-deployment-to-york.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/3939964587357063483?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/3939964587357063483?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/MZZUTMldcqo/gridcentric-copper-deployment-to-york.html" title="GridCentric Copper deployment to York University" /><author><name>Tim Smith</name><uri>http://www.blogger.com/profile/14856519127995550658</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://2.bp.blogspot.com/_bx4J52y3Zos/SyvM3ugyULI/AAAAAAAAAAU/IxWMWYrz_xs/s72-c/cluster-front.jpg" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2009/12/gridcentric-copper-deployment-to-york.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CEMBQX0ycCp7ImA9WxBRFk8.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-6888762180094745223</id><published>2009-11-24T14:17:00.000-08:00</published><updated>2010-01-04T08:27:30.398-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-01-04T08:27:30.398-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="news" /><title>GridCentric SC09 Wrap-up</title><content type="html">We spent last week in beautiful Portland, Oregon for the SC 2009 International Conference on HPC, Networking, Storage, and Analysis. In between visits to various brewpubs and &lt;a href="http://chaportland.com/taqueria.html"&gt;our gastronomic home base&lt;/a&gt;, we managed to get the following done:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Hung out at the Compute Canada booth with Mike Bauer, John Morton and others from SHARCNET, Susan Baldwin from Compute Canada, as well as Cameron Kiddle from the U of Calgary / WestGrid (who showed us a cool demo of &lt;a href="http://geochronos.org/"&gt;GeoChronos&lt;/a&gt;) at the Monday night Opening Gala.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Attended the IBM Customer Reception, where we had beers with Neil Bunn (IBM), Jill Kowalchuk (Cybera), Florent Parent (Clumeq), and many others (thanks to Neil for the invite!).&lt;/li&gt;&lt;li&gt;Went to the SGI Innovators Breakfast where we met with Paul Lu (U of Alberta) and talked about the GridCentric platform's technological underpinnings (thanks to Daniel St-Germain for the invite!). This was one of the few alcohol-free events we attended.&lt;/li&gt;&lt;li&gt;Spoke with Josh Simons (Sun) about the convergence of HPC and Enterprise technologies. You should read his &lt;a href="http://blogs.sun.com/simons"&gt;blog&lt;/a&gt;, especially &lt;a href="http://blogs.sun.com/simons/category/HPC"&gt;his HPC writings&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;&lt;span style=";font-family:courier new;font-size:100%;"  &gt;&lt;/span&gt;Showed our demo to Daniel Chavarría from PNNL, who gave us helpful feedback and words of encouragement.&lt;span style="font-family:Verdana,Helvetica,Arial;"&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-size:11px;"&gt;&lt;span style=";font-family:courier new;font-size:100%;"  &gt;&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;Spoke with folks from Mellanox, QLogic, and Myricom about hardware support for virtualization in their Infiniband and 10GigE adapters. Turns out they've supported it for years.&lt;/li&gt;&lt;li&gt;Spoke at length with Gregor von Laszewski about the NSF &lt;a href="http://futuregrid.org/"&gt;FutureGrid&lt;/a&gt; project, which will act as a testbed for HPC grid/cloud platforms and software stacks - perfect for evaluating the GridCentric platform and getting ourselves on the U.S. government's radar.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Had beers with Daniel Gruner (SciNet) at the Thursday Closing Event (which was quite nice!), then went on to have more beers with Danny at another pub whose name escapes me (although I recall it had over 100 beers on tap).&lt;/li&gt;&lt;/ul&gt;Overall, it was an exhausting week (bookended by some atrocious connector flights), but we made a lot of friends and are much the wiser. We're really looking forward to SC10!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-6888762180094745223?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/QpwMjFZUxo0" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/6888762180094745223/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2009/11/gridcentric-sc09-wrap-up.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/6888762180094745223?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/6888762180094745223?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/QpwMjFZUxo0/gridcentric-sc09-wrap-up.html" title="GridCentric SC09 Wrap-up" /><author><name>Tim Smith</name><uri>http://www.blogger.com/profile/14856519127995550658</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2009/11/gridcentric-sc09-wrap-up.html</feedburner:origLink></entry><entry gd:etag="W/&quot;AkcBRH47fyp7ImA9WxNbFEQ.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-6735434363704711745</id><published>2009-11-13T11:48:00.000-08:00</published><updated>2009-11-17T14:47:35.007-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-11-17T14:47:35.007-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="news" /><title>Office move</title><content type="html">It's been a while since our last update, but we've been pretty busy (more on that later). With the investment from RCI also came a move to their newly renovated space for Rogers Ventures companies.  This transformation has been quite significant, so I wanted to share it.&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Those that had visited us in our old offices know that they were a bit cramped (especially the weeks leading up to the move as we grew to six).  We like to think that it was a stereotypical start-up, and gives us a great "it all started in an attic" (literally) story.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;First: the developer's desks.  In the main room of our office, four people were in fairly close quarters.  Meanwhile, Tim and I shared two sides of another small Ikea desk.  Great for extreme programming, but not so great on those hot august days! :)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;Workstations 1.0&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_Na7dk58Iywg/Sv4SspnI5UI/AAAAAAAAAA8/wRo7wDiRtjE/s1600-h/photo-small.jpg"&gt;&lt;img src="http://3.bp.blogspot.com/_Na7dk58Iywg/Sv4SspnI5UI/AAAAAAAAAA8/wRo7wDiRtjE/s320/photo-small.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5403777161184732482" style="cursor: pointer; width: 150px; height: 113px; " /&gt;&lt;/a&gt; &lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_Na7dk58Iywg/Sv4VndOVglI/AAAAAAAAACU/3mQKay5ebQI/s1600-h/photo+(2)-small.jpg"&gt;&lt;img src="http://3.bp.blogspot.com/_Na7dk58Iywg/Sv4VndOVglI/AAAAAAAAACU/3mQKay5ebQI/s320/photo+(2)-small.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5403780370495013458" style="cursor: pointer; width: 150px; height: 113px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;Workstations 2.0&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_Na7dk58Iywg/Sv4Ss1K-wEI/AAAAAAAAABE/jLkltdrJHZs/s1600-h/arrow.png"&gt;&lt;/a&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_Na7dk58Iywg/Sv4SsywYxFI/AAAAAAAAABM/NFd2bEK2-F0/s1600-h/photo+(9)-small.jpg"&gt;&lt;img src="http://4.bp.blogspot.com/_Na7dk58Iywg/Sv4SsywYxFI/AAAAAAAAABM/NFd2bEK2-F0/s320/photo+(9)-small.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5403777163639440466" style="cursor: pointer; width: 150px; height: 113px; " /&gt;&lt;/a&gt; &lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_Na7dk58Iywg/Sv4VniSb-rI/AAAAAAAAACc/wOhpHXHz7u4/s1600-h/photo+(8)-small.jpg"&gt;&lt;img src="http://2.bp.blogspot.com/_Na7dk58Iywg/Sv4VniSb-rI/AAAAAAAAACc/wOhpHXHz7u4/s320/photo+(8)-small.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5403780371854391986" style="cursor: pointer; width: 150px; height: 113px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;Our new workspaces are much more spacious -- enough for multiple machines and monitors and a good place to organize and post notes. Unlimited productivity, here I come!&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;In our old office, Tim and I shared a room with three fairly noisy servers and several pedestal servers. Not that the others had it much better; noise wasn't really containable in our space.  You'll also see our enormous 12,000 BTU air conditioner at the bottom left of this photo.&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;Server Room 1.0&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_Na7dk58Iywg/Sv4SspnI5UI/AAAAAAAAAA8/wRo7wDiRtjE/s1600-h/photo-small.jpg"&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_Na7dk58Iywg/Sv4S_O7DtsI/AAAAAAAAABU/YSd1WjO1jeA/s1600-h/photo+(1)-small.jpg"&gt;&lt;img src="http://4.bp.blogspot.com/_Na7dk58Iywg/Sv4S_O7DtsI/AAAAAAAAABU/YSd1WjO1jeA/s320/photo+(1)-small.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5403777480438036162" style="cursor: pointer; width: 150px; height: 113px; padding-bottom: 17px;" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_Na7dk58Iywg/Sv4S_O7DtsI/AAAAAAAAABU/YSd1WjO1jeA/s1600-h/photo+(1)-small.jpg"&gt;&lt;/a&gt;Server Room 2.0&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_Na7dk58Iywg/Sv4S_eBLVHI/AAAAAAAAABc/EhjELfaC30Q/s1600-h/arrow.png"&gt;&lt;/a&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_Na7dk58Iywg/Sv4S_WC3vmI/AAAAAAAAABk/QvU1SPvHa9o/s1600-h/photo+(4)-small.jpg"&gt;&lt;img src="http://1.bp.blogspot.com/_Na7dk58Iywg/Sv4S_WC3vmI/AAAAAAAAABk/QvU1SPvHa9o/s320/photo+(4)-small.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5403777482349854306" style="cursor: pointer; height: 150px;" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;I would say that one of the best features of our new office is the rack space.  Now we can hide away our servers and not worry about tripping over the power cord!  (Also we don't have to worry about blowing a fuse!)&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;We nick-named a portion of our old office "the idea board".  This was the coffee space and the white board.  Shown below is the idea board in the last few days before we moved -- when it also served as storage space for the blue boxes to be packed up with our stuff.&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;Idea Board 1.0&lt;/div&gt;&lt;div&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_Na7dk58Iywg/Sv4VbyQ91qI/AAAAAAAAABs/W2zXnKUg86s/s1600-h/photo+(3)-small.jpg"&gt;&lt;img src="http://1.bp.blogspot.com/_Na7dk58Iywg/Sv4VbyQ91qI/AAAAAAAAABs/W2zXnKUg86s/s320/photo+(3)-small.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5403780169984759458" style="cursor: pointer; width: 150px; height: 113px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;Idea Board 2.0&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_Na7dk58Iywg/Sv4YnnOzA_I/AAAAAAAAACk/RjTpSUPoVYw/s1600-h/arrow.png"&gt;&lt;/a&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_Na7dk58Iywg/Sv4VcuEmMhI/AAAAAAAAACM/uN7TN9hG0qI/s1600-h/photo+(7)-small.jpg"&gt;&lt;img src="http://1.bp.blogspot.com/_Na7dk58Iywg/Sv4VcuEmMhI/AAAAAAAAACM/uN7TN9hG0qI/s320/photo+(7)-small.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5403780186039005714" style="cursor: pointer; width: 150px; height: 113px; " /&gt;&lt;/a&gt; &lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_Na7dk58Iywg/Sv4VcVCkrhI/AAAAAAAAACE/96KhEuD7HqY/s1600-h/photo+(11)-small.jpg"&gt;&lt;img src="http://2.bp.blogspot.com/_Na7dk58Iywg/Sv4VcVCkrhI/AAAAAAAAACE/96KhEuD7HqY/s320/photo+(11)-small.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5403780179319631378" style="cursor: pointer; width: 150px; height: 113px; " /&gt;&lt;/a&gt; &lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_Na7dk58Iywg/Sv4VcCEikkI/AAAAAAAAAB8/HKMe0QbHNoY/s1600-h/photo+(10)-small.jpg"&gt;&lt;img src="http://3.bp.blogspot.com/_Na7dk58Iywg/Sv4VcCEikkI/AAAAAAAAAB8/HKMe0QbHNoY/s320/photo+(10)-small.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5403780174227608130" style="cursor: pointer; width: 150px; height: 113px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;As you can see from the photo on the right, the size of the white board has at least tripled (it's yet to been seen whether the ideas will triple).  We've also got some great space to relax and meet.  The only thing left to do is decide what to do with the old idea board.&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_Na7dk58Iywg/Sv4VbyQ91qI/AAAAAAAAABs/W2zXnKUg86s/s1600-h/photo+(3)-small.jpg"&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-6735434363704711745?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/ndC_YxunFQ4" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/6735434363704711745/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2009/11/office-move.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/6735434363704711745?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/6735434363704711745?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/ndC_YxunFQ4/office-move.html" title="Office move" /><author><name>Adin Scannell</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://3.bp.blogspot.com/_Na7dk58Iywg/Sv4SspnI5UI/AAAAAAAAAA8/wRo7wDiRtjE/s72-c/photo-small.jpg" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2009/11/office-move.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkENQXw4eSp7ImA9WxNbFEQ.&quot;"><id>tag:blogger.com,1999:blog-4803689960319917716.post-2092839680271857711</id><published>2009-10-18T14:29:00.000-07:00</published><updated>2009-11-17T12:44:50.231-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-11-17T12:44:50.231-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="news" /><title>GridCentric receives funding from the Rogers Venture Group</title><content type="html">On September 2nd &lt;a href="http://gridcentric.ca"&gt;GridCentric Inc&lt;/a&gt;, a Toronto-based startup developing a platform for the provisioning and management of High Performance Compute (HPC) clusters using virtualization, closed a seed round of financing with the Rogers Venture Group (&lt;a href="http://rogersventures.ca"&gt;http://rogersventures.ca&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;GridCentric was founded in early 2009 by Andres Lagar-Cavilla, Adin Scannell, and Tim Smith. GridCentric's platform is based on technology that enables stateful cloning of a running virtual machine to dozens of physical hosts in less than a second.&lt;br /&gt;&lt;br /&gt;Mike Lee, Rogers Communications' Chief Strategy Officer, has joined Adin Scannell and Tim Smith on GridCentric's Board of Directors.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4803689960319917716-2092839680271857711?l=blog.gridcentriclabs.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/gridcentric/~4/wiEc4yAM260" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.gridcentriclabs.com/feeds/2092839680271857711/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.gridcentriclabs.com/2009/10/gridcentric-receives-funding-from.html#comment-form" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/2092839680271857711?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4803689960319917716/posts/default/2092839680271857711?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/gridcentric/~3/wiEc4yAM260/gridcentric-receives-funding-from.html" title="GridCentric receives funding from the Rogers Venture Group" /><author><name>Tim Smith</name><uri>http://www.blogger.com/profile/14856519127995550658</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total><feedburner:origLink>http://blog.gridcentriclabs.com/2009/10/gridcentric-receives-funding-from.html</feedburner:origLink></entry></feed>

