<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>VMware vFabric Blog</title>
	
	<link>http://blogs.vmware.com/vfabric</link>
	<description>VMware vFabric Cloud Application Platform -- Build, Scale and Run Data-Intensive Applications On-Premise and in the Cloud</description>
	<lastBuildDate>Thu, 02 May 2013 18:17:10 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/VmwareVfabricBlog" /><feedburner:info uri="vmwarevfabricblog" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:browserFriendly></feedburner:browserFriendly><item>
		<title>Breaking the Mindset: Why Hadoop Can and Should Move Past Bare-Metal Deployments to Virtualization</title>
		<link>http://blogs.vmware.com/vfabric/2013/05/breaking-the-mindset-why-hadoop-can-and-should-move-past-bare-metal-deployments-to-virtualization.html?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=breaking-the-mindset-why-hadoop-can-and-should-move-past-bare-metal-deployments-to-virtualization</link>
		<comments>http://blogs.vmware.com/vfabric/2013/05/breaking-the-mindset-why-hadoop-can-and-should-move-past-bare-metal-deployments-to-virtualization.html#comments</comments>
		<pubDate>Thu, 02 May 2013 17:59:03 +0000</pubDate>
		<dc:creator>Stacey Schneider</dc:creator>
				<category><![CDATA[Serengeti]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[hadoop]]></category>

		<guid isPermaLink="false">http://blogs.vmware.com/vfabric/?p=5976</guid>
		<description><![CDATA[Whenever we’ve dealt with something for a while, our way of thinking about it becomes a habit. Hadoop deals with a lot of data. Currently, the record is 100 petabytes in a Facebook cluster that analyzes log data.  Since it &#8230; <a href="http://blogs.vmware.com/vfabric/2013/05/breaking-the-mindset-why-hadoop-can-and-should-move-past-bare-metal-deployments-to-virtualization.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-5981" title="hadoop-on-virtual" src="http://blogs.vmware.com/vfabric/files/2013/05/hadoop-on-virtual.jpg" alt="" width="340" height="200" />Whenever we’ve dealt with something for a while, our way of thinking about it becomes a habit. Hadoop deals with a lot of data. Currently, the record is <a title="100 petabytes in a Facebook cluster that analyzes log data" href="http://www.infoq.com/presentations/Hadoop-HDFS-Facebook" target="_blank">100 petabytes in a Facebook cluster that analyzes log data</a>.  Since it was built by the likes of Google and Facebook to deal with such large data volumes and performance, it originally was built to run on bare-metal servers. Since it wasn’t an option from the get-go, the notion that you can’t have that much data running on a move-able virtual machine safely has largely gone unchallenged.</p>
<p>However, as time has gone on, and technology has allowed for persistent storage on the cloud, organizations have started to rethink this paradigm. In fact, several <a title="companies are using Hadoop and big data" href="http://blogs.vmware.com/vfabric/2013/04/why-virtualize-hadoop.html" target="_blank">companies are using Hadoop and big data</a> today to gain competitive advantage. And while they are running it on virtualization, they are not moving the data. There are other advantages.</p>
<p>VMware’s Big Data product line marketing manager Joe Russell, spoke with Roberto Zicari this week in an <a title="interview on ODBMS.org" href="http://www.odbms.org/blog/2013/04/on-virtualize-hadoop-interview-with-joe-russell/" target="_blank">interview on ODBMS.org</a> that helps articulate why Hadoop not only can run on virtual infrastructure using Project Serengeti, but why companies should consider it to save time and make Hadoop more usable.<span id="more-5976"></span></p>
<h3>Understanding Data Locality with Hadoop Running on Serengeti</h3>
<p>With Hadoop and virtualization, it’s important to see virtualization through the lens of data locality.</p>
<p>In a <a title="distributed system" href="http://en.wikipedia.org/wiki/Distributed_computing" target="_blank">distributed system</a>, data locality is about keeping the processing and data together to achieve greater performance and avoid network bottlenecks. Moving large volumes of data with vMotions is unreasonable. With petabytes of data, it would take too long and affect data availabillty. Similarly, separating the processing introduces latency that is unacceptable.</p>
<div class="promo" style="float: right; margin: 10px 10px 10px 10px;">
<div>
<h3>Try Serengeti Now</h3>
<p><a class="more" title="Click here" target="_blank">Click here</a></p>
</div>
</div>
<p>As Russell explains in the article, Serengeti is providing value by preserving data locality, but allowing organizations to deploy a enterprise-tested High Availability and Fault Tolerant Hadoop clusters in minutes—something even the most seasoned Hadoop veteran can not do. It also paves the way for advanced use cases such as mixed payload deployments and multi-tenancy.</p>
<p>Russell explains further:</p>
<blockquote><p><em>A common misconception when virtualizing Hadoop clusters is that we decouple the data nodes from the physical infrastructure. This is not necessarily true. When users virtualize a Hadoop cluster using Project Serengeti, they separate data from compute while preserving data locality. By preserving data locality, we ensure that performance isn’t negatively impacted, or essentially making the infrastructure appear as static. Additionally, it creates true multi-tenancy within more layers of the Hadoop stack, not just the name node.</em><em> </em></p>
<p><em>I think there is some confusion when we say “in the cloud”. Here, Steve is talking about running it on a public cloud like Amazon. Steve is largely introducing the concept of data locality, or the notion that large amounts of data are hard to move. In this scenario, it makes sense to bring compute resources to the data to ensure performance isn’t negatively impacted by networking limitations. VMware advocates that Hadoop should be virtualized, as it introduces a level of flexibility and management that allows companies to easily deploy, manage, and scale internal Hadoop clusters.</em></p></blockquote>
<p>&nbsp;</p>
<p>Zicari probes further and challenges while you can keep the data together, aren’t there basic functions that depend on the data and processing happening together:</p>
<blockquote><p><strong><em>Zicari:</em></strong><em> <strong>There are concerns on the approach of decoupling Apache Hadoop nodes from the underlying physical infrastructure. <a href="http://steveloughran.blogspot.de/2012/03/hadoop-in-cloud-infrastructures.html">Quoting Steve Loughran</a> (HP Research): “Hadoop contains lots of assumptions about running in a static infrastructure; it’s scheduling and recovery algorithms assume this.” What is your take on this?</strong></em></p>
<p><strong><em>Joe Russell:</em></strong><em> A common misconception when virtualizing Hadoop clusters is that we decouple the data nodes from the physical infrastructure. This is not necessarily true. When users virtualize a Hadoop cluster using Project Serengeti, they separate data from compute while preserving data locality. By preserving data locality, we ensure that performance isn’t negatively impacted, or essentially making the infrastructure appear as static. Additionally, it creates true multi-tenancy within more layers of the Hadoop stack, not just the name node.</em></p>
<p><em>I think there is some confusion when we say “in the cloud”. Here, Steve is talking about running it on a public cloud like Amazon. Steve is largely introducing the concept of data locality, or the notion that large amounts of data are hard to move. In this scenario, it makes sense to bring compute resources to the data to ensure performance isn’t negatively impacted by networking limitations. VMware advocates that Hadoop should be virtualized, as it introduces a level of flexibility and management that allows companies to easily deploy, manage, and scale internal Hadoop clusters.</em></p></blockquote>
<h3></h3>
<h3>How Does It Work?</h3>
<p>VMware created Hadoop Virtual Extensions (“HVE”) to make Hadoop distributions virtualization aware. It works by inserting a node group layer between the rack and host to make Hadoop distributions topology aware for virtualized platforms. So, while technically Serengeti has separated the compute resources from the data to allow for better management, scaling and faster deployments, the hypervisor knows to keep the processing and data on the same physical machine.</p>
<p>Russell also outlines how High Availability is added through using vSphere:</p>
<blockquote><p><em>We ensure High Availability (HA) by leveraging vSphere’s tested solution via Project Serengeti’s integration with vCenter (management console of vSphere).</em></p>
<p><em>In the event of physical server failure, affected virtual machines are automatically restarted on other production servers with spare capacity. In the case of operating system failure, vSphere HA restarts the affected virtual machine on the same physical server.</em></p>
<p><em>In Hadoop nomenclature, this means that there is HA on more than just the name node. vSphere’s solution also allows for HA on the jobtracker node, metastores, and on the management server, which are critical pieces of any Hadoop system that require high availability.</em></p>
<p><em>More importantly, as Hadoop is a batch-oriented process, it is important that when a physical host does fail, that you are able to pause and then restart that job from the point in time in which it went down. VMware’s vSphere solution allows for this and has been tested amongst the biggest Enterprises for the better part of the past decade.</em></p></blockquote>
<p>&nbsp;</p>
<p>HVE has been donated back to Apache Hadoop. Similarly, Serengeti is also open source. While it doesn’t make sense for VMware to spend money and engineering to have Serengeti ported to work with other hypervisors, Russell does state that this is very much the desire.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.vmware.com/vfabric/2013/05/breaking-the-mindset-why-hadoop-can-and-should-move-past-bare-metal-deployments-to-virtualization.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New RabbitMQ 3.1.0 Release Available</title>
		<link>http://blogs.vmware.com/vfabric/2013/05/new-rabbitmq-3-1-0-release-available.html?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=new-rabbitmq-3-1-0-release-available</link>
		<comments>http://blogs.vmware.com/vfabric/2013/05/new-rabbitmq-3-1-0-release-available.html#comments</comments>
		<pubDate>Thu, 02 May 2013 13:34:23 +0000</pubDate>
		<dc:creator>Stacey Schneider</dc:creator>
				<category><![CDATA[RabbitMQ]]></category>
		<category><![CDATA[middleware]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://blogs.vmware.com/vfabric/?p=5969</guid>
		<description><![CDATA[RabbitMQ 3.1.0 is now available for immediate download. Announced this morning on the new Pivotal blog, where RabbitMQ now resides, this version includes enhancements to garbage collection, consumption, requeuing, memory use, and dead lettering. For those on Mac OS X, &#8230; <a href="http://blogs.vmware.com/vfabric/2013/05/new-rabbitmq-3-1-0-release-available.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://blogs.vmware.com/vfabric/files/2013/05/rabbit_header_logo_340x200.jpg"><img class="alignleft size-full wp-image-5971" title="RabbitMQ 3.1.0" src="http://blogs.vmware.com/vfabric/files/2013/05/rabbit_header_logo_340x200.jpg" alt="" width="340" height="200" /></a><a href="http://www.rabbitmq.com/download.html">RabbitMQ 3.1.0</a> is now available for immediate download.</p>
<p>Announced this morning <a title="RabbitMQ 3.1.0 Released" href="http://blog.gopivotal.com/topics/big-data-topics/new-release-rabbitmq-3-1-0" target="_blank">on the new Pivotal blog</a>, where RabbitMQ now resides, this version includes enhancements to garbage collection, consumption, requeuing, memory use, and dead lettering.</p>
<p>For those on Mac OS X, there is a newly packaged, standalone release of RabbitMQ that doesn’t require a separate Erlang install.</p>
<p>Some key, new capabilities include eager synchronisation of mirror queue slaves, automatic cluster partition healing, and improved statistics (including charts) in the management plugin. There are also many enhancements and bug fixes to the server, Java client, Erlang client, and a number of other plugins, including federation, old-federation, shovel, Web-STOMP, STOMP, and MQTT plugins, as well as the consistent hash exchange.</p>
<p><a href="https://www.rabbitmq.com/blog/2013/05/01/rabbitmq-3-1-0-in-images/">RabbitMQ’s blog post on the topic</a> shares screenshots of several new features like the ones for new charts and filters below:</p>
<p><a href="http://www.rabbitmq.com/blog/2013/05/01/rabbitmq-3-1-0-in-images"><img src="http://www.rabbitmq.com/wp-uploads/2013/05/chart.png" alt="" width="640" height="150" /></a></p>
<p><a href="http://www.rabbitmq.com/wp-uploads/2013/05/filter.png"><img src="http://www.rabbitmq.com/wp-uploads/2013/05/filter.png" alt="" width="640" /></a></p>
<p>Read More:</p>
<ul>
<li>Download <a title="RabbitMQ 3.1.0 Download" href="http://www.rabbitmq.com/download.html" target="_blank">RabbitMQ 3.1.0</a></li>
<li>Check out the <a href="http://www.rabbitmq.com/release-notes/README-3.1.0.txt">release notes for RabbitMQ 3.1.0</a></li>
<li>Read more from the Rabbit Team about the <a title="RabbitMQ 3.1.0 Release" href="http://blog.gopivotal.com/topics/big-data-topics/new-release-rabbitmq-3-1-0" target="_blank">RabbitMQ 3.1.0 release on the Pivotal blog</a></li>
<li>See <a title="RabbitMQ 3.1.0 in images" href="http://www.rabbitmq.com/blog/2013/05/01/rabbitmq-3-1-0-in-images/" target="_blank">more images from the new release</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blogs.vmware.com/vfabric/2013/05/new-rabbitmq-3-1-0-release-available.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>5 Steps to Mainframe Modernization with a Big Fast Data Fabric</title>
		<link>http://blogs.vmware.com/vfabric/2013/04/5-steps-to-mainframe-modernization-with-a-big-fast-data-fabric.html?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=5-steps-to-mainframe-modernization-with-a-big-fast-data-fabric</link>
		<comments>http://blogs.vmware.com/vfabric/2013/04/5-steps-to-mainframe-modernization-with-a-big-fast-data-fabric.html#comments</comments>
		<pubDate>Thu, 25 Apr 2013 17:18:29 +0000</pubDate>
		<dc:creator>vFabric Team</dc:creator>
				<category><![CDATA[GemFire]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[SQLFire]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[analytical]]></category>
		<category><![CDATA[Big]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[grid]]></category>
		<category><![CDATA[mainframe]]></category>
		<category><![CDATA[modernization]]></category>
		<category><![CDATA[OLAP]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[transaction]]></category>
		<category><![CDATA[transactional]]></category>

		<guid isPermaLink="false">http://blogs.vmware.com/vfabric/?p=5900</guid>
		<description><![CDATA[For growth initiatives, many companies are looking to innovate by ramping analytical, mobile, social, big data, and cloud initiatives. For example, GE is one growth-oriented company and just announced heavy investment in the Industrial Internet with GoPivotal. One area of &#8230; <a href="http://blogs.vmware.com/vfabric/2013/04/5-steps-to-mainframe-modernization-with-a-big-fast-data-fabric.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-medium wp-image-5918" title="mainframe_header" src="http://blogs.vmware.com/vfabric/files/2013/04/mainframe_header-300x176.png" alt="" width="300" height="176" />For growth initiatives, many companies are looking to innovate by ramping <a title="analytical" href="http://www.greenplum.com/blog/topics/data-science/chorus-in-action-at-data-science-london" target="_blank">analytical</a>, <a title="mobile" href="http://blogs.vmware.com/vfabric/2013/02/build-your-first-mobile-app-in-the-cloud-in-45-minutes.html" target="_blank">mobile</a>, <a title="social" href="http://blogs.vmware.com/vfabric/2013/04/how-instagram-feeds-work-celery-and-rabbitmq.html" target="_blank">social</a>, <a title="big data" href="http://blogs.vmware.com/vfabric/2013/04/why-virtualize-hadoop.html" target="_blank">big data</a>, and <a title="cloud initiatives" href="http://blogs.vmware.com/vfabric/2012/09/3-insightful-vfabric-implementations-to-learn-from.html" target="_blank">cloud initiatives</a>. For example, GE is one growth-oriented company and just announced <a title="heavy investment in the Industrial Internet with GoPivotal" href="http://blogs.vmware.com/vfabric/2013/04/webinar-recap-pivotal-opens-for-business-ge-gets-10-stake-and-how-pivotal-plans-to-deliver-next-generation-paas.html" target="_blank">heavy investment in the Industrial Internet with GoPivotal</a>. One area of concern to many well-established businesses is what to do with their mainframe powered applications. Mainframes are expensive to run, but the applications that run off of them are typically very important and the business can not afford to risk downtime or any degradation in service.  So, until now the idea of modernizing a mainframe application has often faced <a title="major roadblocks" href="http://blogs.vmware.com/vfabric/2013/01/the-top-six-reasons-companies-are-afraid-of-mainframe-modernization.html" target="_blank">major roadblocks</a>.</p>
<p>There are ways to preserve the mainframe and improve application performance, reliability and even usability.  <a title="As one of the world’s largest banks sees" href="http://blogs.vmware.com/vfabric/2013/04/banks-are-breaking-away-from-mainframes-to-big-fast-data-grids.html" target="_blank">As one of the world’s largest banks sees</a>, big, fast data grids can provide an incremental approach to mainframe modernization and reduce risk, lower operational costs, increase data processing performance, and provide innovative analytics capabilities for the business—all based on the same types of cloud computing technologies that power internet powerhouses and financial trading markets.<span id="more-5900"></span></p>
<h3>The Fast Data component</h3>
<p>One customer used <a title="used vFabric GemFire to save $75 million dollars on a mainframe modernization project" href="http://blogs.vmware.com/vfabric/2012/05/vmware-partner-viewpoint-5-reasons-why-you-should-care-about-vfabric.html" target="_blank">used vFabric GemFire to save $75 million dollars on a mainframe modernization project</a>, and it has been used for the past ten years as highly performant, horizontally scalable data transaction layers, or big data grid, for mission-critical applications. Both <a title="GemFire" href="https://www.vmware.com/products/application-platform/vfabric-gemfire/overview.html" target="_blank">GemFire</a> and it’s sister product, <a title="SQLFire" href="https://www.vmware.com/products/application-platform/vfabric-sqlfire/overview.html" target="_blank">SQLFire</a>, are known to <a title="achieve linear scale" href="http://blogs.vmware.com/vfabric/2013/01/scaling-and-modernizing-net-and-java-sqlfire-performance-test-blows-away-traditional-rdbms.html" target="_blank">achieve linear scale</a>. Key use cases include credit-card transaction systems, stock trading platforms, foreign exchange systems, web-based travel reservation systems, and mainframe batch data offloading. As an in-memory data grid, its main advantages are being able to do in-memory, sub-millisecond transactions while still maintaining the highest standards for <a title="fault-tolerance, high-availability" href="http://blogs.vmware.com/vfabric/2013/04/disaster-recovery-jackpot-activeactive-wan-based-replication-in-gemfire-vs-oracle-and-mysql.html" target="_blank">fault-tolerance, high-availability</a>, and linear <a title="scalability" href="http://blogs.vmware.com/vfabric/2013/04/understanding-speed-and-scale-strategies-for-big-data-grids-and-in-memory-colocation.html" target="_blank">scalability</a> on a <a title="distributed platform" href="http://blogs.vmware.com/vfabric/2013/01/3-game-changing-capabilities-in-sqlfire.html" target="_blank">distributed platform</a>.</p>
<h3>The Big Data component</h3>
<div class="promo" style="float: right; margin: 10px 10px 10px 10px;">
<div>
<h3>For More Information:</h3>
<p><a class="more" title="vFabric GemFire" href="http://www.vmware.com/products/application-platform/vfabric-gemfire/overview.html" target="_blank">vFabric GemFire</a><br />
<a class="more" title="vFabric SQLFire" href="http://www.vmware.com/products/application-platform/vfabric-sqlfire/overview.html" target="_blank">vFabric SQLFire</a><br />
<a class="more" title="Greenplum" href="http://www.greenplum.com/" target="_blank">Greenplum</a></p>
</div>
</div>
<p>With big data analysis, Greenplum has a multitude of customer case studies with companies like <a title="O’Reilly Media" href="http://www.greenplum.com/communities/customer/oreilly-media" target="_blank">O’Reilly Media</a>, <a title="Skype" href="http://www.greenplum.com/communities/customer/customers-by-industry" target="_blank">Skype</a>, and <a title="NYSE Euronext" href="http://www.greenplum.com/communities/customer/nyse-euronext" target="_blank">NYSE Euronext</a>. These solutions have become well-known when it comes to analytical analysis on multiple terabyte or petabyte data sets where traditional relational databases begin to break down, <a title="stop scaling" href="http://blogs.vmware.com/vfabric/2012/11/3-signs-your-relational-database-must-go.html" target="_blank">stop scaling,</a> or fail to deal well with <a title="un-structured data" href="http://blogs.vmware.com/vfabric/2013/03/why-every-database-must-be-broken-soon.html" target="_blank">un-structured data</a>. <a title="Greenplum technology" href="http://www.greenplum.com/" target="_blank">Greenplum technology</a> provides a complete big data solution for both structured and unstructured data, based on the <a title="Greenplum Database" href="http://www.greenplum.com/products/greenplum-database" target="_blank">Greenplum Database</a> and <a title="Pivotal HD" href="http://www.greenplum.com/blog/topics/hadoop/introducing-pivotal-hd" target="_blank">Pivotal HD</a>—a commercially supported distribution of <a title="Hadoop" href="http://hadoop.apache.org/" target="_blank">Hadoop</a> that includes HDFS, MapReduce, Hive, Pig, HBase, Zookeeper, Sqoop, and Flume. The recently announced <a title="Pivotal Advanced Database Services powered by HAWQ" href="http://www.greenplum.com/products/pivotal-hd" target="_blank">Pivotal Advanced Database Services powered by HAWQ</a> allow for SQL queries to run on the fastest Hadoop-based query interface on the market today—a 100X+ faster solution.</p>
<h3>Fast Data + Big Data: Better together</h3>
<p>Big data and fast data solutions make a lot of sense together as we’ve seen on many customer solution blueprints delivered over the past several months. This is because most business owners and administrators aren’t able to fully utilize the data being captured in their transactional systems on a daily basis. From a business value perspective, the fast data layer can bring scalability and reliability to the business while reducing the cost per transaction. Most transactional systems also benefit from predictive analytics on transacted data, and the fast data layer enables this type of real-time transaction analysis that can also incorporate big data result-sets. The big data layer provides insight on mountains of data to help with decision making and support traditional performance metrics or enable <a title="more advanced types of visualization" href="http://www.greenplum.com/blog/topics/data-for-good/effective-data-visualization-techniques-from-business-to-social-advocacy" target="_blank">more advanced types of visualization</a> and <a title="data science" href="http://www.greenplum.com/blog/topics/data-science/chorus-in-action-at-data-science-london" target="_blank">data science</a>.</p>
<p><a href="http://blogs.vmware.com/vfabric/files/2013/04/Screen-shot-2013-04-25-at-12.57.11-PM.png" target="_blank"><img class="alignnone  wp-image-5919" title="Screen shot 2013-04-25 at 12.57.11 PM" src="http://blogs.vmware.com/vfabric/files/2013/04/Screen-shot-2013-04-25-at-12.57.11-PM.png" alt="" width="852" height="881" /></a></p>
<h3>From Mainframe to Big Fast Data Architecture</h3>
<p>Moving from mainframe to big, fast data is an evolution. A phased approach—step by step—is certainly the most recommended way of modernizing applications. It makes sense because it minimizes risk and better justifies investments. After working with many customers who face this problem, here is one approach we recommend.</p>
<h4>1. Selecting the Pilot: Pick a Starting Point</h4>
<p>As with most major initiatives, an initial use case or small scope should be used as a pilot to validate the architecture choices and prove a return for the overall project. The ideal project candidate should a) have little or no integration points with other systems on the legacy platform, b) be small but critical to existing business processes, c) consume a considerable amount of operational expenses, and/or d) represent a business risk in its current state. By screening this way, we should be able to deliver something of value to the business,  reduce OpEx, and make the improvement quickly while avoiding bad decisions.</p>
<h4>2. Designing the Modern Data Architecture for Co-Existence</h4>
<p>The goal of this step is determine what legacy data stays, migrates, or integrates. First, there is an analysis on the pilot’s data model. Then, we begin to design a data architecture that makes sense for a highly scalable, distributed data grid and still supports the existing business model and processes. The analysis should identify which entities are transactional, a mix of transactional and analytical (e.g. part of a real-time analytics model), or purely analytical. During this process, we make decisions regarding data model <a title="partitioning, replication" href="http://blogs.vmware.com/vfabric/2013/01/3-game-changing-capabilities-in-sqlfire.html" target="_blank">partitioning, replication</a>, <a title="colocation" href="http://blogs.vmware.com/vfabric/2013/04/understanding-speed-and-scale-strategies-for-big-data-grids-and-in-memory-colocation.html" target="_blank">colocation</a>, <a title="disaster recovery" href="http://blogs.vmware.com/vfabric/2013/04/disaster-recovery-jackpot-activeactive-wan-based-replication-in-gemfire-vs-oracle-and-mysql.html" target="_blank">disaster recovery</a>, transaction consistency, and more. We also decide which data to leave on the legacy platform, accessing it on the fly as needed using the GemFire integration layer capabilities.</p>
<h4>3. Integrating Mainframe and Big Data Grid</h4>
<p>While the data architecture is being defined, we start building the initial big fast data infrastructure. Then, the pilot migrates the first use case to the modernized architecture. By using the GemFire/SQLFire asynchronous integration layer, we can provide data consistency between the new application and legacy mainframe application. Transactions done on the modernized system are delivered simultaneously to both the legacy system and the analytics platform. Integration with legacy can be achieved using either a mainframe connector, CICS Web Services, <a title="messaging platform" href="http://blogs.vmware.com/vfabric/2012/11/expert-interview-the-polyglot-rabbit-examples-of-multi-protocol-queues-in-rabbitmq.html" target="_blank">messaging platform</a>, or any other integration protocol.</p>
<p><a href="http://blogs.vmware.com/vfabric/files/2013/04/Screen-shot-2013-04-25-at-12.59.23-PM.png" target="_blank"><img class="alignnone  wp-image-5920" title="Screen shot 2013-04-25 at 12.59.23 PM" src="http://blogs.vmware.com/vfabric/files/2013/04/Screen-shot-2013-04-25-at-12.59.23-PM.png" alt="" width="1056" height="834" /></a></p>
<h4>4. First Deployment Risk Mitigation Plans</h4>
<p>When the pilot is complete and has proven to be better performing than the legacy system with much lower maintenance costs, we are ready to partially turn off our first piece of the legacy system. The legacy system, especially if living on a mainframe, should stay there for a period of time to support ongoing business. During this time, new transactions should start happening on the new system and data can be validated against the original system to make sure it is behaving exactly as expected. This will minimize risk, assure a seamless architecture evolution, and avoid headaches from unexpected problems. While the deployment acts as an advanced, operational cache for the mainframe, the mainframe still receives the data it needs while both analytical and real-time or predictive analytics data stores are updated.</p>
<p><a href="http://blogs.vmware.com/vfabric/files/2013/04/Screen-shot-2013-04-25-at-13.01.02-PM.png" target="_blank"><img class="alignnone  wp-image-5921" title="Screen shot 2013-04-25 at 13.01.02 PM" src="http://blogs.vmware.com/vfabric/files/2013/04/Screen-shot-2013-04-25-at-13.01.02-PM.png" alt="" width="1006" height="934" /></a></p>
<h4>5. Evolution</h4>
<p>Step-by-step, other applications or portions of the mainframe can be carefully migrated to the new platform in a similar manner—without risk. As this happens, we gradually reduce mainframe usage, costs, and time to market for new deployments. We gain a level of scalability proven by data grids that run the most rigorous and high-performance data environments on the planet—those that power financial transactions. We also enable new methods of analysis to unleash business insight and value.</p>
<p><a href="http://blogs.vmware.com/vfabric/files/2013/04/Screen-shot-2013-04-25-at-13.02.28-PM.png" target="_blank"><img class="alignnone  wp-image-5922" title="Screen shot 2013-04-25 at 13.02.28 PM" src="http://blogs.vmware.com/vfabric/files/2013/04/Screen-shot-2013-04-25-at-13.02.28-PM.png" alt="" width="1114" height="933" /></a></p>
<p>Of course, there is an initial capital expense to make; however, the investment is justified by reduced operational expenses. Companies can also save on capex by leveraging existing, partially used infrastructure since the software runs on commodity hardware.</p>
<table width="100%" cellspacing="10" cellpadding="10">
<tbody>
<tr>
<td width="95"><a href="http://blogs.vmware.com/vfabric/files/2013/04/fred_melo_headshot.png"><img class="alignright size-full wp-image-5736" title="fred_melo_headshot" src="http://blogs.vmware.com/vfabric/files/2013/04/fred_melo_headshot.png" alt="" width="90" height="90" /></a></td>
<td><span style="color: #333333;"><strong>About the Author:</strong> Frederico Melo (a.k.a. Fred Melo) has a degree in Computer Science and has been working with Software Engineering for the last 14 years. His areas of expertise include Grid Computing, Highly Scalable Architectures, Big Data, Fast Data and Legacy Modernization. He is currently based in Sao Paulo, Brazil working as Field Engineer for Pivotal.</span></td>
</tr>
</tbody>
</table>
]]></content:encoded>
			<wfw:commentRss>http://blogs.vmware.com/vfabric/2013/04/5-steps-to-mainframe-modernization-with-a-big-fast-data-fabric.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Webinar Recap: Pivotal Opens For Business, GE Gets 10% Stake and How Pivotal Plans to Deliver Next-Generation PaaS</title>
		<link>http://blogs.vmware.com/vfabric/2013/04/webinar-recap-pivotal-opens-for-business-ge-gets-10-stake-and-how-pivotal-plans-to-deliver-next-generation-paas.html?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=webinar-recap-pivotal-opens-for-business-ge-gets-10-stake-and-how-pivotal-plans-to-deliver-next-generation-paas</link>
		<comments>http://blogs.vmware.com/vfabric/2013/04/webinar-recap-pivotal-opens-for-business-ge-gets-10-stake-and-how-pivotal-plans-to-deliver-next-generation-paas.html#comments</comments>
		<pubDate>Wed, 24 Apr 2013 16:53:06 +0000</pubDate>
		<dc:creator>Stacey Schneider</dc:creator>
				<category><![CDATA[vFabric]]></category>

		<guid isPermaLink="false">http://blogs.vmware.com/vfabric/?p=5444</guid>
		<description><![CDATA[Pivotal is now open for business! Pivotal, first announced in December, is a new venture started by VMware and EMC that is focused on Big Data and Cloud Application Platforms. Formally launched as a stand-alone entity today, Pivotal is led &#8230; <a href="http://blogs.vmware.com/vfabric/2013/04/webinar-recap-pivotal-opens-for-business-ge-gets-10-stake-and-how-pivotal-plans-to-deliver-next-generation-paas.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-medium wp-image-5887" title="pivotal_open_graphic" src="http://blogs.vmware.com/vfabric/files/2013/04/pivotal_open_graphic-300x177.png" alt="" width="300" height="177" /></p>
<p><a title="Pivotal" href="http://www.gopivotal.com" target="_blank">Pivotal</a> is now open for business!</p>
<p>Pivotal, <a title="first announced in December" href="http://blogs.vmware.com/console/2012/12/the-pivotal-initiative.html" target="_blank">first announced in December</a>, is a new venture started by VMware and EMC that is focused on Big Data and Cloud Application Platforms. Formally launched as a stand-alone entity today, Pivotal is led by former VMware CEO Paul Maritz, who has been working as Chief Strategy Officer at EMC since last August.</p>
<p>In a webinar today, Maritz not only confirmed the new initiative is now a stand-alone business with 1,250 employees from VMware and EMC, but he also surprised listeners with an announcement that General Electric is making a strategic investment of $105 million into Pivotal. GE’s Vice President and Corporate Officer Bill Ruh joined the webinar today and said GE will hold a 10% stake in the new company. CEO Jeff Immelt also joined the call to explain This brings the value of the newly launched Pivotal to $1 billion.</p>
<p><a title="Pivotal Announces Planned Strategic Investment from GE" href="http://www.prnewswire.com/news-releases-test/pivotal-announces-planned-strategic-investment-from-ge-204457381.html">GE also announced this morning</a> that their Software Center is standardizing on several of Pivotal’s technologies, essentially being the first public customer to endorse the new company.<span id="more-5444"></span></p>
<p>GE and Pivotal have also entered a commercial agreement to begin a broad research and development project aimed to advance new analytic services and applications that support GE’s <a title="vision of the Industrial Internet" href="http://files.gereports.com/wp-content/uploads/2012/11/ge-industrial-internet-vision-paper.pdf" target="_blank">vision of the Industrial Internet</a>. In a <a title="press announcement from GE" href="http://gopivotal.com/about-pivotal/press-center/04242013-launch01" target="_blank">press announcement from GE</a> earlier today, the company detailed more about what they plan to accomplish:</p>
<p style="padding-left: 30px;"><em>Over the past two decades, technology has connected people globally and created unprecedented opportunities for business and consumers. In the next decade, the Internet will also transform industries like aviation, rail, energy, oil &amp; gas, infrastructure and healthcare, connecting human insight with machine intelligence, advanced analytics and low-cost-sensing to drive new levels of productivity and efficiency for global industries.</em></p>
<p>The Pivotal and GE have established an R&amp;D facility in San Ramon, California that is already in operation with over 400 hundred employees developing new ways for industrial and medical products to use big data. The distribution of ownership among the companies will be EMC will own 62 percent; VMware 28 percent and GE 10 percent.</p>
<h2>How and Why Pivotal Formed</h2>
<p>Last August, Maritz left VMware to go lead a vision of how to seriously advance the industry to make Platform-as-a-Service widely appealing to the enterprise and dramatically accelerate adoption. Basically, Maritz wants to sell you “<a title="Google in a Box" href="http://www.wired.com/wiredenterprise/2013/02/pivotal/" target="_blank">Google in a Box</a>”. He wants to make it easy for you to achieve Google-style engineering, and become the next great software company that much more quickly.</p>
<p>Pivotal is aimed at helping companies get enormous jump-starts on development, and make it easy for applications to scale across cheap commodity servers in public and private clouds. It will make it easy for companies to be more like Google, Facebook, Twitter and Instagram without having to invent the development and production systems themselves. With Pivotal, they will have a highly-productive development platform and ultra-scalable production systems that deal with big, fast data inherently. And it will all be delivered as a service—ready to go when you are.</p>
<p>The appeal of this type of development environment will lower the barrier to entry for new innovative applications and right size the economics of investment for data center support based off of real usage. This idea is so appealing to companies and analysts that it is expected to be an $8 billion dollar industry this year—and will grow to $20 billion within the next 5 years.</p>
<h2>Announcing Pivotal One: A Next Generation Enterprise Platform-as-a-Service (PaaS)</h2>
<p>Armed with products from VMware including Cloud Foundry, Spring, and the vFabric middleware solutions, and EMC’s Greenplum and Pivotal consulting experts, Pivotal already has many solutions in place. However, today Pivotal&#8217;s Scott Yara revealed a plan for Pivotal One, the name of Pivotal&#8217;s next-generation Enterprise PaaS that will integrate new data fabrics, modern programming frameworks, cloud portability and support for legacy systems.</p>
<p>According to Yara, in order to achieve the next level of productivity, this solution will have to be targeted at the developer. Pivotal One will improve developer productivity, simplify and speed how applications deal with large data, and enable true cloud independence. Designed for the enterprise, it will also allow companies to <a title="still use legacy systems and even improve their performance" href="http://blogs.vmware.com/vfabric/2013/04/banks-are-breaking-away-from-mainframes-to-big-fast-data-grids.html" target="_blank">still use legacy systems and even improve their performance</a>.</p>
<p>Yara also outlined another key requirement placing data as the top priority. Modern applications will need to build the access and real-time analytics of data into every solution. Pivotal One will be investing heavily into big fast data solutions including Hadoop.</p>
<p>The first release of Pivotal One will be in Q4 of this year. Components of the Pivotal One Platform will include:</p>
<p style="padding-left: 30px;"><strong>Pivotal Data Fabric </strong></p>
<p style="padding-left: 30px;"><a title="Pivotal HD" href="http://www.greenplum.com/products/pivotal-hd" target="_blank">Pivotal HD</a> provides integrated, advanced data services by integrating the industry leading massively parallel processing (MPP) Greenplum Database with enterprise-hardened Apache Hadoop. With enterprise data services like HAWQ and Pivotal in-memory data grid technology that make Hadoop more stable and usable, Pivotal HD reduces implementation costs for Hadoop in all but the most complex environments.</p>
<p style="padding-left: 30px;"><strong>Pivotal Cloud and Application Platform </strong></p>
<p style="padding-left: 30px;">Based on <a title="Cloud Foundry" href="http://www.cloudfoundry.com" target="_blank">Cloud Foundry</a> and <a title="Spring" href="http://www.springsource.org" target="_blank">Spring</a>, building applications with Pivotal builds in portability, scaling, automation, and resiliency. Pivotal Application Fabric bakes in rapid application development for messaging, application services and database services with the Spring developer ecosystem and analytic and visualization instrumentation. This platform will also feature other products acquired from VMware’s portfolio including Groovy, Grails, GemFire, SQLFire, RabbitMQ, tc Server, and Web Server.</p>
<p style="padding-left: 30px;"><strong>Pivotal Expert Services</strong></p>
<p style="padding-left: 30px;">Pivotal Expert Services is the consulting arm of the operation that helps companies achieve agile development and sophisticated data analytics on a project-by-project basis.</p>
<p style="padding-left: 30px;"><strong>Pivotal Labs</strong></p>
<p style="padding-left: 30px;">Pivotal Labs are the for-hire engineering team that are known for their industry-leading agile development methodology that have helped hundreds of companies to rapidly build and deploy modern mobile, web, and enterprise applications.</p>
<p style="padding-left: 30px;"><strong>Data Science Labs</strong></p>
<p style="padding-left: 30px;">The Pivotal Data Science team is a specific team of consulting experts that will partner with businesses to accelerate analytics projects and exploit new value hidden in your data.</p>
<p style="padding-left: 30px;"><strong>Open Source Support</strong></p>
<p style="padding-left: 30px;">Collaborative and customer-driven open source support and co-development.</p>
<p style="padding-left: 30px;"><strong>Pivotal Open Source Software</strong></p>
<p style="padding-left: 30px;">Pivotal supports some of the most vibrant open source software communities in the world, including Spring, Cloud Foundry, RabbitMQ, Redis, OpenChorus, and more.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.vmware.com/vfabric/2013/04/webinar-recap-pivotal-opens-for-business-ge-gets-10-stake-and-how-pivotal-plans-to-deliver-next-generation-paas.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>15% Discount for Spring Java Training in May</title>
		<link>http://blogs.vmware.com/vfabric/2013/04/15-discount-for-spring-java-training-in-may.html?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=15-discount-for-spring-java-training-in-may</link>
		<comments>http://blogs.vmware.com/vfabric/2013/04/15-discount-for-spring-java-training-in-may.html#comments</comments>
		<pubDate>Tue, 23 Apr 2013 16:14:50 +0000</pubDate>
		<dc:creator>Stacey Schneider</dc:creator>
				<category><![CDATA[Spring]]></category>
		<category><![CDATA[education]]></category>
		<category><![CDATA[training]]></category>

		<guid isPermaLink="false">http://blogs.vmware.com/vfabric/?p=5850</guid>
		<description><![CDATA[Training is a great way to speed up development, learn how to improve performance and usability for your applications and generally build confidence in your skills. This month, SpringSource is offering java developers a 15% discount code on all VMware &#8230; <a href="http://blogs.vmware.com/vfabric/2013/04/15-discount-for-spring-java-training-in-may.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://blogs.vmware.com/vfabric/files/2013/04/15off.jpg"><img class="alignleft size-full wp-image-5855" title="15% off" src="http://blogs.vmware.com/vfabric/files/2013/04/15off.jpg" alt="" width="340" height="200" /></a></p>
<p>Training is a great way to speed up development, learn how to improve performance and usability for your applications and generally build confidence in your skills. This month, SpringSource is offering java developers a 15% discount code on all VMware trainings including Core Spring, Spring Web, Enterprise Integration, and Hibernate classes.</p>
<p>To secure your 15% discount, be sure to use the promo code <strong>springcustomerpromo </strong>during your registration process (promo is not available for partners). All of the following qualifying classes for May, 2013 can be found below:</p>
<p><strong>Step 1: Core Spring</strong></p>
<p><em>Americas</em></p>
<ul>
<li>May 07 – 10: <a title="Core Spring in Boston, MA" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173649" target="_blank">Core Spring in Boston, MA</a></li>
<li>May 07 – 10: <a title="Core Spring in Dallas, TX" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173648" target="_blank">Core Spring in Dallas, TX</a></li>
<li>May 13 – 16: <a title="Core Spring in Phoenix, AZ" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173216" target="_blank">Core Spring in Phoenix, AZ</a></li>
<li>May 13 – 16: <a title="Core Spring in Sacramento, CA" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173219" target="_blank">Core Spring in Sacramento, CA</a></li>
<li>May 13 – 16: <a title="Core Spring in San Francisco, CA" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173220" target="_blank">Core Spring in San Francisco, CA</a></li>
<li>May 13 – 16: <a title="Core Spring in San Jose, CA" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173223" target="_blank">Core Spring in San Jose, CA</a></li>
<li>May 14 – 17: <a title="Core Spring in Washington, DC" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173650" target="_blank">Core Spring in Washington, DC</a></li>
<li>May 14 – 17: <a title="Core Spring in Washington, DC" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173650" target="_blank">Core Spring in Washington, DC</a></li>
<li>May 14 – 17: <a title="Core Spring in Los Angeles, CA" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173651" target="_blank">Core Spring in Los Angeles, CA</a></li>
<li>May 14 – 17: <a title="Core Spring in Portland, OR" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173652" target="_blank">Core Spring in Portland, OR</a></li>
<li>May 21 – 24: <a title="Core Spring in Salt Lake City, UT" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173654" target="_blank">Core Spring in Salt Lake City, UT</a></li>
<li>May 21 – 24: <a title="Core Spring in Seattle, WA" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173653" target="_blank">Core Spring in Seattle, WA</a></li>
<li>May 27 – 30: <a title="Core Spring in Bogota, Colombia" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=174548" target="_blank">Core Spring in Bogota, Colombia</a></li>
<li>May 28 – 31: <a title="Core Spring in Stamford, CT" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173656" target="_blank">Core Spring in Stamford, CT</a></li>
<li>May 28 – 31: <a title="Core Spring in Charlotte, NC" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173655" target="_blank">Core Spring in Charlotte, NC<span id="more-5850"></span></a></li>
</ul>
<p><em>Asia Pacific</em></p>
<ul>
<li>May 06 – 09: <a title="Core Spring in Canberra, Australia" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=160795" target="_blank">Core Spring in Canberra, Australia</a></li>
<li>May 06 – 09: <a title="Core Spring in Hyderabad, India" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=168065" target="_blank">Core Spring in Hyderabad, India</a></li>
<li>May 06 – 09: <a title="Core Spring in Bangalore, India" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=158753" target="_blank">Core Spring in Bangalore, India</a></li>
<li>May 14 – 17: <a title="Core Spring in Singapore, Singapore" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=154197" target="_blank">Core Spring in Singapore, Singapore</a></li>
<li>May 20 – 23: <a title="Core Spring in Sydney, Australia" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=160793" target="_blank">Core Spring in Sydney, Australia</a></li>
</ul>
<p><em>Europe, Middle East &amp; Africa</em></p>
<ul>
<li>May 07 – 10: <a title="Core Spring in Rome, Italy" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=162023" target="_blank">Core Spring in Rome, Italy</a></li>
<li>May 07 – 10: <a title="Core Spring in Lisbon, Portugal" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=155982" target="_blank">Core Spring in Lisbon, Portugal</a></li>
<li>May 07 – 10: <a title="Core Spring in Madrid, Spain" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=167280" target="_blank">Core Spring in Madrid, Spain</a></li>
<li>May 13 – 16: <a title="Core Spring in London, UK" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=171409" target="_blank">Core Spring in London, UK</a></li>
<li>May 14 – 17: <a title="Core Spring in Kontich, Belgium" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=157576" target="_blank">Core Spring in Kontich, Belgium</a></li>
<li>May 14 – 17: <a title="Core Spring in Paris, France" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=170766" target="_blank">Core Spring in Paris, France</a></li>
<li>May 14 – 17: <a title="Core Spring in Stockholm, Sweden" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=150313" target="_blank">Core Spring in Stockholm, Sweden</a></li>
<li>May 21 – 24: <a title="Core Spring in Zagreb, Croatia" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=174438" target="_blank">Core Spring in Zagreb, Croatia</a></li>
<li>May 21 – 24: <a title="Core Spring in Berlin, Germany" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=140833" target="_blank">Core Spring in Berlin, Germany</a></li>
<li>May 28 – 31: <a title="Core Spring in Prague, Czech Republic" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=174429" target="_blank">Core Spring in Prague, Czech Republic</a></li>
<li>May 28 – 31: <a title="Core Spring in Hamburg, Germany" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=140841" target="_blank">Core Spring in Hamburg, Germany</a></li>
</ul>
<p><strong>Step 2: Spring Web / Enterprise Integration with Spring / Hibernate with Spring</strong></p>
<p><em>Americas</em></p>
<ul>
<li>May 06 – 09: <a title="Spring Web in Edmonton, AB" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173504" target="_blank">Spring Web in Edmonton, AB</a></li>
<li>May 06 – 09: <a title="Spring Web in Phoenix, AZ" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173503" target="_blank">Spring Web in Phoenix, AZ</a></li>
<li>May 06 – 09: <a title="Spring Web in Sacramento, CA" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173499" target="_blank">Spring Web in Sacramento, CA</a></li>
<li>May 06 – 09: <a title="Spring Web in San Francisco, CA" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173500" target="_blank">Spring Web in San Francisco, CA</a></li>
<li>May 06 – 09: <a title="Spring Web in San Jose, CA" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173501" target="_blank">Spring Web in San Jose, CA</a></li>
<li>May 06 – 09: <a title="Spring Web in Phoenix, AZ" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173503" target="_blank">Spring Web in Phoenix, AZ</a></li>
<li>May 21 – 24: <a title="Enterprise Integration with Spring in Ottawa, ON" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173386" target="_blank">Enterprise Integration with Spring in Ottawa, ON</a></li>
<li>May 21 – 24: <a title="Enterprise Integration with Spring in Toronto, ON" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173387" target="_blank">Enterprise Integration with Spring in Toronto, ON</a></li>
<li>May 21 – 24: <a title="Enterprise Integration with Spring in Montreal, QC" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173385" target="_blank">Enterprise Integration with Spring in Montreal, QC</a></li>
<li>May 21 – 24: <a title="Enterprise Integration with Spring in Edison, NJ" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173383" target="_blank">Enterprise Integration with Spring in Edison, NJ</a></li>
<li>May 21 – 24: <a title="Spring Web in Chicago, IL" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173657" target="_blank">Spring Web in Chicago, IL</a></li>
<li>May 28 – 31: <a title="Enterprise Integration with Spring in Los Angeles, CA" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173658" target="_blank">Enterprise Integration with Spring in Los Angeles, CA</a></li>
</ul>
<p><em>Europe, Middle East &amp; Africa</em></p>
<ul>
<li>May 06 – 08: <a title="Hibernate with Spring in Wien, Austria" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=172350" target="_blank">Hibernate with Spring in Wien, Austria</a></li>
<li>May 06 – 08: <a title="Hibernate with Spring in Berlin, Germany" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=140872" target="_blank">Hibernate with Spring in Berlin, Germany</a></li>
<li>May 06 – 08: <a title="Hibernate with Spring in Zurich, Switzerland" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=172363" target="_blank">Hibernate with Spring in Zurich, Switzerland</a></li>
<li>May 07 – 10: <a title="Enterprise Integration with Spring in Amsterdam, Netherlands" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=167938" target="_blank">Enterprise Integration with Spring in Amsterdam, Netherlands</a></li>
<li>May 14 – 17: <a title="Enterprise Integration with Spring in Munich, Germany" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=140863" target="_blank">Enterprise Integration with Spring in Munich, Germany</a></li>
<li>May 20 – 23: <a title="Enterprise Integration with Spring in London, UK" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=171421" target="_blank">Enterprise Integration with Spring in London, UK</a></li>
<li>May 21 – 24: <a title="Enterprise Integration with Spring in Dublin, Ireland" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=167920" target="_blank">Enterprise Integration with Spring in Dublin, Ireland</a></li>
<li>May 27 – 29: <a title="Hibernate with Spring in Munich, Germany" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=140876" target="_blank">Hibernate with Spring in Munich, Germany</a></li>
<li>May 28 – 31: <a title="Enterprise Integration with Spring in Brussels, Belgium" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=167930" target="_blank">Enterprise Integration with Spring in Brussels, Belgium</a></li>
<li>May 21 – 24: <a title="Spring Web in Prague, Czech Republic" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=174431" target="_blank">Spring Web in Prague, Czech Republic</a></li>
<li>May 21 – 24: <a title="Spring Web in Rome, Italy" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=154282" target="_blank">Spring Web in Rome, Italy</a></li>
<li>May 21 – 24: <a title="Spring Web in Amsterdam, Netherlands" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=167936" target="_blank">Spring Web in Amsterdam, Netherlands</a></li>
</ul>
<p><em>Live Online</em></p>
<ul>
<li>May 06 – 09: <a title="Spring Web Online in Americas" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173498" target="_blank">Spring Web Online in Americas</a></li>
<li>May 21 – 23: <a title="Hibernate with Spring Online in Americas" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173396" target="_blank">Hibernate with Spring Online in Americas</a></li>
<li>May 21 – 24: <a title="Enterprise Integration with Spring Online in Americas" href="http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&amp;a=det&amp;id_course=173384" target="_blank">Enterprise Integration with Spring Online in Americas</a></li>
</ul>
<p>Note: If you cannot find a professional training near you, you can always request an <a title="onsite SpringSource training" href="http://mylearn.vmware.com/mgrReg/message.cfm?ui=www&amp;subject=Onsite%20Training" target="_blank">onsite SpringSource training</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.vmware.com/vfabric/2013/04/15-discount-for-spring-java-training-in-may.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>7 Myths on Big Data—Avoiding Bad Hadoop and Cloud Analytics Decisions</title>
		<link>http://blogs.vmware.com/vfabric/2013/04/myths-about-running-hadoop-in-a-virtualized-environment.html?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=myths-about-running-hadoop-in-a-virtualized-environment</link>
		<comments>http://blogs.vmware.com/vfabric/2013/04/myths-about-running-hadoop-in-a-virtualized-environment.html#comments</comments>
		<pubDate>Mon, 22 Apr 2013 17:53:03 +0000</pubDate>
		<dc:creator>Adam Bloom</dc:creator>
				<category><![CDATA[GemFire]]></category>
		<category><![CDATA[RabbitMQ]]></category>
		<category><![CDATA[Serengeti]]></category>
		<category><![CDATA[SQLFire]]></category>
		<category><![CDATA[application]]></category>
		<category><![CDATA[Big]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[elastic]]></category>
		<category><![CDATA[grid]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[virtualization]]></category>

		<guid isPermaLink="false">http://blogs.vmware.com/vfabric/?p=5780</guid>
		<description><![CDATA[Hadoop is an open source legend built by software heroes. Yet, legends can sometimes be surrounded by myths—these myths can lead IT executives down a path with rose-colored glasses. Data and data usage is growing at an alarming rate.  Just look &#8230; <a href="http://blogs.vmware.com/vfabric/2013/04/myths-about-running-hadoop-in-a-virtualized-environment.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://en.wikipedia.org/wiki/Blind_men_and_an_elephant"><img class="alignleft size-full wp-image-5832" title="7 Myths of Big Data" src="http://blogs.vmware.com/vfabric/files/2013/04/header-graphic-7-myths-of-big-data_BigM.png" alt="" width="340" height="200" /></a>Hadoop is an open source legend built by software heroes.</p>
<p>Yet, legends can sometimes be surrounded by myths—these myths can lead IT executives down a path with rose-colored glasses.</p>
<p><a title="data and data usage is growing" href="http://blogs.vmware.com/vfabric/2013/03/why-every-database-must-be-broken-soon.html" target="_blank">Data and data usage is growing at an alarming rate</a>.  Just look at all the numbers from analysts—IDC predicts a <a title="IDC on the 53.4% CAGR for storage" href="http://strata.oreilly.com/2013/01/market-forecast-fisa-big-data-daily-life.html" target="_blank">53.4% growth rate for storage</a> this year, AT&amp;T claims <a title="AT&amp;T on the 20,000% growth of their wireless data traffic over the past 5 years" href="http://www.attinnovationspace.com/innovation/story/a7781181" target="_blank">20,000% growth of their wireless data traffic over the past 5 years</a>, and if you take at your own communications channels, its guaranteed that the internet content, emails, app notifications, social messages, and automated reports you get every day has dramatically increased.  This is why companies ranging from <a title="McKinsey to Facebook to Walmart are doing something about big data" href="http://blogs.vmware.com/vfabric/2013/04/why-virtualize-hadoop.html" target="_blank">McKinsey to Facebook to Walmart are doing something about big data</a>.</p>
<p style="padding-left: 60px;"><em>Just like we saw in the dot-com boom of the 90s and the web 2.0 boom of the 2000s, the big data trend will also lead companies to make some really bad assumptions and decisions.</em></p>
<p>Hadoop is certainly one major area of investment for companies to use to solve big data needs. Companies like Facebook that have famously dealt well with large data volumes have publicly touted their successes with Hadoop, so its natural that companies approaching big data first look to the successes of others.  A really smart MIT computer science grad once told me, “when all you have is a hammer, everything looks like a nail.” This <a title="functional fixedness" href="http://en.wikipedia.org/wiki/Functional_fixedness" target="_blank">functional fixedness</a> is the cognitive bias to avoid with the hype surrounding <a title="Hadoop" href="http://hadoop.apache.org/" target="_blank">Hadoop</a>. Hadoop is a multi-dimensional solution that can be deployed and used in different way. Let’s look at some of the most common pre-concieved notions about Hadoop and big data that companies should know before committing to a Hadoop project:<span id="more-5780"></span></p>
<p><strong>1. Big Data is purely about volume—NOT TRUE</strong></p>
<p>Besides volume, several industry leaders have also touted <a title="variety, variability, velocity, and value" href="http://searchdatamanagement.techtarget.com/news/2240036228/Will-your-organization-benefit-from-big-data-processing-technology" target="_blank">variety, variability, velocity, and value</a>. Putting all arguments about <a title="alliteration" href="http://en.wikipedia.org/wiki/Alliteration" target="_blank">alliteration</a> aside, the point is that data is not just growing—it is moving further towards real-time analysis, coming from structured and unstructured sources, and being used to try and make better decisions. With these considerations, analyzing a large volume of data is not the only way to achieve value. For example, storing and analyzing terabytes of data over time might not add nearly as much value as analyzing 1 gigabyte of really important, impactful information in real time. From a tool-set perspective, you might want an in-memory data grid built for real-time pricing calculations instead of a way to slice and dice historical prices into a <a title="dead horse" href="http://en.wikipedia.org/wiki/Flogging_a_dead_horse" target="_blank">dead horse</a>.</p>
<p><strong>2. Traditional SQL doesn’t work with Hadoop—NOT TRUE</strong></p>
<p>When Facebook, Twitter, Yahoo! and others bet big on Hadoop, they also knew that HDFS and MapReduce were limited in their ability to deal with expressive queries through a language like SQL. This is how <a title="Hive" href="http://en.wikipedia.org/wiki/Apache_Hive" target="_blank">Hive</a>, <a title="Pig" href="http://en.wikipedia.org/wiki/Pig_(programming_language)" target="_blank">Pig</a>, and <a title="Sqoop" href="http://en.wikipedia.org/wiki/Sqoop" target="_blank">Sqoop</a> were ultimately hatched. Given that so much data on earth is managed through SQL, many companies and projects are offering ways to address the compatibility of Hadoop and SQL. <a title="Pivotal HD’s HAWQ" href="http://www.greenplum.com/products/pivotal-hd" target="_blank">Pivotal HD’s HAWQ</a> is one example—a parallel SQL-compliant query engine that has shown to be 10 to 100s of times faster than other Hadoop query engines in the market today—and it was built to support petabyte data sets.</p>
<p><strong>3. Kill the Mainframe! Hadoop is the only the new IT data platform—NOT TRUE</strong></p>
<p>There are many longstanding investments in the IT portfolio, and the mainframe is an example of one that probably should evolve along with ERP, CRM, and SCM. While the mainframe isn’t being buried by companies, it definitely <a title="needs some new legs" href="http://blogs.vmware.com/vfabric/2013/01/four-strategies-for-modernizing-mainframe-applications-to-the-cloud.html" target="_blank">needs a new strategy to grow new legs and expand on the value of it&#8217;s existing investment</a>. For many of our customers that run into <a title="issues with mainframe speed, scale, or cost" href="http://blogs.vmware.com/vfabric/2013/04/banks-are-breaking-away-from-mainframes-to-big-fast-data-grids.html" target="_blank">issues with mainframe speed, scale, or cost</a>, there are incremental ways to evolve the big iron data platform and actually get more use out of it. For example, in-memory, big data grids like <a title="vFabric SQLFire" href="https://blogs.vmware.com/vfabric/2013/03/vmware-vfabric-sqlfire-1-1-0-released.html" target="_blank">vFabric SQLFire</a> can be <a title="embedded or use distributed caching approaches" href="https://blogs.vmware.com/vfabric/2012/11/3-key-stages-to-evolve-from-legacy-dbs-to-a-global-cloud-data-grid.html" target="_blank">embedded or use distributed caching approaches</a> for dealing with problems like high-speed ingest from queues, speeding mainframe batch processes, or real-time analytical reporting.</p>
<p><strong>4. Virtualized Hadoop takes a performance hit—NOT TRUE</strong></p>
<p>Hadoop was designed originally to run on bare metal servers, however as adoption has grown many companies want it as a data center service running in the cloud. <a title="Why virtualize Hadoop?" href="http://cto.vmware.com/project-serengeti-theres-a-virtual-elephant-in-my-datacenter/" target="_blank">Why do companies want to virtualize Hadoop?</a> First, let’s consider the ability to <a title="manage infrastructure elastically" href="http://serengeti.cloudfoundry.com/pdf/Hadoop%20Virtualization%20Extensions%20on%20VMware%20vSphere%205.pdf" target="_blank">manage infrastructure elastically</a>—we quickly realize that scaling compute resources, like virtual Hadoop nodes, help with performance when data and compute are separated—otherwise, you would take a Hadoop node down and lose the data with it or add a node and have no data with it. Major Hadoop distributions from MapR, Hortonworks, Cloudera, and Greenplum all support <a title="Project Serengeti" href="http://serengeti.cloudfoundry.com/" target="_blank">Project Serengeti</a> and <a title="Hadoop Virtualization Extensions (HVE)" href="http://serengeti.cloudfoundry.com/pdf/Hadoop%20Virtualization%20Extensions%20on%20VMware%20vSphere%205.pdf" target="_blank">Hadoop Virtualization Extensions (HVE)</a> for this reason. In addition, our research with partners has show that Hadoop works quite well on vSphere and can even perform better under certain conditions—running 2 or 4 smaller VMs per physical machine often resulted in better performance, up to 14% faster, than a native approach according to <a title="benchmarks we’ve done with partners" href="http://www.vmware.com/files/pdf/techpaper/VMW-Hadoop-Performance-vSphere5.pdf" target="_blank">benchmarks we’ve done with partners</a>.</p>
<p><strong>5. Hadoop only works in your data center—NOT TRUE</strong></p>
<p>First of all, there are SaaS-based, cloud solutions, <a title="like Cetas" href="http://cetas.net/products.php" target="_blank">like Cetas</a>, that allow you to run Hadoop, SQL, and real-time analytics in the cloud without investing the time and money it takes do build a large project inside your data center. For a public cloud runtime, Java developers can probably benefit from <a title="Spring Data for Apache Hadoop" href="http://www.springsource.org/spring-data/hadoop" target="_blank">Spring Data for Apache Hadoop</a> and the related examples on <a title="GitHub" href="https://github.com/SpringSource" target="_blank">GitHub</a> or <a title="online video introduction" href="http://www.youtube.com/watch?v=wlTnBzQ6KDU" target="_blank">online video introduction</a>.</p>
<p><strong>6. Hadoop doesn’t make financial sense to virtualize—NOT TRUE</strong></p>
<p>Hadoop is typically explained as running on a bank of commodity servers—so, one might conclude that adding a virtualization layer adds extra cost but no extra value. There is a flaw in this perspective—you are not considering the fact that data and data analysis are both dynamic. To become an organization that leverages the power of Hadoop to grow, innovate, and create efficiencies, you are going to vary the sources of data, the speed of analysis, and more. Virtualized infrastructure still reduces the physical hardware footprint to bring CAPEX in line with pure commodity hardware, and OPEX is reduced through automation and higher utilization of shared infrastructure.</p>
<p><strong>7. Hadoop doesn’t work on SAN or NAS—NOT TRUE</strong></p>
<p>Hadoop runs on local disks, but it can also <a title="run well in a shared SAN environment" href="http://serengeti.cloudfoundry.com/pdf/Virtualizing-Apache-Hadoop.pdf" target="_blank">run well in a shared SAN environment</a> for small to medium sized clusters with <a title="different cost and performance characteristics" href="http://www.vmware.com/files/pdf/techpaper/VMW-Hadoop-Performance-vSphere5.pdf" target="_blank">different cost and performance characteristics</a>. High bandwidth networks like 10GB Ethernet, FoE, and iSCSI can also support effective performance.</p>
<p><strong>Taking Action to Overcome the Myths</strong></p>
<p>While many of us are fans of big data, this list can help you take a step back and look objectively at the right approach to solving your big data problems. Just like some building projects need hammers and others need screwdrivers, hacksaws, or a welding torch, Hadoop is just one tool to help conquer big data problems. High velocity data may push you towards an in-memory, big data grid like <a title="GemFire" href="https://www.vmware.com/products/application-platform/vfabric-gemfire/overview.html" target="_blank">GemFire</a> or <a title="SQLFire" href="https://www.vmware.com/products/application-platform/vfabric-sqlfire/overview.html" target="_blank">SQLFire</a>. A need for massive, consumer-grade web scale may mean you need message-oriented middleware like <a title="RabbitMQ" href="https://www.vmware.com/products/application-platform/vfabric-rabbitmq/overview.html" target="_blank">RabbitMQ</a>. Getting to market faster may mean you need to look at a full SaaS solution like Cetas, and Redis may meet your needs and find a home in your stack much easier than a full blown Hadoop environment.</p>
<p><strong>To learn more about the products in this article:</strong></p>
<ul>
<li>Read over 100 articles about <a title="GemFire Articles" href="http://blogs.vmware.com/vfabric/gemfire" target="_blank">GemFire</a> or <a title="Articles on SQLFire" href="http://blogs.vmware.com/vfabric/sqlfire" target="_blank">SQLFire</a></li>
<li>Check out the <a title="RabbitMQ Case Studies" href="http://blogs.vmware.com/vfabric/rabbitmq" target="_blank">case studies on RabbitMQ</a></li>
<li>See the <a title="Pivotal HD" href="http://www.greenplum.com/products/pivotal-hd" target="_blank">Pivotal HD product page</a> or the <a title="VMware Hadoop" href="vmware.com/hadoop" target="_blank">Hadoop Virtualization pages on VMware.com</a></li>
<li>Learn more about Hadoop in the cloud with <a title="Cetas Hadoop in the Cloud - SaaS" href="http://cetas.net/" target="_blank">Cetas</a></li>
<li>Find out more about <a title="Redis" href="http://redis.io/" target="_blank">Redis</a></li>
</ul>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.vmware.com/vfabric/2013/04/myths-about-running-hadoop-in-a-virtualized-environment.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>How fast is a Rabbit? Basic RabbitMQ Performance Benchmarks</title>
		<link>http://blogs.vmware.com/vfabric/2013/04/how-fast-is-a-rabbit-basic-rabbitmq-performance-benchmarks.html?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=how-fast-is-a-rabbit-basic-rabbitmq-performance-benchmarks</link>
		<comments>http://blogs.vmware.com/vfabric/2013/04/how-fast-is-a-rabbit-basic-rabbitmq-performance-benchmarks.html#comments</comments>
		<pubDate>Thu, 18 Apr 2013 17:15:51 +0000</pubDate>
		<dc:creator>Adam Bloom</dc:creator>
				<category><![CDATA[Cloud Foundry]]></category>
		<category><![CDATA[RabbitMQ]]></category>
		<category><![CDATA[messaging]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[scale]]></category>

		<guid isPermaLink="false">http://blogs.vmware.com/vfabric/?p=5783</guid>
		<description><![CDATA[One of the greatest things about RabbitMQ is the community that surrounds it. With open source at its roots, people come together to share their code, their knowledge and their stories of how they’ve deployed it in their projects. At &#8230; <a href="http://blogs.vmware.com/vfabric/2013/04/how-fast-is-a-rabbit-basic-rabbitmq-performance-benchmarks.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-5795" title="rabbit_header" src="http://blogs.vmware.com/vfabric/files/2013/04/rabbit_header.png" alt="" width="340" height="200" />One of the greatest things about <a title="RabbitMQ" href="http://www.rabbitmq.com/" target="_blank">RabbitMQ</a> is the community that surrounds it. With open source at its roots, people come together to share their code, their knowledge and their stories of how they’ve deployed it in their projects. At a <a title="recent meetup" href="http://rivierarb.fr/2013/04/02/Drinkup/" target="_blank">recent meetup</a> near Nice, France, database engineer Adina Mihailescu shared a <a title="presentation" href="http://rivierarb.fr/presentations/messaging-systems/" target="_blank">presentation</a> on choosing messaging systems. Supported by Murial Salvan’s <a title="benchmark" href="http://x-aeon.com/wp/2013/04/10/a-quick-message-queue-benchmark-activemq-rabbitmq-hornetq-qpid-apollo/" target="_blank">benchmark</a> comparing <a title="ActiveMQ" href="http://activemq.apache.org/" target="_blank">ActiveMQ</a>, <a title="RabbitMQ" href="http://www.rabbitmq.com/" target="_blank">RabbitMQ</a>, <a title="HornetQ" href="http://www.jboss.org/hornetq" target="_blank">HornetQ</a>, <a title="Apollo" href="http://activemq.apache.org/apollo/" target="_blank">Apollo</a>, <a title="QPID" href="http://qpid.apache.org/" target="_blank">QPID</a>, and <a title="ZeroMQ" href="http://www.zeromq.org/" target="_blank">ZeroMQ</a>, they shared some interesting performance comparisons that we’d like to share with you.</p>
<p>In a single laptop benchmark, Salvan ran four different scenarios in order to obtain some insight on performance of the default setups for these messaging solutions. Each test had 1 process dedicated to enqueuing and another dedicated to dequeuing. The message volume and size ranged from 200 to 20,000 to 200,000 messages and 32 to 1024 to 32768 bytes. Both persistent and transient queues and messages were used.<span id="more-5783"></span></p>
<p><a href="http://blogs.vmware.com/vfabric/files/2013/04/rabbitmq-performance-benchmark-set-up.png" target="_blank"><img class="alignnone  wp-image-5796" title="rabbitmq-performance-benchmark-set-up" src="http://blogs.vmware.com/vfabric/files/2013/04/rabbitmq-performance-benchmark-set-up.png" alt="" width="903" height="618" /></a></p>
<p>The main points are below, however you can also check out <a title="Salvan’s blog post" href="http://x-aeon.com/wp/2013/04/10/a-quick-message-queue-benchmark-activemq-rabbitmq-hornetq-qpid-apollo/" target="_blank">Salvan’s blog post</a> to see the <a title="source code" href="https://github.com/Muriel-Salvan/mq-benchmarks" target="_blank">source code</a>, charts, and a data sheet.  Here were the highlighted results:</p>
<ul>
<li>Brokers perform much better with fewer, bigger messages.</li>
<li>Persistence drawbacks appear with big messages, and time on small/medium messages is spent on processing not I/O.</li>
<li>ZeroMQ’s simple feature-set delivers great performance.</li>
<li>QPID seems to perform the best when persistence is not used.</li>
<li>AMQP seems more optimized than STOMP.</li>
<li>RabbitMQ seems to outperform others by a factor of 3 except for the case of big messages.</li>
</ul>
<p>Adina’s related <a title="presentation" href="http://rivierarb.fr/presentations/messaging-systems/" target="_blank">presentation</a> is titled, “Messaging Systems—How to make the right choice?” In it, she covers the following:</p>
<ul>
<li>What are messaging systems?</li>
<li>How messaging helps with performance, complexity, scalability, quality, cost-effectiveness, high-availability, and messaging patterns</li>
<li>Comparison of the Java Message Service API, AMQP, and STOMP with code examples</li>
<li>Trade-offs</li>
<li>Functional and operational requirements</li>
<li>The benchmark results</li>
</ul>
<p>In one slide, Adina highlights the benefits of message oriented middleware in the cloud, a SaaS model, and explains how overhead is reduced and scale is introduced with greater simplicity and lower cost when messaging takes place in the cloud. Her examples include <a title="Amazon Simple Queue Service" href="http://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/Welcome.html" target="_blank">Amazon Simple Queue Service</a>,  <a title="CloudAMQP" href="http://www.cloudamqp.com/" target="_blank">CloudAMQP</a>, and <a title="StormMQ" href="http://stormmq.com/" target="_blank">StormMQ</a>. To add commentary to this slide, we’d like to also point out that <a title="RabbitMQ is available in the cloud" href="http://support.cloudfoundry.com/entries/20346977-RabbitMQ-Cloud-Foundry-Cloud-Messaging-that-Just-Works" target="_blank">RabbitMQ is available in the cloud</a> via <a title="Cloud Foundry and Pivotal" href="http://blog.cloudfoundry.com/2013/03/07/cloud-foundry-is-open-and-pivotal/" target="_blank">Cloud Foundry and Pivotal</a> along with <a title="MongoDB" href="http://www.mongodb.org" target="_blank">MongoDB</a>, <a title="Redis" href="http://redis.io/" target="_blank">Redis</a>, and there is also support for <a title="Scala" href="http://blog.cloudfoundry.com/2011/06/02/cloud-foundry-now-supporting-scala/" target="_blank">Scala</a> and the <a title="Play Framework" href="http://docs.cloudfoundry.com/frameworks/play/play.html" target="_blank">Play Framework</a>. With RabbitMQ and <a title="Cloud Foundry" href="http://www.cloudfoundry.com/about" target="_blank">Cloud Foundry</a>, there are some pretty cool things worth pointing out:</p>
<ul>
<li>You can use <a title="Node.js with the Cloud Foundry RabbitMQ service" href="http://docs.cloudfoundry.com/services/rabbitmq/nodejs-rabbitmq.html" target="_blank">Node.js with the Cloud Foundry RabbitMQ service</a></li>
<li>In Java, you can use <a title="Grails with RabbitMQ on Cloud Foundry" href="http://www.littlelostmanuals.com/2011/08/grails-rabbitmq-cloud-foundry-messaging.html" target="_blank">Grails with RabbitMQ on Cloud Foundry</a> or build <a title="Spring apps that hook up to the RabbitMQ service" href="http://docs.cloudfoundry.com/services/rabbitmq/spring-rabbitmq.html" target="_blank">Spring apps that hook up to the RabbitMQ service</a></li>
<li>With Ruby on Rails, you can develop in <a title="Ruby or Sinatra  via the bunny gem and access RabbitMQ services" href="http://docs.cloudfoundry.com/services/rabbitmq/ruby-rabbitmq.html" target="_blank">Ruby or Sinatra  via the bunny gem and access RabbitMQ services</a></li>
<li>There is a new <a title="simulator, video, and open source bits available" href="http://blogs.vmware.com/vfabric/2013/03/introducing-the-rabbitmq-simulator-video-open-source-bits.html" target="_blank">simulator, video, and open source bits available</a> for RabbitMQ and a how-to article for deploying the <a title="RabbitMQ simulator on Cloud Foundry" href="http://blogs.vmware.com/vfabric/2013/03/howto-deploying-rabbitmq-simulator-on-cloud-foundry.html" target="_blank">RabbitMQ simulator on Cloud Foundry</a>.</li>
</ul>
<h3>More About RabbitMQ Performance</h3>
<p>Less than a year ago, we began publishing some information about RabbitMQ’s performance and will summarize it here and begin to build a collection of RabbitMQ performance information.</p>
<p>In <a title="part one" href="http://www.rabbitmq.com/blog/2012/04/17/rabbitmq-performance-measurements-part-1/" target="_blank">part one</a>, we measured performance on a single PowerEdge R610 with dual Xeon E5530s and 40GB of RAM, <a title="RabbitMQ 2.8.1" href="http://blogs.vmware.com/vfabric/2012/06/vmware-vfabric-suite-51-significantly-upgrades-vfabric-rabbitmq.html" target="_blank">RabbitMQ 2.8.1</a> and Erlang R15B with HiPE compilation enabled and the code is available. We showed the difference in send rate, receive rate, and average latency <a title="improvements between RabbitMQ 2.7.1 and 2.8.1 due to a capability called internal flow control" href="http://www.rabbitmq.com/blog/2012/04/17/rabbitmq-performance-measurements-part-1/" target="_blank">improvements between RabbitMQ 2.7.1 and 2.8.1 due to a capability called internal flow control</a> and significant memory improvements.</p>
<p>In <a title="part 2" href="http://www.rabbitmq.com/blog/2012/04/25/rabbitmq-performance-measurements-part-2/" target="_blank">part 2</a>, we outlined how different features affect performance using smaller messages. For a baseline, we showed auto-ack of 44,824 messages per second with one producer and one consumer. Looking at aspects of performance independently, we can publish at 53,710 messages per second with no consumption and consume 64,315 messages per second stand-alone. We shared performance number differences from configuring the mandatory flag and immediate flag impact rates as well as acknowledgements, publish confirms, and message persistence. The diagram below explains how message send rate and bytes rate change with message size—message rates drop with size increases, but the number of bytes sent increases. You can read more about message size and horizontal scale prefetch counts in this portion of the <a title="post" href="http://www.rabbitmq.com/blog/2012/04/25/rabbitmq-performance-measurements-part-2/" target="_blank">post</a>.</p>
<p><a href="http://blogs.vmware.com/vfabric/files/2013/04/sending-rate-message-sizes.png" target="_blank"><img class="alignnone  wp-image-5793" title="sending-rate-message-sizes" src="http://blogs.vmware.com/vfabric/files/2013/04/sending-rate-message-sizes.png" alt="" width="763" height="364" /></a></p>
<p>In the article, we go on to publish information for scenarios with large queues and paging—we load and drain 500,000 messages, 10,000,000 messages to show where the bottlenecks occur. There are also other <a title="in-depth performance resources" href="http://www.rabbitmq.com/blog/2011/10/27/performance-of-queues-when-less-is-more/" target="_blank">in-depth performance resources</a> on the RabbitMQ blog and you can read over 45 case studies, tutorials, and updates on RabbitMQ at the vFabric Blog.</p>
<p>Of course, in the spirit of community, if you’ve run across any great performance articles on messaging and RabbitMQ, we would love to hear from you and share your stories on this blog.  To let us know, leave a comment below and we will contact you.</p>
<p>For more information on RabbitMQ:</p>
<ul>
<li>Check out the <a title="vFabric Product Line and vFabric RabbitMQ" href="http://www.vmware.com/products/application-platform/vfabric-rabbitmq.html" target="_blank">vFabric RabbitMQ product page</a></li>
<li>Learn about the <a title="New vFabric Reference Architecture and how RabbitMQ fits" href="http://blogs.vmware.com/vfabric/2013/02/introducing-a-new-reference-architecture-that-will-speed-knowledge-development-of-modern-cloud-applications.html" target="_blank">New vFabric Reference Architecture and how RabbitMQ fits</a></li>
<li>Download a trial of RabbitMQ</li>
<li>Read how Rabbit is integrated with <a title="over 100 developer tools and platforms" href="http://www.rabbitmq.com/devtools.html" target="_blank">over 100 developer tools and platforms</a></li>
<li>Check out case studies on how <a title="Instagram" href="http://blogs.vmware.com/vfabric/2013/04/how-instagram-feeds-work-celery-and-rabbitmq.html" target="_blank">Instagram</a>, <a title="HuffingtonPost Live" href="http://blogs.vmware.com/vfabric/2013/03/scaling-real-time-comments-huffpost-live-with-rabbitmq.html" target="_blank">HuffingtonPost Live</a>, <a title="Roblox" href="http://blogs.vmware.com/vfabric/2012/10/roblox-rabbitmq-hybrid-clouds-and-1-billion-page-viewsmonth.html" target="_blank">Roblox</a>, and <a title="Indeed.com" href="http://blogs.vmware.com/vfabric/2013/03/how-indeed-com-handles-35-million-job-postings-per-day-using-rabbitmq.html" target="_blank">Indeed.com</a> scale with RabbitMQ.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blogs.vmware.com/vfabric/2013/04/how-fast-is-a-rabbit-basic-rabbitmq-performance-benchmarks.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>10 Ways to Make Hadoop Green in the CFO’s Eyes</title>
		<link>http://blogs.vmware.com/vfabric/2013/04/why-virtualize-hadoop.html?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=why-virtualize-hadoop</link>
		<comments>http://blogs.vmware.com/vfabric/2013/04/why-virtualize-hadoop.html#comments</comments>
		<pubDate>Wed, 17 Apr 2013 16:41:11 +0000</pubDate>
		<dc:creator>Adam Bloom</dc:creator>
				<category><![CDATA[Serengeti]]></category>
		<category><![CDATA[CEO]]></category>
		<category><![CDATA[CFO]]></category>
		<category><![CDATA[co-locate]]></category>
		<category><![CDATA[elastic]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[infrastructure]]></category>
		<category><![CDATA[investment]]></category>
		<category><![CDATA[time-share]]></category>
		<category><![CDATA[virtualization]]></category>

		<guid isPermaLink="false">http://blogs.vmware.com/vfabric/?p=5756</guid>
		<description><![CDATA[Hadoop is used by some pretty amazing companies to make use of big, fast data—particularly unstructured data. Huge brands on the web like AOL, eBay, Facebook, Google, Last.fm, LinkedIn, MercadoLibre, Ning, Quantcast, Spotify, Stumbleupon, Twitter, as well as some more &#8230; <a href="http://blogs.vmware.com/vfabric/2013/04/why-virtualize-hadoop.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-5775" title="Hadoop-green-on-black" src="http://blogs.vmware.com/vfabric/files/2013/04/Hadoop-green-on-black.jpg" alt="" width="340" height="200" />Hadoop is <a title="used by some pretty amazing companies" href="http://wiki.apache.org/hadoop/PoweredBy" target="_blank">used by some pretty amazing companies</a> to make use of big, fast data—particularly unstructured data. Huge brands on the web like AOL, eBay, Facebook, Google, Last.fm, LinkedIn, MercadoLibre, Ning, Quantcast, Spotify, Stumbleupon, Twitter, as well as some more brick and mortar giants like <a title="GE" href="http://news.cnet.com/8301-13846_3-20016013-62.html" target="_blank">GE</a>, <a title="Walmart" href="http://gigaom.com/2012/03/23/walmart-labs-is-building-big-data-tools-and-will-then-open-source-them/" target="_blank">Walmart</a>, <a title="Morgan Stanley" href="http://www.forbes.com/sites/tomgroenfeldt/2012/05/30/morgan-stanley-takes-on-big-data-with-hadoop/" target="_blank">Morgan Stanley</a>, <a title="Sears" href="http://www.informationweek.com/global-cio/interviews/why-sears-is-going-all-in-on-hadoop/240009717" target="_blank">Sears</a>, and <a title="Ford" href="http://www.datanami.com/datanami/2013-03-16/how_ford_is_putting_hadoop_pedal_to_the_metal.html" target="_blank">Ford</a> use Hadoop.</p>
<p>Why? In a nutshell, companies like <a title="McKinsey" href="http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation" target="_blank">McKinsey</a> believe the use of big data and technologies like Hadoop will allow companies to better compete and grow in the future.</p>
<p>Hadoop is used to support a variety of valuable business capabilities—analysis, search, machine learning, data aggregation, content generation, reporting, integration, and more. All types of industries use Hadoop—media and advertising, A/V processing, credit and fraud, security, geographic exploration, online travel, financial analysis, mobile phones, sensor networks, e-commerce, retail, energy discovery, video games, social media, and more.<span id="more-5756"></span></p>
<p><a href="http://blogs.vmware.com/vfabric/files/2013/04/Figure-1-Industry-Trends_Source-Forrester-survey-of-60-CIOs-September-2011.jpg" target="_blank"><img class="alignnone  wp-image-5763" title="Figure-1-Industry-Trends_Source-Forrester-survey-of-60-CIOs-September-2011" src="http://blogs.vmware.com/vfabric/files/2013/04/Figure-1-Industry-Trends_Source-Forrester-survey-of-60-CIOs-September-2011.jpg" alt="" width="1050" height="512" /></a></p>
<p>At first glance, it sounds like many of the above business needs were already solved by conventional data warehouses, business intelligence, and statistical analysis programs. This is not the case—the conventional systems begin to fail when the data sets become too large, include fast-growing unstructured data formats, or face both of these issues. With size and complexity issues, traditional BI systems can become too expensive. This is why Hadoop was invented.</p>
<p>Simply put, Hadoop follows the <a title="MapReduce model" href="http://en.wikipedia.org/wiki/MapReduce" target="_blank">MapReduce model</a> to slice data into chunks of work, spread the work across a large number of commodity servers, and aggregate the work back into a single output. It’s parallel computing approach out-scales the old models and is more cost-effective at doing so.</p>
<h3>Effectively Managing Hadoop from the CFO’s Eyes</h3>
<p>In the early days of the web and enterprise apps, everyone got so enamored by the potential for growth and productivity that both business and IT teams spent money prematurely—we ended up with a massive number of underutilized servers that cost us an arm and a leg to operate. Then, we spent more to virtualize these resources, get better utilization out of our datacenters and reduce our overhead.</p>
<p>With the big data technology trend, we are facing the same excitement around Hadoop. It’s going to be an investment area for the next decade or two, and <a title="your CFO is going to see this coming" href="http://www.cfoworld.com/technology/59541/idc-big-data-hype-still-here-maturity-beckons" target="_blank">your CFO is going to see this coming</a>. This time around, we can spend IT dollars much more wisely buy putting <a title="Hadoop on virtualized infrastructure" href="http://www.vmware.com/hadoop/overview.html" target="_blank">Hadoop on virtualized infrastructure</a> from the beginning. For those of us that have learned the painful TCO lessons from the past and understand the economics of virtualization, here is a list of ten key, financially sound, cloud infrastructure requirements that should be part of any Hadoop project:</p>
<ol>
<li>Initial Hadoop projects should be explored for the most pressing issues in the company and start by aligning with the CEO and CFO’s top needs and goals.</li>
<li>Hadoop investments should run with the same data center efficiency and cost effectiveness as other virtualized platforms that have high server consolidation ratios and require less CapEx and OpEx than non-virtualized environments.</li>
<li>Hadoop pilots should identify a big problem, make the scope concise, and complete quickly to prove the time-to-value and identify future costs and risks thoroughly. We all learn by doing—don’t drag out the time to value by over-engineering.</li>
<li>Hadoop must be able to co-locate with existing applications and run on existing virtualized hosts. This approach should accommodate a Hadoop pilot without new hardware or help manage shared infrastructure budgets in a cost-effective manner.</li>
<li>Hadoop nodes should use the concept of time sharing. For example, when email, database, web, or ERP applications are idle, the compute power available should be transferred to Hadoop nodes that are analyzing improvements in business performance.</li>
<li>The Hadoop infrastructure should be able to scale up or down elastically, on-demand, and across clouds for burst compute needs. This capability would allow you to expedite a big analysis on your company’s performance by temporarily adding new Hadoop nodes on a 3<sup>rd</sup> party cloud service to increase capacity.</li>
<li>Hadoop VMs should not require significant resources to scale, provision, deploy, replicate, or move because a cloud-centric, virtual machine infrastructure can accommodate this.</li>
<li>Hadoop should be available to the company as a shared service. This is one of the most cost-effective ways to provide Hadoop as a service. In this model, it is available to all departments based on chargeback accounting. Even with shared services, virtualization still allows for enough isolation to meet independent business and security needs.</li>
<li>Hadoop should not require expensive, high availability or fault tolerance (i.e. no downtime) frameworks based on hardware. Distributed computing is meant for commodity computing in the cloud.</li>
<li>Hadoop training, at least at a high level, should be provided to every IT person who engages with various business units and departments—Hadoop attracts talent and paves careers.</li>
</ol>
<p>To learn more about how VMware is helping virtualize Hadoop clusters, check out <a title="Project Serengeti" href="http://www.vmware.com/hadoop/overview.html" target="_blank">Project Serengeti</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.vmware.com/vfabric/2013/04/why-virtualize-hadoop.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Upcoming Webinar: Paul Maritz on Pivotal and The New Platform for the New Era</title>
		<link>http://blogs.vmware.com/vfabric/2013/04/upcoming-webinar-paul-maritz-on-pivotal-and-the-new-platform-for-the-new-era.html?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=upcoming-webinar-paul-maritz-on-pivotal-and-the-new-platform-for-the-new-era</link>
		<comments>http://blogs.vmware.com/vfabric/2013/04/upcoming-webinar-paul-maritz-on-pivotal-and-the-new-platform-for-the-new-era.html#comments</comments>
		<pubDate>Wed, 17 Apr 2013 15:55:26 +0000</pubDate>
		<dc:creator>Stacey Schneider</dc:creator>
				<category><![CDATA[Cloud Foundry]]></category>
		<category><![CDATA[GemFire]]></category>
		<category><![CDATA[Mobile]]></category>
		<category><![CDATA[RabbitMQ]]></category>
		<category><![CDATA[Spring]]></category>
		<category><![CDATA[SQLFire]]></category>
		<category><![CDATA[tc Server]]></category>
		<category><![CDATA[vFabric]]></category>
		<category><![CDATA[Web Server]]></category>
		<category><![CDATA[Pivotal]]></category>
		<category><![CDATA[webinar]]></category>

		<guid isPermaLink="false">http://blogs.vmware.com/vfabric/?p=5750</guid>
		<description><![CDATA[The cloud, mobile applications and big, fast data are fundamentally changing how applications are built and modernized today. To speed this transformation at the enterprise level, Pivotal, the new venture by VMware and EMC, will host a live streaming event &#8230; <a href="http://blogs.vmware.com/vfabric/2013/04/upcoming-webinar-paul-maritz-on-pivotal-and-the-new-platform-for-the-new-era.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://gopivotal.com/"><img class="alignright size-full wp-image-5767" title="webinar" src="http://blogs.vmware.com/vfabric/files/2013/04/webinar.jpg" alt="" width="340" height="200" /></a>The cloud, mobile applications and big, fast data are fundamentally changing how applications are built and modernized today. To speed this transformation at the enterprise level, Pivotal, <a title="the new venture by VMware and EMC" href="http://blogs.vmware.com/console/2012/12/the-pivotal-initiative.html" target="_blank">the new venture by VMware and EMC</a>, will host a live streaming event on April 24th at 10:00 am Pacific/1:00 pm Eastern with a special announcement and an unveiling of its plans to build “A New Platform for a New Era”.</p>
<p>The Pivotal platform will unite data, application, and cloud fabrics, helping enterprises to develop faster, understand more, and succeed at an even greater scale. It is a platform that makes <a title="the consumer grade enterprise" href="http://www.greenplum.com/blog/topics/data-science/paul-maritz-calls-for-a-consumer-grade-enterprise-platform" target="_blank">the consumer grade enterprise</a> a reality.</p>
<p>Pivotal brings together a prodigious set of technologies and talent from a number of EMC and VMware entities, which include <a title="Greenplum" href="http://www.greenplum.com/" target="_blank">Greenplum</a>, <a title="Cloud Foundry" href="http://www.cloudfoundry.com/" target="_blank">Cloud Foundry</a>, <a title="Spring" href="http://www.springsource.org/" target="_blank">Spring</a>, <a title="GemFire" href="http://www.vmware.com/products/application-platform/vfabric-gemfire/overview.html" target="_blank">GemFire</a> and other products from the <a title="VMware vFabric Suite" href="http://www.vmware.com/products/application-platform/vfabric/overview.html" target="_blank">VMware vFabric Suite</a>, <a title="Cetas" href="http://www.cetas.net/" target="_blank">Cetas</a>, and <a title="Pivotal Labs" href="http://pivotallabs.com/" target="_blank">Pivotal Labs</a>.</p>
<table width="100%" cellspacing="5" cellpadding="5">
<tbody>
<tr>
<td><span style="color: #339966;"><strong>&gt;&gt; Register for webinar <a title="here" href="http://gopivotal.com/" target="_blank">here</a>!<br />
</strong></span></td>
</tr>
</tbody>
</table>
<p>Paul Maritz, the Pivotal Leadership Team, and special guests will unveil this platform, and make a special announcement during a live streaming event on Wednesday, April 24th at 10:00 am Pacific/1:00 pm Eastern.</p>
<p>Sign up for the event at <a title="gopivotal.com" href="http://gopivotal.com/" target="_blank">gopivotal.com</a> and follow <a title="@gopivotal" href="http://twitter.com/gopivotal" target="_blank">@gopivotal</a> on Twitter for updates.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.vmware.com/vfabric/2013/04/upcoming-webinar-paul-maritz-on-pivotal-and-the-new-platform-for-the-new-era.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Banks Are Breaking Away From Mainframes to Big, Fast Data Grids</title>
		<link>http://blogs.vmware.com/vfabric/2013/04/banks-are-breaking-away-from-mainframes-to-big-fast-data-grids.html?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=banks-are-breaking-away-from-mainframes-to-big-fast-data-grids</link>
		<comments>http://blogs.vmware.com/vfabric/2013/04/banks-are-breaking-away-from-mainframes-to-big-fast-data-grids.html#comments</comments>
		<pubDate>Tue, 16 Apr 2013 16:46:22 +0000</pubDate>
		<dc:creator>vFabric Team</dc:creator>
				<category><![CDATA[GemFire]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[Big]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[cost]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[fast]]></category>
		<category><![CDATA[financial]]></category>
		<category><![CDATA[mainframe]]></category>
		<category><![CDATA[messages]]></category>
		<category><![CDATA[MIPS]]></category>
		<category><![CDATA[scale]]></category>
		<category><![CDATA[TCO]]></category>

		<guid isPermaLink="false">http://blogs.vmware.com/vfabric/?p=5725</guid>
		<description><![CDATA[The world’s largest banks have historically relied on mainframes to manage all their transactions and the related cash and profit. In mainframe terms, hundreds of thousands of MIPS are used to keep the mainframe running these transactions, and the cost &#8230; <a href="http://blogs.vmware.com/vfabric/2013/04/banks-are-breaking-away-from-mainframes-to-big-fast-data-grids.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-5727" title="bigfastdata" src="http://blogs.vmware.com/vfabric/files/2013/04/bigfastdata.jpg" alt="" width="340" height="200" /></p>
<p>The world’s largest banks have historically relied on mainframes to manage all their transactions and the related cash and profit. In mainframe terms, hundreds of thousands of <a title="MIPS" href="http://en.wikipedia.org/wiki/Instructions_per_second" target="_blank">MIPS</a> are used to keep the mainframe running these transactions, and the cost per MIP can make mainframes extremely expensive to operate. For example, <a title="Sears was seeing the overall cost per MIP at $3000-$7000 per year" href="http://blogs.wsj.com/cio/2012/06/17/to-better-compete-with-amazon-sears-takes-baby-steps-away-from-mainframes/" target="_blank">Sears was seeing the overall cost per MIP at $3000-$7000 per year</a> and didn’t see that as a cost-effective way to compete with Amazon. While the price of MIPS has continued to improve, mainframes can also face pure capacity issues.</p>
<p>In today’s world of financial regulations, risk, and compliance, the entire history of all transactions must be captured, stored, and available to report on or search both immediately and over time. This way, banks can meet audit requirements and allow for scenarios like a customer service call that results in an agent search for the transaction history leading up to a customer’s current account balance. The volume of information created across checking, savings, credit card, insurance, and other financial products is tremendous—it’s large enough to bring a mainframe to its knees.<span id="more-5725"></span></p>
<h3>Mainframe Jam can be Sticky</h3>
<p>No, this is not some type of jelly you find in a mason jar at a farmer’s market in <a title="Armonk, NY" href="http://en.wikipedia.org/wiki/Armonk,_New_York" target="_blank">Armonk, NY</a>.</p>
<p>Just like other data stores, a spike in resource requests can bring a system to a halt. When mainframes get jammed and go offline for a few minutes—it costs banks millions of dollars per minute. Mainframe jams eventually cause the executive team to drop what they are doing until they know the CIO and IT organization are all focused 100% on removing the jam and working to avoid it in the future.</p>
<p>What causes a mainframe jam? In the case of one of Latin America&#8217;s largest banks, it is simply the volume of data. Their mainframe runs transactions and creates logs for compliance, risk, and audit purposes. These logs are sent to a message queue and database that also runs on the mainframe hardware. If this database is overloaded, the messages queue up, and the system begins to slow down until it begins to deny transactions.</p>
<h3>Making Big, Fast Data Scale for Financial Services Mainframes</h3>
<p>What the bank needed is a way to ingest a large number of transaction log messages and never deny transactions. Valued at over $60 billion, their client’s transactions generate a message volume of around 200-300 million per day with peaks of up to 15,000 messages per second. With each message being about 4KB of data, this totals up to be about 1.2 terabytes per day of financial transactions.</p>
<p>This is big data for sure, but it was not yet fast data.</p>
<h4>Architecture Approaches with Pivotal’s vFabric GemFire and Greenplum</h4>
<p>The VMware vFabric team partnered with the bank to show them how this data jam could be avoided in the future by building out a proof of concept.</p>
<p>Below you’ll find an architecture diagram of the proposed solution. We used vFabric GemFire as the fast ingest layer to consume MQSeries messages right as they hit the server. This prevents messages from queueing up and also allows them to be asynchronously persisted to a Greenplum appliance for analytics. For the customer, this deployment worked like a black box because GemFire is embedded inside the Greenplum hardware appliance, so setup was minimal and easy.  vFabric tc Server, Spring Integration, and RabbitMQ could also be used to host various web services. .</p>
<p><a href="http://blogs.vmware.com/vfabric/files/2013/04/GemFire-Greenplum-Architecture.png" target="_blank"><img class=" wp-image-5726 alignnone" title="GemFire-Greenplum-Architecture" src="http://blogs.vmware.com/vfabric/files/2013/04/GemFire-Greenplum-Architecture.png" alt="" width="586" height="771" /></a></p>
<p>Basically, this solution allowed the high volume set of log events and messages need to be stored on a persistence layer in GemFire outside the mainframe that could scale elastically as needed by simply adding more nodes.  Since the data is placed in-memory with GemFire, it also allows the data to available for search or reporting as soon as it leaves the MQSeries queue.</p>
<p>The bank saw several additional advantages to running a big, fast data grid outside the mainframe:</p>
<ul>
<li>Running the messages outside of the mainframe would save MIPS and money for a set of data that was not considered core to the business operations.</li>
<li>The mainframe would be able to handle a much higher transaction throughput, avoid jams and have a greater ability to scale on it’s current hardware.</li>
<li>Business intelligence analyses could be run on the transaction logs. Once this data could be placed on a big data grid outside the mainframe, it would allow various departments to do analysis on customer profiles, usage characteristics, fraud, security, customer service, and marketing.</li>
</ul>
<p><strong>For more information on vFabric GemFire and Greenplum, see:</strong></p>
<ul>
<li><a title="GemFire" href="http://www.vmware.com/products/application-platform/vfabric-gemfire/overview.html" target="_blank">GemFire</a></li>
<li><a title="Greenplum" href="http://www.greenplum.com/" target="_blank">Greenplum</a></li>
</ul>
<table width="100%" cellspacing="10" cellpadding="10">
<tbody>
<tr>
<td width="95"><a href="http://blogs.vmware.com/vfabric/files/2013/04/fred_melo_headshot.png"><img class="alignright size-full wp-image-5736" title="fred_melo_headshot" src="http://blogs.vmware.com/vfabric/files/2013/04/fred_melo_headshot.png" alt="" width="90" height="90" /></a></td>
<td><span style="color: #333333;"><strong>About the Author:</strong> Frederico Melo (a.k.a. Fred Melo) has a degree in Computer Science and has been working with Software Engineering for the last 14 years. His areas of expertise include Grid Computing, Highly Scalable Architectures, Big Data, Fast Data and Legacy Modernization. He is currently based in Sao Paulo, Brazil working as Field Engineer for Pivotal.</span></td>
</tr>
</tbody>
</table>
]]></content:encoded>
			<wfw:commentRss>http://blogs.vmware.com/vfabric/2013/04/banks-are-breaking-away-from-mainframes-to-big-fast-data-grids.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss><!-- Dynamic page generated in 0.163 seconds. --><!-- Cached page generated by WP-Super-Cache on 2013-05-20 01:01:43 -->
