<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><!-- generator="wordpress/2.2.3" --><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Big Data Blog: Aster Data Blog</title>
	<link>http://www.asterdata.com/blog</link>
	<description>The convergence of Big Data, analytic applications, MPP data warehouses, and MapReduce</description>
	<pubDate>Wed, 11 Nov 2009 00:51:08 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2.3</generator>
	<language>en</language>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/AsterData" type="application/rss+xml" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com" /><item>
		<title>Big Data Blog Metamorphosis</title>
		<link>http://feedproxy.google.com/~r/AsterData/~3/-SVdc5AfOZg/</link>
		<comments>http://www.asterdata.com/blog/index.php/2009/11/10/big-data-blog-metamorphosis/#comments</comments>
		<pubDate>Wed, 11 Nov 2009 00:51:08 +0000</pubDate>
		<dc:creator>Steve Wooledge</dc:creator>
		
		<category><![CDATA[Statements]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/index.php/2009/11/10/big-data-blog-metamorphosis/</guid>
		<description><![CDATA[Aster Data 4.0 is here and for those of you who subscribe to Aster Data’s blog, “Winning with Data”, you may have noticed that we’ve changed things up a bit.  This blog is now called the “Big Data Blog” and will continue to be a mash-up of opinions and news from the team at Aster [...]<p><a href="http://sharethis.com/item?&#038;wp=2.2.3&#38;publisher=cf684f86-aafe-420a-8a13-0d8ab76b31c0&#38;title=Big+Data+Blog+Metamorphosis&#38;url=http%3A%2F%2Fwww.asterdata.com%2Fblog%2Findex.php%2F2009%2F11%2F10%2Fbig-data-blog-metamorphosis%2F">ShareThis</a></p>]]></description>
			<content:encoded><![CDATA[<p>Aster Data 4.0 is here and for those of you who subscribe to Aster Data’s blog, “Winning with Data”, you may have noticed that we’ve changed things up a bit.  This blog is now called the “Big Data Blog” and will continue to be a mash-up of opinions and news from the team at Aster Data. Topics will continue to be a mix of technical deep-dives as well as company announcements and content.</p>
<p>At the same time, our CEO and co-founder Mayank will be sharing his thoughts on a <a href="http://www.asterdata.com/ceo-blog">separate</a> blog called “Winning with Data” where he will talk about his perspectives of the market trends, customer use-cases, technology evolution and company growth. You can find all of his previous posts there, as well as fresh content starting with the announcement of Aster Data’s massively parallel data-application server.</p>
<img src="http://feeds.feedburner.com/~r/AsterData/~4/-SVdc5AfOZg" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/index.php/2009/11/10/big-data-blog-metamorphosis/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.asterdata.com/blog/index.php/2009/11/10/big-data-blog-metamorphosis/</feedburner:origLink></item>
		<item>
		<title>Mastering MapReduce</title>
		<link>http://feedproxy.google.com/~r/AsterData/~3/VpCuNqO6jwc/</link>
		<comments>http://www.asterdata.com/blog/index.php/2009/10/15/mastering-mapreduce/#comments</comments>
		<pubDate>Thu, 15 Oct 2009 19:40:14 +0000</pubDate>
		<dc:creator>Steve Wooledge</dc:creator>
		
		<category><![CDATA[nPath]]></category>

		<category><![CDATA[MapReduce]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/index.php/2009/10/15/mastering-mapreduce/</guid>
		<description><![CDATA[We just wrapped up our first of a two-part series on Mastering MapReduce together with Curt Monash. We&#8217;ve spent a lot of time discussing MapReduce with Curt and wanted to help educate the community on exactly what it is and how it applies to data management and analysis.  We&#8217;ve published the recorded webcast and below [...]<p><a href="http://sharethis.com/item?&#038;wp=2.2.3&#38;publisher=cf684f86-aafe-420a-8a13-0d8ab76b31c0&#38;title=Mastering+MapReduce&#38;url=http%3A%2F%2Fwww.asterdata.com%2Fblog%2Findex.php%2F2009%2F10%2F15%2Fmastering-mapreduce%2F">ShareThis</a></p>]]></description>
			<content:encoded><![CDATA[<p>We just wrapped up our first of a two-part series on Mastering MapReduce together with Curt Monash. We&#8217;ve spent a lot of time discussing MapReduce with Curt and wanted to help educate the community on exactly what it is and how it applies to data management and analysis.  We&#8217;ve published <a href="http://www.asterdata.com/masteringmapreduce">the recorded webcast</a> and below are the slides we presented from an Aster Data perspective which outline:</p>
<p>- What is Aster Data&#8217;s SQL-MapReduce?<br />
- Example industry applications of SQL-MapReduce<br />
- Walking through the SQL-MapReduce syntax</p>
<div style="width:425px;text-align:left" id="__ss_2233244"><a style="font:14px Helvetica,Arial,Sans-serif;display:block;margin:12px 0 3px 0;text-decoration:underline;" href="http://www.slideshare.net/AsterData/mastering-mapreduce-mapreduce-for-big-data-management-and-analysis" title="Mastering MapReduce: MapReduce for Big Data Management and Analysis">Mastering MapReduce: MapReduce for Big Data Management and Analysis</a><object style="margin:0px" width="425" height="355">
<param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=masteringmapreducewebcastiasterdata10152009forposting-091015132642-phpapp02&#038;stripped_title=mastering-mapreduce-mapreduce-for-big-data-management-and-analysis" />
<param name="allowFullScreen" value="true"/>
<param name="allowScriptAccess" value="always"/><embed src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=masteringmapreducewebcastiasterdata10152009forposting-091015132642-phpapp02&#038;stripped_title=mastering-mapreduce-mapreduce-for-big-data-management-and-analysis" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed></object>
<div style="font-size:11px;font-family:tahoma,arial;height:26px;padding-top:2px;">View more <a style="text-decoration:underline;" href="http://www.slideshare.net/">presentations</a> from <a style="text-decoration:underline;" href="http://www.slideshare.net/AsterData">AsterData</a>.</div>
</div>
<p>Curt has also <a href="http://www.dbms2.com/2009/10/15/mapreduce-webinar-slides/">posted his slides on DBMS2</a> with a great overview on dispelling the myths around MapReduce, and how MapReduce and SQL play nicely with each other.</p>
<p>We had great turn-out and questions from the sessions.  If you have any questions after reviewing the material, please drop a comment.</p>
<img src="http://feeds.feedburner.com/~r/AsterData/~4/VpCuNqO6jwc" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/index.php/2009/10/15/mastering-mapreduce/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.asterdata.com/blog/index.php/2009/10/15/mastering-mapreduce/</feedburner:origLink></item>
		<item>
		<title>Aster Data Seamlessly Connects to Hadoop</title>
		<link>http://feedproxy.google.com/~r/AsterData/~3/DBHVq2wveJw/</link>
		<comments>http://www.asterdata.com/blog/index.php/2009/10/05/aster-data-seamlessly-connects-to-hadoop/#comments</comments>
		<pubDate>Tue, 06 Oct 2009 01:01:14 +0000</pubDate>
		<dc:creator>Steve Wooledge</dc:creator>
		
		<category><![CDATA[MapReduce]]></category>

		<category><![CDATA[Statements]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/index.php/2009/10/05/aster-data-seamlessly-connects-to-hadoop/</guid>
		<description><![CDATA[On Friday we had series of events and announcements around our new Aster-Hadoop Data Connector which utilizes key new SQL-MapReduce functions to provide ultra-fast, two-way data loading between HDFS (Hadoop Distributed File System) and Aster Data&#8217;s MPP data warehouse.
In addition to the Big Data Summit we held in New York City (which we&#8217;ll detail in [...]<p><a href="http://sharethis.com/item?&#038;wp=2.2.3&#38;publisher=cf684f86-aafe-420a-8a13-0d8ab76b31c0&#38;title=Aster+Data+Seamlessly+Connects+to+Hadoop&#38;url=http%3A%2F%2Fwww.asterdata.com%2Fblog%2Findex.php%2F2009%2F10%2F05%2Faster-data-seamlessly-connects-to-hadoop%2F">ShareThis</a></p>]]></description>
			<content:encoded><![CDATA[<p>On Friday we had series of events and <a href="http://www.asterdata.com/news/091001-Aster-Hadoop-connector.php">announcements</a> around our new Aster-Hadoop Data Connector which utilizes key new SQL-MapReduce functions to provide ultra-fast, two-way data loading between HDFS (Hadoop Distributed File System) and Aster Data&#8217;s MPP data warehouse.</p>
<p>In addition to the Big Data Summit we held in New York City (which we&#8217;ll detail in a separate post), Colin White presented on a Webcast with Aster on the various use-cases for Hadoop within data warehouse environments.  Colin does a great job summarizing what Hadoop is, how it&#8217;s different from an RDBMS, the different types of users for each, and how they co-exist nicely in customer environments.</p>
<p>Below are the slides to view if you weren&#8217;t able to attend the event, which will also be available for on-demand viewing soon in our <a href="http://www.asterdata.com/product/resource_library_partial.php">resource library</a>.</p>
<div style="width:425px;text-align:left" id="__ss_2135769"><a style="font:14px Helvetica,Arial,Sans-serif;display:block;margin:12px 0 3px 0;text-decoration:underline;" href="http://www.slideshare.net/AsterData/making-sense-of-hadoop-its-fit-with-data-warehousing-solutions" title="Making Sense of Hadoop - It&#39;s Fit with Data Warehousing Solutions">Making Sense of Hadoop - It&#39;s Fit with Data Warehousing Solutions</a><object style="margin:0px" width="425" height="355">
<param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=hadoopwebinarcombineddeckfinal-091005191748-phpapp02&#038;stripped_title=making-sense-of-hadoop-its-fit-with-data-warehousing-solutions" />
<param name="allowFullScreen" value="true"/>
<param name="allowScriptAccess" value="always"/><embed src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=hadoopwebinarcombineddeckfinal-091005191748-phpapp02&#038;stripped_title=making-sense-of-hadoop-its-fit-with-data-warehousing-solutions" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed></object>
<div style="font-size:11px;font-family:tahoma,arial;height:26px;padding-top:2px;">View more <a style="text-decoration:underline;" href="http://www.slideshare.net/">presentations</a> from <a style="text-decoration:underline;" href="http://www.slideshare.net/AsterData">AsterData</a>.</div>
</div>
<img src="http://feeds.feedburner.com/~r/AsterData/~4/DBHVq2wveJw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/index.php/2009/10/05/aster-data-seamlessly-connects-to-hadoop/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.asterdata.com/blog/index.php/2009/10/05/aster-data-seamlessly-connects-to-hadoop/</feedburner:origLink></item>
		<item>
		<title>Hadoop Webinar Featuring Colin White</title>
		<link>http://feedproxy.google.com/~r/AsterData/~3/M2i_yD9On3g/</link>
		<comments>http://www.asterdata.com/blog/index.php/2009/09/30/hadoop-webinar-on-10-1-featuring-colin-white/#comments</comments>
		<pubDate>Thu, 01 Oct 2009 00:02:04 +0000</pubDate>
		<dc:creator>Shawn Kung</dc:creator>
		
		<category><![CDATA[Analytics]]></category>

		<category><![CDATA[Frontline data warehouse]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/index.php/2009/09/29/hadoop-webinar-on-10-1-featuring-colin-white/</guid>
		<description><![CDATA[I&#8217;m very excited about the upcoming Big Data Summit in New York City on Thursday evening (October 1st). Sponsored by Aster Data, Microstrategy, and Informatica, we have an incredible speaker lineup including LinkedIn, comScore, and Colin White from BI Research. Check out the Facebook page for the event here.
To kick off the festivities, we&#8217;re holding [...]<p><a href="http://sharethis.com/item?&#038;wp=2.2.3&#38;publisher=cf684f86-aafe-420a-8a13-0d8ab76b31c0&#38;title=Hadoop+Webinar+Featuring+Colin+White&#38;url=http%3A%2F%2Fwww.asterdata.com%2Fblog%2Findex.php%2F2009%2F09%2F30%2Fhadoop-webinar-on-10-1-featuring-colin-white%2F">ShareThis</a></p>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m very excited about the upcoming <a href="http://www.asterdata.com/bigdatasummit">Big Data Summit</a> in New York City on Thursday evening (October 1st). Sponsored by Aster Data, Microstrategy, and Informatica, we have an incredible speaker lineup including LinkedIn, comScore, and Colin White from BI Research. Check out the Facebook page for the event <a href="http://www.facebook.com/pages/Big-Data-Summit/143312171156#">here</a>.</p>
<p>To kick off the festivities, we&#8217;re holding a live webinar earlier in the day at 9 a.m. US Pacific time. Colin White and myself will be discussing Hadoop and data warehousing - how they&#8217;re similar, how they&#8217;re different, and what they can be used for (both separately and together). In fact, we&#8217;ll be making an important announcement of a new product offering that you won&#8217;t want to miss. If you haven&#8217;t already, I urge you to register by clicking <a href="https://asterdata.webex.com/asterdata/onstage/g.php?t=a&amp;d=332024976">here</a>.</p>
<p>Mark it on your calendar - something &#8220;big&#8221; is coming on October 1st.</p>
<img src="http://feeds.feedburner.com/~r/AsterData/~4/M2i_yD9On3g" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/index.php/2009/09/30/hadoop-webinar-on-10-1-featuring-colin-white/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.asterdata.com/blog/index.php/2009/09/30/hadoop-webinar-on-10-1-featuring-colin-white/</feedburner:origLink></item>
		<item>
		<title>New garage days ahead</title>
		<link>http://feedproxy.google.com/~r/AsterData/~3/M6YM5H-vwUw/</link>
		<comments>http://www.asterdata.com/blog/index.php/2009/09/30/new-garage-days-ahead/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 20:42:22 +0000</pubDate>
		<dc:creator>George Candea</dc:creator>
		
		<category><![CDATA[Statements]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/index.php/2009/09/30/new-garage-days-ahead/</guid>
		<description><![CDATA[After a long thinking process, I have decided to step down from my day-to-day operating role as Chief Scientist. I will be checking in on the team every now and then, and I will continue to serve Aster on the board of directors, where we shape the strategic direction of the company and products.
Aster is [...]<p><a href="http://sharethis.com/item?&#038;wp=2.2.3&#38;publisher=cf684f86-aafe-420a-8a13-0d8ab76b31c0&#38;title=New+garage+days+ahead&#38;url=http%3A%2F%2Fwww.asterdata.com%2Fblog%2Findex.php%2F2009%2F09%2F30%2Fnew-garage-days-ahead%2F">ShareThis</a></p>]]></description>
			<content:encoded><![CDATA[<p>After a long thinking process, I have decided to step down from my day-to-day operating role as Chief Scientist. I will be checking in on the team every now and then, and I will continue to serve Aster on the board of directors, where we shape the strategic direction of the company and products.</p>
<p>Aster is now in a new phase, when maximal focus on product delivery and responding to customers’ every need is taking center stage. We have a fantastic product that is growing quickly and is being deployed at many, many customer sites. The level of product focus in our team is astounding. Yes, my friends, the garage days are now officially over! … And I’m delighted we’ve come this far.</p>
<p>One should always invest their energy in those activities that most benefit from that energy. Aster is a grown-up now and is solidly set on a path with great momentum toward executing our “big data” vision. At the same time, I have a budding team of incredibly bright students back in <a href="http://dslab.epfl.ch/">my lab at EPFL</a>, who are hungry for my help and scientific direction.  For me, it’s time for the next “garage project.”</p>
<p>As you might imagine, this is not an easy decision for a founder. Perhaps this is how parents feel when sending their kid to college away from home… but time does not stand still.</p>
<p>It has been a thrill to start Aster with Mayank and Tasso, see her emerge from 5 PCs cooled by a portable fan in my living room, and then working for her all these years. I’m proud of what we’ve achieved so far, both in terms of product and market impact, as well as our published research papers. At a future stage in Aster’s growth, it may make sense for me to resume my role in leading the company’s long-term research. Until then, I look forward to helping the company from behind the scenes and to watch Aster grow and develop the best data management platform in the market.</p>
<p>Carpe datum!</p>
<img src="http://feeds.feedburner.com/~r/AsterData/~4/M6YM5H-vwUw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/index.php/2009/09/30/new-garage-days-ahead/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.asterdata.com/blog/index.php/2009/09/30/new-garage-days-ahead/</feedburner:origLink></item>
		<item>
		<title>Aster Data in Europe</title>
		<link>http://feedproxy.google.com/~r/AsterData/~3/LVBfMSQBqg0/</link>
		<comments>http://www.asterdata.com/blog/index.php/2009/09/14/aster-data-in-europe/#comments</comments>
		<pubDate>Mon, 14 Sep 2009 15:36:50 +0000</pubDate>
		<dc:creator>Mayank Bawa</dc:creator>
		
		<category><![CDATA[Statements]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/index.php/2009/09/14/aster-data-in-europe/</guid>
		<description><![CDATA[Aster Data has seen tremendous growth in North America. We announced today that we have opened a Europe office in West London, England. The office will be headed by Bob Pearson, our newly appointed Europe Area Director. Bob is an entrepreneurial industry leader and had earlier introduced Opsware into Europe, eventually propelling Opsware to be [...]<p><a href="http://sharethis.com/item?&#038;wp=2.2.3&#38;publisher=cf684f86-aafe-420a-8a13-0d8ab76b31c0&#38;title=Aster+Data+in+Europe&#38;url=http%3A%2F%2Fwww.asterdata.com%2Fblog%2Findex.php%2F2009%2F09%2F14%2Faster-data-in-europe%2F">ShareThis</a></p>]]></description>
			<content:encoded><![CDATA[<p>Aster Data has seen tremendous growth in North America. We announced today that we have <a href="http://www.asterdata.com/news/090914-Aster-Data-UK.php">opened a Europe office</a> in West London, England. The office will be headed by Bob Pearson, our newly appointed Europe Area Director. Bob is an entrepreneurial industry leader and had earlier introduced Opsware into Europe, eventually propelling Opsware to be #1 in Europe in its market. We had been in conversations with Bob for 12 months - understanding the European market - before we opened our office this summer.</p>
<p>We also <a href="http://www.asterdata.com/news/090914-Aster-Pocket-Kings.php">announced today that our first customer in Europe</a> is the #1 online poker gaming site in the world, Full Tilt Poker. We have been working with <a href="http://www.fulltiltpoker.com/">Full Tilt Poker</a> for 8 months now helping deploy Aster <em>n</em>Cluster to power their fraud prevention systems and provide enhanced customer service to their players.</p>
<p>It is no surprise that data size growth is a world-wide phenomenon, and certainly occurs across &#8220;the pond&#8221; as well. We have noticed that European customers in numerous industries, such as financial services and insurance, online retailing, social networking, communications, and gaming are deploying new (and sometimes custom) applications to leverage big data.</p>
<p>Aster Data is certainly the most application friendly big-data infrastructure in the market, and we look forward to working with our European customers in the coming years!</p>
<img src="http://feeds.feedburner.com/~r/AsterData/~4/LVBfMSQBqg0" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/index.php/2009/09/14/aster-data-in-europe/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.asterdata.com/blog/index.php/2009/09/14/aster-data-in-europe/</feedburner:origLink></item>
		<item>
		<title>VLDB 2009: SQL/MapReduce Turns One Year Old</title>
		<link>http://feedproxy.google.com/~r/AsterData/~3/r1e_k2FWs8s/</link>
		<comments>http://www.asterdata.com/blog/index.php/2009/08/21/vldb-2009-sqlmapreduce-turns-one-year-old/#comments</comments>
		<pubDate>Fri, 21 Aug 2009 14:12:59 +0000</pubDate>
		<dc:creator>Aster</dc:creator>
		
		<category><![CDATA[nPath]]></category>

		<category><![CDATA[MapReduce]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/index.php/2009/08/21/vldb-2009-sqlmapreduce-turns-one-year-old/</guid>
		<description><![CDATA[ 
This post was co-authored by John Cieslewicz, Eric Friedman, and Peter Pawlowski of Aster Data Systems
One year ago we introduced SQL/MapReduce for the Aster nCluster database, which integrates MapReduce and SQL to enable deep analytics within the database. Pushing computation inside the database and close to the data is increasingly important as data sizes grow [...]<p><a href="http://sharethis.com/item?&#038;wp=2.2.3&#38;publisher=cf684f86-aafe-420a-8a13-0d8ab76b31c0&#38;title=VLDB+2009%3A+SQL%2FMapReduce+Turns+One+Year+Old&#38;url=http%3A%2F%2Fwww.asterdata.com%2Fblog%2Findex.php%2F2009%2F08%2F21%2Fvldb-2009-sqlmapreduce-turns-one-year-old%2F">ShareThis</a></p>]]></description>
			<content:encoded><![CDATA[<p> <a href="http://www.asterdata.com/blog/wp-content/uploads/2009/08/photo1.jpg" title="John Cieslewicz, Eric Friedman, and Peter Pawlowski of Aster Data Systems"><img src="http://www.asterdata.com/blog/wp-content/uploads/2009/08/photo1.jpg" alt="John Cieslewicz, Eric Friedman, and Peter Pawlowski of Aster Data Systems" width="250" height="188" /></a></p>
<p><em>This post was co-authored by </em><em>John Cieslewicz, Eric Friedman, and Peter Pawlowski of Aster Data Systems</em></p>
<p>One year ago <a href="http://www.asterdata.com/blog/index.php/2008/08/25/announcing-in-database-mapreduce/">we introduced SQL/MapReduce</a> for the Aster <em>n</em>Cluster database, which integrates MapReduce and SQL to enable deep analytics within the database. Pushing computation inside the database and close to the data is increasingly important as data sizes grow exponentially. As SQL/MR turns one year old, we are happy to announce that we will be presenting our SQL/MR innovations next week at the <em>35th International Conference on Very Large Data Bases (<a href="http://vldb2009.org/">VLDB</a>)</em>, the premier international forum for database research.</p>
<p>The title of our conference paper is <em><a href="http://www.asterdata.com/resources/downloads/whitepapers/sqlmr.pdf">SQL/MapReduce: A Practical Approach to Self-describing, Polymorphic, and Parallelizable User-defined Functions</a></em>. The title of the paper may be large, but so are the advanced analytics possibilities created by the invention of SQL/MR!</p>
<p>We developed SQL/MR because we saw a growing gap between the deep analytics and application needs of very large data and the capabilities provided by SQL and traditional relational-only data processing. We call this gap, the “SQL Gap.”</p>
<p><a href="http://www.asterdata.com/blog/wp-content/uploads/2009/08/thesqlgap.png" title="The SQL Gap"><img src="http://www.asterdata.com/blog/wp-content/uploads/2009/08/thesqlgap.png" alt="The SQL Gap" width="522" height="158" /></a></p>
<p>SQL and the relational query processing model are well suited for many, but not all data processing tasks. Some queries are cumbersome, non-intuitive or impossible to express in SQL (note: now that SQL is turing complete, nothing is strictly impossible, but it can be very painful and perform very badly) - check out our paper for some examples. Moreover, query optimizers have a limited number of algorithms at their disposal to process data, which leads to convoluted data processing in situations where applying a little domain knowledge can yield a much more straightforward algorithm.</p>
<p>We found traditional user-defined functions (UDFs) to fall short in bridging this gap between SQL and the answers to challenging analytic problems that need to be solved. UDFs are often user-unfriendly, inflexible, and not easily parallelized. SQL/MR functions, in contrast, are designed to be easy to develop, easy to install, and easy to use - providing developers and analysts with a powerful tool to tackle the challenges posed by very large data.</p>
<p>To do this, we integrated the MapReduce programming model with SQL. MapReduce is a well known paradigm for parallel, fault-tolerant data processing that allows developers to write procedural code that is then applied to data in parallel. Pure MapReduce, however, misses out on aspects of SQL and relational data processing that are great - such as query optimizations, managed data, and transactions. By integrating SQL and MapReduce on top of Aster <em>n</em>Cluster’s hardware management and fault tolerance, we leverage the strengths of each, resulting in a system that is much more powerful than either in isolation.</p>
<p><strong>SQL/MapReduce At VLDB</strong><br />
SQL/MR is much more than a user-defined function. As our paper title states, a SQL/MR function is <em>self-describing, polymorphic, and parallelizable</em>. Let’s explore each of these characteristics and see why we hope researchers at VLDB will be as excited as we are by SQL/MR.</p>
<p><em>Self-Describing and Polymorphic</em><br />
The behavior and output characteristics of a SQL/MR function are determined dynamically at query-time instead of statically at install-time as is the case with traditional user-defined functions. These characteristics allow SQL/MR functions to behave much more like general purpose library functions than single-use, specific application user-defined functions. When a SQL/MR function is used in a query, the <em>n</em>Cluster query planner negotiates a contract with the SQL/MR function, providing the function’s input schema and optional user-supplied parameters. In return, the SQL/MR function agrees to a contract that specifies its output schema for the duration of the query. This contract negotiation is what makes SQL/MR functions self-describing, and their ability to be invoked on different input with different optional parameters makes them polymorphic as well. To summarize, by invoking a SQL/MR function on different input, with different optional parameters, the SQL/MR function may export a different output schema and perform different computation - the possibilities are entirely up to the developer!</p>
<p><em>Parallelizable</em><br />
The SQL/MR programming model, like that of MapReduce, is inherently parallelizable. Developers write procedural code in the language of their choice, but at runtime that code will run in parallel across hundreds of nodes within an <em>n</em>Cluster database. Advanced analytics and application code can now be executed in parallel, directly on data stored within <em>n</em>Cluster making <em>n</em>Cluster an application-friendly, high performance data warehouse and data application system. Advanced analytics capabilities that we have already pushed inside <em>n</em>Cluster using SQL/MR include click-stream sessionization, general purpose time series path matching, dynamic and massively parallel data loading from heterogeneous sources, and genetic sequence analysis.</p>
<p><strong>Bridging the Gap</strong><br />
SQL/MapReduce has matured greatly over the past year and is used by our customers in creative ways we never imagined - proof positive that SQL/MapReduce is an effective way to bridge the “SQL Gap” to deeper analytics on very large data. Check out the other <a href="http://www.asterdata.com/mapreduce/applications.php">application examples</a> and <a href="http://www.asterdata.com/mapreduce/writing.php">sample code videos</a> on www.asterdata.com, as well as the <a href="http://www.asterdata.com/blog/index.php/category/mapreduce/">MapReduce category</a> of this blog</p>
<p>We&#8217;d love to hear from you about any type of analysis you&#8217;re doing where stand-alone SQL is becoming overly-complex &#8230; and please look us up at the VLDB 2009 show if you&#8217;re making the trip to Lyon, France!</p>
<p><em>(Apart from the authors, Brent Chun, Mohit Aron, Abhishek Marwah, Raghu Venkat, Vinay Bondhugula, and Prasan Roy of Aster Data Systems are notable contributors to the overall SQL/MR effort.)</em></p>
<img src="http://feeds.feedburner.com/~r/AsterData/~4/r1e_k2FWs8s" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/index.php/2009/08/21/vldb-2009-sqlmapreduce-turns-one-year-old/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.asterdata.com/blog/index.php/2009/08/21/vldb-2009-sqlmapreduce-turns-one-year-old/</feedburner:origLink></item>
		<item>
		<title>Netezza’s Change in Architecture - Move towards Commodity</title>
		<link>http://feedproxy.google.com/~r/AsterData/~3/8eOg-OquyYs/</link>
		<comments>http://www.asterdata.com/blog/index.php/2009/08/03/netezzas-change-in-architecture-move-towards-commodity/#comments</comments>
		<pubDate>Tue, 04 Aug 2009 05:49:36 +0000</pubDate>
		<dc:creator>Mayank Bawa</dc:creator>
		
		<category><![CDATA[TCO]]></category>

		<category><![CDATA[Database]]></category>

		<category><![CDATA[Statements]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/index.php/2009/08/03/netezzas-change-in-architecture-move-towards-commodity/</guid>
		<description><![CDATA[Netezza pre-announced last week that they will be moving to a new architecture - one based around IBM blades (Linux + Intel + RAM) with commodity SAS disks, RAID controllers, and NICs. The product will continue to rely on an FPGA, but that would sit much further from the disks &#38; RAID controller, beyond the [...]<p><a href="http://sharethis.com/item?&#038;wp=2.2.3&#38;publisher=cf684f86-aafe-420a-8a13-0d8ab76b31c0&#38;title=Netezza%27s+Change+in+Architecture+-+Move+towards+Commodity&#38;url=http%3A%2F%2Fwww.asterdata.com%2Fblog%2Findex.php%2F2009%2F08%2F03%2Fnetezzas-change-in-architecture-move-towards-commodity%2F">ShareThis</a></p>]]></description>
			<content:encoded><![CDATA[<p>Netezza <a href="http://www.netezzacommunity.com/blogs/nzblog/2009/07/30/catch-a-wave-and-youre-sittin-on-top-of-the-world">pre-announced</a> last week that they will be moving to a new architecture - one based around IBM blades (Linux + Intel + RAM) with commodity SAS disks, RAID controllers, and NICs. The product will continue to rely on an FPGA, but that would sit much further from the disks &amp; RAID controller, beyond the RAM but adjacent to the Intel CPU, in contrast to their previous product line.</p>
<p>In assembling a new hardware stack, Netezza calls this re-architecture as <a href="http://www.netezzacommunity.com/blogs/nzblog/2009/07/31/change-but-no-change">a change but not really a change</a> - the FPGA will continue to offload data compression/decompression, selection and projection from the Intel CPU; the Intel CPU will be used to push-down joins and group bys; the RAM will be used to enable caching (thus helping improve mixed workload performance).</p>
<p>I think this is a pretty significant change for Netezza.</p>
<p>Clearly, Netezza would not have invested in this change - assemble &amp; ship a new hardware stack to share revenue with IBM vs. a 3rd party hardware assembler - if Netezza&#8217;s old FPGA-dominant hardware was not being out-priced and out-performed by our Intel-based commodity hardware.</p>
<p>It was a matter of time before the market realized that FPGA&#8217;s had reached their end-of-life status in the data warehousing market. In realizing the writing on the wall, and responding to it early, Netezza has made a bold decision to change - and yet, clung to the warm familiarity of an FPGA as a &#8220;side car&#8221;.</p>
<p>Netezza, and the rest of the market, will soon become aware that a change in hardware stack is not a free lunch. The richness of CPU and RAM resources in an IBM commodity blade come at a cost that a resource-starved FPGA-based architecture never had to account for.</p>
<p>In 2009, after having engineered its software for an FPGA over the last 9 years, Netezza will need to come to terms with commodity hardware in production systems and demonstrate that they can:</p>
<blockquote><p>- Manage processes and memory spawned by a single query across 100s of blade servers</p>
<p>- Maintain consistent caches across 100s of blade servers - after all, it is Oracle&#8217;s Cache Fusion technology that is the bane of scaling Oracle RAC beyond 8 blade servers</p>
<p>- Tolerate the higher frequency of failures that a commodity Linux + RAID Controller/driver + Network driver stack incur when put under rigorous data movement (e.g., allocation/de-allocation of memory contributing to memory leaks)</p>
<p>- Add a new IBM blade and ensure incremental scaling of their appliance</p>
<p>- Upgrade the software stack in place - unlike an FPGA-based hardware stack that customers are OK to floor-sweep in their upgrade</p>
<p>- Contain run-away queries from allocating the abundant CPU and RAM resources and starving other concurrent queries in the workload</p>
<p>- Reduce network traffic for a blade with 2 NICs that is managing 8 disks vs. a Power-PC/FPGA that had 1 NIC for 1 disk</p>
<p>- …</p></blockquote>
<p>If you take a quick pulse of the market, apart from our known installations of 100+ servers, there is no other vendor - mature or new-age - who has demonstrated that 100&#8217;s of commodity servers can be made to work together to run a single database.</p>
<p>And I believe that there is a fundamental reason for this lack of proof-point even a decade after Linux has matured and commodity servers have been used for computing - software <strong>not</strong> built from the ground-up to leverage the richness and contain the limitations of commodity hardware is <strong>incapable</strong> of scaling. Aster <em>n</em>Cluster has been built ground up to have these capabilities on a commodity stack. Netezza’s software written for proprietary hardware cannot be retrofitted to work on commodity hardware (else, Netezza would have completely taken the FPGAs out, now that they have powerful CPUs!). Netezza has its work cut-out - they have taken a dramatic shift that has the ability to bring the company and its production customers to its knees. And there-in lies Netezza&#8217;s challenge - they must succeed while supporting their current customers on an FPGA-based platform while moving resources to build out a commodity-based platform.</p>
<p>And we have not even touched upon the extension of SQL with MapReduce to power big data manipulation using arbitrary user-written procedures.</p>
<p>If a system is not fundamentally designed to leverage commodity servers, it&#8217;s only going to be a band-aid on seams that are bursting. Overall, we will curiously watch how long it takes Netezza to eliminate their FPGAs completely and move to a real commodity stack so that the customers can have the freedom to choose their own hardware and not be locked down to Netezza-supplied custom hardware.</p>
<img src="http://feeds.feedburner.com/~r/AsterData/~4/8eOg-OquyYs" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/index.php/2009/08/03/netezzas-change-in-architecture-move-towards-commodity/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.asterdata.com/blog/index.php/2009/08/03/netezzas-change-in-architecture-move-towards-commodity/</feedburner:origLink></item>
		<item>
		<title>Scaling Your Data Warehouse</title>
		<link>http://feedproxy.google.com/~r/AsterData/~3/WyH8Cop656g/</link>
		<comments>http://www.asterdata.com/blog/index.php/2009/07/14/upcoming-webinar-scaling-your-data-warehouse/#comments</comments>
		<pubDate>Tue, 14 Jul 2009 15:09:56 +0000</pubDate>
		<dc:creator>Shawn Kung</dc:creator>
		
		<category><![CDATA[Frontline data warehouse]]></category>

		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/index.php/2009/07/14/upcoming-webinar-scaling-your-data-warehouse/</guid>
		<description><![CDATA[When you hear the word “warehouse,” you normally think of an oversized building with high ceilings and a ton of storage space. In the data warehousing world, it’s all too easy to fill that space faster than expected. Even companies with predictable data growth trajectories don’t want to pay for storage space they won’t need [...]<p><a href="http://sharethis.com/item?&#038;wp=2.2.3&#38;publisher=cf684f86-aafe-420a-8a13-0d8ab76b31c0&#38;title=Scaling+Your+Data+Warehouse&#38;url=http%3A%2F%2Fwww.asterdata.com%2Fblog%2Findex.php%2F2009%2F07%2F14%2Fupcoming-webinar-scaling-your-data-warehouse%2F">ShareThis</a></p>]]></description>
			<content:encoded><![CDATA[<p>When you hear the word “warehouse,” you normally think of an oversized building with high ceilings and a ton of storage space. In the data warehousing world, it’s all too easy to fill that space faster than expected. Even companies with predictable data growth trajectories don’t want to pay for storage space they won’t need for months or even years out. For either type of company, the ability to scale on-demand, and to the appropriate degree, is critical.</p>
<p>That’s why I’m so excited about a <a href="https://asterdata.webex.com/asterdata/onstage/g.php?t=a&amp;d=335349154">webinar</a> we are hosting next week with James Kobielus, Senior Analyst for Forrester Research. In case you haven’t read it, James recently released his report “<a href="http://marketing.asterdata.com/forms/Forrester">Massive But Agile: Best Practices for Scaling the Next-Generation Data Warehouse</a>.” In the report, James thoroughly address several issues around scalability for which Aster is well-suited (parallelism, optimized storage, in-database analytics, etc.).</p>
<p>We’ll get into much more detail on these and other issues over the course of the webinar. If you haven’t had a chance yet, please <a href="https://asterdata.webex.com/asterdata/onstage/g.php?t=a&amp;d=335349154">register for the webinar</a> to hear what James, a leader and visionary in the industry, has to say. And make sure to leave a comment below if there are any facets of data warehouse scalability that you would like us to cover.</p>
<img src="http://feeds.feedburner.com/~r/AsterData/~4/WyH8Cop656g" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/index.php/2009/07/14/upcoming-webinar-scaling-your-data-warehouse/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.asterdata.com/blog/index.php/2009/07/14/upcoming-webinar-scaling-your-data-warehouse/</feedburner:origLink></item>
		<item>
		<title>Enterprise-Ready MapReduce Data Warehouse Appliance</title>
		<link>http://feedproxy.google.com/~r/AsterData/~3/YDdTtmsqdXs/</link>
		<comments>http://www.asterdata.com/blog/index.php/2009/06/29/enterprise-ready-mapreduce-data-warehouse-appliance/#comments</comments>
		<pubDate>Mon, 29 Jun 2009 15:38:37 +0000</pubDate>
		<dc:creator>Mayank Bawa</dc:creator>
		
		<category><![CDATA[MapReduce]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/index.php/2009/06/29/enterprise-ready-mapreduce-data-warehouse-appliance/</guid>
		<description><![CDATA[We are announcing the availability of an Enterprise-Ready MapReduce Data Warehouse Appliance.
The appliance is powered by Dell hardware and Aster&#8217;s nCluster SQL/ MR database, with optional software for BI platform from Microstrategy and data modeling software from Aqua Data Studio.
Our product portfolio now allows our customers to get the benefits of our flagship Aster nCluster [...]<p><a href="http://sharethis.com/item?&#038;wp=2.2.3&#38;publisher=cf684f86-aafe-420a-8a13-0d8ab76b31c0&#38;title=Enterprise-Ready+MapReduce+Data+Warehouse+Appliance&#38;url=http%3A%2F%2Fwww.asterdata.com%2Fblog%2Findex.php%2F2009%2F06%2F29%2Fenterprise-ready-mapreduce-data-warehouse-appliance%2F">ShareThis</a></p>]]></description>
			<content:encoded><![CDATA[<p>We are announcing the availability of an Enterprise-Ready MapReduce Data Warehouse Appliance.</p>
<p>The appliance is powered by Dell hardware and Aster&#8217;s nCluster SQL/ MR database, with optional software for BI platform from Microstrategy and data modeling software from Aqua Data Studio.</p>
<p>Our product portfolio now allows our customers to get the benefits of our flagship Aster nCluster SQL/MR database in the packaging that they are most comfortable with - on-premise software, in-cloud service, or pre-packaged appliance.</p>
<p>The appliance offering packs a lot of punch compared to other data warehousing appliances in the market - it has the highest ratio of compute &amp; memory to data sizes, allowing you to run rich queries on the appliance without breaking a sweat.</p>
<p>We are especially proud of the open nature of our appliance - the hardware is from Dell built from industry-standard components, the BI server is from Microstrategy, and the data modeling tool is from AquaFold (Aqua Data Studio). The appliance brings together industry-leading components of a full data warehouse stack together - all pre-tested and configured for optimal performance.</p>
<p>Even the programming of our appliance is open - our SQL/MR framework allows applications to push computation into the appliance using industry standard SQL augmented with MapReduce in the language of your choice (Java, C#, Perl, Python, etc.).</p>
<p>We have been approached by a number of customers seeking a get-started-quickly system, especially those groups of users and departments seeking a Hadoop framework to build their solutions upon.</p>
<p>In response to the requests, we are proud to announce an Express Edition of the appliance that is designed to work for upto 1TB of user data. And it comes in an even more attractive price - that of $50K only - complete with hardware and software!</p>
<p>Give us a call - we&#8217;ll get your warehouse setup on our appliance to ensure that the time-to-first-query is measured in hours, not months!</p>
<img src="http://feeds.feedburner.com/~r/AsterData/~4/YDdTtmsqdXs" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/index.php/2009/06/29/enterprise-ready-mapreduce-data-warehouse-appliance/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.asterdata.com/blog/index.php/2009/06/29/enterprise-ready-mapreduce-data-warehouse-appliance/</feedburner:origLink></item>
	</channel>
</rss>
