<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" xmlns:gr="http://www.google.com/schemas/reader/atom/" xmlns:idx="urn:atom-extension:indexing" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" idx:index="no" gr:dir="ltr"><!--
Content-type: Preventing XSRF in IE.

--><generator uri="http://www.google.com/reader">Google Reader</generator><id>tag:google.com,2005:reader/user/07670894862541843814/label/data-mining</id><title>"data-mining" via gtzi in Google Reader</title><gr:continuation>CIuTw8XVhbAC</gr:continuation><author><name>gtzi</name></author><updated>2012-05-27T05:26:00Z</updated><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/data-miningblogposts" /><feedburner:info uri="data-miningblogposts" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry gr:crawl-timestamp-msec="1338096360905"><id gr:original-id="http://agtb.wordpress.com/?p=1685">tag:google.com,2005:reader/item/5bd1d8895e2b71e7</id><category term="Uncategorized" /><title type="html">AGT summer school in Samos island, 14-12 July</title><published>2012-05-27T05:25:57Z</published><updated>2012-05-27T05:25:57Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/n1SRDyVOh_U/" type="text/html" /><content xml:base="http://agtb.wordpress.com/" type="html">&lt;p&gt;A &lt;a href="http://agt2012samos.wordpress.com/"&gt;summer school in Algorithmic Game Theory&lt;/a&gt; will take place in &lt;a href="http://en.wikipedia.org/wiki/Samos"&gt;Samos island&lt;/a&gt; (Greece) from 14 to 21 of July, 2012. The school is organized by Dimitris Fotakis, Paul Spirakis, and Alexis Kaporis and is sponsored by the University of the Aegean and Information, Communication &amp;amp; Systems Department, Greece.  The school is open to under/postgraduate students and researchers in general at very low (early) accommodation costs.&lt;/p&gt;
&lt;br&gt;  &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/agtb.wordpress.com/1685/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/agtb.wordpress.com/1685/"&gt;&lt;/a&gt; &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/agtb.wordpress.com/1685/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/agtb.wordpress.com/1685/"&gt;&lt;/a&gt; &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/agtb.wordpress.com/1685/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/agtb.wordpress.com/1685/"&gt;&lt;/a&gt; &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/agtb.wordpress.com/1685/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/agtb.wordpress.com/1685/"&gt;&lt;/a&gt; &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/agtb.wordpress.com/1685/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/agtb.wordpress.com/1685/"&gt;&lt;/a&gt; &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/agtb.wordpress.com/1685/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/agtb.wordpress.com/1685/"&gt;&lt;/a&gt; &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/agtb.wordpress.com/1685/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/agtb.wordpress.com/1685/"&gt;&lt;/a&gt; &lt;img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=agtb.wordpress.com&amp;amp;blog=6963698&amp;amp;post=1685&amp;amp;subd=agtb&amp;amp;ref=&amp;amp;feed=1" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/n1SRDyVOh_U" height="1" width="1"/&gt;</content><author><name>Noam Nisan</name></author><source gr:stream-id="feed/http://agtb.wordpress.com/feed/"><id>tag:google.com,2005:reader/feed/http://agtb.wordpress.com/feed/</id><title type="html">Turing&amp;#39;s Invisible Hand</title><link rel="alternate" href="http://agtb.wordpress.com" type="text/html" /></source><feedburner:origLink>http://agtb.wordpress.com/2012/05/27/agt-summer-school-in-samos-island-14-12-july/</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1338081882127"><id gr:original-id="http://datamining.typepad.com/data_mining/2012/05/5-hidden-skills-for-big-data-scientists.html">tag:google.com,2005:reader/item/15368fa7834b72fb</id><title type="html">5 Hidden Skills for Big Data Scientists</title><published>2012-05-27T01:24:38Z</published><updated>2012-05-27T01:24:38Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/bNznI0tg6Og/5-hidden-skills-for-big-data-scientists.html" type="text/html" /><content xml:base="http://datamining.typepad.com/data_mining/" type="html">&lt;p&gt;1. Be Clear:  Is Your Problem Really A Big Data Problem?&lt;/p&gt;
&lt;p&gt;There are many big data problems out there requiring huge compute scale, innovations in computation paradigms, vast storage space and so on. But just because your data takes up lots of disc space does not mean that you have a big data problem. Firstly, your data may be encoded in an inefficient format. XML, for example, can be incredible verbose (all those close tags and human readable text). Secondly, if your data changes over time it may change very slowly indicating that monitoring the difference between data sets is more important that importing complete data sets. Thirdly, you may be processing your information on a legacy architecture designed for low power CPUs or cores. Architecture should be data driven, meaning that you need to deeply understand the informational aspects of your data and not just the size of the data as it comes to you on disc.&lt;/p&gt;
&lt;p&gt;2. Communicating About Your Data&lt;/p&gt;
&lt;p&gt;Often, in large organization (I work for Microsoft and have worked at IBM in the past), the product requirements for data deliverables are high level. For example: we need these variables to be 99% accurate. This simplistic view of data - that a level of quality can be delivered in a specified time frame - is ignorant of the highly opportunistic nature of processes that improve the quality of data. Consequently, a data scientist needs to aggressively manage the communication about projects which transform and improve data sets. Do as much research as possible to minimize unknowns, but don&amp;#39;t sign contracts that involve both time and quality metrics!&lt;/p&gt;
&lt;p&gt;3. Invest in Interactive Analytics, not Reporting&lt;/p&gt;
&lt;p&gt;When you construct reports about your data products, you are answering a fixed set of questions. This is useful for monitoring, but it doesn&amp;#39;t provide a way to get at the unknown unknowns. It is only through interactions with data (often called slicing and dicing) that pockets of interest (problems and opportunities) are discovered. Rich, interactive tools may be perceived as a low priority and never quite got to. Avoid this peril!&lt;/p&gt;
&lt;p&gt;4. Understand the Role and Quality of Human Evaluations of Data&lt;/p&gt;
&lt;p&gt;When trying to determine how good your data product is, it is often the case that we employ an array of human judges to evaluate a sample of the data. The higher up the management chain you go, you tend to find a higher degree of respect for human judgement. There are many studies, however, that show that human judgements are not always as good as they are cracked up to be. In many cases, machines can do better than humans, they just tend to make different types of errors. On deeper inspection, human errors can be traced to the structure of incentives around the judgement process. Innovate in methods to compare data sets that help distinguish their relative quality without necessarily the expense of human assessment.&lt;/p&gt;
&lt;p&gt;5. Spend Time on the Plumbing&lt;/p&gt;
&lt;p&gt;How does data get in to your system? How does it flow? Are you sure every bit of information got in? With large scale data loading and processing systems, one doesn&amp;#39;t one a small number of failures to tip over the entire run. However, silently failing components can cause big headaches down the line when you are reporting your summary findings. Make sure there are no leaks in your pipeline!&lt;/p&gt;&lt;div&gt;
&lt;a href="http://feeds.feedburner.com/~ff/DataMining?a=-sXPggJj_Os:_Kpq09JV_Ec:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/DataMining?d=yIl2AUoC8zA" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/DataMining?a=-sXPggJj_Os:_Kpq09JV_Ec:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/DataMining?d=7Q72WNTAKBA" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/DataMining?a=-sXPggJj_Os:_Kpq09JV_Ec:2mJPEYqXBVI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/DataMining?d=2mJPEYqXBVI" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/DataMining?a=-sXPggJj_Os:_Kpq09JV_Ec:I9og5sOYxJI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/DataMining?d=I9og5sOYxJI" border="0"&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/DataMining/~4/-sXPggJj_Os" height="1" width="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/bNznI0tg6Og" height="1" width="1"/&gt;</content><author><name>Matthew Hurst</name></author><source gr:stream-id="feed/http://datamining.typepad.com/data_mining/atom.xml"><id>tag:google.com,2005:reader/feed/http://datamining.typepad.com/data_mining/atom.xml</id><title type="html">Data Mining: Text Mining, Visualization and Social Media</title><link rel="alternate" href="http://datamining.typepad.com/data_mining/" type="text/html" /></source><feedburner:origLink>http://feedproxy.google.com/~r/DataMining/~3/-sXPggJj_Os/5-hidden-skills-for-big-data-scientists.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337976378332"><id gr:original-id="tag:typepad.com,2003:post-6a010534b1db25970b016304ade5ef970d">tag:google.com,2005:reader/item/c290bc032d404ff5</id><category term="random" scheme="http://www.sixapart.com/ns/types#category" /><title type="html">Because it&amp;#39;s a long weekend: Who&amp;#39;s at your party?</title><published>2012-05-25T19:50:44Z</published><updated>2012-05-25T19:50:31Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/NSl9RFun3ac/because-its-friday-whos-at-your-party.html" type="text/html" /><link rel="replies" href="http://blog.revolutionanalytics.com/2012/05/because-its-friday-whos-at-your-party.html" type="text/html" /><content xml:base="http://blog.revolutionanalytics.com/" xml:lang="en-US" type="html">&lt;div&gt;&lt;p&gt;If you&amp;#39;re planning a house party this long weekend, you might want to take a cue from &lt;a href="http://imgur.com/a/s6dgU/all#52"&gt;Everett Hiller&lt;/a&gt; to liven up your party snaps with a bit of Bill Murray:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b016766c9737d970b-pi" style="display:inline"&gt;&lt;img alt="Bill-murray-with-a-beer" border="0" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b016766c9737d970b-800wi" title="Bill-murray-with-a-beer"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;And what party is complete without Tom Cruise on a piñata?&lt;/p&gt;
&lt;p&gt;&lt;a href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b0168ebcad0ca970c-pi" style="display:inline"&gt;&lt;img alt="Tom cruise on a pinata" border="0" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b0168ebcad0ca970c-800wi" title="Tom cruise on a pinata"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Yes, these are photoshops -- but awesome photoshops. (Everett clearly has some skills with lighting and composition.) You can see more celebrity substitutions at &lt;a href="http://twistedsifter.com/2012/03/photoshopping-celebrities-into-holiday-party/"&gt;Twisted Sifter&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We&amp;#39;ll be taking a break at the blog for the Memorial Day holiday; we&amp;#39;ll be back will on Tuesday. Enjoy the weekend, and for those of you in the US have a great long weekend.&lt;/p&gt;
&lt;p&gt;Twisted Sifter: &lt;a href="http://twistedsifter.com/2012/03/photoshopping-celebrities-into-holiday-party/"&gt;This is What Happens When You Photoshop Celebrities Into Your Holiday Party&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/NSl9RFun3ac" height="1" width="1"/&gt;</content><author><name>David Smith</name></author><source gr:stream-id="feed/http://blog.revolution-computing.com/atom.xml"><id>tag:google.com,2005:reader/feed/http://blog.revolution-computing.com/atom.xml</id><title type="html">Revolutions</title><link rel="alternate" href="http://blog.revolutionanalytics.com/" type="text/html" /></source><feedburner:origLink>http://blog.revolutionanalytics.com/2012/05/because-its-friday-whos-at-your-party.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337972327510"><id gr:original-id="tag:typepad.com,2003:post-6a010534b1db25970b0168ebca740b970c">tag:google.com,2005:reader/item/02c3398b7b829b68</id><category term="big data" scheme="http://www.sixapart.com/ns/types#category" /><category term="packages" scheme="http://www.sixapart.com/ns/types#category" /><category term="R" scheme="http://www.sixapart.com/ns/types#category" /><title type="html">Facebook-class social network analysis with R and Hadoop</title><published>2012-05-25T18:58:00Z</published><updated>2012-05-25T18:58:00Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/abPAr9UdhVo/facebook-class-social-network-analysis-with-r-and-hadoop.html" type="text/html" /><link rel="replies" href="http://blog.revolutionanalytics.com/2012/05/facebook-class-social-network-analysis-with-r-and-hadoop.html" type="text/html" /><content xml:base="http://blog.revolutionanalytics.com/" xml:lang="en-US" type="html">&lt;div&gt;&lt;p&gt;In computing, social networks are traditionally represented as &lt;em&gt;&lt;a href="http://en.wikipedia.org/wiki/Graph_%28mathematics%29"&gt;graphs&lt;/a&gt;&lt;/em&gt;: a connection of nodes (people), pairs of which may be connected by edges (friend relationships). Visually, the social networks can then be represented like this:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b016766c8c300970b-pi"&gt;&lt;img alt="Network-graph" border="0" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b016766c8c300970b-800wi" style="display:block;margin-left:auto;margin-right:auto" title="Network-graph"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Social network analysis often amounts to calculating the statistics on a graph like this: the number of edges (friends) connected to a particular node (person), and the distribution of the number of edges connected to nodes across the entire graph. When the graph consists of up to 10 billion elements (nodes and edges), such computations can be done on a single server with dedicated graph software like &lt;a href="http://neo4j.org/"&gt;Neo4j&lt;/a&gt;. But bigger networks — like &lt;a href="http://blog.revolutionanalytics.com/2010/12/facebooks-social-network-graph.html"&gt;Facebook&amp;#39;s social network&lt;/a&gt;, which is a graph with more than 60 billion elements — require a distributed solution.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b0147e0ae51b2970b-popup" style="display:inline"&gt;&lt;img alt="Facebook-friendships" border="0" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b0147e0ae51b2970b-800wi" title="Facebook-friendships"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://markorodriguez.com/"&gt;Marko A. Rodriguez&lt;/a&gt;, a graph consultant with &lt;a href="http://thinkaurelius.com/"&gt;Aurelius&lt;/a&gt;, shows in a blog post how to use &lt;a href="http://www.revolutionanalytics.com/products/r-for-apache-hadoop.php"&gt;R and Hadoop&lt;/a&gt; (integrated with Revolution Analytics&amp;#39; &lt;a href="https://github.com/RevolutionAnalytics/RHadoop/wiki"&gt;RHadoop&lt;/a&gt; packages) to analyze Facebook-scale social networks. He first simulates a social network (shown at the top of this post) using R&amp;#39;s &lt;a href="http://www.inside-r.org/packages/cran/igraph"&gt;igraph package&lt;/a&gt;, and then distributed the network in the Hadoop cluster with &lt;span style="font-family:&amp;#39;courier new&amp;#39;,courier"&gt;&lt;a href="https://github.com/RevolutionAnalytics/RHadoop/wiki/Getting-data-in-and-out"&gt;to.dfs&lt;/a&gt;&lt;/span&gt; function (from the &lt;a href="https://github.com/RevolutionAnalytics/RHadoop/wiki/rhdfs"&gt;rhdfs package&lt;/a&gt;). He then used the &lt;span style="font-family:&amp;#39;courier new&amp;#39;,courier"&gt;&lt;a href="https://github.com/RevolutionAnalytics/RHadoop/wiki/Tutorial"&gt;mapreduce&lt;/a&gt;&lt;/span&gt; function (from the &lt;a href="https://github.com/RevolutionAnalytics/RHadoop/wiki/rmr"&gt;rmr package&lt;/a&gt;) to write a simple map-reduce algorithm in R to count the number of edges associated with each node:&lt;/p&gt;
&lt;div style="overflow:auto"&gt;
&lt;div&gt;
&lt;pre style="font-family:monospace"&gt;degree.V &lt;span&gt;&amp;lt;-&lt;/span&gt; &lt;a href="https://github.com/RevolutionAnalytics/RHadoop/wiki/Tutorial"&gt;&lt;span&gt;mapreduce&lt;/span&gt;&lt;/a&gt;&lt;span style="color:#009900"&gt;(&lt;/span&gt;edge.list&lt;span style="color:#339933"&gt;,&lt;/span&gt; 
    map=&lt;a href="http://inside-r.org/r-doc/base/function"&gt;&lt;span style="color:#003399;font-weight:bold"&gt;function&lt;/span&gt;&lt;/a&gt;&lt;span style="color:#009900"&gt;(&lt;/span&gt;k&lt;span style="color:#339933"&gt;,&lt;/span&gt;v&lt;span style="color:#009900"&gt;)&lt;/span&gt; keyval&lt;span style="color:#009900"&gt;(&lt;/span&gt;v&lt;span style="color:#009900"&gt;[&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;2&lt;/span&gt;&lt;span style="color:#009900"&gt;]&lt;/span&gt;&lt;span style="color:#339933"&gt;,&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;1&lt;/span&gt;&lt;span style="color:#009900"&gt;)&lt;/span&gt;&lt;span style="color:#339933"&gt;,&lt;/span&gt; 
    reduce=&lt;a href="http://inside-r.org/r-doc/base/function"&gt;&lt;span style="color:#003399;font-weight:bold"&gt;function&lt;/span&gt;&lt;/a&gt;&lt;span style="color:#009900"&gt;(&lt;/span&gt;k&lt;span style="color:#339933"&gt;,&lt;/span&gt;v&lt;span style="color:#009900"&gt;)&lt;/span&gt; keyval&lt;span style="color:#009900"&gt;(&lt;/span&gt;k&lt;span style="color:#339933"&gt;,&lt;/span&gt;&lt;a href="http://inside-r.org/r-doc/base/length"&gt;&lt;span style="color:#003399;font-weight:bold"&gt;length&lt;/span&gt;&lt;/a&gt;&lt;span style="color:#009900"&gt;(&lt;/span&gt;v&lt;span style="color:#009900"&gt;)&lt;/span&gt;&lt;span style="color:#009900"&gt;)&lt;/span&gt;&lt;span style="color:#009900"&gt;)&lt;/span&gt;
from.dfs&lt;span style="color:#009900"&gt;(&lt;/span&gt;degree.V&lt;span style="color:#009900"&gt;)&lt;/span&gt;&lt;span style="color:#009900"&gt;[&lt;/span&gt;&lt;span style="color:#009900"&gt;[&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;1&lt;/span&gt;&lt;span style="color:#009900"&gt;]&lt;/span&gt;&lt;span style="color:#009900"&gt;]&lt;/span&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;From there, it&amp;#39;s another simple map-reduce job to calculate the connectivity statistics for the entire network. For more details on how Marko used RHadoop to perform this analysis, see the entire blog post linked below.&lt;/p&gt;
&lt;p&gt;Aurelius blog: &lt;a href="http://thinkaurelius.com/2012/02/05/graph-degree-distributions-using-r-over-hadoop/" rel="prev"&gt;Graph Degree Distributions using R over Hadoop&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/abPAr9UdhVo" height="1" width="1"/&gt;</content><author><name>David Smith</name></author><source gr:stream-id="feed/http://blog.revolution-computing.com/atom.xml"><id>tag:google.com,2005:reader/feed/http://blog.revolution-computing.com/atom.xml</id><title type="html">Revolutions</title><link rel="alternate" href="http://blog.revolutionanalytics.com/" type="text/html" /></source><feedburner:origLink>http://blog.revolutionanalytics.com/2012/05/facebook-class-social-network-analysis-with-r-and-hadoop.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337885737732"><id gr:original-id="http://infosthetics.com/archives/2012/05/americas_presidential_race_in_narrated_data_graphics.html">tag:google.com,2005:reader/item/8003899a262918bb</id><category term="infographic" /><title type="html">The Economist Videographics: Presidential Race in Narrated Data Graphics</title><published>2012-05-24T19:54:03Z</published><updated>2012-05-24T19:54:03Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/igafFMGaL-Q/americas_presidential_race_in_narrated_data_graphics.html" type="text/html" /><summary xml:base="http://infosthetics.com/" type="html">&lt;p&gt;&lt;img alt="economist_videographic.jpg" src="http://infosthetics.com/archives/economist_videographic.jpg" width="600" height="300"&gt;&lt;br&gt;
The discussion about the need to make a distinction between data visualization and data art has recently &lt;a href="http://www.perceptualedge.com/blog/?p=1245"&gt;resurfaced&lt;/a&gt; in &lt;a href="http://adamcrymble.blogspot.co.uk/2012/05/shock-and-awe-graphs-in-digital.html"&gt;various&lt;/a&gt; various online &lt;a href="http://www.excelcharts.com/blog/data-visualization-continuum/"&gt;locations&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;Now it seems we might have to rehash this discussion also for the practice of animated infographics. Since quite some time, The Economist has semi-regularly been featuring a new sort of information display, which they coin as "&lt;a href="http://www.economist.com/search/apachesolr_search/videographic"&gt;videographics&lt;/a&gt;". For instance, in their latest installment titled  &lt;a href="http://www.economist.com/node/21555743"&gt;America's Presidential Race&lt;/a&gt; [economist.com] one can experience quite relatively sophisticated data graphics, charts and diagrams, instead of the usual flashy animated typographic and iconographic effects for this kind of practice. Here, the presentation is further augmented with animations and a narration.  &lt;/p&gt;

&lt;p&gt;Other installments include an explanation of the &lt;a href="http://www.economist.com/blogs/graphicdetail/2012/03/daily-chart-20"&gt;French elections&lt;/a&gt;, or a detailed analysis of the &lt;a href="http://www.economist.com/blogs/graphicdetail/2012/02/daily-chart-1"&gt;state of the nation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You can also watch the animation &lt;a href="http://infosthetics.com/archives/2012/05/americas_presidential_race_in_narrated_data_graphics.html#extended"&gt;below&lt;/a&gt;.&lt;/p&gt;&lt;div&gt;
&lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=oASOglmzFd8:PliyFjDvhGI:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=yIl2AUoC8zA" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=oASOglmzFd8:PliyFjDvhGI:nQ_hWtDbxek"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=nQ_hWtDbxek" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=oASOglmzFd8:PliyFjDvhGI:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=qj6IDK7rITs" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=oASOglmzFd8:PliyFjDvhGI:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=7Q72WNTAKBA" border="0"&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/infosthetics/~4/oASOglmzFd8" height="1" width="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/igafFMGaL-Q" height="1" width="1"/&gt;</summary><author gr:unknown-author="true"><name>(author unknown)</name></author><source gr:stream-id="feed/http://infosthetics.com/index.xml"><id>tag:google.com,2005:reader/feed/http://infosthetics.com/index.xml</id><title type="html">information aesthetics</title><link rel="alternate" href="http://infosthetics.com/" type="text/html" /></source><feedburner:origLink>http://feeds.infosthetics.com/~r/infosthetics/~3/oASOglmzFd8/americas_presidential_race_in_narrated_data_graphics.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337884989603"><id gr:original-id="tag:typepad.com,2003:post-6a010534b1db25970b016305cc558b970d">tag:google.com,2005:reader/item/65827a9295ba4be7</id><category term="big data" scheme="http://www.sixapart.com/ns/types#category" /><category term="data science" scheme="http://www.sixapart.com/ns/types#category" /><title type="html">Data Science shows maturity at 2012 Summit</title><published>2012-05-24T17:55:25Z</published><updated>2012-05-24T17:55:25Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/eCy00BBRd-0/data-science-shows-maturity-at-2012-summit.html" type="text/html" /><link rel="replies" href="http://blog.revolutionanalytics.com/2012/05/data-science-shows-maturity-at-2012-summit.html" type="text/html" /><content xml:base="http://blog.revolutionanalytics.com/" xml:lang="en-US" type="html">&lt;div&gt;&lt;p&gt;As a discipline, Data Science is growing up fast. That&amp;#39;s my key takeaway from the 2012 &lt;a href="http://www.greenplum.com/datasciencesummit/"&gt;Data Science Summit&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b016305cc48d6970d-pi" style="display:inline"&gt;&lt;img alt="Data Science Summit 2012" border="0" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b016305cc48d6970d-800wi" title="Data Science Summit 2012"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;At the inaugural &lt;a href="http://blog.revolutionanalytics.com/2011/05/reflections-on-data-science-summit-2011.html"&gt;2011 Data Science Summit&lt;/a&gt; (you can see some highlights in this &lt;a href="http://www.greenplum.com/popover/video/206"&gt;recap video&lt;/a&gt;), the focus was on the Big Data part of Data Science: issues with streaming data, how to store big data, technology platforms, that kind of thing. This year&amp;#39;s summit was much more focused on the &amp;quot;Science&amp;quot; part of Data Science: applications of Big Data, and statistical issues related to the analysis of Big Data. A few examples:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Nate Silver (&lt;a href="http://fivethirtyeight.blogs.nytimes.com/"&gt;political forecaster&lt;/a&gt; for the NYT) talked not just about building models and making predictions, but also the importance of, in his words, &amp;quot;embracing uncertaintly&amp;quot;. A prediction often isn&amp;#39;t useful without an assessment of its uncertaintly (or risk). He gave this real-life example: a flood-level prediction of 49 feet doesn&amp;#39;t mean a city can rest easy because the levees are 51 feet high. The weather service failed to mention that there was a plus-or-minus 9 feet margin of error to that prediction, or about a 50-50 chance the city would be flooded. (It was.)&lt;/li&gt;
&lt;li&gt;Michael Chui (author of the &lt;a href="http://blog.revolutionanalytics.com/2011/06/new-mckinsey-report-on-big-data.html"&gt;McKinsey Big Data report&lt;/a&gt;) said that schools should be teaching more Statistics, and less Calculus, so that graduates have a better grasp of issues like sampling and selection bias.&lt;/li&gt;
&lt;li&gt;Michael Brown (CTO of &lt;a href="http://www.comscore.com/"&gt;ComScore&lt;/a&gt;) talked about the need to understand the impact of recall bias and outliers.&lt;/li&gt;
&lt;li&gt;Jeremy Howard (Chief Data Scientist of &lt;a href="http://www.kaggle.com"&gt;Kaggle&lt;/a&gt;) warned of the dangers of observation bias inherent in &amp;quot;data exhaust&amp;quot;, and extolled the benefits of statistical experiments to distringuish between causality and correlation.&lt;/li&gt;
&lt;li&gt;Tony Jebara (co-founder of Sense Networks) expressed the need for the focus of predictive analytics to graduate from mere accuracy to making models interpretable, and predictions actionable.&lt;/li&gt;
&lt;li&gt;Hadley Wickham (R package author and &lt;a href="http://www.revolutionanalytics.com/products/training/public/r-development.php"&gt;educator&lt;/a&gt;) described the variety of application areas for Data Science, from cheesemakers to airport designers, and from sports teams to cruise lines. &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These are all important statistical issues, which until recently have had a back-seat to the technological and operational issues of data science. It&amp;#39;s great to see the practice maturing, and this new focus will lead to data applications which are not just more powerful, but more reliable and more impactful as well. Data Science has come of Statistical age.&lt;/p&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/eCy00BBRd-0" height="1" width="1"/&gt;</content><author><name>David Smith</name></author><source gr:stream-id="feed/http://blog.revolution-computing.com/atom.xml"><id>tag:google.com,2005:reader/feed/http://blog.revolution-computing.com/atom.xml</id><title type="html">Revolutions</title><link rel="alternate" href="http://blog.revolutionanalytics.com/" type="text/html" /></source><feedburner:origLink>http://blog.revolutionanalytics.com/2012/05/data-science-shows-maturity-at-2012-summit.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337803317941"><id gr:original-id="tag:typepad.com,2003:post-6a010534b1db25970b0168ebbad389970c">tag:google.com,2005:reader/item/4a5a28b805471f0b</id><category term="applications" scheme="http://www.sixapart.com/ns/types#category" /><category term="current events" scheme="http://www.sixapart.com/ns/types#category" /><category term="graphics" scheme="http://www.sixapart.com/ns/types#category" /><category term="R" scheme="http://www.sixapart.com/ns/types#category" /><title type="html">NYT charts the Facebook IPO with R</title><published>2012-05-23T20:01:14Z</published><updated>2012-05-23T20:01:14Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/aVURiwHzTy4/nyt-charts-the-facebook-ipo-with-r.html" type="text/html" /><link rel="replies" href="http://blog.revolutionanalytics.com/2012/05/nyt-charts-the-facebook-ipo-with-r.html" type="text/html" /><content xml:base="http://blog.revolutionanalytics.com/" xml:lang="en-US" type="html">&lt;div&gt;&lt;p&gt;In conjunction with Facebook&amp;#39;s record-setting IPO last Thursday, the New York Times created an infographic to put the size of the offer in context with other recent IPOs. A detail of the graphic as it appeared in the print edition appears below:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://chartsnthings.tumblr.com/post/23348191031/amanda-cox-and-countrymen-chart-the-facebook-i-p-o" style="display:inline"&gt;&lt;img alt="NYT-facebook-detail" border="0" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b016305c52b8a970d-800wi" title="NYT-facebook-detail"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;ChartsNThings gives a &lt;a href="http://chartsnthings.tumblr.com/post/23348191031/amanda-cox-and-countrymen-chart-the-facebook-i-p-o"&gt;fascinating peek&lt;/a&gt; into the weeklong process that went into creating this chart, where about a dozen &amp;quot;sketches&amp;quot; of charts were tried and considered until the final chart was selected when Thursday&amp;#39;s deadline arrived. It&amp;#39;s a great look at the developmental process that goes into creating the quality infographics that are the hallmark of the NYT team led by graphics editor Amanda Cox. It&amp;#39;s also a testament to the value of working in the &lt;a href="http://www.revolutionanalytics.com/what-is-open-source-r/"&gt;R language&lt;/a&gt; (used to create all of the prototype charts), which is expressly designed to encourage experimentation and creativity in data analysis and visualization. Plus, the highly expressive nature of the language allows such prototypes to be created quickly, allowing the NYT team to iterate through many alternatives as the deadline loomed.&lt;/p&gt;
&lt;p&gt;The post also highlights an interesting angle related to the differing requirements of the web version and print version of NYT infographics: &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If you’ve seen the &lt;a href="http://www.nytimes.com/interactive/2012/05/17/business/dealbook/how-the-facebook-offering-compares.html"&gt;web version&lt;/a&gt;, though, you know it doesn’t look like this. [Amanda thinks print graphics can be smarter than web graphics.] For one, the browser window doesn’t give us this kind of space. But the medium itself plays a part too. Online, if you’re not engaged in 10 seconds, you’re not going to stay on the page, so they needed to keep it fun.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The prototype below was considered for the print version, but was instead ultimately adapted for the web:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b016766b950d1970b-pi" style="display:inline"&gt;&lt;img alt="Facebook-IPO-NYT-prototype" border="0" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b016766b950d1970b-800wi" title="Facebook-IPO-NYT-prototype"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The fun aspect here is that as you switch between the different view on the web, it&amp;#39;s not just a slideshow; each IPO &amp;quot;bubble&amp;quot; moves into it&amp;#39;s new position as the view changes. Like a motion chart, it&amp;#39;s a great way to track comparisons of individual data points between data views. (&lt;a href="http://www.nytimes.com/interactive/2012/05/17/business/dealbook/how-the-facebook-offering-compares.html"&gt;Try it out here&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;There are lots more details about the process of creating these charts at the post linked below; the whole thing is a great read.&lt;/p&gt;
&lt;p&gt;ChartsNThings: &lt;a href="http://chartsnthings.tumblr.com/post/23348191031/amanda-cox-and-countrymen-chart-the-facebook-i-p-o"&gt;Amanda Cox and countrymen chart the Facebook I.P.O.&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/aVURiwHzTy4" height="1" width="1"/&gt;</content><author><name>David Smith</name></author><source gr:stream-id="feed/http://blog.revolution-computing.com/atom.xml"><id>tag:google.com,2005:reader/feed/http://blog.revolution-computing.com/atom.xml</id><title type="html">Revolutions</title><link rel="alternate" href="http://blog.revolutionanalytics.com/" type="text/html" /></source><feedburner:origLink>http://blog.revolutionanalytics.com/2012/05/nyt-charts-the-facebook-ipo-with-r.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337800316691"><id gr:original-id="http://infosthetics.com/archives/2012/05/the_power_of_networks_manuel_limas_talk_sketched_animated.html">tag:google.com,2005:reader/item/de4e03a5ef6c7117</id><category term="art" /><title type="html">The Power of Networks: Manuel Lima's Talk... Sketched and Animated</title><published>2012-05-23T20:11:43Z</published><updated>2012-05-23T20:11:43Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/QVDeB_j5WSE/the_power_of_networks_manuel_limas_talk_sketched_animated.html" type="text/html" /><summary xml:base="http://infosthetics.com/" type="html">&lt;p&gt;&lt;img alt="lima_animated.jpg" src="http://infosthetics.com/archives/lima_animated.jpg" width="600" height="300"&gt;&lt;br&gt;
Manuel Lima, currently senior UX design lead at Microsoft Bing but maybe best known from the (now sparsely updated) blog &lt;a href="http://visualcomplexity.com"&gt;Visual Complexity&lt;/a&gt; and his recent &lt;a href="http://www.amazon.com/gp/product/1568989369/ref=as_li_ss_tl?ie=UTF8&amp;amp;tag=informationae-20&amp;amp;linkCode=as2&amp;amp;camp=217145&amp;amp;creative=399373&amp;amp;creativeASIN=1568989369"&gt;book&lt;/a&gt; with the same title, like to discuss the power of network visualisation in a wide and conceptually rich manner. &lt;/p&gt;

&lt;p&gt;You now have to chance to experience his 11-minute talk online, but instead of seeing Manuel talk in front of his slides, the following presentation solely consists of a large set of sketched illustrations which represent and visually clarify the most important concepts that he discussed.&lt;/p&gt;

&lt;p&gt;Watch either the &lt;a href="http://www.youtube.com/watch?v=nJmGrNdJ5Gw"&gt;animated&lt;/a&gt; or &lt;a href="http://www.youtube.com/watch?v=_0LVSIwifpI"&gt;traditional&lt;/a&gt; version at YouTube.&lt;/p&gt;

&lt;p&gt;The animated versions of well-known talks is brought by the &lt;a href="http://www.thersa.org/"&gt;Royal Society for the encouragement of Arts, Manufactures and Commerce (RSA)&lt;/a&gt;. &lt;/p&gt;&lt;div&gt;
&lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=HRN8sF9kJkc:B5WbnrkjxsM:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=yIl2AUoC8zA" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=HRN8sF9kJkc:B5WbnrkjxsM:nQ_hWtDbxek"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=nQ_hWtDbxek" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=HRN8sF9kJkc:B5WbnrkjxsM:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=qj6IDK7rITs" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=HRN8sF9kJkc:B5WbnrkjxsM:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=7Q72WNTAKBA" border="0"&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/infosthetics/~4/HRN8sF9kJkc" height="1" width="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/QVDeB_j5WSE" height="1" width="1"/&gt;</summary><author gr:unknown-author="true"><name>(author unknown)</name></author><source gr:stream-id="feed/http://infosthetics.com/index.xml"><id>tag:google.com,2005:reader/feed/http://infosthetics.com/index.xml</id><title type="html">information aesthetics</title><link rel="alternate" href="http://infosthetics.com/" type="text/html" /></source><feedburner:origLink>http://feeds.infosthetics.com/~r/infosthetics/~3/HRN8sF9kJkc/the_power_of_networks_manuel_limas_talk_sketched_animated.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337797934212"><id gr:original-id="http://infosthetics.com/archives/2012/05/threadwatch_tracking_and_visualizing_the_use_of_software_applications.html">tag:google.com,2005:reader/item/b558aa3b76bf3b09</id><category term="lifelogging" /><title type="html">ThreadWatch: Tracking and Visualizing the Use of Software Applications</title><published>2012-05-23T19:31:11Z</published><updated>2012-05-23T19:31:11Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/tqD1k3YRR9U/threadwatch_tracking_and_visualizing_the_use_of_software_applications.html" type="text/html" /><summary xml:base="http://infosthetics.com/" type="html">&lt;p&gt;&lt;img alt="threadwatch.jpg" src="http://infosthetics.com/archives/threadwatch.jpg" width="600" height="300"&gt;&lt;br&gt;
&lt;a href="http://threadwatch.finekost.com/"&gt;ThreadWatch&lt;/a&gt; [finekost.com] by interactive developer Alex Milde visualizes the usage of software programs on the Mac platform over the timeframe of one day. &lt;/p&gt;

&lt;p&gt;First, one needs to download a small program that tracks all active applications on your desktop, as well as their impact in terms of memory and CPU usage. This tracked data is stored in a text file, which can be uploaded and then visualized. The data is not stored nor kept by the visualization tool. Individual software programs are represented by different colors.&lt;/p&gt;

&lt;p&gt;See also:&lt;br&gt;
. &lt;a href="http://infosthetics.com/archives/2010/10/logtool_revealing_the_hidden_patterns_of_online_surfing_behavior.html"&gt;logTool: Revealing the Hidden Patterns of Online Surfing Behavior&lt;/a&gt;&lt;br&gt;
. &lt;a href="http://infosthetics.com/archives/2009/11/nebulus_visualizing_your_online_activities.html"&gt;Nebul.us: Visualizing (and Sharing) your Online Activity&lt;/a&gt;&lt;br&gt;
. &lt;a href="http://infosthetics.com/archives/2010/05/iograph_tracking_computer_mouse_movements.html"&gt;IOGraph: Tracking Computer Mouse Movements as Art Work&lt;/a&gt;&lt;br&gt;
. &lt;a href="http://infosthetics.com/archives/2009/09/eyebrowse_record_visualize_and_share_your_browser_history.html"&gt;EyeBrowse: Record, Visualize and Share your Browser History&lt;/a&gt;&lt;/p&gt;&lt;div&gt;
&lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=zYNDAun6jyg:s5eIcdrj92g:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=yIl2AUoC8zA" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=zYNDAun6jyg:s5eIcdrj92g:nQ_hWtDbxek"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=nQ_hWtDbxek" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=zYNDAun6jyg:s5eIcdrj92g:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=qj6IDK7rITs" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=zYNDAun6jyg:s5eIcdrj92g:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=7Q72WNTAKBA" border="0"&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/infosthetics/~4/zYNDAun6jyg" height="1" width="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/tqD1k3YRR9U" height="1" width="1"/&gt;</summary><author gr:unknown-author="true"><name>(author unknown)</name></author><source gr:stream-id="feed/http://infosthetics.com/index.xml"><id>tag:google.com,2005:reader/feed/http://infosthetics.com/index.xml</id><title type="html">information aesthetics</title><link rel="alternate" href="http://infosthetics.com/" type="text/html" /></source><feedburner:origLink>http://feeds.infosthetics.com/~r/infosthetics/~3/zYNDAun6jyg/threadwatch_tracking_and_visualizing_the_use_of_software_applications.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337719697404"><id gr:original-id="tag:typepad.com,2003:post-6a010534b1db25970b016766b0278a970b">tag:google.com,2005:reader/item/b685cb7e0f31e9ac</id><category term="applications" scheme="http://www.sixapart.com/ns/types#category" /><category term="current events" scheme="http://www.sixapart.com/ns/types#category" /><category term="graphics" scheme="http://www.sixapart.com/ns/types#category" /><category term="R" scheme="http://www.sixapart.com/ns/types#category" /><title type="html">The grade level of Congress speeches, analyzed with R</title><published>2012-05-22T20:47:16Z</published><updated>2012-05-22T20:47:16Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/-krWEkdB7ks/the-grade-level-of-congress-speeches-analyzed-with-r.html" type="text/html" /><link rel="replies" href="http://blog.revolutionanalytics.com/2012/05/the-grade-level-of-congress-speeches-analyzed-with-r.html" type="text/html" /><content xml:base="http://blog.revolutionanalytics.com/" xml:lang="en-US" type="html">&lt;div&gt;&lt;p&gt;As widely reported by &lt;a href="http://politicalticker.blogs.cnn.com/2012/05/21/study-congressional-speaking-levels-move-back-a-grade/"&gt;CNN&lt;/a&gt;, the &lt;a href="http://www.huffingtonpost.com/2012/05/21/members-of-congress-speak_n_1532666.html"&gt;Huffington Post&lt;/a&gt;, &lt;a href="http://tpmdc.talkingpointsmemo.com/2012/05/congress-speaking-grade-level.php"&gt;Talking Points Memo&lt;/a&gt;, the sophistication of speeches by US politicians has declined in recent years, dropping from an 11th-grade level in 2005 to a 10th-grade level today. The reports are based on an &lt;a href="http://sunlightfoundation.com/blog/2012/05/21/congressional-speech/"&gt;analysis by the Sunlight Foundation&lt;/a&gt;, based on textual analysis of congressional speeches given since 1996 provided by the &lt;a href="http://capitolwords.org/"&gt;Capitol Words&lt;/a&gt; API. You can see where your favourite legislator ranks in &lt;a href="https://data.sunlightlabs.com/dataset/Members-And-Grade-Level/iwhe-qaqu"&gt;this table&lt;/a&gt;. Picking out a couple of famous names, Sen. Rand Paul (R-KY) ranks near the bottom with an 8th-grade speaking level; near the top of the ranks is retiring senator Olympia-Snowe (R-ME) who speaks at a Grade 14 level. &lt;/p&gt;
&lt;p&gt;A regression analysis detailed in the article teases out some of the contributing factors. While the speech levels of members of both parties have declined over the past couple of years, the chart below shows that for Republicans, the more ideological extreme the congressperson, the less sophisticated the speech.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b0168ebb196af970c-pi" style="display:inline"&gt;&lt;img alt="Ideology-and-grade-level-ggplot2" border="0" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b0168ebb196af970c-800wi" title="Ideology-and-grade-level-ggplot2"&gt;&lt;/a&gt;&lt;br&gt;It&amp;#39;s clear that the data visualizations were created using R&amp;#39;s &lt;a href="http://blog.revolutionanalytics.com/2009/01/create-beautiful-statistical-graphics-with-ggplot2.html"&gt;ggplot2 package&lt;/a&gt;; presumably the statistical analyses were peformed with the &lt;a href="http://www.revolutionanalytics.com/what-is-open-source-r/"&gt;R language&lt;/a&gt; as well. For the &lt;a href="http://assets.sunlightfoundation.com.s3.amazonaws.com/blog/capwords-grade-level/congressReportCard.pdf"&gt;infographic circulated to the media&lt;/a&gt;, the ggplot2 charts have been cleaned up using an editing tool like Illustrator (easy if you &lt;a href="http://www.inside-r.org/r-doc/grDevices/pdf"&gt;export the chart to PDF&lt;/a&gt; from R).&lt;/p&gt;
&lt;p&gt;&lt;a href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b016766b01c19970b-pi" style="display:inline"&gt;&lt;img alt="Ideology vs speaking grade" border="0" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b016766b01c19970b-800wi" title="Ideology vs speaking grade"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Read the details of the analysis in the full report from the Sunlight Foundation linked below.&lt;/p&gt;
&lt;p&gt;Sunlight Foundation: &lt;a href="http://sunlightfoundation.com/blog/2012/05/21/congressional-speech/"&gt;The changing complexity of congressional speech&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/-krWEkdB7ks" height="1" width="1"/&gt;</content><author><name>David Smith</name></author><source gr:stream-id="feed/http://blog.revolution-computing.com/atom.xml"><id>tag:google.com,2005:reader/feed/http://blog.revolution-computing.com/atom.xml</id><title type="html">Revolutions</title><link rel="alternate" href="http://blog.revolutionanalytics.com/" type="text/html" /></source><feedburner:origLink>http://blog.revolutionanalytics.com/2012/05/the-grade-level-of-congress-speeches-analyzed-with-r.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337693552414"><id gr:original-id="http://infosthetics.com/archives/2012/05/the_voting_patterns_of_the_eurovision_song_contest_in_the_last_10_years.html">tag:google.com,2005:reader/item/96a37e8a0b564ac8</id><category term="collection" /><title type="html">The Voting Patterns of the Eurovision Song Contest in the last 10 Years</title><published>2012-05-22T14:26:56Z</published><updated>2012-05-22T14:26:56Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/0ZXESskwVpQ/the_voting_patterns_of_the_eurovision_song_contest_in_the_last_10_years.html" type="text/html" /><summary xml:base="http://infosthetics.com/" type="html">&lt;p&gt;&lt;img alt="eurovision_overview.jpg" src="http://infosthetics.com/archives/eurovision_overview.jpg" width="600" height="300"&gt;&lt;br&gt;
In less than a week, more than 125 million people will be watching the 2012 edition of the &lt;a href="http://en.wikipedia.org/wiki/Eurovision_Song_Contest"&gt;Eurovision Song Contest&lt;/a&gt;, the annual competition held among active member countries of the European Broadcasting Union (EBU). &lt;/p&gt;

&lt;p&gt;An important part of the televised contest is the whole voting process, during which each participating country makes their favorites votes public in about 3 different languages. &lt;/p&gt;

&lt;p&gt;For those nerdy people who like to know what country voted when in some particular way, there is now the following information graphic. &lt;a href="http://lifeindata.site50.net/work/eurovizion/eurovizion.html"&gt;Eurovizion&lt;/a&gt; [site50.net], designed by graphic designer Ben Willers, provides a detailed overview of all voting patterns that occurred during the last 10 years of Eurovision Song Contests. &lt;/p&gt;

&lt;p&gt;The top bar graph shows the votes received by each country. The bottom, dot-plot kind of graph reveals all individual votes, where the horizontal axis denotes the 'giving' countries, and the vertical axis the 'receiving' ones. &lt;/p&gt;

&lt;p&gt;One can detect some remarkable anomalies that show how the perceived quality of musical songs is quite a cultural affair. Or how can one explain the relationships between Greece and Cyprus (or Albania), and Germany and Israel?&lt;/p&gt;

&lt;p&gt;See also:&lt;br&gt;
. &lt;a href="http://infosthetics.com/archives/2009/07/eurovision_2009_results_visualized.html"&gt;Eurovision 2009 Results&lt;/a&gt;&lt;br&gt;
. &lt;a href="http://infosthetics.com/archives/2005/05/eurosong_visual.html"&gt;Visualising the Eurovision&lt;/a&gt;&lt;/p&gt;&lt;div&gt;
&lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=YQc2Co3KanA:BMJZvp8Y--c:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=yIl2AUoC8zA" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=YQc2Co3KanA:BMJZvp8Y--c:nQ_hWtDbxek"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=nQ_hWtDbxek" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=YQc2Co3KanA:BMJZvp8Y--c:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=qj6IDK7rITs" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=YQc2Co3KanA:BMJZvp8Y--c:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=7Q72WNTAKBA" border="0"&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/infosthetics/~4/YQc2Co3KanA" height="1" width="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/0ZXESskwVpQ" height="1" width="1"/&gt;</summary><author gr:unknown-author="true"><name>(author unknown)</name></author><source gr:stream-id="feed/http://infosthetics.com/index.xml"><id>tag:google.com,2005:reader/feed/http://infosthetics.com/index.xml</id><title type="html">information aesthetics</title><link rel="alternate" href="http://infosthetics.com/" type="text/html" /></source><feedburner:origLink>http://feeds.infosthetics.com/~r/infosthetics/~3/YQc2Co3KanA/the_voting_patterns_of_the_eurovision_song_contest_in_the_last_10_years.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337644779196"><id gr:original-id="tag:typepad.com,2003:post-6a010534b1db25970b016766a90cd0970b">tag:google.com,2005:reader/item/77df1c350b56a1c7</id><category term="graphics" scheme="http://www.sixapart.com/ns/types#category" /><category term="packages" scheme="http://www.sixapart.com/ns/types#category" /><category term="R" scheme="http://www.sixapart.com/ns/types#category" /><title type="html">A visual data summary for data frames</title><published>2012-05-21T23:32:21Z</published><updated>2012-05-21T23:32:21Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/B9Mi5-LXdoU/a-visual-data-summary-for-data-frames.html" type="text/html" /><link rel="replies" href="http://blog.revolutionanalytics.com/2012/05/a-visual-data-summary-for-data-frames.html" type="text/html" /><content xml:base="http://blog.revolutionanalytics.com/" xml:lang="en-US" type="html">&lt;div&gt;&lt;p&gt;If you want to get a quick numerical summary of a data set, the summary function gives a nice overview for data frames:&lt;/p&gt;
&lt;div style="overflow:auto"&gt;
&lt;div&gt;
&lt;pre style="font-family:monospace"&gt;&lt;span&gt;&amp;gt;&lt;/span&gt; &lt;a href="http://inside-r.org/r-doc/base/require"&gt;&lt;span style="color:#003399;font-weight:bold"&gt;require&lt;/span&gt;&lt;/a&gt;&lt;span style="color:#009900"&gt;(&lt;/span&gt;&lt;a href="http://inside-r.org/packages/cran/ggplot2"&gt;&lt;span&gt;ggplot2&lt;/span&gt;&lt;/a&gt;&lt;span style="color:#009900"&gt;)&lt;/span&gt;
Loading required package&lt;span&gt;:&lt;/span&gt; &lt;a href="http://inside-r.org/packages/cran/ggplot2"&gt;&lt;span&gt;ggplot2&lt;/span&gt;&lt;/a&gt;
&lt;span&gt;&amp;gt;&lt;/span&gt; &lt;a href="http://inside-r.org/r-doc/utils/data"&gt;&lt;span style="color:#003399;font-weight:bold"&gt;data&lt;/span&gt;&lt;/a&gt;&lt;span style="color:#009900"&gt;(&lt;/span&gt;&lt;a href="http://inside-r.org/packages/cran/diamonds"&gt;&lt;span&gt;diamonds&lt;/span&gt;&lt;/a&gt;&lt;span style="color:#009900"&gt;)&lt;/span&gt;
&lt;span&gt;&amp;gt;&lt;/span&gt; &lt;a href="http://inside-r.org/r-doc/base/summary"&gt;&lt;span style="color:#003399;font-weight:bold"&gt;summary&lt;/span&gt;&lt;/a&gt;&lt;span style="color:#009900"&gt;(&lt;/span&gt;&lt;a href="http://inside-r.org/packages/cran/diamonds"&gt;&lt;span&gt;diamonds&lt;/span&gt;&lt;/a&gt;&lt;span style="color:#009900"&gt;)&lt;/span&gt;
     carat               &lt;a href="http://inside-r.org/r-doc/base/cut"&gt;&lt;span style="color:#003399;font-weight:bold"&gt;cut&lt;/span&gt;&lt;/a&gt;        color        clarity          &lt;a href="http://inside-r.org/packages/cran/depth"&gt;&lt;span&gt;depth&lt;/span&gt;&lt;/a&gt;           &lt;a href="http://inside-r.org/r-doc/base/table"&gt;&lt;span style="color:#003399;font-weight:bold"&gt;table&lt;/span&gt;&lt;/a&gt;      
 Min.   &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;0.2000&lt;/span&gt;   &lt;a href="http://inside-r.org/packages/cran/FAiR"&gt;&lt;span&gt;Fair&lt;/span&gt;&lt;/a&gt;     &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;1610&lt;/span&gt;   &lt;a href="http://inside-r.org/r-doc/stats/D"&gt;&lt;span style="color:#003399;font-weight:bold"&gt;D&lt;/span&gt;&lt;/a&gt;&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;6775&lt;/span&gt;   SI1    &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;13065&lt;/span&gt;   Min.   &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;43.00&lt;/span&gt;   Min.   &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;43.00&lt;/span&gt;  
 1st Qu.&lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;0.4000&lt;/span&gt;   Good     &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;4906&lt;/span&gt;   E&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;9797&lt;/span&gt;   VS2    &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;12258&lt;/span&gt;   1st Qu.&lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;61.00&lt;/span&gt;   1st Qu.&lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;56.00&lt;/span&gt;  
 Median &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;0.7000&lt;/span&gt;   Very Good&lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;12082&lt;/span&gt;   F&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;9542&lt;/span&gt;   SI2    &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;9194&lt;/span&gt;   Median &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;61.80&lt;/span&gt;   Median &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;57.00&lt;/span&gt;  
 Mean   &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;0.7979&lt;/span&gt;   Premium  &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;13791&lt;/span&gt;   G&lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;11292&lt;/span&gt;   VS1    &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;8171&lt;/span&gt;   Mean   &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;61.75&lt;/span&gt;   Mean   &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;57.46&lt;/span&gt;  
 3rd Qu.&lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;1.0400&lt;/span&gt;   Ideal    &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;21551&lt;/span&gt;   H&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;8304&lt;/span&gt;   VVS2   &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;5066&lt;/span&gt;   3rd Qu.&lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;62.50&lt;/span&gt;   3rd Qu.&lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;59.00&lt;/span&gt;  
 Max.   &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;5.0100&lt;/span&gt;                     &lt;a href="http://inside-r.org/r-doc/base/I"&gt;&lt;span style="color:#003399;font-weight:bold"&gt;I&lt;/span&gt;&lt;/a&gt;&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;5422&lt;/span&gt;   VVS1   &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;3655&lt;/span&gt;   Max.   &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;79.00&lt;/span&gt;   Max.   &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;95.00&lt;/span&gt;  
                                    J&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;2808&lt;/span&gt;   &lt;span style="color:#009900"&gt;(&lt;/span&gt;Other&lt;span style="color:#009900"&gt;)&lt;/span&gt;&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;2531&lt;/span&gt;                                  
     price             x                y                z         
 Min.   &lt;span&gt;:&lt;/span&gt;  &lt;span style="color:#cc66cc"&gt;326&lt;/span&gt;   Min.   &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;0.000&lt;/span&gt;   Min.   &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;0.000&lt;/span&gt;   Min.   &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;0.000&lt;/span&gt;  
 1st Qu.&lt;span&gt;:&lt;/span&gt;  &lt;span style="color:#cc66cc"&gt;950&lt;/span&gt;   1st Qu.&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;4.710&lt;/span&gt;   1st Qu.&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;4.720&lt;/span&gt;   1st Qu.&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;2.910&lt;/span&gt;  
 Median &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;2401&lt;/span&gt;   Median &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;5.700&lt;/span&gt;   Median &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;5.710&lt;/span&gt;   Median &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;3.530&lt;/span&gt;  
 Mean   &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;3933&lt;/span&gt;   Mean   &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;5.731&lt;/span&gt;   Mean   &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;5.735&lt;/span&gt;   Mean   &lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;3.539&lt;/span&gt;  
 3rd Qu.&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;5324&lt;/span&gt;   3rd Qu.&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;6.540&lt;/span&gt;   3rd Qu.&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;6.540&lt;/span&gt;   3rd Qu.&lt;span&gt;:&lt;/span&gt; &lt;span style="color:#cc66cc"&gt;4.040&lt;/span&gt;  
 Max.   &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;18823&lt;/span&gt;   Max.   &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;10.740&lt;/span&gt;   Max.   &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;58.900&lt;/span&gt;   Max.   &lt;span&gt;:&lt;/span&gt;&lt;span style="color:#cc66cc"&gt;31.800&lt;/span&gt;  &lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;But if you&amp;#39;d prefer a visual overview of your data, &lt;a href="http://www.ancienteco.com/2012/05/quickly-visualize-your-whole-dataset.html"&gt;Andrew Barr suggests the tableplot function&lt;/a&gt; (included in the tabplot package) for a graphical version:&lt;/p&gt;
&lt;div style="overflow:auto"&gt;
&lt;div&gt;
&lt;pre style="font-family:monospace"&gt;tableplot&lt;span style="color:#009900"&gt;(&lt;/span&gt;&lt;a href="http://inside-r.org/packages/cran/diamonds"&gt;&lt;span&gt;diamonds&lt;/span&gt;&lt;/a&gt;&lt;span style="color:#339933"&gt;,&lt;/span&gt; cex = &lt;span style="color:#cc66cc"&gt;1.8&lt;/span&gt;&lt;span style="color:#009900"&gt;)&lt;/span&gt;&lt;/pre&gt;
&lt;pre style="font-family:monospace"&gt;&lt;span style="color:#009900"&gt;
&lt;a href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b016305b500e7970d-popup" style="display:inline"&gt;&lt;img alt="Tableplot-diamonds" border="0" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b016305b500e7970d-800wi" title="Tableplot-diamonds"&gt;&lt;/a&gt;&lt;br&gt;&lt;/span&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Andrew explains how to use the tabplot function in the post linked below.&lt;/p&gt;
&lt;p&gt;W. Andrew Barr&amp;#39;s Paleoecology Blog: &lt;a href="http://www.ancienteco.com/2012/05/quickly-visualize-your-whole-dataset.html"&gt;Quickly Visualize Your Whole Dataset&lt;/a&gt; (&lt;a href="https://twitter.com/#!/JacquelynGill/statuses/204599406790578177"&gt;via&lt;/a&gt; @JacquelynGill)&lt;/p&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/B9Mi5-LXdoU" height="1" width="1"/&gt;</content><author><name>David Smith</name></author><source gr:stream-id="feed/http://blog.revolution-computing.com/atom.xml"><id>tag:google.com,2005:reader/feed/http://blog.revolution-computing.com/atom.xml</id><title type="html">Revolutions</title><link rel="alternate" href="http://blog.revolutionanalytics.com/" type="text/html" /></source><feedburner:origLink>http://blog.revolutionanalytics.com/2012/05/a-visual-data-summary-for-data-frames.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337628648685"><id gr:original-id="http://infosthetics.com/archives/2012/05/hans_roslings_shortest_talk_ever.html">tag:google.com,2005:reader/item/70d4b4b7fdb5f914</id><category term="infographic" /><title type="html">Hans Rosling's Shortest Talk Ever</title><published>2012-05-21T20:29:25Z</published><updated>2012-05-21T20:29:25Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/W4qNDwXKd8g/hans_roslings_shortest_talk_ever.html" type="text/html" /><summary xml:base="http://infosthetics.com/" type="html">&lt;p&gt;&lt;img alt="rosling_stones.jpg" src="http://infosthetics.com/archives/rosling_stones.jpg" width="600" height="300"&gt;&lt;br&gt;
One does not require flashy Powerpoint slides or state exact statistical numbers to be able to convey an educational yet compelling presentation. Nor should the presentation need to take that long. &lt;/p&gt;

&lt;p&gt;Hans Rosling, who is already known to use physical props during his presentations, ranging from &lt;a href="http://www.youtube.com/watch?v=fTznEIZRkLg"&gt;IKEA storage boxes&lt;/a&gt;, over &lt;a href="http://www.youtube.com/watch?v=BZoKfap4g4w"&gt;washing machines&lt;/a&gt; to a real &lt;a href="http://www.youtube.com/watch?v=YpKbO6O3O3M"&gt;sword&lt;/a&gt;, demonstrates again how physical objects can be used to focus human attention to important statistical trends. He also shows how stones can convincingly form bar graphs. &lt;/p&gt;

&lt;p&gt;The following 'talk', taking less than 50 seconds, 'happened' during a spontaneous interview after the TEDxSummit 2012 in Doha, Qatar and discusses the impact of global population growth.&lt;/p&gt;

&lt;p&gt;Watch the presentation &lt;a href="http://infosthetics.com/archives/2012/05/hans_roslings_shortest_talk_ever.html#extended"&gt;below&lt;/a&gt;. The presentation also reminds me to always show up with clean shoes.&lt;/p&gt;

&lt;p&gt;See also:&lt;br&gt;
. &lt;a href="http://infosthetics.com/archives/2010/11/the_joy_of_stats_combining_hans_rosling_with_holographic_infographics.html"&gt;The Joy of Stats: Combining Hans Rosling with Holographic Infographics&lt;/a&gt;&lt;br&gt;
. &lt;a href="http://infosthetics.com/archives/2009/05/hans_rosling_video_gapcast_swine_flu_news_versus_death_ratio.html"&gt;Swine Flu News versus Death Ratio&lt;/a&gt;&lt;br&gt;
. &lt;a href="http://infosthetics.com/archives/2010/11/the_joy_of_stats_combining_hans_rosling_with_holographic_infographics.html"&gt;Joy of Stats&lt;/a&gt;&lt;br&gt;
. &lt;a href="http://infosthetics.com/archives/2007/07/new_ted_hans_rosling_talk_data_visualization.html"&gt;Hans Rosling TED Talk 2007&lt;/a&gt;&lt;br&gt;
. &lt;a href="http://infosthetics.com/archives/2006/06/data_visualization_hans_rosling_ingo_gunther.html"&gt;Hans Rosling TED Talk 2006&lt;/a&gt;&lt;/p&gt;&lt;div&gt;
&lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=NVSHZ3oRMk4:u2c_3EZ5sUg:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=yIl2AUoC8zA" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=NVSHZ3oRMk4:u2c_3EZ5sUg:nQ_hWtDbxek"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=nQ_hWtDbxek" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=NVSHZ3oRMk4:u2c_3EZ5sUg:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=qj6IDK7rITs" border="0"&gt;&lt;/a&gt; &lt;a href="http://feeds.infosthetics.com/~ff/infosthetics?a=NVSHZ3oRMk4:u2c_3EZ5sUg:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/infosthetics?d=7Q72WNTAKBA" border="0"&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/infosthetics/~4/NVSHZ3oRMk4" height="1" width="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/W4qNDwXKd8g" height="1" width="1"/&gt;</summary><author gr:unknown-author="true"><name>(author unknown)</name></author><source gr:stream-id="feed/http://infosthetics.com/index.xml"><id>tag:google.com,2005:reader/feed/http://infosthetics.com/index.xml</id><title type="html">information aesthetics</title><link rel="alternate" href="http://infosthetics.com/" type="text/html" /></source><feedburner:origLink>http://feeds.infosthetics.com/~r/infosthetics/~3/NVSHZ3oRMk4/hans_roslings_shortest_talk_ever.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337384880396"><id gr:original-id="http://agtb.wordpress.com/?p=1662">tag:google.com,2005:reader/item/0f0033005a695f66</id><category term="Uncategorized" /><title type="html">2012 Gödel Prize awarded to 6 researchers in Algorithmic Game Theory</title><published>2012-05-18T23:47:51Z</published><updated>2012-05-18T23:47:51Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/517_LBxItDg/" type="text/html" /><content xml:base="http://agtb.wordpress.com/" type="html">&lt;p&gt;The 2012 &lt;a href="http://en.wikipedia.org/wiki/G%C3%B6del_Prize"&gt;Gödel Prize&lt;/a&gt;, which recognizes outstanding papers in theoretical computer science published over the past 14 years, has just been announced. This year it recognizes three prominent papers that helped to launch the field of Algorithmic Game Theory, and these papers’ six authors: Elias Koutsoupias and Christos H. Papadimitriou, Tim Roughgarden and Éva Tardos, and Noam Nisan and Amir Ronen. Quoting from the &lt;a href="http://www.acm.org/press-room/news-releases/2012/goedel-prize-2012"&gt;ACM’s official citation&lt;/a&gt; (editing slightly to improve readability):&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;In their paper “Worst-case Equilibira,” Koutsoupias and Papadimitriou introduced the “price of anarchy” concept, a measure of the extent to which competition approximates cooperation.  It quantifies how much performance is lost due to selfish behaviors in systems like the Internet, which operates without a system designer or monitor striving to achieve the “social optimum.”   Their answer, surprisingly often, is “not that much.”&lt;/p&gt;
&lt;p&gt;Roughgarden and Tardos revealed the power and depth of the “price of anarchy” concept as it applies to routing traffic in large-scale communications networks to optimize the performance of a congested network. Their paper “How Bad Is Selfish Routing?” revisits an old conundrum in transportation science known as “Braess’s paradox,” and provides remarkably complete results on the relationship between centralized optimization and selfish routing in network traffic.&lt;/p&gt;
&lt;p&gt;Nisan and Ronen coined the term “algorithmic mechanism design” in their paper of the same title, presenting a whole new range of applications of the theory of mechanism design within computer science. Combining ideas from economics and game theory with concepts and techniques from computer science, they enriched both mechanism design and the theories of algorithms and complexity.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;It’s great to see this level of recognition of the importance of algorithmic game theory in general, and these influential papers in particular, by the broader TCS community. Congratulations to all the winners! (I should note that half the winners—one from each paper—are contributors to this blog: Elias Koutsoupias, Noam Nisan and Tim Roughgarden. They would surely have been too modest to post this announcement themselves, but congratulations to them anyway. &lt;img src="http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif" alt=":-)"&gt; )&lt;/p&gt;
&lt;br&gt;  &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/agtb.wordpress.com/1662/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/agtb.wordpress.com/1662/"&gt;&lt;/a&gt; &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/agtb.wordpress.com/1662/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/agtb.wordpress.com/1662/"&gt;&lt;/a&gt; &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/agtb.wordpress.com/1662/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/agtb.wordpress.com/1662/"&gt;&lt;/a&gt; &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/agtb.wordpress.com/1662/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/agtb.wordpress.com/1662/"&gt;&lt;/a&gt; &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/agtb.wordpress.com/1662/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/agtb.wordpress.com/1662/"&gt;&lt;/a&gt; &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/agtb.wordpress.com/1662/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/agtb.wordpress.com/1662/"&gt;&lt;/a&gt; &lt;a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/agtb.wordpress.com/1662/"&gt;&lt;img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/agtb.wordpress.com/1662/"&gt;&lt;/a&gt; &lt;img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=agtb.wordpress.com&amp;amp;blog=6963698&amp;amp;post=1662&amp;amp;subd=agtb&amp;amp;ref=&amp;amp;feed=1" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/517_LBxItDg" height="1" width="1"/&gt;</content><author><name>Kevin Leyton-Brown</name></author><source gr:stream-id="feed/http://agtb.wordpress.com/feed/"><id>tag:google.com,2005:reader/feed/http://agtb.wordpress.com/feed/</id><title type="html">Turing&amp;#39;s Invisible Hand</title><link rel="alternate" href="http://agtb.wordpress.com" type="text/html" /></source><feedburner:origLink>http://agtb.wordpress.com/2012/05/19/2012-godel-prize-awarded-to-6-researchers-in-algorithmic-game-theory/</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337380219253"><id gr:original-id="tag:typepad.com,2003:post-6a010534b1db25970b016304b69efb970d">tag:google.com,2005:reader/item/69645b0bad426346</id><category term="random" scheme="http://www.sixapart.com/ns/types#category" /><title type="html">Because it&amp;#39;s Friday: Game theory</title><published>2012-05-18T21:38:50Z</published><updated>2012-05-18T21:40:32Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/dAVXsXUAq_Y/game-theory.html" type="text/html" /><link rel="replies" href="http://blog.revolutionanalytics.com/2012/05/game-theory.html" type="text/html" /><content xml:base="http://blog.revolutionanalytics.com/" xml:lang="en-US" type="html">&lt;div&gt;&lt;p&gt;&lt;a href="http://en.wikipedia.org/wiki/Game_theory"&gt;Game Theory&lt;/a&gt; is the mathematical study of how agents in a system make choices for their actions, in light of the fact that other agents are also making competitive choices of &lt;em&gt;their&lt;/em&gt; actions. As the name suggests, the &amp;quot;system&amp;quot; is often some kind of game and the &amp;quot;agents&amp;quot; are players, but game theory is also used to explain the behaviour of crowd motion, business dealings, foreign relations, and even the evolution of altruism. (There&amp;#39;s a excellent chapter involving game theory in &lt;a href="http://amzn.com/0199291152"&gt;The Selfish Gene&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;The textbook example of game theory is the &lt;a href="http://en.wikipedia.org/wiki/Prisoner%27s_dilemma"&gt;Prisoner&amp;#39;s Dilemma&lt;/a&gt;. The UK game show &amp;quot;Golden Balls&amp;quot; includes a form of the Prisoner&amp;#39;s Dilemma where the stake is prizemoney instead of freedom. Each player may independently choose to &amp;quot;split&amp;quot; or &amp;quot;steal&amp;quot; the prize, but if &lt;em&gt;both&lt;/em&gt; steal each goes away empty-handed. You might think there aren&amp;#39;t many strategic options to guarantee a win in a game like this, but one player found a way:&lt;/p&gt;
&lt;p&gt;&lt;iframe frameborder="0" height="344" src="http://www.youtube.com/embed/S0qjK3TWZE8?fs=1&amp;amp;feature=oembed" width="459"&gt;&lt;/iframe&gt; &lt;/p&gt;
&lt;p&gt;It&amp;#39;s an elegant strategy. Nick is apparently altruistic (or at least risk-averse), and wants to share the prizemoney. But altruistic actions are open to subversion by a selfish opponent. Ibraham, unable to discern Nick&amp;#39;s apparent &amp;quot;mutually assured destruction&amp;quot; strategy, has no option but to behave altruistically. I wonder if this strategy has ever been attempted again on the show; but now that this strategy is in the &amp;quot;gene pool&amp;quot; of strategic options, it would likely be less effective the next time around. And that&amp;#39;s one of the subtle beauties of Game Theory.&lt;/p&gt;
&lt;p&gt;via Business Insider: &lt;a href="http://www.businessinsider.com/golden-balls-game-theory-2012-4"&gt;British Gameshow Contestant Puts On Badass Display Of Game Theory&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/dAVXsXUAq_Y" height="1" width="1"/&gt;</content><author><name>David Smith</name></author><source gr:stream-id="feed/http://blog.revolution-computing.com/atom.xml"><id>tag:google.com,2005:reader/feed/http://blog.revolution-computing.com/atom.xml</id><title type="html">Revolutions</title><link rel="alternate" href="http://blog.revolutionanalytics.com/" type="text/html" /></source><feedburner:origLink>http://blog.revolutionanalytics.com/2012/05/game-theory.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337371820042"><id gr:original-id="tag:typepad.com,2003:post-6a010534b1db25970b016305a2de4c970d">tag:google.com,2005:reader/item/df20fbe5c2315675</id><category term="big data" scheme="http://www.sixapart.com/ns/types#category" /><category term="predictive analytics" scheme="http://www.sixapart.com/ns/types#category" /><category term="R" scheme="http://www.sixapart.com/ns/types#category" /><category term="Revolution" scheme="http://www.sixapart.com/ns/types#category" /><title type="html">R is to SAS as Java is to COBOL</title><published>2012-05-18T19:37:17Z</published><updated>2012-05-18T19:37:17Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/pmO8MIFgyGI/r-is-to-sas-as-java-is-to-cobol.html" type="text/html" /><link rel="replies" href="http://blog.revolutionanalytics.com/2012/05/r-is-to-sas-as-java-is-to-cobol.html" type="text/html" /><content xml:base="http://blog.revolutionanalytics.com/" xml:lang="en-US" type="html">&lt;div&gt;&lt;p&gt;An &lt;a href="http://www.b-eye-network.com/view/16071"&gt;interview with Revolution Analytics CEO Dave Rich&lt;/a&gt; was published this week by BeyeNetwork. During the interview, Dace was asked about how the statistical modeling platforms have changed over the decades:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;People have been doing statistical modeling and predictive analytics for 50 years now, SAS and SPSS have been around since the early ‘70s. What’s different now -- what’s making this move toward other statistical and “big data” areas?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;&lt;strong&gt;David Rich&lt;/strong&gt;: Well, I think obviously SAS and SPSS have been around, as you pointed out, for decades. We call that sort of the first generation of analytics and insight-driven solutions. In my perspective, having been in the business for more than three decades, it reminds me a bit of what COBOL was back in the day relative to business software. I see R as the more modern language. In this analogy, R would represent Java or C++. What happened in the middle of the nineties when the shift occurred is very similar to where we are now with R. Open source is a worldwide collaboration innovation. It’s a way to tap into that channel for research, and I think the role that Revolution Analytics can play – very similar to what Red Hat did back in the Linux days – is to be the conduit between the community and enterprise deployment.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The conversation also touched on the future of big data analytics, impact of advanced analytics on business, and the benefits of R and Revolution R Enteprise to reduce costs and expand the scope of possibility with big data analytics. For the complete interview, follow the link below.&lt;/p&gt;
&lt;p&gt;BeyeNetwork: &lt;a href="http://www.b-eye-network.com/view/16071"&gt;Advanced Analytics, Big Data and the Power of R: A Q&amp;amp;A Spotlight with David Rich, CEO of Revolution Analytics&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/pmO8MIFgyGI" height="1" width="1"/&gt;</content><author><name>David Smith</name></author><source gr:stream-id="feed/http://blog.revolution-computing.com/atom.xml"><id>tag:google.com,2005:reader/feed/http://blog.revolution-computing.com/atom.xml</id><title type="html">Revolutions</title><link rel="alternate" href="http://blog.revolutionanalytics.com/" type="text/html" /></source><feedburner:origLink>http://blog.revolutionanalytics.com/2012/05/r-is-to-sas-as-java-is-to-cobol.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337367823695"><id gr:original-id="tag:typepad.com,2003:post-6a010534b1db25970b016305a2c2cf970d">tag:google.com,2005:reader/item/d2acc51bcca37322</id><category term="R" scheme="http://www.sixapart.com/ns/types#category" /><category term="statistics" scheme="http://www.sixapart.com/ns/types#category" /><title type="html">In Mexico, more marriages ending in divorce, and sooner</title><published>2012-05-18T19:03:13Z</published><updated>2012-05-18T19:03:13Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/_MeXXjLFdrU/in-mexico-more-marriages-ending-in-divorce-and-sooner.html" type="text/html" /><link rel="replies" href="http://blog.revolutionanalytics.com/2012/05/in-mexico-more-marriages-ending-in-divorce-and-sooner.html" type="text/html" /><content xml:base="http://blog.revolutionanalytics.com/" xml:lang="en-US" type="html">&lt;div&gt;&lt;p&gt;R user Diego Valle analyzed the rate of divorces in Mexican marriage since 1993 (the earliest date for which data are available) and found that not only have more marriages ended in divorce over time, but marriages that do end are ending sooner:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b0168eb984d64970c-pi" style="display:inline"&gt;&lt;img alt="Mexico-divorces-by-length" border="0" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b0168eb984d64970c-800wi" title="Mexico-divorces-by-length"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This chart is a bit complicated, but it bears close inspection. Each line you see is a cohort of all of the marriages in a given year: 1993, 1994, all the way up to 2009. The vertical height of each line is proportional to the total number of divorces in each subsequent year within each cohort (expressed as a fraction of all marriages in the cohort year). Cleverly, the cohort lines are all arranged not by calendar time, but by years since marriage: the leftmost point represents divorces in the first year (relatively few), then divorces in the second year, and so on. &lt;/p&gt;
&lt;p&gt;More residents of Mexico married in 1993 saw their 10th wedding anniversary than those married in 1998. Overall, the trend is clear: more weddings that take place now will end than those from previous years, and they&amp;#39;re likely to end sooner as well. Although there&amp;#39;s not much historical data for recent marriage, the steady progression of divorce rates over time allows Diego to create a forecast (using a &lt;a href="http://inside-r.org/r-doc/nlme/lme"&gt;linear mixed-effects model&lt;/a&gt; in the &lt;a href="http://www.revolutionanalytics.com/what-is-open-source-r/"&gt;R language&lt;/a&gt;) of the outcomes of recent marriages. He predicts, for example, that 11% of marriages registered in 2007 will have ended in divorce by 2022. By contrast though, that&amp;#39;s about the same rate as US marriages from the fifties.&lt;/p&gt;
&lt;p&gt;If you want to do a similar analysis, Diego provides R code in his post linked below, and at &lt;a href="https://github.com/diegovalle/Express-Divorce"&gt;his github&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Diego Valle-Jones: &lt;a href="http://blog.diegovalle.net/2012/05/proportion-of-marriages-ending-in.html"&gt;Proportion of marriages ending in divorce&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/_MeXXjLFdrU" height="1" width="1"/&gt;</content><author><name>David Smith</name></author><source gr:stream-id="feed/http://blog.revolution-computing.com/atom.xml"><id>tag:google.com,2005:reader/feed/http://blog.revolution-computing.com/atom.xml</id><title type="html">Revolutions</title><link rel="alternate" href="http://blog.revolutionanalytics.com/" type="text/html" /></source><feedburner:origLink>http://blog.revolutionanalytics.com/2012/05/in-mexico-more-marriages-ending-in-divorce-and-sooner.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337307894266"><id gr:original-id="tag:blogger.com,1999:blog-6569681.post-2911049147956696044">tag:google.com,2005:reader/item/315574d047297cf2</id><title type="html">The game Stick Portal</title><published>2012-05-18T02:09:00Z</published><updated>2012-05-25T19:21:52Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/Osi4p7SBmBs/game-stick-portal.html" type="text/html" /><link rel="replies" href="http://glinden.blogspot.com/feeds/2911049147956696044/comments/default" title="Post Comments" type="application/atom+xml" /><link rel="replies" href="http://www.blogger.com/comment.g?blogID=6569681&amp;postID=2911049147956696044" title="0 Comments" type="text/html" /><content xml:base="http://glinden.blogspot.com/" type="html">I want to share more of the ideas I've been exploring.  First, let me start with this, an early version of a game I'm calling Stick Portal.  Click on the image to play:

&lt;div style="clear:both;text-align:center;margin-top:8px"&gt;
&lt;a href="http://crunchzilla.com/stick-portal" style="margin-left:1em;margin-right:1em"&gt;&lt;img border="0" height="268" width="400" src="http://4.bp.blogspot.com/-R0531hhBVtE/T7WFvQ6P74I/AAAAAAAAXdA/i0r_3-GTsf8/s400/stick-portal-screen-small.png"&gt;&lt;/a&gt;&lt;/div&gt;

It's entirely written in &lt;a href="http://coffeescript.org/"&gt;Coffescript&lt;/a&gt; using HTML5 canvas.  Just need a browser to play, works pretty well on mobile devices (&lt;a href="http://lifehacker.com/5809338/add-web-site-bookmarks-to-your-iphones-homescreen"&gt;add it to your home screen&lt;/a&gt; and it'll even go full screen and behave like a free app).
&lt;br&gt;&lt;br&gt;
The idea is to create a simplified puzzle game with a level editor where kids could share levels they created.  The current version has ten levels that are the tutorials to teach players how to play the game.  I've just started on the level editor that will, eventually, allow people to create their own levels easily and share them with others.
&lt;br&gt;&lt;br&gt;
The motivation for this came from seeing what Valve did with Portal 2.  Portal 2 had a level editor called Hammer that was amazing but incredibly hard to use.  Kids were using Hammer to create puzzles for each other that they could play in Portal 2 -- which is great exposure to CAD-like modeling tools and a nice spatial reasoning workout -- but it was really painful.  Valve &lt;a href="http://www.thinkwithportals.com/blog.php?id=7926&amp;amp;p=1"&gt;just launched&lt;/a&gt; a much easier-to-use editor for Portal 2 that is truly fantastic, highly recommend it.
&lt;br&gt;&lt;br&gt;
Stick Portal is free to play, open source (MIT license), and the code is available &lt;a href="https://github.com/glinden/stick-portal"&gt;on GitHub&lt;/a&gt;.  The source might be useful to people working on similar games as it contains examples of ways to use the &lt;a href="http://code.google.com/p/box2dweb/"&gt;Box2Djs physics engine&lt;/a&gt;, handling touch and multi-touch (and accelerometer) on mobile devices, how to make your web page look like an app, plenty of examples of working with HTML5 Canvas, crazy things like a way to automatically resize the canvas when the browser window changes or a device rotates, and a lot of other goodies.  Won't claim it's the most beautiful code ever, but it is well commented and was fun to write.  I hope it is useful.
&lt;br&gt;&lt;br&gt;
I plan to keep working on this and extend it to include an editor, but I've been sitting on this long enough so, in the spirit of launch early and often, I'm putting it out now.  Please let me know what you think in the comments, and I'd love it if you'd drop me a note if your kids like the game or if the examples in the source turn out to be useful to you.
&lt;br&gt;&lt;br&gt;
&lt;b&gt;Update&lt;/b&gt;: A couple people have told me they have gotten stuck not being able to guess the controls in the tutorial. It's AWSD or arrow keys for movement and mouse button and mouse movement for the portal gun.  On mobile, it's hold down your finger to run toward your finger and hold down above you to jump, tap to aim and fire the portal gun, and second finger (multi-touch) to move the portal gun without firing (like to maneuver a held box).
&lt;br&gt;&lt;br&gt;
I also should have said more explicitly that one very cool thing is that the game doesn't use Flash, it's just HTML5.  So, it works on all modern browsers without a plug-in, which is neat-o.  Also interesting is that it is a fairly complicated HTML5 game running smoothly in the browser on PCs and mobile, almost looking like a native app, but not a native app.
&lt;br&gt;&lt;br&gt;
Finally, let me add that I did this game mostly to learn about making games fun.  That's a surprisingly hard thing to do.  If you're interested in that topic too, nothing like trying to do it yourself, but I'd also recommend the books "&lt;a href="http://www.amazon.com/A-Theory-Fun-Game-Design/dp/1932111972"&gt;A Theory of Fun Game Design&lt;/a&gt;" and "&lt;a href="http://www.amazon.com/The-Art-Game-Design-lenses/dp/0123694965"&gt;The Art of Game Design: A Book of Lenses&lt;/a&gt;".  And, if you find Stick Portal fun or don't find it fun, please let me know!&lt;div&gt;&lt;img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/6569681-2911049147956696044?l=glinden.blogspot.com" alt=""&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/GeekingWithGreg/~4/94nH4XvGwxc" height="1" width="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/Osi4p7SBmBs" height="1" width="1"/&gt;</content><author><name>Greg Linden</name></author><source gr:stream-id="feed/http://glinden.blogspot.com/atom.xml"><id>tag:google.com,2005:reader/feed/http://glinden.blogspot.com/atom.xml</id><title type="html">Geeking with Greg</title><link rel="alternate" href="http://glinden.blogspot.com/" type="text/html" /></source><feedburner:origLink>http://feedproxy.google.com/~r/GeekingWithGreg/~3/94nH4XvGwxc/game-stick-portal.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337289532686"><id gr:original-id="tag:typepad.com,2003:post-6a010534b1db25970b0163059dedad970d">tag:google.com,2005:reader/item/e5c3e13321d5805d</id><category term="packages" scheme="http://www.sixapart.com/ns/types#category" /><category term="R" scheme="http://www.sixapart.com/ns/types#category" /><title type="html">Where&amp;#39;s Waldo? Image Analysis in R</title><published>2012-05-17T20:55:06Z</published><updated>2012-05-17T20:55:06Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/zwsKHy-iOEk/wheres-waldo-image-analysis-in-r.html" type="text/html" /><link rel="replies" href="http://blog.revolutionanalytics.com/2012/05/wheres-waldo-image-analysis-in-r.html" type="text/html" /><content xml:base="http://blog.revolutionanalytics.com/" xml:lang="en-US" type="html">&lt;div&gt;&lt;p&gt;R user Arthur Charpentier attempts to use the &lt;a href="http://blog.revolutionanalytics.com/2011/07/paul-murrell-on-incorporating-images-in-r-charts.html"&gt;raster library&lt;/a&gt; and R functions to find Waldo in a &amp;quot;Where&amp;#39;s Waldo&amp;quot; image:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b0168eb93863d970c-pi" style="display:inline"&gt;&lt;img alt="Screen Shot 2012-05-17 at 1.47.51 PM" border="0" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b0168eb93863d970c-800wi" title="Screen Shot 2012-05-17 at 1.47.51 PM"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Sadly, it turned out that Waldo was a bit too tricky to spot using these techniques. But Arthur did have more success identifying the US flag in a shot from the Apollo mission, and identifying answers in the form for a multiple-choice test. All of the R code is provided at the link below, so that&amp;#39;s a great place to start if you&amp;#39;re looking to do some image analysis in R yourself.&lt;/p&gt;
&lt;p&gt;Freakonometrics: &lt;a href="http://freakonometrics.blog.free.fr/index.php?post/2012/04/18/foundwaldo"&gt;Finding Waldo, a flag on the moon and multiple choice tests, with R&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/zwsKHy-iOEk" height="1" width="1"/&gt;</content><author><name>David Smith</name></author><source gr:stream-id="feed/http://blog.revolution-computing.com/atom.xml"><id>tag:google.com,2005:reader/feed/http://blog.revolution-computing.com/atom.xml</id><title type="html">Revolutions</title><link rel="alternate" href="http://blog.revolutionanalytics.com/" type="text/html" /></source><feedburner:origLink>http://blog.revolutionanalytics.com/2012/05/wheres-waldo-image-analysis-in-r.html</feedburner:origLink></entry><entry gr:crawl-timestamp-msec="1337287489749"><id gr:original-id="tag:typepad.com,2003:post-6a010534b1db25970b0163059ddacf970d">tag:google.com,2005:reader/item/5e65244388052a45</id><category term="big data" scheme="http://www.sixapart.com/ns/types#category" /><category term="R" scheme="http://www.sixapart.com/ns/types#category" /><category term="Revolution" scheme="http://www.sixapart.com/ns/types#category" /><category term="Rmedia" scheme="http://www.sixapart.com/ns/types#category" /><title type="html">Orbitz: R has become the data-mining tool of choice</title><published>2012-05-17T20:30:17Z</published><updated>2012-05-17T21:30:37Z</updated><link rel="alternate" href="http://feedproxy.google.com/~r/data-miningblogposts/~3/Dk3DrNGBsNI/orbitz-r-has-become-the-data-mining-tool-of-choice.html" type="text/html" /><link rel="replies" href="http://blog.revolutionanalytics.com/2012/05/orbitz-r-has-become-the-data-mining-tool-of-choice.html" type="text/html" /><content xml:base="http://blog.revolutionanalytics.com/" xml:lang="en-US" type="html">&lt;div&gt;&lt;p&gt;Sameer Chopra, vice president of Advanced Analytics at Orbitz Worldwide, &lt;a href="http://www.analytics-magazine.org/may-june-2012/572-executive-edge-the-times-they-are-a-changin-for-advanced-analytics"&gt;wrote recently&lt;/a&gt; in &lt;em&gt;Analytics m&lt;/em&gt;agazine about the changing landscape of processes, software and systems for statistical modelers. In a section on &amp;quot;Big Data and Open Source Analytics&amp;quot;, Chopra lays out the reasons why the R language &amp;quot;has become the data-mining tool of choice for machine learners&amp;quot;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;R has very good integration with Hadoop, an area where established commercial statistical tools have frankly been playing catch-up over the past year. (Note: At the time of this writing, some established statistical solution providers were announcing an access interface to Hadoop.)&lt;/li&gt;
&lt;li&gt;Many startups and smaller firms do not have deep pockets and are embracing open source tools such as the R programming language and NoSQL database systems such as MongoDB.&lt;/li&gt;
&lt;li&gt;R is a leading language for developing new statistical methods, and it is a platform for statistical innovation and collaboration across both the corporate world and academia. In my opinion, for the first time in years, the stronghold of established commercial players seems to be potentially threatened; open source tools are better suited for Big Data and will slowly but surely continue to take share away from commercialized statistical packages. In fact, traditional statistical vendors have recognized that R is a force to be reckoned with. In response, many of these vendors have developed hooks into R so users can interface with the R language.&lt;/li&gt;
&lt;li&gt;Based on the resumes I’ve been reading, the next generation of data miners is flocking to R as their go-to tool. Professors in general are comfortable with R; they tend to use R and Excel as part of their curriculum.&lt;/li&gt;
&lt;li&gt;In short, open-source analytics tools and platforms have arrived.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Chopra says that the usage of R in the commercial sector is growing &amp;quot;as firms such as Revolution Analytics focus on the enterprise capabilities for R&amp;quot; (for example, Revolution R Enterprise&amp;#39;s &lt;a href="http://www.revolutionanalytics.com/products/r-for-apache-hadoop.php"&gt;Hadoop support&lt;/a&gt; and &lt;a href="http://www.revolutionanalytics.com/products/enterprise-deployment.php"&gt;enterprise deployment&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Chopra also has some interesting perspectives on statistical modeling vs machine learning which you can find in the full article linked below.&lt;/p&gt;
&lt;p&gt;Analytics magazine: &lt;a href="http://www.analytics-magazine.org/may-june-2012/572-executive-edge-the-times-they-are-a-changin-for-advanced-analytics"&gt;The times they are a changin’ for advanced analytics&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/data-miningblogposts/~4/Dk3DrNGBsNI" height="1" width="1"/&gt;</content><author><name>David Smith</name></author><source gr:stream-id="feed/http://blog.revolution-computing.com/atom.xml"><id>tag:google.com,2005:reader/feed/http://blog.revolution-computing.com/atom.xml</id><title type="html">Revolutions</title><link rel="alternate" href="http://blog.revolutionanalytics.com/" type="text/html" /></source><feedburner:origLink>http://blog.revolutionanalytics.com/2012/05/orbitz-r-has-become-the-data-mining-tool-of-choice.html</feedburner:origLink></entry></feed>

