<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Jerome Cukier</title>
	
	<link>http://www.jeromecukier.net</link>
	<description>communicating with data</description>
	<lastBuildDate>Thu, 24 May 2012 13:03:36 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/JeromeCukier" /><feedburner:info uri="jeromecukier" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Making-of: the map of congress equality</title>
		<link>http://feedproxy.google.com/~r/JeromeCukier/~3/MB4ECtiZBHM/</link>
		<comments>http://www.jeromecukier.net/blog/2012/05/24/making-of-the-map-of-congress-equality/#comments</comments>
		<pubDate>Thu, 24 May 2012 10:36:37 +0000</pubDate>
		<dc:creator>jerome</dc:creator>
				<category><![CDATA[d3]]></category>
		<category><![CDATA[data visualization]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://www.jeromecukier.net/?p=1297</guid>
		<description><![CDATA[To my datavis readers, sorry for that string of posts in French but what better data to visualize than political data, and what better time to visualize political data than election time, and what better audience for such visualizations than the folks who are asked to vote? Like last time, though, I am writing a [...]]]></description>
			<content:encoded><![CDATA[<p><div id="attachment_1290" class="wp-caption aligncenter" style="width: 620px"><a href="http://www.jeromecukier.net/projects/elections/parite.html"><img class="size-full wp-image-1290 " title="La parité, c'est maintenant" src="http://www.jeromecukier.net/wp-content/uploads/2012/05/parite.png" alt="" width="610" height="572" /></a><p class="wp-caption-text">It&#39;s all about this map (click for interactive version)</p></div></p>
<p>To my datavis readers, sorry for that string of posts in French but what better data to visualize than political data, and what better time to visualize political data than election time, and what better audience for such visualizations than the folks who are asked to vote?</p>
<p><a href="http://www.jeromecukier.net/blog/2012/05/15/making-of-cutting-paris-in-voting-districts/">Like last time</a>, though, I am writing a follow-up technical post about how I dealt with the issues of this visualization.</p>
<p>So anyone who ever tried to make data visualizations knows that you can hardly start without <strong>data</strong>.</p>
<p>My ingredients for the recipe were:</p>
<p>2012 presidential election results <a href="http://www.data.gouv.fr/donnees/view/Pr%C3%A9sidentielle-2012%2C-r%C3%A9sultats-par-circonscriptions-l%C3%A9gislatives%2C-2%C3%A8me-tour-551768?xtmc=presidentielle+circonscription&amp;xtcr=4">by circonscription</a>, plus those <a href="http://www.data.gouv.fr/donnees/view/R%C3%A9sultats-par-circonscription-%C3%A9lection-pr%C3%A9sidentielle-2007---Tours-1-et-2-30379819?xtmc=presidentielle+circonscription&amp;xtcr=1">of 2007</a>.</p>
<p>Results of the <a href="http://www.data.gouv.fr/donnees/view/R%C3%A9sultats-par-circonscription-%C3%A9lections-l%C3%A9gislatives-2007---Tour-1-383529">previous</a> <a href="http://www.data.gouv.fr/donnees/view/R%C3%A9sultats-par-circonscription-%C3%A9lections-l%C3%A9gislatives-2007---Tour-2-383531">congressional </a>election. There were 2 files, one per round, as opposed to a flat file of <em>députés </em>in place (I didn&#8217;t find one that didn&#8217;t required some significant editing to be of use). Most importantly, I needed their political orientation which required some tweaking.</p>
<p>Matching tables between <a href="http://www.data.gouv.fr/donnees/view/Table-de-correspondance-des-communes-et-des-cantons-avec-les-circonscriptions-l%C3%A9gislat-551418?xtmc=communes+circonscription&amp;xtcr=1">circonscriptions and cities</a>.  From a previous project, presidential election data at the city level. Also, <a href="http://professionnels.ign.fr/ficheProduitCMS.do?idDoc=5323862">geo coordinates of the cities</a>.</p>
<p>The data which was most painful to extract was the list of candidates. In all fairness, UMP made it easier than PS as they had them all on <a href="http://www.u-m-p.org/actualites/a-la-une/candidats-investis-ou-soutenus-par-le-conseil-national-de-lump-du-4982102">a page</a>. For PS, they had a google fusion table which had <a href="https://www.google.com/fusiontables/DataSource?docid=14Nu51pqxD4PmZd8FlcSqQNruHR1yfZgg0QJwe_A">this</a> as a data source. That file required a lot of massaging. Eventually local pages of the PS site would list the candidates missing from the map (or provide alternate names). When it was up, I also used the <a href="http://www.elections-legislatives.fr/">http://www.elections-legislatives.fr/</a> site to check for the missing names.</p>
<p>Finally, I figured out the genders of all the candidates by extracting their first name and looking up all the ones I wasn&#8217;t sure about (there are quite a few unisex first names in French).</p>
<p>Now <strong>calculations</strong>.</p>
<p>There is a pretty strong statistical link between the score of a party on an election in a certain territory, and the chances of a congress candidate of the same party of winning the district.</p>
<p>Predicting these chances is a well-known problem known as <em>classification </em> for which the textbook method is <em>logistic regression.</em></p>
<p>All we need was the 575 districts for which I had results. We then associate the score of a party at the 2nd round of the election to whether the corresponding congress candidate got elected (1 or 0). That gives us <a href="http://www.jeromecukier.net/projects/elections/deputes.txt">1150 pairs of values</a> which we throw in the mathematical cooking pot.</p>
<p>And what we get is the following formula:</p>
<p><a href="http://www.jeromecukier.net/wp-content/uploads/2012/05/equation.png"><img class="aligncenter size-full wp-image-1305" title="equation" src="http://www.jeromecukier.net/wp-content/uploads/2012/05/equation.png" alt="" width="297" height="66" /></a></p>
<p>where x is the score in the previous election (between 0 and 1).  As you can see when x gets close to 0, the denominator becomes a very large number and the probability quickly drops to virtually nothing, and converserly when x gets close to 1, the denominator becomes very close to 1 so the probability rises up to 1 equally fast.</p>
<p><iframe style="border: none;" src="http://www.jeromecukier.net/stuff/littlesigmoid.html" scrolling="no" width="502px" height="270px"></iframe></p>
<p>With this and that in place, it is possible to come up with a <a href="http://www.jeromecukier.net/projects/elections/pariteFM.csv">reasonable estimation</a> of the chances of any candidate based on the recent results. As an aside, the current Prime Minister has renewed the tradition started by his predecessor to ask ministers to seek office and to force them to step down if they fail to win their district. As a result, 24 out of the 37 ministers are campaigning. Out of those 24, 2 are taking very serious risks according to this model: Marie-Anne Carlotti and Benoît Hamon.</p>
<p>Finally, <strong>geography</strong>.</p>
<p>In an ideal world, there will be an abundance of geoJSON files describing France and its many administrative entities. Usable data must exist somewhere, because the maps on <a href="view-source:http://www.elections-legislatives.fr/candidats.asp">www.elections-legislatives.fr</a> have all been generated (by Raphael.js says the source code). If I&#8217;m doing another project on these elections I might reverse engineer the shape of the maps to extract the coordinates.</p>
<p>Without a dataset, the work of drawing the boundaries of 577 districts is just huge. However, accuracy is not required as I&#8217;m only putting the districts on a map so people can look up where they live or places they know. <a href="http://www.jeromecukier.net/blog/2012/05/15/making-of-cutt…ting-districts/">In my previous work</a> in order to let users change the composition of the districts, I wanted to be rigorous in the placement of everything but here we can live with imperfection.</p>
<p>So I am using the same <a href="http://www.jeromecukier.net/blog/2012/05/15/making-of-cutt…ting-districts/">principle as I did</a>: voronoi tesselation.</p>
<p>For each district I am picking the largest city, for which I have the coordinates. But most large cities belong to several districts. So I am adding random noise to each point. Then, I am drawing shapes around them.</p>
<p><iframe style="border: none;" src="http://www.jeromecukier.net/stuff/littlevoronoi.html" scrolling="no" width="502px" height="270px"></iframe></p>
<p>That would normally fill a rectangle, so in order to make it look like France, I have drawn a <a href="http://commons.oreilly.com/wiki/index.php/SVG_Essentials/Clipping_and_Masking">clipping mask</a> on top of it (that, I&#8217;ve done by hand, picking coordinates of the outline of France).</p>
<p>That about wraps it up!</p>
<img src="http://feeds.feedburner.com/~r/JeromeCukier/~4/MB4ECtiZBHM" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.jeromecukier.net/blog/2012/05/24/making-of-the-map-of-congress-equality/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.jeromecukier.net/blog/2012/05/24/making-of-the-map-of-congress-equality/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=making-of-the-map-of-congress-equality</feedburner:origLink></item>
		<item>
		<title>La parité, c’est maintenant?</title>
		<link>http://feedproxy.google.com/~r/JeromeCukier/~3/vURHzcb8l3k/</link>
		<comments>http://www.jeromecukier.net/blog/2012/05/23/la-parite-cest-maintenant/#comments</comments>
		<pubDate>Wed, 23 May 2012 15:56:51 +0000</pubDate>
		<dc:creator>jerome</dc:creator>
				<category><![CDATA[d3]]></category>
		<category><![CDATA[data visualization]]></category>
		<category><![CDATA[français]]></category>

		<guid isPermaLink="false">http://www.jeromecukier.net/?p=1289</guid>
		<description><![CDATA[J&#8217;ai eu l&#8217;idée de créer cette carte quand j&#8217;ai pu mettre la main sur les données des résultats de la présidentielle par circonscription législative. En 2007, les résulats du deuxième tour ont été un très bon prédicteur de ceux des législatives qui ont suivi: une circonscription pour qui un candidat à la présidentielle au second [...]]]></description>
			<content:encoded><![CDATA[<p><div id="attachment_1290" class="wp-caption aligncenter" style="width: 620px"><a href="http://www.jeromecukier.net/projects/elections/parite.html"><img class="size-full wp-image-1290" title="La parité, c'est maintenant" src="http://www.jeromecukier.net/wp-content/uploads/2012/05/parite.png" alt="" width="610" height="572" /></a><p class="wp-caption-text">Cliquez sur l&#39;image pour arriver à l&#39;application interactive</p></div></p>
<p>J&#8217;ai eu l&#8217;idée de créer cette carte quand j&#8217;ai pu mettre la main sur les données des résultats de la présidentielle par circonscription législative. En 2007, les résulats du deuxième tour ont été un très bon prédicteur de ceux des législatives qui ont suivi: une circonscription pour qui un candidat à la présidentielle au second tour a fait ne serait-ce que 52% a plus de 75% de chances d&#8217;être remportée par le député du même parti.<br />
Et encore, tout le monde s&#8217;était accordé à dire qu&#8217;en 2007, la gauche avait fait une très bonne campagne et qu&#8217;elle avait <a href="http://www.rfi.fr/actufr/articles/090/article_52924.asp">endigué la vague bleue</a>. Les probabilités sont sûrement encore plus élevées.</p>
<p>Or, presque 80% des circonscriptions ont été gagnées avec un score supérieur à 52%, ce qui est donc une très grosse marge. Dans plus du quart d&#8217;entre elles, le gagnant du scrutin a même récolté plus de 60% des voix&#8230;</p>
<p>Bref. Dans la plupart des circonscriptions, il n&#8217;y aura pas tellement de suspsens. Là où je vote, on n&#8217;a pas souvent le droit à un deuxième tour.</p>
<p>Comme pour<a href="http://www.jeromecukier.net/blog/2012/05/15/le-decoupage-de-paris-en-circonscriptions/"> le découpage électoral</a>, je trouve ça un peu dérangeant. L&#8217;élection n&#8217;est pas tant la rencontre entre une personne et une population qui la choisit, mais surtout le fait d&#8217;un parti qui place ses pions, surtout si on rajoute les &#8220;accords électoraux&#8221;. Je crois que j&#8217;ai moins de chance de rencontrer la candidate <em>challenger</em> de ma circonscription que de voir François Hollande ou Nicolas Sarkozy &#8220;en vrai&#8221;.</p>
<p>Alors pourquoi ne pas en profiter pour s&#8217;approcher de la parité à l&#8217;assemblée?</p>
<p>Pourquoi ne pas le faire d&#8217;abord: si un parti ne présente pas autant de femmes que d&#8217;hommes, il récupère une amende. Ou plutôt, comme l&#8217;explique bien <a href="http://www.lemonde.fr/politique/article/2012/05/22/les-manquements-a-la-parite-coutent-cher-aux-partis_1705246_823448.html">Alexandre Léchenet</a>,  il perd des financements. Mais le mode de calcul est biaisé. Un parti récupère une certaine somme par voix  au niveau national, puis cette somme est minorée si les femmes représentent moins de la moitié des candidats. Pour contrer ce système, il aurait été plus judicieux de le baser sur la proportion des voix <em>remportées</em> par des femmes, pas sur celles qu&#8217;on a alignées au départ.</p>
<p>Donc, on envoie les femmes au casse-pipe: on les met dans des circonscriptions impossibles à gagner, histoire d&#8217;éviter l&#8217;amende. Il y a presque 100 candidates qui se retrouvent contre le représentant d&#8217;un parti qui a fait plus de 55% au second tour.</p>
<p>A Paris, par exemple, Annie Novelli défie Claude Goasguen dans la 14ème circonscription (qui a voté à 77% pour Sarkozy) et c&#8217;est Agnès Pannier qui affronte Bernard Debré dans la 4ème (75% pour Sarkozy). Pendant ce temps, Roxane Decorte se frotte à Daniel Vaillant dans la 17ème circonscription (qui a voté Hollande à 72%).</p>
<p>Et encore, en théorie elles pourraient gagner, mais dans 238 circonscriptions, soit plus de 40%, ni le PS ni l&#8217;UMP n&#8217;ont investi de femme, comme ça on est tranquille.</p>
<p>Ce qui fait qu&#8217;au final, même s&#8217;il y a 40% de candidates, le nombre d&#8217;élues devrait tourner autour de 175 soit 28%. Ce serait quand même presque 70 de plus qu&#8217;aujourd&#8217;hui, malgré le cynisme de l&#8217;actuel patron de l&#8217;UMP. Espérons qu&#8217;elle pourront aller à l&#8217;Assemblée<a href="http://www.lemonde.fr/politique/article/2011/06/07/l-assemblee-se-defend-d-etre-un-bastion-du-sexisme_1533148_823448.html"> habillées comme elles veulent</a>.</p>
<p><a href="http://www.insee.fr/fr/themes/tableau.asp?reg_id=0&amp;ref_id=NATnon02145">Mine de rien</a>, il y aurait 51.5% de femmes en France.</p>
<img src="http://feeds.feedburner.com/~r/JeromeCukier/~4/vURHzcb8l3k" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.jeromecukier.net/blog/2012/05/23/la-parite-cest-maintenant/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.jeromecukier.net/blog/2012/05/23/la-parite-cest-maintenant/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=la-parite-cest-maintenant</feedburner:origLink></item>
		<item>
		<title>Making-of: cutting Paris in voting districts</title>
		<link>http://feedproxy.google.com/~r/JeromeCukier/~3/qe3vq50Q2Ao/</link>
		<comments>http://www.jeromecukier.net/blog/2012/05/15/making-of-cutting-paris-in-voting-districts/#comments</comments>
		<pubDate>Tue, 15 May 2012 13:33:44 +0000</pubDate>
		<dc:creator>jerome</dc:creator>
				<category><![CDATA[d3]]></category>
		<category><![CDATA[data visualization]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://www.jeromecukier.net/?p=1277</guid>
		<description><![CDATA[Hi, in my previous post I showcased one of my recent projects. I really enjoyed building it and so would like to share how this has been done. First, getting the data. I already scraped the results of both rounds of the presidential election by city. The districts for the congress election are also known, [...]]]></description>
			<content:encoded><![CDATA[<p>Hi, in my previous post I showcased <a href="http://www.jeromecukier.net/projects/elections/circonscriptions.html#">one of my recent projects</a>. I really enjoyed building it and so would like to share how this has been done.</p>
<p>First, getting the data. I already scraped the results of both rounds of the presidential election by city. The districts for the congress election are also known, but it&#8217;s not possible to do a match, because large cities are almost systematically broken down into several such districts. Paris, for instance, will be represented by no less than 18 <em>députés</em>.</p>
<p>So I needed the results by the finest possible unit, that is by individual polling station. During the election night these results are compiled by city and centralized, so you would assume that the raw data of each polling station is available somewhere. That is not the case, unfortunately. Although it seems that they will be made public eventually, that may not be the case before the June 2012 election.</p>
<p>Fortunately, <a href="http://opendata.paris.fr/opendata/jsp/site/Portal.jsp?document_id=133&amp;portlet_id=102">Open Data Paris</a> had the results by polling station. More: it had their address and matching of every inhabited building in Paris to its corresponding polling station.</p>
<p>To map the polling stations, my first intuition was to create a <a href="http://en.wikipedia.org/wiki/Voronoi_diagram">voronoi tesselation</a> of their projected, geocoded coordinates (I only had their addresses in the raw data file). In short, voronoi polygons can be generated for a certain number of control points and correspond to the area nearer to that control point than to any other. So it&#8217;s a good approximation of the areas  which correspond to a given polling station.</p>
<p><iframe style="border:none;" src="http://www.jeromecukier.net/stuff/littlevoronoi.html" scrolling="no" width="502px" height="270px"></iframe></p>
<p>Problem: several polling stations could be in the same address, and for the voronoi algorithm the control points have to be distinct. So I tried jittering them (adding random noise to each one). A tesselation could be done that kind of looked like Paris but voting districts will look messy as there were frequent inversions between neighboring districts.</p>
<p>So I had to come up with a better approximation of what part of the city corresponded to what voting district. So I used the address to polling station correspondance, and for each polling station I took the first and the last street number of any street that was covered by it. Then I geocoded the whole lot. That&#8217;s about 16000 points. It took some time.</p>
<p><div id="attachment_1280" class="wp-caption aligncenter" style="width: 616px"><a href="http://www.jeromecukier.net/wp-content/uploads/2012/05/centroids.png"><img class=" wp-image-1280" title="centroids" src="http://www.jeromecukier.net/wp-content/uploads/2012/05/centroids.png" alt="" width="606" height="478" /></a><p class="wp-caption-text">Here&#39;s my polling station as an example.</p></div></p>
<p>Then, for each polling station area, I took the minimum and maximum longitude and latitude, which formed a bounding box, and assigned the polling station to the center of that box. Then, I used tesselation again.</p>
<p>I found a number of oddities in the geocoding that I had to correct manually, because if one address was not accurately coded, chances are it would change the shape of the bounding box drastically and so the position of its corresponding polling station. Sometimes the geocoding service wouldn&#8217;t find the street and/or would use a street of the same name in another city, sometimes they did find the street but the coordinates were way off&#8230; So the dataset required a lot of massaging before it got into shape.</p>
<p>The last geographic errand I had to do for this visualization was to create a perimeter of Paris to use as a clipping mask, else the tesselation would be done on a rectangular shape with the edge polygons being very large and very skewed. So I collected coordinates of points around Paris to create one polygon. Only what&#8217;s inside of this polygon is shown (.style(&#8220;clip-path&#8221;) in d3).</p>
<p>After the data has been acquired, the building of the rest of the datavis was nothing special. I have used extensively mouseover and click events to trigger transformations as I always do, although this time I did prepare a lot of rules.</p>
<p>Originally I wanted to make the whole of France like this, though it will be difficult: one, to get the data, and two, to get it into shape. As of today the location (i.e. street address) of most of the polling stations is not available online, so even if we got the number of votes for each of the polling stations (there should be about 40000 of them) the geographic part of the problem will remain unsolved. Though, it&#8217;s a worthy endeavour. While the election results have little interest at a macro-geographic level &#8211; by region or by <em>département &#8211; </em>they are very useful at a very fine level as strategies can be constructed.</p>
<p>For instance, it&#8217;s worthwhile to send heavyweights to conquer districts that are winnable, but it&#8217;s a waste to keep them in their respective fiefdoms if victory in these districts is already certain. Also, when districts would have to be redefined, having this kind of information can be invaluable to the political force which gets to draw their new limits, or to their opponents.</p>
<img src="http://feeds.feedburner.com/~r/JeromeCukier/~4/qe3vq50Q2Ao" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.jeromecukier.net/blog/2012/05/15/making-of-cutting-paris-in-voting-districts/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		<feedburner:origLink>http://www.jeromecukier.net/blog/2012/05/15/making-of-cutting-paris-in-voting-districts/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=making-of-cutting-paris-in-voting-districts</feedburner:origLink></item>
		<item>
		<title>Le découpage de Paris en circonscriptions</title>
		<link>http://feedproxy.google.com/~r/JeromeCukier/~3/fQ7tbNmy9hI/</link>
		<comments>http://www.jeromecukier.net/blog/2012/05/15/le-decoupage-de-paris-en-circonscriptions/#comments</comments>
		<pubDate>Tue, 15 May 2012 13:33:18 +0000</pubDate>
		<dc:creator>jerome</dc:creator>
				<category><![CDATA[data visualization]]></category>
		<category><![CDATA[français]]></category>

		<guid isPermaLink="false">http://www.jeromecukier.net/?p=1273</guid>
		<description><![CDATA[Mon dernier projet permet de voir les résultats des élections présidentielles à Paris par bureau de vote et de les projeter sur les circonscriptions qui serviront aux élections législatives de juin 2012. Et surtout il permet de changer la composition de ces circonscriptions, dont le tracé aujourd&#8217;hui est assez arbitraire. Il y aura 18 circonscriptions à [...]]]></description>
			<content:encoded><![CDATA[<p>Mon <a href="http://www.jeromecukier.net/projects/elections/circonscriptions.html">dernier projet</a> permet de voir les résultats des élections présidentielles à Paris par bureau de vote et de les projeter sur les circonscriptions qui serviront aux élections législatives de juin 2012.</p>
<p><a href="http://www.jeromecukier.net/projects/elections/circonscriptions.html"><img class="aligncenter size-full wp-image-1274" title="votes" src="http://www.jeromecukier.net/wp-content/uploads/2012/05/votes.jpg" alt="" width="556" height="385" /></a>Et surtout il permet de changer la composition de ces circonscriptions, dont le tracé aujourd&#8217;hui est assez arbitraire. Il y aura 18 circonscriptions à Paris contre 21 aujourd&#8217;hui, et elles ne suivent pas les arrondissements.</p>
<p>Le tracé de ces circonscriptions est déterminant pour le résultat des élections. Aujourd&#8217;hui, par exemple, il y a deux circonscriptions où Nicolas Sarkozy a récupéré plus de 75% des voix au 2ème tour de l&#8217;élection présidentielle, j&#8217;imagine que la gauche ne place pas trop d&#8217;espoir sur leur reconquête. De même, il existe pas moins de 9 circonscriptions où François Hollande a reçu plus de 60% des votes. En l&#8217;état actuel des choses, 12 circonscriptions semblent acquises à la gauche et 6 à la droite, dont 3 pourraient peut-être quand même être gagnées par la gauche.</p>
<p>Le découpage actuel n&#8217;est optimum ni pour la gauche, ni pour la droite. En modifiant le tracé des circonscriptions, la gauche pourrait toutes les remporter, et la droite pourrait en gagner 12 sur 18 (ou peut-être plus, 12 restant mon <em>high-score</em> personnel). Pour favoriser un camp, l&#8217;idée consiste à répartir les bureaux de votes les plus favorables entre le plus de circonscriptions possibles, plutôt que de les garder dans peu de circonscriptions. En généralisant sur le territoire, on imagine ce que ça peut donner!</p>
<p>Donc, quelque soit le sentiment actuel, tel ou tel redécoupage peut complètement redistribuer les cartes. C&#8217;est un sentiment dérangeant parce que ces redécoupages arrivent régulièrement et sont relativement opaques. D&#8217;ailleurs, il est assez difficile de faire le lien entre les données des élections présidentielles et les circonscriptions législatives parce que les résultats ne sont que rarement disponibles par bureau de vote.</p>
<p>Je donnerai les détails techniques de l&#8217;implémentation dans un futur post.</p>
<img src="http://feeds.feedburner.com/~r/JeromeCukier/~4/fQ7tbNmy9hI" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.jeromecukier.net/blog/2012/05/15/le-decoupage-de-paris-en-circonscriptions/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		<feedburner:origLink>http://www.jeromecukier.net/blog/2012/05/15/le-decoupage-de-paris-en-circonscriptions/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=le-decoupage-de-paris-en-circonscriptions</feedburner:origLink></item>
		<item>
		<title>À la découverte des résultats des présidentielles avec les coordonnées parallèles</title>
		<link>http://feedproxy.google.com/~r/JeromeCukier/~3/B1xrWHiOEWw/</link>
		<comments>http://www.jeromecukier.net/blog/2012/05/07/a-la-decouverte-des-resultats-des-presidentielles-avec-les-coordonnees-paralleles/#comments</comments>
		<pubDate>Mon, 07 May 2012 17:20:39 +0000</pubDate>
		<dc:creator>jerome</dc:creator>
				<category><![CDATA[d3]]></category>
		<category><![CDATA[data visualization]]></category>
		<category><![CDATA[français]]></category>
		<category><![CDATA[coordonnées parallèles]]></category>
		<category><![CDATA[élections]]></category>

		<guid isPermaLink="false">http://www.jeromecukier.net/?p=1264</guid>
		<description><![CDATA[Qui dit élection présidentielle, dit résultats. Et qui dit résultats dit représentation visuelle dans les médias. Et donc souvent carte. Cela dit, une carte ne nous apprend pas grand chose de ce qui s&#8217;est vraiment passé dans une élection. C&#8217;est vrai que c&#8217;est utile pour retrouver sa région ou sa ville, mais les résultats de [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><a href="http://www.jeromecukier.net/projects/elections/dtour2012.html"><img class="aligncenter size-full wp-image-1265" title="front" src="http://www.jeromecukier.net/wp-content/uploads/2012/05/front.jpg" alt="" width="700" height="374" /></a></p>
<p>Qui dit élection présidentielle, dit résultats.</p>
<p>Et qui dit résultats dit représentation visuelle dans les médias.</p>
<p>Et donc souvent carte.</p>
<p>Cela dit, une carte ne nous apprend pas grand chose de ce qui s&#8217;est vraiment passé dans une élection. C&#8217;est vrai que c&#8217;est utile pour retrouver sa région ou sa ville, mais les résultats de deux villes proches géographiquement (par exemple, Boulogne-Billancourt et Issy-les-Moulineaux) n&#8217;a strictement rien à voir.</p>
<p>En revanche, avec une carte c&#8217;est très difficile de répondre à certaines questions, comme: où a-t-on le plus voté pour Nicolas Sarkozy? Qu&#8217;est ce qui s&#8217;est passé dans les villes qui ont beaucoup voté pour Marine Le Pen ou François Bayrou? D&#8217;où vient l&#8217;explosion du vote blanc au deuxième tour?</p>
<p>Pour répondre à ces questions, on peut utiliser les <a href="http://www.jeromecukier.net/projects/elections/dtour2012.html">coordonnées parallèles</a>.</p>
<p>Chaque axe vertical correspond à un vote possible &#8211; d&#8217;abord ceux du premier tour, puis ceux du deuxième. Chaque ligne de couleur correspond à une ville (ou à un département, à une région, ou à la France). Chacune de ces lignes coupe chaque axe à une hauteur qui correspond à la proportion des gens qui ont fait tel ou tel choix. Par exemple, les villes où on a beaucoup voté pour Jean-Luc Mélenchon, comme Bagnolet ou Gennevilliers, se retrouveront vers le haut de l&#8217;axe du milieu.</p>
<p>En s&#8217;approchant des axes, le curseur devient une croix. Il suffit alors de le faire glisser le long de l&#8217;axe pour dessiner un rectangle, et mettre en valeur toutes les lignes qui passent par ce rectangle. Par exemple, voici les villes qui ont donné plus de 70% de leurs voix du second tour à François Hollande:</p>
<p><a href="http://jeromecukier.net/projects/elections/dtour2012.html"><img class="aligncenter size-full wp-image-1266" title="hollande70" src="http://www.jeromecukier.net/wp-content/uploads/2012/05/hollande70.jpg" alt="" width="700" height="500" /></a></p>
<p>&nbsp;</p>
<p>On peut tracer autant de rectangles qu&#8217;on veut. Par exemple, on peut ne garder que les endroits qui avaient soutenu Bayrou (par exemple à plus de 10%).</p>
<p><a href="http://jeromecukier.net/projects/elections/dtour2012.html"><img class="aligncenter size-full wp-image-1267" title="hollande70bayrou10" src="http://www.jeromecukier.net/wp-content/uploads/2012/05/hollande70bayrou10.jpg" alt="" width="700" height="500" /></a></p>
<p>et voilà: il n&#8217;y en a plus que 2.</p>
<p>Pour supprimer une sélection, il suffit de cliquer près de l&#8217;axe sans glisser. Si on clique sur un rectangle, on peut aussi le faire coulisser le long de l&#8217;axe.</p>
<p>Avec cette technique, on peut tout de suite voir les fiefs de tel ou tel candidat: il s&#8217;agit des lignes qui touchent le haut du graphique.</p>
<p>On peut répondre à des questions plus complexes. Par exemple, je parlais de l&#8217;explosion du vote blanc au deuxième tour. Que s&#8217;est-il passé?</p>
<p>Sélectionnons les endroits où le vote blanc a dépassé les 4% au deuxième tour, tout en restant à moins de 1.5% au premier tour:</p>
<p><a href="http://www.jeromecukier.net/projects/elections/dtour2012.html"><img class="aligncenter size-full wp-image-1268" title="blanc" src="http://www.jeromecukier.net/wp-content/uploads/2012/05/blanc.jpg" alt="" width="700" height="370" /></a></p>
<p>On voit trois pics: les villes où Marine Le Pen, Jean-Luc Mélenchon ou François Bayrou ont fait un très gros score. On peut continuer à sélectionner et voir que ce sont bien les villes qui sont dans la pointe de l&#8217;un des trois triangles et pas celles qui passent par le bas qui se retrouvent dans ce cas de figure. Ces électeurs ont préféré voter blanc plutôt que de choisir.</p>
<p>Bonne exploration!</p>
<p><a href="http://www.jeromecukier.net/projects/elections/dtour2012.html">http://www.jeromecukier.net/projects/elections/dtour2012.html</a></p>
<img src="http://feeds.feedburner.com/~r/JeromeCukier/~4/B1xrWHiOEWw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.jeromecukier.net/blog/2012/05/07/a-la-decouverte-des-resultats-des-presidentielles-avec-les-coordonnees-paralleles/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.jeromecukier.net/blog/2012/05/07/a-la-decouverte-des-resultats-des-presidentielles-avec-les-coordonnees-paralleles/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=a-la-decouverte-des-resultats-des-presidentielles-avec-les-coordonnees-paralleles</feedburner:origLink></item>
		<item>
		<title>See#7 conference</title>
		<link>http://feedproxy.google.com/~r/JeromeCukier/~3/kY1-m3h7vBQ/</link>
		<comments>http://www.jeromecukier.net/blog/2012/05/07/see7-conference/#comments</comments>
		<pubDate>Mon, 07 May 2012 16:43:12 +0000</pubDate>
		<dc:creator>jerome</dc:creator>
				<category><![CDATA[conferences]]></category>
		<category><![CDATA[datavis]]></category>
		<category><![CDATA[see conference]]></category>
		<category><![CDATA[see+]]></category>
		<category><![CDATA[wiesbaden]]></category>

		<guid isPermaLink="false">http://www.jeromecukier.net/?p=1255</guid>
		<description><![CDATA[Last week, I had the privilege to attend the See#7 conference in Wiesbaden. I wrote a quick post summarizing the immediate feelings I had on my way back so here&#8217;s a more detailed follow-up. Three things to know about the conference. It&#8217;s one of the main events on visualization in Europe. The main event takes [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.jeromecukier.net/wp-content/uploads/2012/05/into-eternity-poster.jpg"><br />
</a>Last week, I had the privilege to attend the <a href="http://see-conference.com/">See#7 conference</a> in Wiesbaden. I wrote a <a href="http://wp.me/po630-ka">quick post</a> summarizing the immediate feelings I had on my way back so here&#8217;s a more detailed follow-up.</p>
<p>Three things to know about the conference. It&#8217;s one of the main events on visualization in Europe. The main event takes place in a church which sits hundreds.</p>
<p><a href="http://farm8.staticflickr.com/7211/7135484013_0d34768b01_z.jpg"><img class="aligncenter" src="http://farm8.staticflickr.com/7211/7135484013_0d34768b01_z.jpg" alt="" width="640" height="427" /></a></p>
<p>Also, the conference is not called <em>die Konferenz zur Visualisierung von Information</em> for no reason. It&#8217;s mostly German-centric.</p>
<p>Finally, the conference&#8217;s approach to talking about visualization is to offer some height of view by inviting experts who do not work in information visualization proper but offer an interesting perspective to the field from their point of view.</p>
<p>Videos from all talks will be made available shortly.</p>
<p>The first speaker was Dr Thomas Henningsen from Greenpeace and was about how the organization is using visual impact for their agenda. Greenpeace goes indeed to considerable lengths to take the one picture that shows that what they are fighting is not an abstract possibility but something very tangible. Sometimes they create the picture as in this example,</p>
<p><a href="http://www.faz.net/polopoly_fs/1.929590!/image/2064909508.jpg_gen/derivatives/default/2064909508.jpg"><img class="aligncenter" src="http://www.faz.net/polopoly_fs/1.929590!/image/2064909508.jpg_gen/derivatives/default/2064909508.jpg" alt="&quot;If the planet were a bank, you would have saved it long ago&quot;" width="552" height="357" /></a></p>
<p>sometimes they capture a certain moment, but they also use charts and data to make their point.</p>
<p>Next was Prof. Dr. Norbert Bolz. From the opinion of many of the native German speakers I heard this was the highlight of the day. As my German is somewhat rusty and as he didn&#8217;t use visual aides I admit I missed a lot of it. Prof. Bolz is presented on the see conference site as a media scholar and many of his books were on display at the conference. His talk was on how can one inform and be memorable through images. So here are parts I grabbed. To be remarkable a piece of information has to be new, it has to be on something that the reader didn&#8217;t know. This is very different from being important. A lot of the talk was also about the memorable faculty of images (Prägnanz), that images can&#8217;t be cancelled.</p>
<p>The next speaker was <a href="http://itsbeenreal.co.uk/">Stefanie Posavec</a> who describes herself as a data illustrator. Like most people, you may think of data visualization  as a supervised but automated process by which a computer generates an image based on data and a set of rules. Then, enters the data illustrator, who works entirely by hand.</p>
<p>Perhaps Stefanie&#8217;s best-known work is the representation she&#8217;s done of Kerouac&#8217;s<em> On the Road</em>:</p>
<p><a href="http://www.itsbeenreal.co.uk/files/lrg-literary-organism-poste.jpg"><img class="aligncenter" src="http://www.itsbeenreal.co.uk/files/lrg-literary-organism-poste.jpg" alt="" width="600" height="849" /></a></p>
<p>which most people assume is a fine example of generative art. Wrong: every. single. element. is. placed. by. hand.</p>
<p>Stefanie took us through her project and shared how she works, how she collects data and encodes it (she did bring a computer).</p>
<p><a href="http://www.jeromecukier.net/wp-content/uploads/2012/05/stefanie1.jpg"><img class="size-full wp-image-1258 aligncenter" title="stefanie" src="http://www.jeromecukier.net/wp-content/uploads/2012/05/stefanie1.jpg" alt="" width="600" height="450" /></a></p>
<p>Here&#8217;s a picture I took at the workshop the next day along with some of her sketches.</p>
<p>Stefanie always seems to be apologetic that she does not write code which is ironic considering that this approach is what makes her work unique. During the workshop, she took us through specifications that she had written for a developer to create an interactive visualization and which were insanely detailed. Everybody who writes code would be really thrilled to be able to rely on such a structured document!</p>
<p>The next speaker was Ben Kreukniet from <a href="http://www.uva.co.uk/">UnitedVisualArtists</a>. Now UVA may not be a household name, but how about Red Hot Chili Peppers or U2? Remember how everyone was talking about the gigantic scenic structure on the U2 last tour, which AFAIK was the highest-grossing concert tour ever? that was UVA&#8217;s work. They are lighting artists who specialize in large installations, and by large they mean friggin&#8217; epic.</p>
<p>Ben&#8217;s talk took us through their work, with a focus on their &#8220;origin&#8221; project:</p>
<p><a href="http://www.uva.co.uk/wp-content/gallery/origin/uva_origin_1625.jpg"><img class="aligncenter" src="http://www.uva.co.uk/wp-content/gallery/origin/uva_origin_1625.jpg" alt="" width="814" height="543" /></a></p>
<p>Origin is a gigantic cube of light and sound which is made to interact with an audience. During the workshop the next day, UVA showed us the tools they use to work, which revolves around a platform they call d3 (though not that <a href="http://d3-js.org">d3</a>). And we had the privilege to preview their next work which would be an advertisement campaign to be shown in movie theaters.</p>
<p>The next speaker was <a href="http://www.legoman.net/site/index.php/en/">Yannick Jacquet</a> from the <a href="http://www.antivj.com">antiVJ</a> collective. More than a portfolio talk, this was an introduction to what VJing is about for the many of us who only had a vague idea. Basically, VJing is about showing moving pictures. But it doesn&#8217;t have to be ugly psychedelic shapes moving on flat screens in night clubs. antiVJ was created in reaction to this reductive view of the field. Through a technique called video-mapping, VJs can use one projector to cast images on many separate surfaces, and with a couple of projectors which can be controlled from just one computer, they can cover very complex geometries.</p>
<p><a href="http://www.legoman.net/site/images/works/intangible-states/intangible-states1.jpg"><img class="aligncenter" src="http://www.legoman.net/site/images/works/intangible-states/intangible-states1.jpg" alt="" width="800" height="364" /></a></p>
<p>Next was Michael Madsen. I covered his workshop talk in my previous post, but during the conference he presented his movie &#8220;Into Eternity&#8221;:</p>
<p style="text-align: center;"><a href="http://www.jeromecukier.net/wp-content/uploads/2012/05/into-eternity-poster.jpg"><img class="aligncenter" title="into-eternity-poster" src="http://www.jeromecukier.net/wp-content/uploads/2012/05/into-eternity-poster-747x1024.jpg" alt="" width="598" height="819" /></a></p>
<p>Here&#8217;s the story. In Finland, law requires that nuclear waste be disposed within the country. So a company is building a bunker to bury it deep, deep within the earth &#8211; eventually, canisters of nuclear waste will be stored 4000 meters underground. In 2100, the facility will be reaching its capacity and it will be sealed and expected to remain undisturbed for the next 100 000 years. This is the first human creation designed with such a horizon. Contrary to religious buildings which are being built &#8220;forever&#8221; everything has been done to give the facility the highest chances to last 100 000 years. This led to surprising choices. First, the location of the facility is secret. When it will be sealed, there will be no distinctive mark at all. And while there is a possibility of a next ice age or a similar global disaster and with that the possibility that the security be breached. This is precisely on what the movie is about &#8211; which decisions were made, why, with what perspective.</p>
<p>So documentaries are all about telling stories visually, which could also be told of data visualization. Only for documentaries, the angle &#8211; a combination of the subject, the approach, the questions that need to be answered &#8211; is the result of an overly elaborate research, which we sometimes do in datavis, and sometimes don&#8217;t. More often than not it&#8217;s tempting to just go ahead with a form that agrees with the dataset though, so this work process is a welcome perspective.</p>
<p>The last talk of the conference was by none other than Manuel Lima, who pioneered visualization blogs with <a href="http://www.visualcomplexity.com">visualcomplexity</a>. Both his talks (to the exception of the description of his experience at Bing) relate to material in his highly-recommended book, <a href="http://www.visualcomplexity.com/vc/book/">Visual Complexity: Mapping Patterns of Information</a>, the See conference talk focusing more on trees and hierarchical displays of information and the second one more on networks.</p>
<p><div id="attachment_1259" class="wp-caption aligncenter" style="width: 610px"><a href="http://www.jeromecukier.net/wp-content/uploads/2012/05/manuel.jpg"><img class="size-full wp-image-1259" title="manuel" src="http://www.jeromecukier.net/wp-content/uploads/2012/05/manuel.jpg" alt="" width="600" height="450" /></a><p class="wp-caption-text">Trees can be taken literally</p></div></p>
<p>This was definitely the closest talk to actual datavis practice. The principles he exposed really come to life with the examples, so I will encourage you to watch <a href="http://www.see-conference.org/video-stream/">his talk in its entirety</a> (it should be available on the 12th of May). In passing, in his workshop talk he mentioned that most of his &#8220;ancient&#8221; examples come from an out-of-print grimmoire called <em>The Album of Science</em>, I found one for $4 on Amazon so you may want to check it out.</p>
<p>That about wraps it up.</p>
<p>Pros of see conference:</p>
<ul>
<li>really cheap to attend from anywhere in Europe &#8211; transportation, accommodation and conference fees are super reasonable.</li>
<li>big.</li>
<li>not just about datavis, but rather on subjects which are particularly interesting for the datavis practitioner.</li>
<li>in pleasant Wiesbaden.</li>
<li>party-time/conference-time ratio about optimal.</li>
<li>workshop is really, really fantastic (and free to attend btw).</li>
</ul>
<p>Cons:</p>
<ul>
<li>you&#8217;re kind of expected to speak German.</li>
</ul>
<img src="http://feeds.feedburner.com/~r/JeromeCukier/~4/kY1-m3h7vBQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.jeromecukier.net/blog/2012/05/07/see7-conference/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.jeromecukier.net/blog/2012/05/07/see7-conference/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=see7-conference</feedburner:origLink></item>
		<item>
		<title>Impressions from Wiesbaden</title>
		<link>http://feedproxy.google.com/~r/JeromeCukier/~3/6Ex3ijSHsFY/</link>
		<comments>http://www.jeromecukier.net/blog/2012/05/03/impressions-from-wiesbaden/#comments</comments>
		<pubDate>Wed, 02 May 2012 23:59:15 +0000</pubDate>
		<dc:creator>jerome</dc:creator>
				<category><![CDATA[data visualization]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[michael madsen]]></category>
		<category><![CDATA[practice]]></category>
		<category><![CDATA[see conference]]></category>
		<category><![CDATA[see+]]></category>

		<guid isPermaLink="false">http://www.jeromecukier.net/?p=1250</guid>
		<description><![CDATA[I&#8217;m just returning from the 7th See conference on information visualization. I&#8217;ll do a longer, more descriptive post later (a lot happens in just 2 days) but for now I would like to mention one talk which really moved me. After the conference proper which takes place in the impressive Lutherkirche which sit hundreds, the See+ workshop [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m just returning from the 7th <a href="http://see-conference.com/">See conference</a> on information visualization. I&#8217;ll do a longer, more descriptive post later (a lot happens in just 2 days) but for now I would like to mention one talk which really moved me. After the conference proper which takes place in the impressive <a href="http://www.lutherkirche-wiesbaden.de/">Lutherkirche</a> which sit hundreds, the See+ workshop was held in the offices of the organizing agency, <a href="http://www.s-v.de/">Scholz und Volkmer</a>. Most speakers of the main conference came back for a more laid-back discussion with a much smaller audience.</p>
<p>The theme of See+ was tools. Speakers were invited to tell us how they work.</p>
<p>One characteristic of the See conference is that it offers a broader perspective on data visualization. In addition to well-known datavis specialists, such as Manuel Lima this year, other guests include visual artists and designers who also work with data, as well as experts in communication who apply that skill to data.</p>
<p>Michael Madsen is a Danish film-maker who lives in Berlin. In his see conference talk, he presented <a href="http://www.imdb.com/title/tt1194612/">Into Eternity</a>, a powerful documentary about an incredible facility in <del>Black Mesa</del> Finland which is designed to hold nuclear waste for 100,000 years. While the eerie tone of the narration is fascinating this was perhaps the topic most remote from &#8220;traditional&#8221; datavis (as in data, graphs and stuff).</p>
<p>Then came his See+ talk which took a turn I didn&#8217;t expect.</p>
<p><div class="wp-caption alignnone" style="width: 610px"><img class=" " src="http://25.media.tumblr.com/tumblr_m3d58aEkJ51rv82n3o1_1280.jpg" alt="" width="600" height="800" /><p class="wp-caption-text">Pic courtesy of Joshua de Haseth. I&#39;m actually the guy on the right.</p></div></p>
<p>Filming a documentary takes at least two years by a conservative count. A filmmaker first worry is therefore to find what will drive them for that long. More than a 9 to 5 job, more than a simple theme, what they are looking for is a vision, one unique way to treat one unique subject. Michael elaborated on the difference between executing what you are told to do, and a fulfilling calling that comes from the self. The flipside is that it is very difficult to go through times of no project. One subject powerful enough to give one a reason to work everyday for years does not simply come by; it can take months or even years of doubt before one is found. In Michael&#8217;s words: when I have no project, I have no identity. But when he does, his thrill is to seek and explore as he is doing something that has never been done before.</p>
<p>(I&#8217;m leaving a lot out which is more directly related to film-making.)</p>
<p>As he was discussing that I could sense people in the room tune to this, as I did. We datavis practitioners are all makers, tinkerers, inventors. Fortunately, our work cycles are often shorter and it is easier to start a new one. Certainly, there are patterns and recipes and things that need to be done for work. There are also hacks found on stackoverflow (thanks for that) and inspiration from the works of other (and thanks for the debugging console).</p>
<p>But there is also a &#8220;great unknown&#8221; in visualization &#8211; data that has never been collected let alone represented, techniques that have never been used, combinations that have never been tried &#8211; and things that can not even be put in words. All of this requires curiosity, independence and dedication. And the outcome may not live up to expectations. But it&#8217;s just a reminder that to do what we do we must leave our comfort zones and our ways, set off and explore.</p>
<p>&nbsp;</p>
<img src="http://feeds.feedburner.com/~r/JeromeCukier/~4/6Ex3ijSHsFY" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.jeromecukier.net/blog/2012/05/03/impressions-from-wiesbaden/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.jeromecukier.net/blog/2012/05/03/impressions-from-wiesbaden/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=impressions-from-wiesbaden</feedburner:origLink></item>
		<item>
		<title>Designing data visualizations</title>
		<link>http://feedproxy.google.com/~r/JeromeCukier/~3/lVpjyJU4-Go/</link>
		<comments>http://www.jeromecukier.net/blog/2012/04/22/designing-data-visualizations/#comments</comments>
		<pubDate>Sun, 22 Apr 2012 15:32:58 +0000</pubDate>
		<dc:creator>jerome</dc:creator>
				<category><![CDATA[book review]]></category>
		<category><![CDATA[data visualization]]></category>
		<category><![CDATA[book]]></category>
		<category><![CDATA[designing data visualizations]]></category>
		<category><![CDATA[Noah Iliinsky]]></category>
		<category><![CDATA[review]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://www.jeromecukier.net/?p=1155</guid>
		<description><![CDATA[Noah Iliinsky and O&#8217;Reilly were kind enough to send me one review copy of Noah&#8217;s book and who says review copy says review, so here goes. We need more introductory books to data visualization. I&#8217;ve had several discussions with data visualization colleagues who feel that there are too many books already. I strongly believe otherwise. [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.amazon.com/Designing-Data-Visualizations-Noah-Iliinsky/dp/1449312284/ref=tag_stp_s2f_edpp_data_v18on"><img class="alignnone" title="Designing data visualization book cover" src="http://covers.oreilly.com/images/0636920022060/cat.gif" alt="Designing data visualization book cover" width="180" height="236" /></a></p>
<p>Noah Iliinsky and O&#8217;Reilly were kind enough to send me one review copy of Noah&#8217;s book and who says review copy says review, so here goes.</p>
<h2>We need more introductory books to data visualization.</h2>
<p>I&#8217;ve had several discussions with data visualization colleagues who feel that there are too many books already. I strongly believe otherwise.</p>
<p>As of this writing, there are <a href="http://www.amazon.com/gp/tagging/items-tagged-with?ie=UTF8&amp;flatten=1&amp;tag=data%20visualization&amp;search=1">59 books tagged data visualization</a> on Amazon, versus well over a thousand for Java (for example). And on those 59, I would say about a dozen qualify as introductory. Here are 3 reasons why introductory books are important.</p>
<ul>
<li><strong>You only need to know a little to start making effective visualizations. </strong>A small book won&#8217;t teach you all there is to know about visualization, but you don&#8217;t need that to get off to a good start. A lot of this has to do with asking yourself the right questions. But this is a very unnatural thing to do, especially when you feel you can do stuff. Fortunately, even a short book can help you to pause and think.</li>
<li><strong>An effective visualization is not harder to make than a poor one</strong>. Well, actually it is, really good visualizations are built after many iterations on one promising concept. But the point is, a lot of efforts and ressources can go into abyssmal visualizations. If you are in a position to buy visualization, having even basic knowledge of how data visualization works can prevent you from wasting your money.</li>
<li><strong>There are many approaches to visualization.</strong> The right introductory book will be the one that resonates with you. Some people who are interested in this love to code, some are afraid of programming. Some are accomplished visual artists, some don&#8217;t know to draw. Some have specific needs (business dashboards, presentations, interactive web applications, etc.).</li>
</ul>
<h2><span class="Apple-style-span" style="font-size: 26px;"><strong>Where does designing data visualizations fit?</strong></span></h2>
<p>Designing Data Visualizations is a very short book &#8211; the advantage is that you can read this in a couple of hours. It&#8217;s perfect for a train or plane trip for instance. The format of the book (23 x 17.5 cm, flexible paperback) makes it easy to carry and read anywhere. And it&#8217;s an easy read &#8211; you won&#8217;t need to put down the book every few pages to make sure you understood.</p>
<p>The flipside of this is that you won&#8217;t learn any actionable skills from the book. The book is <strong>never trying to teach you to make things</strong> : this is explicitly outside of its scope. What is does is <strong>make you think on how to do stuff</strong>. It makes you consider the choices you make.</p>
<p>So you&#8217;re making a visualization. Does your choice of representation makes sense? how about your colors? placement? If you&#8217;re not confident that you know the answer to this kind of questions <strong>you must read the book right now</strong>; else, you won&#8217;t be able to improve your work. And again that is what successful designers do &#8211; iterate and improve, again and again and again.</p>
<p>As a non-native speaker of English one reason why I enjoy reading introductory books is for their excellent formulation of things. You know, there are those things you have a vague idea of, and the writer puts the exact words on it. So I&#8217;ll go ahead and quote my favorite paragraph :</p>
<blockquote><p>Consult [your goal] when you are about to be seduced by the siren song of circular layouts, the allure of extra data, the false prophet of &#8220;because I can&#8221;. These are distractions on your journey. As Bruce Lee would say, &#8220;It is like a finger pointing a way to the moon. Don&#8217;t concentrate on the finger or you will miss all that heavenly glory&#8221;.</p></blockquote>
<h3>Who is this book for?</h3>
<p>I think the people who would benefit the most from the books fall in two categories:</p>
<ol>
<li>Those who know absolutely nothing about visualization but have some interest in the subject. And the subset of those who don&#8217;t really have time to find out all about it (think: your client, your n+2 boss). They will appreciate that there is a real take-out value in such a short book.</li>
<li>Those who can create visualization because for instance they are coders, designers, excel users etc. and who see data visualization as a byproduct of their activity, so they never really asked themselves those questions. And among those, I&#8217;m thinking mostly of coders. Noah and I met at last year&#8217;s Strata conference which is also attended by the cream of the crop of the data scientists. I was surprised to see that some of them, despite being able to harness huge quantity of data, were severely limited in their visualization options because they never had an opportunity to learn. These people who are already at ease with the tool will see their activity supercharged thanks to the book.</li>
</ol>
<div>For a data practitioner who has already an interest in theory I won&#8217;t lie to you &#8211; reading the book will feel like patting yourself on the back and there will be little you will learn. But consider, for instance, giving copies to your customers and think of all the fruitless discussions that will  this will save you in the course of a project.</div>
<img src="http://feeds.feedburner.com/~r/JeromeCukier/~4/lVpjyJU4-Go" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.jeromecukier.net/blog/2012/04/22/designing-data-visualizations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.jeromecukier.net/blog/2012/04/22/designing-data-visualizations/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=designing-data-visualizations</feedburner:origLink></item>
		<item>
		<title>Hollywood + data III: our info+beauty awards entry. Bonus: making of.</title>
		<link>http://feedproxy.google.com/~r/JeromeCukier/~3/qY7n36kGIcw/</link>
		<comments>http://www.jeromecukier.net/blog/2012/04/22/hollywood-data-iii-our-infobeauty-awards-entry-bonus-making-of/#comments</comments>
		<pubDate>Sun, 22 Apr 2012 15:08:14 +0000</pubDate>
		<dc:creator>jerome</dc:creator>
				<category><![CDATA[d3]]></category>
		<category><![CDATA[data visualization]]></category>
		<category><![CDATA[challenge]]></category>
		<category><![CDATA[David mcCandless]]></category>
		<category><![CDATA[hacks]]></category>
		<category><![CDATA[iba]]></category>
		<category><![CDATA[info+beauty awards]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.jeromecukier.net/?p=1204</guid>
		<description><![CDATA[So Jen and I released our Info+beauty awards entry. How did we end up with this? it&#8217;s really cool working around movies, because it&#8217;s something we can relate to. At first I wanted to do something out of keywords we could grab on the movies but  Jen came up with another idea I found more [...]]]></description>
			<content:encoded><![CDATA[<p>So Jen and I released our <a href="http://bit.ly/starchallenge">Info+beauty awards entry</a>.</p>
<p><a href="http://bit.ly/starchallenge"><img class="aligncenter size-full wp-image-1205" title="shootingstars" src="http://www.jeromecukier.net/wp-content/uploads/2012/02/shootingstars.png" alt="" width="628" height="636" /></a></p>
<p>How did we end up with this?</p>
<p>it&#8217;s really cool working around movies, because it&#8217;s something we can relate to.</p>
<p><div id="attachment_1206" class="wp-caption aligncenter" style="width: 610px"><a href="http://www.jeromecukier.net/wp-content/uploads/2012/02/20120203_105750.jpg"><img class="size-full wp-image-1206" title="20120203_105750" src="http://www.jeromecukier.net/wp-content/uploads/2012/02/20120203_105750.jpg" alt="" width="600" height="450" /></a><p class="wp-caption-text">A part of my movie ticket stubs stash.</p></div></p>
<p>At first I wanted to do something out of keywords we could grab on the movies but  Jen came up with another idea I found more worth pursuing: working around the story types (which was the most interesting aspect of the curated contest dataset) and see if there was not some kind of grand truth we could unravel there. She also requested stars and glitter, because we were not going to work on this glamorous dataset with a tedious dashboard done in Excel.</p>
<p>That truth didn&#8217;t take so much time to find: the most frequently used story types (like comedy or movies with monsters) do not perform well in the box office while different story types (stories of teens growing up, or when the main character turns into something else), which are used less often, are much more profitable. So why doesn&#8217;t hollywood make more Junos and Black Swans and fewer College Road Trips or Dylan Dogs?</p>
<p>That&#8217;s the idea. Now the making.</p>
<p><span style="color: #ff0000;">Fair warning &#8211; the rest of this post is fairly technical. </span></p>
<h3>Making stars</h3>
<p>If I had to contribute significantly to the project it had to be done in d3/svg.</p>
<p>Fortunately, it&#8217;s easy to generate star shapes in d3. Once you have the coordinates of where the points of one unitary star should be, you can easily make stars of any size with a function and a parameter.</p>
<pre class="brush: jscript; title: ; notranslate">
var c1=Math.cos(.2*Math.PI),c2=Math.cos(.4*Math.PI),
    s1=Math.sin(.2*Math.PI),s2=Math.sin(.4*Math.PI),
    r=1,

    // ok the constant after r1 is the thickness of the branches.
    // 1 is a &quot;straight&quot; star, less is narrower, more is thicker.

    r1=1.5*r*c2/c1,
    star=[
        [0,-r],
        [r1*s1,-r1*c1],
        [r*s2,-r*c2],
        [r1*s2,r1*c2],
        [r*s1,r*c1],
        [0,r1],
        [-r*s1,r*c1],
        [-r1*s2,r1*c2],
        [-r*s2,-r*c2],
        [-r1*s1,-r1*c1],
        [0,-r]
        ];
    // this is a list of the pair of coordinates of the points that make a star.
lineStar=function(k) {
	var line=d3.svg.line()
		.x(function(d) {return d[0]*k;})
		.y(function(d) {return d[1]*k;})
	return line(star)+&quot;Z&quot;; // this will stitch everything together.
}</pre>
<p>Now, running lineStar(10) will return the path description of a star with a radius of 10, thusly:</p>
<pre class="brush: jscript; title: ; notranslate">&quot;M0,-10L3.367709824346891,-4.635254915624212L9.510565162951535,-3.0901699437494745L5.449068960040206,
1.770509831248423L5.877852522924732,8.090169943749475L0,5.729490168751577L-5.877852522924732,
8.090169943749475L-5.449068960040206,1.770509831248423L-9.510565162951535,-3.0901699437494745
L-3.367709824346891,-4.635254915624212L0,-10Z&quot;</pre>
<h3>Placing, moving (and spinning) the stars</h3>
<p>The next idea was placing the stars.</p>
<p>And for this we need two things: being able to position them somewhere, and being able to move them easily from point A to point B, ideally with some cool effect in between.</p>
<p>So, it would be possible to change the x and y attributes of the path, but each would have to be dealt with separately with a different function call. I found it a better approach to rely on the <strong>transform </strong>attribute and <strong>translate</strong>. Each time I want to position a star somewhere, I need it to be set at an x and y coordinate, which will always correspond to either the data of the star, or that of a group above it. For instance, a star corresponding to a movie will need to be at the position corresponding to the data of that movie, or that of the story type above it if it&#8217;s still collapsed, or that of the high-level grouping of story types if that&#8217;s collapsed.</p>
<p>Now all of the data structures for that are array of objects which all have x and y keys. In other terms, for any star-shaped object, I can always expect the underlying datum d to have d.x and d.y values. So, I wrote a function <strong>translate(d)</strong> which works on those 2 properties. And as a result, when I need to position any object all I have to write is:</p>
<pre class="brush: jscript; title: ; notranslate">.attr(&quot;transform&quot;,translate)</pre>
<p>and the object will be positioned according to its underlying data. (this is equivalent to writing .attr(&#8220;transform&#8221;,function(d) {return translate(d);}) )</p>
<p>If I need to be them elsewhere, i.e. at the position of their parent, I can pass the data of that parent as an argument, for instance:</p>
<pre class="brush: jscript; title: ; notranslate">.attr(&quot;transform&quot;,function(d) {return translate(structs[d.struct]);})</pre>
<p>For a cheap bit of extra action, I&#8217;ve added a spinning effect in the translate function. Since translate(d) returns a value for the transform attribute, nobody said it just had to be instructions for translation! so I&#8217;ve added a rotate after the translate. The arguments for the rotate function depend on the x and y properties of the argument as well, so when stars move across the screen, the rotate angle changes slightly with each increment of either coordinate, giving the impression of spinning.</p>
<h3>Explosions, starlets and other effects</h3>
<p>Most of the cool things happening in the visualization rely on one very simple principle about d3 transitions: <strong>chaining them. </strong><br />
In the code you&#8217;ll find oftentimes this pattern:</p>
<pre class="brush: jscript; title: ; notranslate">.selectAll(&quot;someobject&quot;).data(...).enter().append(...) // creates the items
... // sets the initial attributes
...
.transition()
... // change the attributes
...
...
...
.each(&quot;end&quot;, function() { // stuff to be done on each item after the transition is over
</pre>
<p>and within that function, you&#8217;ll find either:<br />
another transition which starts exactly when the previous one ends, so for instance opacity can decrease (causing a fading effect): d3.select(this).transition()&#8230;</p>
<p>or a command to remove the object: d3.select(this).remove().</p>
<p>When another transition is called, there can be another one after, then another one, then another one, then eventually the object can be removed (or not).</p>
<p>Now you may think of transitions as ways to get one object to change smoothly from state A to state B, like a rectangle moving across the screen. But if you start to think that the <strong>objects can be discarded after the transitions</strong>, you&#8217;ll realize that there is an unbelievable number of things that can be done with them.<br />
For instance, upon clicking on some stars, I am creating another star shape at that same location. Initially it has a the same size as the star, but I increase that radius to a large number (1000px) while decreasing its opacity to 0. So it seems that the new star is both exploding and fading. When it&#8217;s become transparent I remove it.</p>
<pre class="brush: jscript; title: ; notranslate">gStructs.append(&quot;svg:path&quot;) // here I'm creating a &quot;path&quot; shape
.style(&quot;stroke&quot;,&quot;none&quot;) // with no outline
.style(&quot;fill&quot;,colorXp)  // with the fill color of the explosion
.style(&quot;opacity&quot;,0.2)  // and a low opacity to start with (translucent)
.attr(&quot;d&quot;,lineStar(d.size[sizeAxis])) // I give it the shape of a star and the size of the
                                      // star that's being clicked
.attr(&quot;transform&quot;,translate(d)) // and I position it on that star

.transition() // action!

.duration(500)	// a 500ms transition. Long enough to see the effect.
.attr(&quot;d&quot;,lineStar(1000)) // the star expands to a radius of 1000.
.style(&quot;opacity&quot;,0) // while fading to transparency.

.each(&quot;end&quot;,function() {d3.select(this).remove();}) // and when it's done - it's removed.
</pre>
<h3>Changing axes</h3>
<p>In this visualization I let the user change what&#8217;s plotted along the axes. It&#8217;s not very difficult to do but it&#8217;s a hassle to do it late in the project as it has been our case because it requires a lot of housekeeping. This is really about the data structures that will support our items. Instead of having just one value for x, y and size they have an object with several keys, one per axis. Then we maintain one variable per axis type, so everywhere we should write: d.x, we write instead: d.x[xAxis].</p>
<p>So when there is an axis change, of course, we do a transition so that the stars and everything move smoothly to their new position. But what if the objects were already moving? When an unplanned transition interferes with an ongoing one, the results are often ugly, especially if the current transition had chained transitions waiting to be triggered. In other words, this will leave a mess.</p>
<p>The way I&#8217;ve dealt with this is by keeping a tab on the number of transitions going on at a certain time. The axis change could only occur if no other transitions were taken place. If that was the case they were simply denied. There are other ways to do that like a queue of actions but that seemed the simple and adequate way to deal with this.</p>
<h3>Bootstrap and google fonts</h3>
<p>This was the first non-trivial project where I used <a href="http://twitter.github.com/bootstrap/">bootstrap</a> and I&#8217;m just never going back. Bootstrap simply removes all the hassle of arranging all the elements of a visualization on a screen and is very easy to use. Plus, it comes up with sensible options for buttons, forms, and the like. Since the contest it has evolved faster than a pokémon, for instance it is now possible to specify custom colors in a form and bootstrap will generate the appropriate css files. <a href="http://www.google.com/webfonts">Google fonts</a> are another great help as they are a very easy solution to choose fonts among a relatively large number of choices without relying on the fact that all the users have these fonts on their computer.</p>
<h3>Wrapping it up</h3>
<p>There&#8217;s a lot of other hacks in the code which you are welcome to explore, I admit I don&#8217;t remember them all because I took too much time to write this blog post after creating the entry (bad). However if there is one point you would like be to explain please ask in the comments.<br />
I&#8217;m not entirely sure of what happened when I submitted the entry though. First it wasn&#8217;t listed with the others, then I got a message saying it hadn&#8217;t been reviewed, so it didn&#8217;t win anything, yet some time after the prizes have been handled it appeared in the &#8220;shortlisted&#8221; visualizations for the contest (which I found by accident). So whether or not it was good, I let you guys judge, at any rate it was fun making.</p>
<img src="http://feeds.feedburner.com/~r/JeromeCukier/~4/qY7n36kGIcw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.jeromecukier.net/blog/2012/04/22/hollywood-data-iii-our-infobeauty-awards-entry-bonus-making-of/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.jeromecukier.net/blog/2012/04/22/hollywood-data-iii-our-infobeauty-awards-entry-bonus-making-of/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=hollywood-data-iii-our-infobeauty-awards-entry-bonus-making-of</feedburner:origLink></item>
		<item>
		<title>Treemaps in Tableau? can be done.</title>
		<link>http://feedproxy.google.com/~r/JeromeCukier/~3/565KAupCup0/</link>
		<comments>http://www.jeromecukier.net/blog/2012/04/19/treemaps-in-tableau-can-be-done/#comments</comments>
		<pubDate>Thu, 19 Apr 2012 11:16:13 +0000</pubDate>
		<dc:creator>jerome</dc:creator>
				<category><![CDATA[d3]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[Tableau]]></category>
		<category><![CDATA[treemaps]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.jeromecukier.net/?p=1211</guid>
		<description><![CDATA[Tableau can do many things natively but there are a couple of basic primitives that are not built in because they behave somewhat differently from the overall logic. And treemaps is one of them. Then again treemaps are arguably one of the best way to express complex hierarchical information, i.e. to show the proportions in [...]]]></description>
			<content:encoded><![CDATA[Tableau can do many things natively but there are a couple of basic primitives that are not built in because they behave somewhat differently from the overall logic. And <a href="http://www.tableausoftware.com/about/blog/2011/05/alternative-tree-maps-0">treemaps is one of them</a>. Then again treemaps are arguably one of the best way to express complex hierarchical information, i.e. to show the proportions in a large dataset.</p>

Fortunately, thanks to Tableau flexibility there are ways to do that. In the tutorial I'm going to cover 2 cases. First, we'll create a somewhat complex treemap off data which will not change in runtime. Then, we'll create mini-treemaps which can change dynamically.</p>
<h2>A complex treemap</h2>
<script type="text/javascript" src="http://public.tableausoftware.com/javascripts/api/viz_v1.js"></script>
<noscript><a href="#"><img alt="ComplexTM " src="http:&#47;&#47;public.tableausoftware.com&#47;static&#47;images&#47;tr&#47;treemaps&#47;ComplexTM&#47;1_rss.png" style="border: none" /></a></noscript><object class="tableauViz" width="604" height="369" style="display:none;"><param name="host_url" value="http%3A%2F%2Fpublic.tableausoftware.com%2F" /><param name="site_root" value="" /><param name="name" value="treemaps&#47;ComplexTM" /><param name="tabs" value="no" /><param name="toolbar" value="yes" /><param name="static_image" value="http:&#47;&#47;public.tableausoftware.com&#47;static&#47;images&#47;tr&#47;treemaps&#47;ComplexTM&#47;1.png" /><param name="animate_transition" value="yes" /><param name="display_static_image" value="yes" /><param name="display_spinner" value="yes" /><param name="display_overlay" value="yes" /><param name="display_count" value="yes" /></object>
<div style="width:604px;height:22px;padding:0px 10px 0px 0px;color:black;font:normal 8pt verdana,helvetica,arial,sans-serif;"><div style="float:right; padding-right:8px;"><a href="http://www.tableausoftware.com/public?ref=http://public.tableausoftware.com/views/treemaps/ComplexTM" target="_blank">Powered by Tableau</a></div></div>
Before we go in the details the main ideas are deceptively simple.</p>
<ul>
	<li>we use the polygon mark,</li>
	<li>we generate the treemap layout outside of tableau.</li>
</ul>
What we want (and what we'll get) is a dataset that can be directly imported in Tableau and -boom- makes a treemap in a few clicks.</p>

To make this dataset we can use d3. The treemap I am making is directly inspired from the <a href="mbostock.github.com/d3/ex/treemap.html">d3 treemap example</a>. d3 is already computing all of the node positions so what we'll do is modify the program slightly so that it outputs them in a way that can be directly used in Tableau.</p>

Here is the <a href="http://www.jeromecukier.net/wp-content/uploads/2012/04/treemap.html">modified file</a> which you can download and run on your computer. To work it needs to be in the same folder as a <a href="http://www.jeromecukier.net/wp-content/uploads/2012/04/data.js">data file</a> called data.js which will hold your hiearchical data and which has the same structure as the one linked here.</p>

You can just copy/paste the table that's displayed below the treemap and put it in Tableau or save it in a file for good measure. Here is <a href="http://www.jeromecukier.net/wp-content/uploads/2012/04/data.csv">the output</a> of the data file linked above.</p>

Let's take a look at a few rows :</p>
<table>
<tbody>
<tr>
<td>Id</td>
<td>Path</td>
<td>Top-level category</td>
<td>Name</td>
<td>Value</td>
<td>Corner</td>
<td>x</td>
<td>y</td>
</tr>
<tr>
<td>0</td>
<td>flare&gt;analytics&gt;cluster</td>
<td>flare</td>
<td>AgglomerativeCluster</td>
<td>3938</td>
<td>0</td>
<td>89</td>
<td>167</td>
</tr>
<tr>
<td>0</td>
<td>flare&gt;analytics&gt;cluster</td>
<td>flare</td>
<td>AgglomerativeCluster</td>
<td>3938</td>
<td>1</td>
<td>167</td>
<td>167</td>
</tr>
<tr>
<td>0</td>
<td>flare&gt;analytics&gt;cluster</td>
<td>flare</td>
<td>AgglomerativeCluster</td>
<td>3938</td>
<td>2</td>
<td>167</td>
<td>192</td>
</tr>
<tr>
<td>0</td>
<td>flare&gt;analytics&gt;cluster</td>
<td>flare</td>
<td>AgglomerativeCluster</td>
<td>3938</td>
<td>3</td>
<td>89</td>
<td>192</td>
</tr>
<tr>
<td>1</td>
<td>flare&gt;analytics&gt;cluster</td>
<td>flare</td>
<td>CommunityStructure</td>
<td>3812</td>
<td>0</td>
<td>102</td>
<td>138</td>
</tr>
<tr>
<td>1</td>
<td>flare&gt;analytics&gt;cluster</td>
<td>flare</td>
<td>CommunityStructure</td>
<td>3812</td>
<td>1</td>
<td>167</td>
<td>138</td>
</tr>
<tr>
<td>1</td>
<td>flare&gt;analytics&gt;cluster</td>
<td>flare</td>
<td>CommunityStructure</td>
<td>3812</td>
<td>2</td>
<td>167</td>
<td>167</td>
</tr>
<tr>
<td>1</td>
<td>flare&gt;analytics&gt;cluster</td>
<td>flare</td>
<td>CommunityStructure</td>
<td>3812</td>
<td>3</td>
<td>102</td>
<td>167</td>
</tr>
<tr>
<td>2</td>
<td>flare&gt;analytics&gt;cluster</td>
<td>flare</td>
<td>HierarchicalCluster</td>
<td>6714</td>
<td>0</td>
<td>89</td>
<td>192</td>
</tr>
<tr>
<td>2</td>
<td>flare&gt;analytics&gt;cluster</td>
<td>flare</td>
<td>HierarchicalCluster</td>
<td>6714</td>
<td>1</td>
<td>167</td>
<td>192</td>
</tr>
<tr>
<td>2</td>
<td>flare&gt;analytics&gt;cluster</td>
<td>flare</td>
<td>HierarchicalCluster</td>
<td>6714</td>
<td>2</td>
<td>167</td>
<td>236</td>
</tr>
<tr>
<td>2</td>
<td>flare&gt;analytics&gt;cluster</td>
<td>flare</td>
<td>HierarchicalCluster</td>
<td>6714</td>
<td>3</td>
<td>89</td>
<td>236</td>
</tr>
</tbody>
</table>
I'm creating 4 lines per "leaf" node. So in this example which has 220 nodes, that amounts to 880 lines. Why 4? Because to draw a rectangle in Tableau you really need to define 4 corners. This is why there is a column "Corner" which is worth 0,1,2 and 3. This, we will use to tell Tableau to read our corners in bottom left, bottom right, top right, top left order which produces a nice convex rectangle and not a concave hourglass shape.</p>

Now off to Tableau with this data. <a href="http://www.jeromecukier.net/wp-content/uploads/2012/04/complex.png"><img class="aligncenter size-full wp-image-1212" title="complex" src="http://www.jeromecukier.net/wp-content/uploads/2012/04/complex.png" alt="" width="558" height="477" /></a></p>

Now it's just a matter of doing like this screen. Unsurprisingly the columns and rows are going to be determined by x and y. You want a polygon mark, and you absolutely must use your corner measure in the path. For color, you'll have a choice, you can use the top-level category column (as I have) or the full path which will divide your treemap in finer parts. Finally, level of detail: you must use the Id and not the name in case several of your nodes have the same name. It's quite important at this point to uncheck aggregate measures in Analysis. You do NOT want aggregate measures (though it's quite pretty). To be able to use the name, you must first make a measure out of it. And finally, you'll want to update your infotip slightly.</p>

All of this you can see if you download the tableau file.</p>

And voilà! Treemaps for your Tableau workbooks.</p>

<em>Caveat:</em> the polygon mark doesn't support labels so you can't write on top of the small rectangles what they are but that's not the point of the treemap, which is instead to give an immediate first impression of the relative size of large groups of your data, then allow you to explore them, to that end the infotip function works just fine.</p>
<h2>Simpler but dynamic treemaps</h2>
This is fine and dandy if your data doesn't change but it won't scale if you need to make many treemaps based on selections. What to do? You could use pie charts, but let's not.</p>

To that end I've tried to emulate the <a href="http://www.congressspeaks.com">Congress speaks</a> visualization by <a href="http://periscopic.com">Periscopic</a>. I really like it. When you've selected representatives at the end of the process you are taken to a screen which shows the following mini-treemap:</p>

<a href="http://www.jeromecukier.net/wp-content/uploads/2012/04/votingrecord.png"><img class="aligncenter size-full wp-image-1213" title="votingrecord" src="http://www.jeromecukier.net/wp-content/uploads/2012/04/votingrecord.png" alt="" width="246" height="166" /></a></p>

There are just 5 rectangles. But they will change for any representative that we choose. Can this be done with Tableau? Obviously.</p>

Now the Tableau part of this is slightly trickier than above. The idea is that we are going to use formulas to generate the coordinates of all 20 corners of the rectangles, in other words we are going to let Tableau calculate the layout. We can do it because the way that rectangles are going to be arranged is quite predictible. There is one on the left, then 4 stacked on the right one on top of the other. Again, we could compute all of these coordinates outside of Tableau but that would be a hassle and so for a large number of cases it becomes easier and more reliable to do this inside of Tableau.</p>
<h3>Data</h3>
For this I have used completely random data. I have <a href="http://www.kleimo.com/random/name.cfm">generated 20 names</a>, and for each I have generated 5 values in a likely range, number of possible votes, number of votes the representative actually voted, number of times they voted yes, number of times they voted yes with their party, and the same for no. (or nay, technically).</p>

At the end of the day I need 20 records per representative (5 rectangles of 4 corners each), so I can either replicate the line 20 times, or use linked tables. The idea is to get something like this for all of the representatives that can somehow get into Tableau.</p>
<table width="640" border="0" cellspacing="0" cellpadding="0"><colgroup> <col span="10" width="64" /> </colgroup>
<tbody>
<tr>
<td width="64" height="17">Id</td>
<td width="64">representative</td>
<td width="64">corner</td>
<td width="64">rectangle</td>
<td width="64">possible votes</td>
<td width="64">total votes</td>
<td width="64">voted yes</td>
<td width="64">yes with party</td>
<td width="64">voted no</td>
<td width="64">no with party</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">0</td>
<td>no against party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">1</td>
<td>no against party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">2</td>
<td>no against party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">3</td>
<td>no against party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">0</td>
<td>no vote</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">1</td>
<td>no vote</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">2</td>
<td>no vote</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">3</td>
<td>no vote</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">0</td>
<td>no with party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">1</td>
<td>no with party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">2</td>
<td>no with party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">3</td>
<td>no with party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">0</td>
<td>yes against party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">1</td>
<td>yes against party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">2</td>
<td>yes against party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">3</td>
<td>yes against party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">0</td>
<td>yes with party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">1</td>
<td>yes with party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">2</td>
<td>yes with party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
<tr>
<td align="right" height="17">16</td>
<td>Nelson Thiede</td>
<td align="right">3</td>
<td>yes with party</td>
<td align="right">888</td>
<td align="right">784</td>
<td align="right">320</td>
<td align="right">274</td>
<td align="right">464</td>
<td align="right">373</td>
</tr>
</tbody>
</table>
<h3>In Tableau</h3>
In Tableau we are going to use the same idea as above: polygon mark, disable aggregate measures, and use x and y for columns and rows.</p>

Only, x and y are going to be much more complex. Sorry about that. Well, not that complex but definitely longer.</p>

Here's x:</p>

<pre class="brush: plain; title: ; notranslate">

case [rectangle]
when &quot;no vote&quot; then
     case [corner]
       when 0 then 0
       when 1 then (([possible votes]-[total votes])/[possible votes])
       when 2 then (([possible votes]-[total votes])/[possible votes])
       when 3 then 0
     end
else
     case [corner]
       when 0 then (([possible votes]-[total votes])/[possible votes])
       when 1 then 1
       when 2 then 1
       when 3 then (([possible votes]-[total votes])/[possible votes])
   end
end

</pre>

Depending on the rectangle we are trying to draw we can find ourselves in one of two cases (hence the use of case).</p>

If we draw "no vote" then we are on the left of our vis. The left corners are on the leftmost side of the vis (hence value: 0) and the right corners correspond to the proportion of possible votes which where not cast by this representative, which we can compute as ([possible votes]-[total votes])/[possible votes].</p>

In the other case, we are drawing one of the 4 stacked rectangles, so the right corners are on the rightmost side of the vis (hence value: 1) and the left corners correspond to the value we just computed.</p>

And now, y:</p>

<pre class="brush: plain; title: ; notranslate">
case [rectangle]
when &quot;no vote&quot; then
case [corner]
when 0 then 0
when 1 then 0
when 2 then 1
when 3 then 1
end
when &quot;yes against party&quot; then
case [corner]
when 0 then 0
when 1 then 0
when 2 then (([voted yes]-[yes with party])/[total votes])
when 3 then (([voted yes]-[yes with party])/[total votes])
end
when &quot;yes with party&quot; then
case [corner]
when 0 then (([voted yes]-[yes with party])/[total votes])
when 1 then (([voted yes]-[yes with party])/[total votes])
when 2 then ((2*[voted yes]-[yes with party])/[total votes])
when 3 then ((2*[voted yes]-[yes with party])/[total votes])
end
when &quot;no with party&quot; then
case [corner]
when 0 then ((2*[voted yes]-[yes with party])/[total votes])
when 1 then ((2*[voted yes]-[yes with party])/[total votes])
when 2 then ((2*[voted yes]+[no with party]-[yes with party])/[total votes])
when 3 then ((2*[voted yes]+[no with party]-[yes with party])/[total votes])
end
when &quot;no against party&quot; then
case [corner]
when 0 then ((2*[voted yes]+[no with party]-[yes with party])/[total votes])
when 1 then ((2*[voted yes]+[no with party]-[yes with party])/[total votes])
when 2 then 1
when 3 then 1
end
end
</pre>

y is longer but this is the same general idea. For the "no vote" rectangle, the corners are either to the top or bottom of the vis. But for the other, we can predict where the rectangle will start and when it will end, as a proportion of the [possible votes] field. The values we want are going to be correspond to these proportions, plus that of all the rectangles below so we can achieve that stacked effect (as opposed to have all rectangles superimposed at the bottom of the vis). This is why I am entering the rectangles in stacking order. Each time, the bottom corners get the value of the top corners of the previous rectangle.</p>

Here is the final result:</p>

&nbsp;
<script type="text/javascript" src="http://public.tableausoftware.com/javascripts/api/viz_v1.js"></script>
<noscript><a href="#"><img alt="mini TM " src="http:&#47;&#47;public.tableausoftware.com&#47;static&#47;images&#47;B5&#47;B5PW2XJWX&#47;1_rss.png" style="border: none" /></a></noscript>
<object class="tableauViz" width="404" height="269" style="display:none;">
<param name="host_url" value="http%3A%2F%2Fpublic.tableausoftware.com%2F" />
<param name="path" value="shared&#47;B5PW2XJWX" />
<param name="toolbar" value="yes" />
<param name="static_image" value="http:&#47;&#47;public.tableausoftware.com&#47;static&#47;images&#47;B5&#47;B5PW2XJWX&#47;1.png" />
<param name="animate_transition" value="yes" />
<param name="display_static_image" value="yes" />
<param name="display_spinner" value="yes" />
<param name="display_overlay" value="yes" />
<param name="display_count" value="yes" />
</object>
<div style="width:404px;height:22px;padding:0px 10px 0px 0px;color:black;font:normal 8pt verdana,helvetica,arial,sans-serif;"><div style="float:right; padding-right:8px;"><a href="http://www.tableausoftware.com/public?ref=http://public.tableausoftware.com/shared/B5PW2XJWX" target="_blank">Powered by Tableau</a></div></div><img src="http://feeds.feedburner.com/~r/JeromeCukier/~4/565KAupCup0" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.jeromecukier.net/blog/2012/04/19/treemaps-in-tableau-can-be-done/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://www.jeromecukier.net/blog/2012/04/19/treemaps-in-tableau-can-be-done/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=treemaps-in-tableau-can-be-done</feedburner:origLink></item>
	</channel>
</rss>

