<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>R-bloggers</title>
	
	<link>http://www.r-bloggers.com</link>
	<description>R news and tutorials contributed by (452) R bloggers</description>
	<lastBuildDate>Sun, 19 May 2013 16:10:38 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/RBloggers" /><feedburner:info uri="rbloggers" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:emailServiceId>RBloggers</feedburner:emailServiceId><feedburner:feedburnerHostname>http://feedburner.google.com</feedburner:feedburnerHostname><item>
		<title>Update to PSID panel builder for R: psidR</title>
		<link>http://feedproxy.google.com/~r/RBloggers/~3/GemEG4ztYQc/</link>
		<comments>http://www.r-bloggers.com/update-to-psid-panel-builder-for-r-psidr/#comments</comments>
		<pubDate>Sun, 19 May 2013 15:49:00 +0000</pubDate>
		<dc:creator>leisuretronic</dc:creator>
		
		<guid isPermaLink="false">http://www.r-bloggers.com/?guid=b69ed28364d57bd7ec7976c90436245c</guid>
		<description><![CDATA[I just pushed the most recent version of the PSID panel data builder introduced a little while ago. Got some user feedback and made some improvements. The package is hosted on github.News:I added a reproducible example using artificial data which you c...]]></description>
				<content:encoded><![CDATA[<p class="syndicated-attribution"><div class="social4i" style="height:29px;"><div class="social4in" style="height:29px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/update-to-psid-panel-builder-for-r-psidr/" data-send="true"  data-layout="button_count" data-width="100" data-height="21"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/update-to-psid-panel-builder-for-r-psidr/" data-counturl="http://www.r-bloggers.com/update-to-psid-panel-builder-for-r-psidr/" data-text="Update to PSID panel builder for R: psidR" class="twitter-share-button" data-count="horizontal" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="medium" href="http://www.r-bloggers.com/update-to-psid-panel-builder-for-r-psidr/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://plausibel.blogspot.com/2013/05/update-to-psid-panel-builder-psidr.html"> plausibel</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers)</a>      
</div></p>
I just pushed the most recent version of the <a href="http://plausibel.blogspot.co.uk/2013/04/psid-data-set-builder-for-r.html" ref="nofollow" target="_blank">PSID panel data builder introduced a little while ago</a>. Got some user feedback and made some improvements. The <a href="https://github.com/floswald/psidR" ref="nofollow" target="_blank">package is hosted on github</a>.<br /><div><br /></div><div>News:</div><div><ul><li>I added a reproducible example using artificial data which you can run by calling 'example(build.panel)'. This means you can try out the package before bothering to download anything and it provides a simple test of the main function.</li><li>I've included a suggestion to use the&nbsp;<a href="http://cran.r-project.org/web/packages/survey/index.html" ref="nofollow" target="_blank">R survey package</a>&nbsp;to analyse this dataset and made it explicit in the examples how to obtain the desired weights for each wave. Note that your results are invalid in the majority of cases if you ignore the survey design (i.e. the weights).</li><li>I got some useful comments from Anthony Damico (thanks!)&nbsp;and integrated the&nbsp;<a href="http://cran.r-project.org/web/packages/SAScii/index.html" ref="nofollow" target="_blank">SAScii</a>&nbsp;package. (check out his tutorials at&nbsp;<a href="http://www.asdfree.com/" ref="nofollow" target="_blank">http://www.asdfree.com/</a>).&nbsp; This allows one to download the data directly from the PSID server into R, thereby removing any dependency on Stata or SAS to preprocess the raw data. (As is common with large datasets, the raw data come in ASCII format that needs to be fixed up into rows and columns.) The downside is that downloading directly takes a rather long time: downloading FAM1985ER, FAM1986ER and the index IND2009ER took 3 and a half hours.</li></ul><div>Hopefully I can get another round of feedback (particularly from a windows user: I could not test that all the paths are written correctly on a unix system) before submitting to CRAN.</div></div><div><br /></div><div><br /></div><div class="blogger-post-footer">flo.</div>
<p class="syndicated-attribution"><div class="social4i" style="height:82px;"><div class="social4in" style="height:82px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/update-to-psid-panel-builder-for-r-psidr/" data-send="true"  data-layout="box_count" data-width="55" data-height="62"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/update-to-psid-panel-builder-for-r-psidr/" data-counturl="http://www.r-bloggers.com/update-to-psid-panel-builder-for-r-psidr/" data-text="Update to PSID panel builder for R: psidR" class="twitter-share-button" data-count="vertical" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="tall" href="http://www.r-bloggers.com/update-to-psid-panel-builder-for-r-psidr/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://plausibel.blogspot.com/2013/05/update-to-psid-panel-builder-psidr.html"> plausibel</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers" rel="nofollow">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/" rel="nofollow">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial" rel="nofollow">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2" rel="nofollow">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot" rel="nofollow">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map" rel="nofollow">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation" rel="nofollow">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio" rel="nofollow">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave" rel="nofollow">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX" rel="nofollow">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL" rel="nofollow">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse" rel="nofollow">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git" rel="nofollow">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop" rel="nofollow">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping" rel="nofollow">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression" rel="nofollow">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA" rel="nofollow">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series" rel="nofollow">time series</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading" rel="nofollow">trading</a>) and more...
</div></p><img src="http://feeds.feedburner.com/~r/RBloggers/~4/GemEG4ztYQc" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/update-to-psid-panel-builder-for-r-psidr/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		<feedburner:origLink>http://www.r-bloggers.com/update-to-psid-panel-builder-for-r-psidr/</feedburner:origLink></item>
		<item>
		<title>Sharing my R notes</title>
		<link>http://feedproxy.google.com/~r/RBloggers/~3/ejJOrHOlMBI/</link>
		<comments>http://www.r-bloggers.com/sharing-my-r-notes/#comments</comments>
		<pubDate>Sun, 19 May 2013 00:56:48 +0000</pubDate>
		<dc:creator>tylerrinker</dc:creator>
		
		<guid isPermaLink="false">http://trinkerrstuff.wordpress.com/?p=989</guid>
		<description><![CDATA[I started working with R 2 1/2 years ago. I remember opening R closing it and thinking it was the dumbest thing ever (command line to a non programmer is not inviting). Now it&#8217;s my constant friend. From the beginning &#8230; <a href="http://trinkerrstuff.wordpress.com/2013/05/19/sharing-my-r-notes/">Continue reading <span>&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=trinkerrstuff.wordpress.com&#38;blog=35458626&#38;post=989&#38;subd=trinkerrstuff&#38;ref=&#38;feed=1" width="1" height="1">
]]></description>
				<content:encoded><![CDATA[<p class="syndicated-attribution"><div class="social4i" style="height:29px;"><div class="social4in" style="height:29px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/sharing-my-r-notes/" data-send="true"  data-layout="button_count" data-width="100" data-height="21"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/sharing-my-r-notes/" data-counturl="http://www.r-bloggers.com/sharing-my-r-notes/" data-text="Sharing my R notes" class="twitter-share-button" data-count="horizontal" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="medium" href="http://www.r-bloggers.com/sharing-my-r-notes/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://trinkerrstuff.wordpress.com/2013/05/19/sharing-my-r-notes/"> TRinker&#039;s R Blog » R</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers)</a>      
</div></p>
<p>I started working with R 2 1/2 years ago. I remember opening R closing it and thinking it was the dumbest thing ever (command line to a non programmer is not inviting). Now it&#8217;s my constant friend. From the beginning I took notes to remind myself all of the things I learned and relearned. They&#8217;ve been invaluable to me in learning. They are not particularly well arranged nor do they credit sources properly. There are likely bad or outdated practices in there but I figured they may be helpful to others learning the language and so I&#8217;m sharing.</p>
<p>Note that :</p>
<p>1) they are poorly arranged<br />
2) they may have mistakes<br />
3) they don&#8217;t credit others work properly or at all</p>
<p>They were for me but now I think maybe others will find them useful so here they are:</p>
<div style="width:78.75px;margin:auto;">
<p style="text-align:center;"><a href="https://copy.com/mFIxd6cIJac0RqoM/trinker%27s_notes.pdf?download=1" ref="nofollow" target="_blank"><img alt="" src="http://c.dryicons.com/images/icon_sets/coquette_part_4_icons_set/png/128x128/pdf_file.png" width="75" height="100" /></a></p>
<p style="text-align:center;"><a href="https://copy.com/mFIxd6cIJac0RqoM/trinker%27s_notes.pdf?download=1" ref="nofollow" target="_blank">click here</a></p>
</div>
<p><em>*<strong>Note</strong> that the file is larger ~7000KB and 274 pages worth.</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/trinkerrstuff.wordpress.com/989/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/trinkerrstuff.wordpress.com/989/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=trinkerrstuff.wordpress.com&#038;blog=35458626&%23038;post=989&%23038;subd=trinkerrstuff&%23038;ref=&%23038;feed=1" width="1" height="1" />
<p class="syndicated-attribution"><div class="social4i" style="height:82px;"><div class="social4in" style="height:82px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/sharing-my-r-notes/" data-send="true"  data-layout="box_count" data-width="55" data-height="62"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/sharing-my-r-notes/" data-counturl="http://www.r-bloggers.com/sharing-my-r-notes/" data-text="Sharing my R notes" class="twitter-share-button" data-count="vertical" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="tall" href="http://www.r-bloggers.com/sharing-my-r-notes/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://trinkerrstuff.wordpress.com/2013/05/19/sharing-my-r-notes/"> TRinker&#039;s R Blog » R</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers" rel="nofollow">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/" rel="nofollow">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial" rel="nofollow">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2" rel="nofollow">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot" rel="nofollow">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map" rel="nofollow">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation" rel="nofollow">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio" rel="nofollow">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave" rel="nofollow">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX" rel="nofollow">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL" rel="nofollow">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse" rel="nofollow">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git" rel="nofollow">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop" rel="nofollow">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping" rel="nofollow">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression" rel="nofollow">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA" rel="nofollow">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series" rel="nofollow">time series</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading" rel="nofollow">trading</a>) and more...
</div></p><img src="http://feeds.feedburner.com/~r/RBloggers/~4/ejJOrHOlMBI" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/sharing-my-r-notes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://1.gravatar.com/avatar/489e674a9beb5760a71abf592b462669?s=96&amp;amp;d=identicon&amp;amp;r=G" length="" type="" />
<enclosure url="http://c.dryicons.com/images/icon_sets/coquette_part_4_icons_set/png/128x128/pdf_file.png" length="" type="" />
		<feedburner:origLink>http://www.r-bloggers.com/sharing-my-r-notes/</feedburner:origLink></item>
		<item>
		<title>R (Web Server) Solutions – Amplifying Artichokes</title>
		<link>http://feedproxy.google.com/~r/RBloggers/~3/9rzGECpJBmA/</link>
		<comments>http://www.r-bloggers.com/r-web-server-solutions-amplifying-artichokes/#comments</comments>
		<pubDate>Sat, 18 May 2013 16:25:00 +0000</pubDate>
		<dc:creator>Pradeep Mavuluri</dc:creator>
		
		<guid isPermaLink="false">http://www.r-bloggers.com/?guid=6aaf4eb598dfe40582769447e5debf6b</guid>
		<description><![CDATA[Every month I see one or more new R based web server solutions coming into the market, sight seeing some of them thought of sharing one of my old architecture map manifested to the client long back in early 2009 (good to see quick spreading of scalable...]]></description>
				<content:encoded><![CDATA[<p class="syndicated-attribution"><div class="social4i" style="height:29px;"><div class="social4in" style="height:29px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/r-web-server-solutions-amplifying-artichokes/" data-send="true"  data-layout="button_count" data-width="100" data-height="21"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/r-web-server-solutions-amplifying-artichokes/" data-counturl="http://www.r-bloggers.com/r-web-server-solutions-amplifying-artichokes/" data-text="R (Web Server) Solutions &#8211; Amplifying Artichokes" class="twitter-share-button" data-count="horizontal" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="medium" href="http://www.r-bloggers.com/r-web-server-solutions-amplifying-artichokes/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://costaleconomist.blogspot.com/2013/05/r-web-server-solutions-amplifying.html"> Econometrics_Help</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers)</a>      
</div></p>
<div style="text-align: justify;"><span style="font-size: large;"><span style="font-family: Times, &quot;Times New Roman&quot;, serif;">Every month I see one or more new R based web server solutions coming into the market, sight seeing some of them thought of sharing one of my old architecture map manifested to the client long back in early 2009 (good to see quick spreading of scalable and customizable open source statistical computing tool in the market)</span></span>.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://1.bp.blogspot.com/-ti7ty13K0uI/UZep5t2wn4I/AAAAAAAAAIc/6nz7A_srSf4/s1600/R_Web_Server_Solutions.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;" ref="nofollow" target="_blank"><img border="0" src="http://1.bp.blogspot.com/-ti7ty13K0uI/UZep5t2wn4I/AAAAAAAAAIc/6nz7A_srSf4/s1600/R_Web_Server_Solutions.PNG" /></a></div></div>
<p class="syndicated-attribution"><div class="social4i" style="height:82px;"><div class="social4in" style="height:82px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/r-web-server-solutions-amplifying-artichokes/" data-send="true"  data-layout="box_count" data-width="55" data-height="62"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/r-web-server-solutions-amplifying-artichokes/" data-counturl="http://www.r-bloggers.com/r-web-server-solutions-amplifying-artichokes/" data-text="R (Web Server) Solutions &#8211; Amplifying Artichokes" class="twitter-share-button" data-count="vertical" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="tall" href="http://www.r-bloggers.com/r-web-server-solutions-amplifying-artichokes/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://costaleconomist.blogspot.com/2013/05/r-web-server-solutions-amplifying.html"> Econometrics_Help</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers" rel="nofollow">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/" rel="nofollow">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial" rel="nofollow">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2" rel="nofollow">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot" rel="nofollow">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map" rel="nofollow">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation" rel="nofollow">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio" rel="nofollow">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave" rel="nofollow">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX" rel="nofollow">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL" rel="nofollow">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse" rel="nofollow">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git" rel="nofollow">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop" rel="nofollow">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping" rel="nofollow">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression" rel="nofollow">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA" rel="nofollow">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series" rel="nofollow">time series</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading" rel="nofollow">trading</a>) and more...
</div></p><img src="http://feeds.feedburner.com/~r/RBloggers/~4/9rzGECpJBmA" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/r-web-server-solutions-amplifying-artichokes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		<feedburner:origLink>http://www.r-bloggers.com/r-web-server-solutions-amplifying-artichokes/</feedburner:origLink></item>
		<item>
		<title>What is probabilistic truth?</title>
		<link>http://feedproxy.google.com/~r/RBloggers/~3/s-DXc2m1x_o/</link>
		<comments>http://www.r-bloggers.com/what-is-probabilistic-truth/#comments</comments>
		<pubDate>Sat, 18 May 2013 15:34:40 +0000</pubDate>
		<dc:creator>Corey Chivers</dc:creator>
		
		<guid isPermaLink="false">http://bayesianbiologist.com/?p=917</guid>
		<description><![CDATA[I am currently working on a validation metric for binary prediction models. That is, models which make predictions about outcomes that can take on either of two possible states (eg Dead/not dead, heads/tails, cat in picture/no cat in picture, etc.) The most commonly used metric for this class of models is AUC, which assesses the [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bayesianbiologist.com&#38;blog=23855543&#38;post=917&#38;subd=bayesianbiologist&#38;ref=&#38;feed=1" width="1" height="1">
]]></description>
				<content:encoded><![CDATA[<p class="syndicated-attribution"><div class="social4i" style="height:29px;"><div class="social4in" style="height:29px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/what-is-probabilistic-truth/" data-send="true"  data-layout="button_count" data-width="100" data-height="21"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/what-is-probabilistic-truth/" data-counturl="http://www.r-bloggers.com/what-is-probabilistic-truth/" data-text="What is probabilistic truth?" class="twitter-share-button" data-count="horizontal" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="medium" href="http://www.r-bloggers.com/what-is-probabilistic-truth/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://bayesianbiologist.com/2013/05/18/what-is-probabilistic-truth/"> bayesianbiologist » Rstats</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers)</a>      
</div></p>
<p>I am currently working on a validation metric for binary prediction models. That is, models which make predictions about outcomes that can take on either of two possible states (eg Dead/not dead, heads/tails, cat in picture/no cat in picture, etc.) The most commonly used metric for this class of models is <a href="http://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_curve" ref="nofollow" target="_blank">AUC</a>, which assesses the relative error rates (false positive, false negative) across the whole range of possible decision thresholds. The result is a curve that looks something like this:</p>
<p style="text-align:center;"><a href="http://bayesianbiologist.files.wordpress.com/2013/05/auc.png" ref="nofollow" target="_blank"><img class="aligncenter  wp-image-918" alt="auc" src="http://bayesianbiologist.files.wordpress.com/2013/05/auc.png?w=384&#038;h=384" width="384" height="384" /></a></p>
<p>Where the area under the curve (the curve itself is the Receiver Operator Curve (ROC)) is some value between 0 and 1. The higher this value, the better your model is said to perform. The problem with this metric, as many authors have pointed out, is that a model can perform very well in terms of AUC, but be completely miscalibrated in terms of the actual <em>probabilities</em> placed on each outcome.</p>
<p>A model which distinguishes perfectly between positive and negative cases (AUC=1) by placing a probability of 0.01 on positive cases and 0.001 on negative cases may be very far off in terms of the actual probability of a positive case. For instance, positive cases may actually occur with probability 0.6 and negative cases with 0.2. In most real situations, our models will predict a whole range of different probabilities with a unique prediction for each data point, but the general idea remains. If your goal is simply to distinguish between cases, you may not care whether the probabilities are not correct. However, if your model is purporting to quantify risk then you very much want to know if you are placing the<em> probabilistically true predictions</em> on cases that are yet to be observed.</p>
<p>Which begs the question:<strong> What is probabilistic truth? </strong></p>
<p>This questions appears, at least at first, to be rather simple. A frequentist definition would say that the probability is correct, or <em>true</em>, if the predicted probability is equal to the long run outcomes.  Think of a dice rolled over and over counting the number of times a one is rolled. We would compare this frequency to our predicted probability of rolling a one (1/6 for a fair six-sided die) and would say that our predicted probability was true if this frequency matched 1/6.</p>
<p>But what about situations where we can&#8217;t re-run an experiment over and over again? How then would we evaluate the probabilistic truth of our predictions?</p>
<p>I&#8217;ll be working through this problem in a series of posts in the coming weeks. Stay tuned!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bayesianbiologist.wordpress.com/917/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bayesianbiologist.wordpress.com/917/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bayesianbiologist.com&#038;blog=23855543&%23038;post=917&%23038;subd=bayesianbiologist&%23038;ref=&%23038;feed=1" width="1" height="1" />
<p class="syndicated-attribution"><div class="social4i" style="height:82px;"><div class="social4in" style="height:82px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/what-is-probabilistic-truth/" data-send="true"  data-layout="box_count" data-width="55" data-height="62"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/what-is-probabilistic-truth/" data-counturl="http://www.r-bloggers.com/what-is-probabilistic-truth/" data-text="What is probabilistic truth?" class="twitter-share-button" data-count="vertical" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="tall" href="http://www.r-bloggers.com/what-is-probabilistic-truth/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://bayesianbiologist.com/2013/05/18/what-is-probabilistic-truth/"> bayesianbiologist » Rstats</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers" rel="nofollow">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/" rel="nofollow">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial" rel="nofollow">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2" rel="nofollow">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot" rel="nofollow">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map" rel="nofollow">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation" rel="nofollow">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio" rel="nofollow">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave" rel="nofollow">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX" rel="nofollow">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL" rel="nofollow">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse" rel="nofollow">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git" rel="nofollow">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop" rel="nofollow">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping" rel="nofollow">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression" rel="nofollow">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA" rel="nofollow">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series" rel="nofollow">time series</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading" rel="nofollow">trading</a>) and more...
</div></p><img src="http://feeds.feedburner.com/~r/RBloggers/~4/s-DXc2m1x_o" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/what-is-probabilistic-truth/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://1.gravatar.com/avatar/4e3c284399ac0ca4375ef4bea2ba4d03?s=96&amp;amp;d=identicon&amp;amp;r=G" length="" type="" />
<enclosure url="http://bayesianbiologist.files.wordpress.com/2013/05/auc.png" length="" type="" />
		<feedburner:origLink>http://www.r-bloggers.com/what-is-probabilistic-truth/</feedburner:origLink></item>
		<item>
		<title>Recent Changes to caret</title>
		<link>http://feedproxy.google.com/~r/RBloggers/~3/9V8awutOg7k/</link>
		<comments>http://www.r-bloggers.com/recent-changes-to-caret/#comments</comments>
		<pubDate>Sat, 18 May 2013 12:19:35 +0000</pubDate>
		<dc:creator>Max Kuhn</dc:creator>
		
		<guid isPermaLink="false">http://www.r-bloggers.com/?guid=c3272eb6d3dcd94bbd3e1b88d859a53f</guid>
		<description><![CDATA[
<p>Here is a summary of some recent changes to <a href="http://caret.r-forge.r-project.org/">caret</a>. </p>

<p>Feature Updates:</p>

<ul>
<li><p><code>train</code> was updated to utilize recent changes in the <a href="http://code.google.com/p/gradientboostedmodels/">gbm</a> package that allow for boosting with three or more classes (via the multinomial distribution)</p></li>
<li><p>The Yeo-Johnson power transformation was added. This is very similar to the Box-Cox transformation, but it does not require the data to be greater than zero.</p></li>
</ul>
<p>New models referenced by <code>train</code>:</p>

<ul>
<li><p>Maximum uncertainty linear discriminant analysis (<code>Mlda</code>) and factor-based linear discriminant analysis (<code>RFlda</code>) from the <a href="http://cran.r-project.org/web/packages/HiDimDA/index.html">HiDimDA</a> package were added. </p></li>
<li><p>The <code>kknn.train</code> model in the <a href="http://cran.r-project.org/web/packages/kknn/index.html">kknn</a> package was added. This is basically a more intelligent <em>K</em>-nearest neighbors model that can use distance weighting, non-Euclidean distances (via the o Minkowski distance) and a few other features. </p></li>
<li><p>The <code>extraTrees</code> function in the <a href="http://cran.r-project.org/web/packages/extraTrees/index.html">package of the same name</a> was added. This generalizes the random forest model by adding randomness to the predictors <em>and</em> the split values that are evaluated at each split point.</p></li>
</ul>
<p>Numerous bugs were also fixed in the last few releases. </p>

<p>The new version is <a href="http://cran.r-project.org/web/packages/caret/index.html">5.16-04</a>.
Feel free to email me at <a href="mailto:mxkuhn@gmail.com">mxkuhn@gmail.com</a> if you have any feature requests or questions. </p>
]]></description>
				<content:encoded><![CDATA[<p class="syndicated-attribution"><div class="social4i" style="height:29px;"><div class="social4in" style="height:29px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/recent-changes-to-caret/" data-send="true"  data-layout="button_count" data-width="100" data-height="21"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/recent-changes-to-caret/" data-counturl="http://www.r-bloggers.com/recent-changes-to-caret/" data-text="Recent Changes to caret" class="twitter-share-button" data-count="horizontal" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="medium" href="http://www.r-bloggers.com/recent-changes-to-caret/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://appliedpredictivemodeling.com/blog/2013/5/18/recent-changes-to-caret"> Blog - Applied Predictive Modeling</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers)</a>      
</div></p>
<p>Here is a summary of some recent changes to <a href="http://caret.r-forge.r-project.org/" ref="nofollow" target="_blank">caret</a>. </p>

<p>Feature Updates:</p>

<ul>
<li><p><code>train</code> was updated to utilize recent changes in the <a href="http://code.google.com/p/gradientboostedmodels/" ref="nofollow" target="_blank">gbm</a> package that allow for boosting with three or more classes (via the multinomial distribution)</p></li>
<li><p>The Yeo-Johnson power transformation was added. This is very similar to the Box-Cox transformation, but it does not require the data to be greater than zero.</p></li>
</ul>

<p>New models referenced by <code>train</code>:</p>

<ul>
<li><p>Maximum uncertainty linear discriminant analysis (<code>Mlda</code>) and factor-based linear discriminant analysis (<code>RFlda</code>) from the <a href="http://cran.r-project.org/web/packages/HiDimDA/index.html" ref="nofollow" target="_blank">HiDimDA</a> package were added. </p></li>
<li><p>The <code>kknn.train</code> model in the <a href="http://cran.r-project.org/web/packages/kknn/index.html" ref="nofollow" target="_blank">kknn</a> package was added. This is basically a more intelligent <em>K</em>-nearest neighbors model that can use distance weighting, non-Euclidean distances (via the o Minkowski distance) and a few other features. </p></li>
<li><p>The <code>extraTrees</code> function in the <a href="http://cran.r-project.org/web/packages/extraTrees/index.html" ref="nofollow" target="_blank">package of the same name</a> was added. This generalizes the random forest model by adding randomness to the predictors <em>and</em> the split values that are evaluated at each split point.</p></li>
</ul>

<p>Numerous bugs were also fixed in the last few releases. </p>

<p>The new version is <a href="http://cran.r-project.org/web/packages/caret/index.html" ref="nofollow" target="_blank">5.16-04</a>.
Feel free to email me at <a href="mailto:mxkuhn@gmail.com" ref="nofollow" target="_blank">&#x6D;&#x78;&#x6B;&#x75;&#x68;&#x6E;&#x40;&#103;&#109;&#x61;&#105;&#x6C;&#46;c&#111;&#109;</a> if you have any feature requests or questions. </p>
<p class="syndicated-attribution"><div class="social4i" style="height:82px;"><div class="social4in" style="height:82px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/recent-changes-to-caret/" data-send="true"  data-layout="box_count" data-width="55" data-height="62"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/recent-changes-to-caret/" data-counturl="http://www.r-bloggers.com/recent-changes-to-caret/" data-text="Recent Changes to caret" class="twitter-share-button" data-count="vertical" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="tall" href="http://www.r-bloggers.com/recent-changes-to-caret/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://appliedpredictivemodeling.com/blog/2013/5/18/recent-changes-to-caret"> Blog - Applied Predictive Modeling</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers" rel="nofollow">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/" rel="nofollow">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial" rel="nofollow">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2" rel="nofollow">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot" rel="nofollow">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map" rel="nofollow">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation" rel="nofollow">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio" rel="nofollow">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave" rel="nofollow">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX" rel="nofollow">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL" rel="nofollow">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse" rel="nofollow">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git" rel="nofollow">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop" rel="nofollow">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping" rel="nofollow">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression" rel="nofollow">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA" rel="nofollow">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series" rel="nofollow">time series</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading" rel="nofollow">trading</a>) and more...
</div></p><img src="http://feeds.feedburner.com/~r/RBloggers/~4/9V8awutOg7k" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/recent-changes-to-caret/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		<feedburner:origLink>http://www.r-bloggers.com/recent-changes-to-caret/</feedburner:origLink></item>
		<item>
		<title>Wiekvoet 2013-05-18 02:26:00</title>
		<link>http://feedproxy.google.com/~r/RBloggers/~3/DVP4E4jI_CI/</link>
		<comments>http://www.r-bloggers.com/wiekvoet-2013-05-18-022600/#comments</comments>
		<pubDate>Sat, 18 May 2013 08:26:00 +0000</pubDate>
		<dc:creator>Wingfeet</dc:creator>
		
		<guid isPermaLink="false">http://www.r-bloggers.com/?guid=3cc3d808a60f7ed0123921464430dab1</guid>
		<description><![CDATA[I was reading Paul Hiemsta's blogpost on&#160;Much more efficient bubble sort in R using the Rcpp and inline packages, went back to his first post&#160;&#160;Bubble sort implemented in pure R&#160;and thought, surely we can do it better in pure R. So I...]]></description>
				<content:encoded><![CDATA[<p class="syndicated-attribution"><div class="social4i" style="height:29px;"><div class="social4in" style="height:29px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/wiekvoet-2013-05-18-022600/" data-send="true"  data-layout="button_count" data-width="100" data-height="21"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/wiekvoet-2013-05-18-022600/" data-counturl="http://www.r-bloggers.com/wiekvoet-2013-05-18-022600/" data-text="Wiekvoet 2013-05-18 02:26:00" class="twitter-share-button" data-count="horizontal" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="medium" href="http://www.r-bloggers.com/wiekvoet-2013-05-18-022600/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://wiekvoet.blogspot.com/2013/05/i-was-reading-paul-hiemstas-blogpost-on.html"> Wiekvoet</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers)</a>      
</div></p>
I was reading Paul Hiemsta's blogpost on&nbsp;<a href="http://www.numbertheory.nl/2013/05/14/much-more-efficient-bubble-sort-in-r-using-the-rcpp-and-inline-packages/" ref="nofollow" target="_blank">Much more efficient bubble sort in R using the Rcpp and inline packages</a>, went back to his first post&nbsp;&nbsp;<a href="http://www.numbertheory.nl/2013/05/10/bubble-sort-implemented-in-pure-r/" ref="nofollow" target="_blank">Bubble sort implemented in pure R</a>&nbsp;and thought, surely we can do it better in pure R. So I cleaned inner loops, removed a recursion made some variations and finally made a big improvement as I vectorized it. I don't need to try Rcpp to know that is and will be faster, but surely closed the gap a bit.<br /><h4>Original code</h4>This is how it was originally in the blog in pure R. If there are changes it is layout.<br /><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">larger = function(pair) {</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; if(pair[1] &gt; pair[2]) return(TRUE) else return(FALSE)</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">swap_if_larger = function(pair) {</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; if(larger(pair)) {</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; return(rev(pair))&nbsp;</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; } else {</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; return(pair)</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; }</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">swap_pass = function(vec) {&nbsp;</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; for(i in seq(1, length(vec)-1)) {</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; vec[i:(i+1)] = swap_if_larger(vec[i:(i+1)])</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; }</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; return(vec)</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">bubble_sort = function(vec) {</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; new_vec = swap_pass(vec)</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; if(isTRUE(all.equal(vec, new_vec))) {&nbsp;</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; return(new_vec)&nbsp;</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; } else {</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; return(bubble_sort(new_vec))</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; }</span></span><br /><span style="background-color: #f3f3f3;"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></span><br /><h4>Improve inside of loops</h4><div>This is essentially the same code, but cleaned the first two functions.&nbsp;</div><div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">larger1 = function(pair) pair[1] &gt; pair[2]</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;"><br /></span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">swap_if_larger1 = function(pair) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; if(larger1(pair)) rev(pair)&nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; else pair</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></div><div><br /></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">swap_pass1 = function(vec) {&nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; for(i in seq(1, length(vec)-1)) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; vec[i:(i+1)] = swap_if_larger1(vec[i:(i+1)])</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; }</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; return(vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;"><br /></span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">bubble_sort1 = function(vec) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; new_vec = swap_pass1(vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; if(isTRUE(all.equal(vec, new_vec))) {&nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; return(new_vec)&nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; } else {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; return(bubble_sort1(new_vec))</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; }</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></div></div><h4>Improve outside loops</h4><div>I cleaned the outside loop, then decided that the two inside functions were not needed any more, because it was reduced to one statement.</div><div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">swap_pass2 = function(vec) {&nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; for(i in 1:(length(vec)-1)) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; if (vec[i] &gt; vec[i+1] ) vec[c language="(i,i+1)"][/c] &lt;- vec[c language="(i+1,i)"][/c]</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; }</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; return(vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;"><br /></span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">bubble_sort2 = function(vec) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; new_vec = swap_pass2(vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; if(identical(vec, new_vec)) new_vec&nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; else bubble_sort2(new_vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></div></div><h4>No recursion</h4><div>This was actually not intended a speed update, but R was complaining about too deep recursion. The same swapp_pass2 as before is used.&nbsp;</div><div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">bubble_sort3 = function(vec) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; new_vec = swap_pass2(vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; while (!identical(vec, new_vec)) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; vec &lt;- new_vec</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; new_vec = swap_pass2(vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; }&nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; new_vec&nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></div></div><h4>A different setup</h4><div>The way I understood bubblesort is different from what is programmed so far. So far the sort is completed when no improvements are made, which is checked via a vector comparison. I always understood you don't check, the first bubble goes to the end, at which point the last element is the maximum. The second bubble then stops at end-1 etc. The final bubble is only the first two elements, after which the algorithm is finished.&nbsp;</div><div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">swap_pass4 = function(vec,iend) {&nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; for(i in 1:iend) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; if (vec[i] &gt; vec[i+1] ) vec[c language="(i,i+1)"][/c] &lt;- vec[c language="(i+1,i)"][/c]</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; }</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; return(vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></div><div><span style="background-color: #f3f3f3; font-family: 'Courier New', Courier, monospace; font-size: x-small;">bubble_sort4 = function(vec) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; for (iend in (length(vec)-1):1) vec &lt;- swap_pass4(vec,iend)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; vec</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></div></div><h4>Tuning Paul's original algorithm</h4><div>I was a bit disappointed with the improvements from using the true bubble sort.&nbsp;Apparently there is some gain by checking if the process is completed. Which is logical since the bubbles do some sorting of the intermediate elements. So, what about bubbles that go up and go down combined with a check on no improvements?</div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">swap_pass3b = function(vec) {&nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; for(i in length(vec):2) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; if (vec[i] &lt; vec[i-1] ) vec[c language="(i,i-1)"][/c] &lt;- vec[c language="(i-1,i)"][/c]</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; }</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; return(vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;"><br /></span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">bubble_sort3b = function(vec) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; new_vec = swap_pass2(vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; while (!identical(vec, new_vec)) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; new_vec = swap_pass2(vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; vec &lt;- swap_pass3b(new_vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; }&nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; vec&nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></div><h4>Vectorizing the bubbles</h4><div>When you look at the bubblesort without checking, it seems it is only assumed that the first step brings the highest element at the last position. This can be achieved by just pulling the highest element and placing it at the end, without the intermediate swaps.</div><div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">bubble_sort5 = function(vec) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; wm &lt;- which.max(vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; vec &lt;- c(vec[-wm],vec[wm])</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; for (iend in ((length(vec)-1):2)) {</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; wm &lt;- which.max(vec[1:iend])</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; vec &lt;- c(vec[1:iend][-wm],vec[1:iend][wm],vec[(iend+1):length(vec)])</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; }</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; vec</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></div></div><h2>Results</h2><div>First a demonstration that they work;</div><div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">test_vec = round(runif(10, 0, 100))</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">cbind(</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; bubble_sort(test_vec),</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; bubble_sort1(test_vec),</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; bubble_sort2(test_vec),</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; bubble_sort3(test_vec),</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; bubble_sort4(test_vec),</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; bubble_sort3b(test_vec),</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; bubble_sort5(test_vec)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">)</span></div></div><div><div><span style="font-family: Courier New, Courier, monospace;">&nbsp; &nbsp; &nbsp; [,1] [,2] [,3] [,4] [,5] [,6] [,7]</span></div><div><span style="font-family: Courier New, Courier, monospace;">&nbsp;[1,] &nbsp; 28 &nbsp; 28 &nbsp; 28 &nbsp; 28 &nbsp; 28 &nbsp; 28 &nbsp; 28</span></div><div><span style="font-family: Courier New, Courier, monospace;">&nbsp;[2,] &nbsp; 41 &nbsp; 41 &nbsp; 41 &nbsp; 41 &nbsp; 41 &nbsp; 41 &nbsp; 41</span></div><div><span style="font-family: Courier New, Courier, monospace;">&nbsp;[3,] &nbsp; 42 &nbsp; 42 &nbsp; 42 &nbsp; 42 &nbsp; 42 &nbsp; 42 &nbsp; 42</span></div><div><span style="font-family: Courier New, Courier, monospace;">&nbsp;[4,] &nbsp; 46 &nbsp; 46 &nbsp; 46 &nbsp; 46 &nbsp; 46 &nbsp; 46 &nbsp; 46</span></div><div><span style="font-family: Courier New, Courier, monospace;">&nbsp;[5,] &nbsp; 52 &nbsp; 52 &nbsp; 52 &nbsp; 52 &nbsp; 52 &nbsp; 52 &nbsp; 52</span></div><div><span style="font-family: Courier New, Courier, monospace;">&nbsp;[6,] &nbsp; 68 &nbsp; 68 &nbsp; 68 &nbsp; 68 &nbsp; 68 &nbsp; 68 &nbsp; 68</span></div><div><span style="font-family: Courier New, Courier, monospace;">&nbsp;[7,] &nbsp; 89 &nbsp; 89 &nbsp; 89 &nbsp; 89 &nbsp; 89 &nbsp; 89 &nbsp; 89</span></div><div><span style="font-family: Courier New, Courier, monospace;">&nbsp;[8,] &nbsp; 93 &nbsp; 93 &nbsp; 93 &nbsp; 93 &nbsp; 93 &nbsp; 93 &nbsp; 93</span></div><div><span style="font-family: Courier New, Courier, monospace;">&nbsp;[9,] &nbsp; 98 &nbsp; 98 &nbsp; 98 &nbsp; 98 &nbsp; 98 &nbsp; 98 &nbsp; 98</span></div><div><span style="font-family: Courier New, Courier, monospace;">[10,] &nbsp;100 &nbsp;100 &nbsp;100 &nbsp;100 &nbsp;100 &nbsp;100 &nbsp;100</span></div></div><h4>Speed</h4><div>Now the timing. Cleaning the outer loops saves a third of the time. The bubble as I remembered it and the up-and-down bubble improve a bit. We are now at 13% of the original time. The final version, which uses the vectorization, blows them all out of the water.&nbsp;</div><div><div><span style="font-family: Courier New, Courier, monospace;">test_vec = runif(1000, 0, 100)</span></div><div><span style="font-family: Courier New, Courier, monospace;">system.time(bubble_sort(test_vec))</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp; user &nbsp;system elapsed&nbsp;</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp;25.48 &nbsp; &nbsp;0.02 &nbsp; 25.69&nbsp;</span></div><div><span style="font-family: Courier New, Courier, monospace;">system.time(bubble_sort1(test_vec))</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp; user &nbsp;system elapsed&nbsp;</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp;17.88 &nbsp; &nbsp;0.00 &nbsp; 17.91&nbsp;</span></div><div><span style="font-family: Courier New, Courier, monospace;">system.time(bubble_sort2(test_vec))</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp; user &nbsp;system elapsed&nbsp;</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp; 4.75 &nbsp; &nbsp;0.00 &nbsp; &nbsp;4.75&nbsp;</span></div><div><span style="font-family: Courier New, Courier, monospace;">system.time(bubble_sort3(test_vec))</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp; user &nbsp;system elapsed&nbsp;</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp; 4.75 &nbsp; &nbsp;0.00 &nbsp; &nbsp;4.74&nbsp;</span></div><div><span style="font-family: Courier New, Courier, monospace;">system.time(bubble_sort4(test_vec))</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp; user &nbsp;system elapsed&nbsp;</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp; 3.45 &nbsp; &nbsp;0.00 &nbsp; &nbsp;3.44&nbsp;</span></div><div><span style="font-family: Courier New, Courier, monospace;">system.time(bubble_sort3b(test_vec))</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp; user &nbsp;system elapsed&nbsp;</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp; 3.57 &nbsp; &nbsp;0.00 &nbsp; &nbsp;3.59&nbsp;</span></div><div><span style="font-family: Courier New, Courier, monospace;">system.time(bubble_sort5(test_vec))</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp; user &nbsp;system elapsed&nbsp;</span></div><div><span style="font-family: Courier New, Courier, monospace;"># &nbsp; 0.06 &nbsp; &nbsp;0.00 &nbsp; &nbsp;0.06&nbsp;</span></div></div><h4>How about sort()?</h4><div>Sort is much faster and scales much better, even more so after subtracting the time for the sample() call.. But I don't need to do the exercise to know that a not optimal algorithm in an interpreted language cannot compete with modern algorithms in a compiled language.</div><div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">library(microbenchmark)</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">vector_size = 100 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">print(microbenchmark(bubble_sort5(sample(vector_size)), &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; &nbsp; &nbsp; sort(sample(vector_size)),</span></div><div><span style="background-color: #f3f3f3; font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; &nbsp; &nbsp; sample(vector_size)) &nbsp;)</span></div></div><div><div><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">Unit: microseconds</span></div><div><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; expr &nbsp; &nbsp; &nbsp;min &nbsp; &nbsp; &nbsp; lq &nbsp; &nbsp;median &nbsp; &nbsp; &nbsp; uq</span></div><div><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp;bubble_sort5(sample(vector_size)) 1730.041 1776.957 1806.2795 1866.758</span></div><div><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;sort(sample(vector_size)) &nbsp;104.096 &nbsp;114.359 &nbsp;126.4540 &nbsp;180.334</span></div><div><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;sample(vector_size) &nbsp; &nbsp;5.865 &nbsp; &nbsp;8.797 &nbsp; &nbsp;9.8965 &nbsp; 11.729</span></div><div><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp; &nbsp; max neval</span></div><div><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp;9978.521 &nbsp; 100</span></div><div><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; 233.849 &nbsp; 100</span></div><div><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">&nbsp; &nbsp;21.992 &nbsp; 100</span></div></div><div><br /></div>
<p class="syndicated-attribution"><div class="social4i" style="height:82px;"><div class="social4in" style="height:82px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/wiekvoet-2013-05-18-022600/" data-send="true"  data-layout="box_count" data-width="55" data-height="62"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/wiekvoet-2013-05-18-022600/" data-counturl="http://www.r-bloggers.com/wiekvoet-2013-05-18-022600/" data-text="Wiekvoet 2013-05-18 02:26:00" class="twitter-share-button" data-count="vertical" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="tall" href="http://www.r-bloggers.com/wiekvoet-2013-05-18-022600/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://wiekvoet.blogspot.com/2013/05/i-was-reading-paul-hiemstas-blogpost-on.html"> Wiekvoet</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers" rel="nofollow">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/" rel="nofollow">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial" rel="nofollow">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2" rel="nofollow">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot" rel="nofollow">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map" rel="nofollow">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation" rel="nofollow">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio" rel="nofollow">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave" rel="nofollow">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX" rel="nofollow">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL" rel="nofollow">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse" rel="nofollow">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git" rel="nofollow">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop" rel="nofollow">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping" rel="nofollow">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression" rel="nofollow">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA" rel="nofollow">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series" rel="nofollow">time series</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading" rel="nofollow">trading</a>) and more...
</div></p><img src="http://feeds.feedburner.com/~r/RBloggers/~4/DVP4E4jI_CI" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/wiekvoet-2013-05-18-022600/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		<feedburner:origLink>http://www.r-bloggers.com/wiekvoet-2013-05-18-022600/</feedburner:origLink></item>
		<item>
		<title>Interfacing XTide and R</title>
		<link>http://feedproxy.google.com/~r/RBloggers/~3/a0F72BpbT3o/</link>
		<comments>http://www.r-bloggers.com/interfacing-xtide-and-r/#comments</comments>
		<pubDate>Sat, 18 May 2013 03:02:18 +0000</pubDate>
		<dc:creator>Luke Miller</dc:creator>
		
		<guid isPermaLink="false">http://lukemiller.org/?p=1536</guid>
		<description><![CDATA[XTide is an open-source program that predicts tide heights and current speeds for hundreds of tide and current stations around the United States. It can be used to produce tide predictions in the past and future for a site at your chosen interval (down...]]></description>
				<content:encoded><![CDATA[<p class="syndicated-attribution"><div class="social4i" style="height:29px;"><div class="social4in" style="height:29px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/interfacing-xtide-and-r/" data-send="true"  data-layout="button_count" data-width="100" data-height="21"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/interfacing-xtide-and-r/" data-counturl="http://www.r-bloggers.com/interfacing-xtide-and-r/" data-text="Interfacing XTide and R" class="twitter-share-button" data-count="horizontal" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="medium" href="http://www.r-bloggers.com/interfacing-xtide-and-r/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://lukemiller.org/index.php/2013/05/interfacing-xtide-and-r/"> lukemiller.org » R-project</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers)</a>      
</div></p>
XTide is an open-source program that predicts tide heights and current speeds for hundreds of tide and current stations around the United States. It can be used to produce tide predictions in the past and future for a site at your chosen interval (down to the minute), as well as producing sunrise and sunset times, [...]
<p class="syndicated-attribution"><div class="social4i" style="height:82px;"><div class="social4in" style="height:82px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/interfacing-xtide-and-r/" data-send="true"  data-layout="box_count" data-width="55" data-height="62"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/interfacing-xtide-and-r/" data-counturl="http://www.r-bloggers.com/interfacing-xtide-and-r/" data-text="Interfacing XTide and R" class="twitter-share-button" data-count="vertical" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="tall" href="http://www.r-bloggers.com/interfacing-xtide-and-r/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://lukemiller.org/index.php/2013/05/interfacing-xtide-and-r/"> lukemiller.org » R-project</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers" rel="nofollow">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/" rel="nofollow">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial" rel="nofollow">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2" rel="nofollow">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot" rel="nofollow">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map" rel="nofollow">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation" rel="nofollow">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio" rel="nofollow">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave" rel="nofollow">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX" rel="nofollow">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL" rel="nofollow">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse" rel="nofollow">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git" rel="nofollow">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop" rel="nofollow">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping" rel="nofollow">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression" rel="nofollow">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA" rel="nofollow">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series" rel="nofollow">time series</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading" rel="nofollow">trading</a>) and more...
</div></p><img src="http://feeds.feedburner.com/~r/RBloggers/~4/a0F72BpbT3o" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/interfacing-xtide-and-r/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		<feedburner:origLink>http://www.r-bloggers.com/interfacing-xtide-and-r/</feedburner:origLink></item>
		<item>
		<title>Unit conversion in R</title>
		<link>http://feedproxy.google.com/~r/RBloggers/~3/IgjkSAA5L8g/</link>
		<comments>http://www.r-bloggers.com/unit-conversion-in-r/#comments</comments>
		<pubDate>Fri, 17 May 2013 22:22:00 +0000</pubDate>
		<dc:creator>Karsten W.</dc:creator>
		
		<guid isPermaLink="false">http://www.r-bloggers.com/?guid=f10085c02fb0dec6761c474c9de31f75</guid>
		<description><![CDATA[
<p>Last weekend I submitted an update of my R package <code>datamart</code> to CRAN. It has been more than a half year since the last update, however there are only minor advances. The package is still in its early stages, and very experimental.</p>
<p>One new feature is the function <code>uconv</code>. Think <code>iconv</code>, but instead of converting character vectors between different encodings, this function converts numerical vectors between different units of measurements. Now if you want to know how many centimeters one horse length is, you can write in R:</p>
<pre><code>&#62; #install.packages("datamart")
&#62; library(datamart)
&#62; uconv(1, "horse length", "cm")
</code></pre>
<p>and you will get the answer 240. I had the idea for this function when I had to convert between various energy units, including natural units of energy fuels like cubic metres of natural gas. The <code>uconv</code> function supports this, using common constants for the conversion.</p>
<pre><code>&#62; uconv(1, "Mtoe", "PJ")
[1] 41.88
&#62; uconv(1, "m&#38;sup3; NG", "kWh")
[1] 10.55556
</code></pre>
<p>These conversions may be ambigious. For instance, the last one combines a volume and an energy dimension. An optional parameter allows the specification of the context, or <code>unitset</code>:</p>
<pre><code>&#62; uconv(1, "Mtoe", "PJ", uset="Energy")
</code></pre>
<p>The currently available unit sets and units therein can be inspected with</p>
<pre><code>&#62; uconvlist()
</code></pre>
<p>The first argument can be a numerical vector:</p>
<pre><code>&#62; set.seed(13)
&#62; uconv(37+2*rnorm(5), "&#176;C", "&#176;F", uset="Temperature")
[1] 100.59558  97.59102 104.99059  99.27435 102.71309
</code></pre>
]]></description>
				<content:encoded><![CDATA[<p class="syndicated-attribution"><div class="social4i" style="height:29px;"><div class="social4in" style="height:29px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/unit-conversion-in-r/" data-send="true"  data-layout="button_count" data-width="100" data-height="21"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/unit-conversion-in-r/" data-counturl="http://www.r-bloggers.com/unit-conversion-in-r/" data-text="Unit conversion in R" class="twitter-share-button" data-count="horizontal" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="medium" href="http://www.r-bloggers.com/unit-conversion-in-r/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://factbased.blogspot.com/2013/05/unit-conversion-in-r.html"> factbased</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers)</a>      
</div></p>
<p>Last weekend I submitted an update of my R package <code>datamart</code> to CRAN. It has been more than a half year since the last update, however there are only minor advances. The package is still in its early stages, and very experimental.</p><p>One new feature is the function <code>uconv</code>. Think <code>iconv</code>, but instead of converting character vectors between different encodings, this function converts numerical vectors between different units of measurements. Now if you want to know how many centimeters one horse length is, you can write in R:</p><pre><code>&gt; #install.packages(&quot;datamart&quot;)
&gt; library(datamart)
&gt; uconv(1, &quot;horse length&quot;, &quot;cm&quot;)
</code></pre><p>and you will get the answer 240. I had the idea for this function when I had to convert between various energy units, including natural units of energy fuels like cubic metres of natural gas. The <code>uconv</code> function supports this, using common constants for the conversion.</p><pre><code>&gt; uconv(1, &quot;Mtoe&quot;, &quot;PJ&quot;)
[1] 41.88
&gt; uconv(1, &quot;m³ NG&quot;, &quot;kWh&quot;)
[1] 10.55556
</code></pre><p>These conversions may be ambigious. For instance, the last one combines a volume and an energy dimension. An optional parameter allows the specification of the context, or <code>unitset</code>:</p><pre><code>&gt; uconv(1, &quot;Mtoe&quot;, &quot;PJ&quot;, uset=&quot;Energy&quot;)
</code></pre><p>The currently available unit sets and units therein can be inspected with</p><pre><code>&gt; uconvlist()
</code></pre><p>The first argument can be a numerical vector:</p><pre><code>&gt; set.seed(13)
&gt; uconv(37+2*rnorm(5), &quot;°C&quot;, &quot;°F&quot;, uset=&quot;Temperature&quot;)
[1] 100.59558  97.59102 104.99059  99.27435 102.71309
</code></pre>
<p class="syndicated-attribution"><div class="social4i" style="height:82px;"><div class="social4in" style="height:82px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/unit-conversion-in-r/" data-send="true"  data-layout="box_count" data-width="55" data-height="62"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/unit-conversion-in-r/" data-counturl="http://www.r-bloggers.com/unit-conversion-in-r/" data-text="Unit conversion in R" class="twitter-share-button" data-count="vertical" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="tall" href="http://www.r-bloggers.com/unit-conversion-in-r/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://factbased.blogspot.com/2013/05/unit-conversion-in-r.html"> factbased</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers" rel="nofollow">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/" rel="nofollow">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial" rel="nofollow">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2" rel="nofollow">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot" rel="nofollow">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map" rel="nofollow">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation" rel="nofollow">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio" rel="nofollow">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave" rel="nofollow">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX" rel="nofollow">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL" rel="nofollow">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse" rel="nofollow">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git" rel="nofollow">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop" rel="nofollow">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping" rel="nofollow">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression" rel="nofollow">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA" rel="nofollow">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series" rel="nofollow">time series</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading" rel="nofollow">trading</a>) and more...
</div></p><img src="http://feeds.feedburner.com/~r/RBloggers/~4/IgjkSAA5L8g" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/unit-conversion-in-r/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		<feedburner:origLink>http://www.r-bloggers.com/unit-conversion-in-r/</feedburner:origLink></item>
		<item>
		<title>Chutes &amp; ladders: How long is this going to take?</title>
		<link>http://feedproxy.google.com/~r/RBloggers/~3/eTHHr_dSOGE/</link>
		<comments>http://www.r-bloggers.com/chutes-ladders-how-long-is-this-going-to-take/#comments</comments>
		<pubDate>Fri, 17 May 2013 21:05:42 +0000</pubDate>
		<dc:creator>Karl Broman</dc:creator>
		
		<guid isPermaLink="false">http://kbroman.wordpress.com/?p=1786</guid>
		<description><![CDATA[I was playing Chutes &#38; Ladders with my four-year-old daughter yesterday, and I thought, &#8220;How long is this going to take?&#8221; I saw an interesting mathematical analysis of the game a few years ago, but it seems to be offline, though you can read it via the wayback machine. But that didn&#8217;t answer my specific [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=kbroman.wordpress.com&#38;blog=26292872&#38;post=1786&#38;subd=kbroman&#38;ref=&#38;feed=1" width="1" height="1">
]]></description>
				<content:encoded><![CDATA[<p class="syndicated-attribution"><div class="social4i" style="height:29px;"><div class="social4in" style="height:29px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/chutes-ladders-how-long-is-this-going-to-take/" data-send="true"  data-layout="button_count" data-width="100" data-height="21"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/chutes-ladders-how-long-is-this-going-to-take/" data-counturl="http://www.r-bloggers.com/chutes-ladders-how-long-is-this-going-to-take/" data-text="Chutes &amp; ladders: How long is this going to take?" class="twitter-share-button" data-count="horizontal" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="medium" href="http://www.r-bloggers.com/chutes-ladders-how-long-is-this-going-to-take/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://kbroman.wordpress.com/2013/05/17/chutes-ladders-how-long-is-this-going-to-take/"> The stupidest thing... » R</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers)</a>      
</div></p>
<p>I was playing Chutes &amp; Ladders with my four-year-old daughter yesterday, and I thought, &ldquo;How long is this going to take?&rdquo;</p>
<p>I saw an <a href="http://datagenetics.com/blog/november12011/" ref="nofollow" target="_blank">interesting mathematical analysis of the game</a> a few years ago, but it seems to be offline, though <a href="http://web.archive.org/web/20120819014527/http://www.datagenetics.com/blog/november12011/" ref="nofollow" target="_blank">you can read it</a> via the <a href="http://web.archive.org/" ref="nofollow" target="_blank">wayback machine</a>.</p>
<p>But that didn&#8217;t answer my specific question, namely, &ldquo;How long is this going to take?&rdquo;</p>
<p>So I wrote a bit of <a href="https://gist.github.com/kbroman/5600209" ref="nofollow" target="_blank">R code to simulate the game</a>.</p>
<p>Here&#8217;s the distribution of the number of spins to complete the game, by number of players:</p>
<p><a href="http://kbroman.files.wordpress.com/2013/05/chutes_and_ladders_spins.png" ref="nofollow" target="_blank"><img src="http://kbroman.files.wordpress.com/2013/05/chutes_and_ladders_spins.png?w=450&#038;h=225" alt="No. spins in chutes &amp; ladders" width="450" height="225" class="aligncenter size-large wp-image-1787" /></a></p>
<p>With two players, the average number of spins is 52, with a 90th percentile of 88.</p>
<p>If you add a third player, the average increases to 65, and the 90th percentile increases to 103.  You&#8217;re playing fewer rounds, but each round is three times as long. If you add a fourth player, the average is 76 and the 90th percentile is 117.</p>
<p>So, in trying to minimize the agony, it seems best to not encourage my eight-year-old son to join us in the game.  If he plays with us, there&#8217;s a 63% chance that it will take longer.</p>
<p>And that&#8217;s particularly true because then the chance of my daughter winning drops from about 1/2 to about 1/3.  </p>
<p>That raises another question: if I let her go first, what advantage does that give her?  Not much.  The chance that the person who goes first winning is 50.9%, 34.4%, and 25.9%, respectively, when there are 2, 3, and 4 players.  So not a noticeable amount.  Thus I cheat (on her behalf). Really, thought, I&#8217;m cheating in order to shorten the game as much as to ensure that she wins.</p>
<p><em>Note</em>: There&#8217;s a close connection between this problem and my work on the multiple-strain recombinant inbred lines. (See <a href="http://www.ncbi.nlm.nih.gov/pubmed/15545647" ref="nofollow" target="_blank">this</a> and <a href="http://www.ncbi.nlm.nih.gov/pubmed/22345609" ref="nofollow" target="_blank">that</a>.)  I&#8217;m tempted to play around with it some more.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/kbroman.wordpress.com/1786/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/kbroman.wordpress.com/1786/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=kbroman.wordpress.com&#038;blog=26292872&%23038;post=1786&%23038;subd=kbroman&%23038;ref=&%23038;feed=1" width="1" height="1" />
<p class="syndicated-attribution"><div class="social4i" style="height:82px;"><div class="social4in" style="height:82px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/chutes-ladders-how-long-is-this-going-to-take/" data-send="true"  data-layout="box_count" data-width="55" data-height="62"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/chutes-ladders-how-long-is-this-going-to-take/" data-counturl="http://www.r-bloggers.com/chutes-ladders-how-long-is-this-going-to-take/" data-text="Chutes &amp; ladders: How long is this going to take?" class="twitter-share-button" data-count="vertical" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="tall" href="http://www.r-bloggers.com/chutes-ladders-how-long-is-this-going-to-take/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://kbroman.wordpress.com/2013/05/17/chutes-ladders-how-long-is-this-going-to-take/"> The stupidest thing... » R</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers" rel="nofollow">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/" rel="nofollow">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial" rel="nofollow">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2" rel="nofollow">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot" rel="nofollow">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map" rel="nofollow">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation" rel="nofollow">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio" rel="nofollow">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave" rel="nofollow">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX" rel="nofollow">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL" rel="nofollow">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse" rel="nofollow">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git" rel="nofollow">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop" rel="nofollow">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping" rel="nofollow">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression" rel="nofollow">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA" rel="nofollow">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series" rel="nofollow">time series</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading" rel="nofollow">trading</a>) and more...
</div></p><img src="http://feeds.feedburner.com/~r/RBloggers/~4/eTHHr_dSOGE" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/chutes-ladders-how-long-is-this-going-to-take/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://1.gravatar.com/avatar/1e38fbb5dcf9455f3738a218433e7670?s=96&amp;amp;d=identicon&amp;amp;r=G" length="" type="" />
<enclosure url="http://kbroman.files.wordpress.com/2013/05/chutes_and_ladders_spins.png?w=450" length="" type="" />
		<feedburner:origLink>http://www.r-bloggers.com/chutes-ladders-how-long-is-this-going-to-take/</feedburner:origLink></item>
		<item>
		<title>Which Torontonians Want a Casino?  Survey Analysis Part 2</title>
		<link>http://feedproxy.google.com/~r/RBloggers/~3/qIrRuJT_aTg/</link>
		<comments>http://www.r-bloggers.com/which-torontonians-want-a-casino-survey-analysis-part-2/#comments</comments>
		<pubDate>Fri, 17 May 2013 19:52:55 +0000</pubDate>
		<dc:creator>inkhorn82</dc:creator>
		
		<guid isPermaLink="false">http://rforwork.info/?p=265</guid>
		<description><![CDATA[In my last post I said that I would try to investigate the question of who actually does want a casino, and whether place of residence is a factor in where they want the casino to be built. &#160;So, here &#8230; <a href="http://rforwork.info/2013/05/17/which-torontonians-want-a-casino-survey-analysis-part-2/">Continue reading <span>&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=rforwork.info&#38;blog=34976952&#38;post=265&#38;subd=rforwork&#38;ref=&#38;feed=1" width="1" height="1">
]]></description>
				<content:encoded><![CDATA[<p class="syndicated-attribution"><div class="social4i" style="height:29px;"><div class="social4in" style="height:29px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/which-torontonians-want-a-casino-survey-analysis-part-2/" data-send="true"  data-layout="button_count" data-width="100" data-height="21"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/which-torontonians-want-a-casino-survey-analysis-part-2/" data-counturl="http://www.r-bloggers.com/which-torontonians-want-a-casino-survey-analysis-part-2/" data-text="Which Torontonians Want a Casino?  Survey Analysis Part 2" class="twitter-share-button" data-count="horizontal" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="medium" href="http://www.r-bloggers.com/which-torontonians-want-a-casino-survey-analysis-part-2/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://rforwork.info/2013/05/17/which-torontonians-want-a-casino-survey-analysis-part-2/"> Data and Analysis with R, at Work</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers)</a>      
</div></p>
<p>In my <a href="http://rforwork.info/2013/05/02/do-torontonians-want-a-new-casino-survey-analysis-part-1/" ref="nofollow" target="_blank">last post</a> I said that I would try to investigate the question of who actually does want a casino, and whether place of residence is a factor in where they want the casino to be built.  So, here goes something:</p>
<p>The first line of attack in this blog post is to distinguish between people based on their responses to the third question on the survey, the one asking people to rate the importance of a long list of issues.  When I looked at this list originally, I knew that I would want to reduce the dimensionality using PCA.</p>
<pre>library(psych)
issues.pca = principal(casino[,8:23], 3, rotate="varimax",scores=TRUE)</pre>
<p>The PCA resulted in the 3 components listed in the table below.  The first component had variables loading on to it that seemed to relate to the casino being a big attraction with lots of features, so I named it &#8220;Go big or Go Home&#8221;.  On the second component there seemed to be variables loading on to it that related to technical details, while the third component seemed to have variables loading on to it that dealt with social or environmental issues.</p>
<table border="0" cellspacing="0">
<tbody>
<tr>
<td align="LEFT" height="47"></td>
<td align="LEFT">Go Big or Go Home</td>
<td align="LEFT">Concerned with Technical Details</td>
<td align="LEFT">Concerned with Social/Environmental Issues or not</td>
<td align="LEFT">Issue/Concern</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_A</td>
<td align="RIGHT">0.181</td>
<td align="RIGHT">0.751</td>
<td align="LEFT"></td>
<td align="LEFT">Design of the facility</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_B</td>
<td align="RIGHT">0.366</td>
<td align="RIGHT">0.738</td>
<td align="LEFT"></td>
<td align="LEFT">Employment Opportunities</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_C</td>
<td align="RIGHT">0.44</td>
<td align="RIGHT">0.659</td>
<td align="LEFT"></td>
<td align="LEFT">Entertainment and cultural activities</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_D</td>
<td align="RIGHT">0.695</td>
<td align="RIGHT">0.361</td>
<td align="LEFT"></td>
<td align="LEFT">Expanded convention facilities</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_E</td>
<td align="LEFT"></td>
<td align="RIGHT">0.701</td>
<td align="RIGHT">0.346</td>
<td align="LEFT">Integration with surrounding areas</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_F</td>
<td align="RIGHT">0.808</td>
<td align="RIGHT">0.266</td>
<td align="LEFT"></td>
<td align="LEFT">New hotel accommodations</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_G</td>
<td align="RIGHT">-0.117</td>
<td align="LEFT"></td>
<td align="RIGHT">0.885</td>
<td align="LEFT">Problem gambling &amp; health concerns</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_H</td>
<td align="LEFT"></td>
<td align="LEFT"></td>
<td align="RIGHT">0.904</td>
<td align="LEFT">Public safety and social concerns</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_I</td>
<td align="LEFT"></td>
<td align="RIGHT">0.254</td>
<td align="RIGHT">0.716</td>
<td align="LEFT">Public space</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_J</td>
<td align="RIGHT">0.864</td>
<td align="RIGHT">0.218</td>
<td align="LEFT"></td>
<td align="LEFT">Restaurants</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_K</td>
<td align="RIGHT">0.877</td>
<td align="RIGHT">0.157</td>
<td align="LEFT"></td>
<td align="LEFT">Retail</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_L</td>
<td align="RIGHT">0.423</td>
<td align="RIGHT">0.676</td>
<td align="RIGHT">-0.1</td>
<td align="LEFT">Revenue for the city</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_M</td>
<td align="RIGHT">0.218</td>
<td align="RIGHT">0.703</td>
<td align="RIGHT">0.227</td>
<td align="LEFT">Support for local businesses</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_N</td>
<td align="RIGHT">0.647</td>
<td align="RIGHT">0.487</td>
<td align="RIGHT">-0.221</td>
<td align="LEFT">Tourist attraction</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_O</td>
<td align="LEFT"></td>
<td align="RIGHT">0.118</td>
<td align="RIGHT">0.731</td>
<td align="LEFT">Traffic concerns</td>
</tr>
<tr>
<td align="LEFT" height="16">Q3_P</td>
<td align="RIGHT">0.497</td>
<td align="RIGHT">0.536</td>
<td align="RIGHT">0.124</td>
<td align="LEFT">Training and career development</td>
</tr>
</tbody>
</table>
<p>Once I was satisfied that I had a decent understanding of what the PCA was telling me, I loaded the component scores into the original dataframe.</p>
<pre>casino[,110:112] = issues.pca$scores
names(casino)[110:112] = c("GoBigorGoHome","TechnicalDetails","Soc.Env.Issues")</pre>
<p>In order to investigate the question of who wants a casino and where, I decided to use question 6 as a dependent variable (the one asking where they would want it built, if one were to be built) and the PCA components as independent variables.  This is a good question to use, because the answer options, if you remember, are &#8220;Toronto&#8221;, &#8220;Adjacent Municipality&#8221; and &#8220;Neither&#8221;.  My approach was to model each response individually using logistic regression.</p>
<pre>casino$Q6[casino$Q6 == ""] = NA
casino$Q6 = factor(casino$Q6, levels=c("Adjacent Municipality","City of Toronto","Neither"))

adj.mun = glm(casino$Q6 == "Adjacent Municipality" ~ GoBigorGoHome + TechnicalDetails + Soc.Env.Issues, data=casino, family=binomial(logit))
toronto = glm(casino$Q6 == "City of Toronto" ~ GoBigorGoHome + TechnicalDetails + Soc.Env.Issues, data=casino, family=binomial(logit))
neither = glm(casino$Q6 == "Neither" ~ GoBigorGoHome + TechnicalDetails + Soc.Env.Issues, data=casino, family=binomial(logit))</pre>
<p>Following are the summaries of each GLM:<br />
Toronto:</p>
<p><script src="https://gist.github.com/inkhorn/5601179.js"></script></p>
<p>Adjacent municipality:</p>
<p><script src="https://gist.github.com/inkhorn/5601192.js"></script></p>
<p>Neither location:</p>
<p><script src="https://gist.github.com/inkhorn/5601202.js"></script></p>
<p>And here is a quick summary of the above GLM information:<br />
<a href="http://rforwork.files.wordpress.com/2013/05/summary-of-casino-glms1.png" ref="nofollow" target="_blank"><img class="alignleft size-full wp-image-270" alt="Summary of Casino GLMs" src="http://rforwork.files.wordpress.com/2013/05/summary-of-casino-glms1.png?w=584&#038;h=1168" width="584" height="1168" /></a></p>
<p>Judging from these results, it looks like those who want a casino in Toronto don&#8217;t focus on the big social/environmental issues surrounding the casino, but do focus on the flashy and non-flashy details and benefits alike.  Those who want a casino outside of Toronto do care about the social/environmental issues, don&#8217;t care as much about the flashy details, but do have a focus on some of the non-flashy details.  Finally, those not wanting a casino in either location care about the social/environmental issues, but don&#8217;t care about any of the details.</p>
<p>Here&#8217;s where the issue of location comes into play.  When I look at the summary for the GLM that predicts who wants a casino in an adjacent municipality, I get the feeling that it&#8217;s picking up people living in the down-town core who just don&#8217;t think the area can handle a casino.  In other words, I think there might be a &#8220;not in my backyard!&#8221; effect.</p>
<p>The first inkling that this might be the case comes from an article from the Martin Prosperity Institute (MPI), who analyzed the same data set, and managed to get <a href="http://martinprosperity.org/wp-content/uploads/2013/04/Casino-Support_Ex01_500px.jpg" ref="nofollow" target="_blank">a very nice looking heat map-map</a> of the responses to the first question on the survey, asking people how they feel about having a new casino in Toronto.  From this map, it does look like people in Downtown Toronto are feeling pretty negative about a new casino, whereas those in the far east and west of Toronto are feeling better about it.</p>
<p>My next evidence comes from the cities uncovered by geocoding the responses in the data set.  I decided to create a very simple indicator variable, distinguishing those for whom the &#8220;City&#8221; is Toronto, and those for whom the city is anything else.  I like this better than the MPI analysis, because it looks at peoples&#8217; attitudes towards a casino both inside and outside of Toronto (rather than towards the concept of a new Casino in Toronto).  If there really is a &#8220;not in my backyard!&#8221; effect, I would expect to see evidence that those in Toronto are more disposed towards a casino in an adjacent municipality, and that those from outside of Toronto are more disposed towards a casino inside Toronto!  Here we go:</p>
<p><script src="https://gist.github.com/inkhorn/5601401.js"></script></p>
<p><a href="http://rforwork.files.wordpress.com/2013/05/where-located-by-city-of-residence.png" ref="nofollow" target="_blank"><img class="alignleft size-full wp-image-272" alt="Where located by city of residence" src="http://rforwork.files.wordpress.com/2013/05/where-located-by-city-of-residence.png?w=584&#038;h=362" width="584" height="362" /></a>As you can see here, those from the outside of Toronto are more likely to suggest building a casino in Toronto compared with those from the inside, and less likely to suggest building a casino in an adjacent municipality (with the reverse being true about those from the inside of Toronto).</p>
<p>That being said, when you do the comparison within city of residence (instead of across it like I just did), those from the inside of Toronto seem equally likely to suggest that the casino be built in our outside of the city, whereas those outside are much more likely to suggest building the casino inside Toronto than outside.  So, depending on how you view this graph, you might only say there&#8217;s evidence for a &#8220;not in my backyard!&#8221; effect for those living outside of Toronto.</p>
<p>As a final note, I&#8217;ll remind you that although these analyses point to which Torontonians do want a new casino, the fact from this survey remains that about 71% of respondents are unsupportive of a casino in Toronto, and 53% don&#8217;t want a casino built in either Toronto or and adjacent municipality.  I really have to wonder if they&#8217;re still going to go ahead with it!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/rforwork.wordpress.com/265/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/rforwork.wordpress.com/265/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=rforwork.info&#038;blog=34976952&%23038;post=265&%23038;subd=rforwork&%23038;ref=&%23038;feed=1" width="1" height="1" />
<p class="syndicated-attribution"><div class="social4i" style="height:82px;"><div class="social4in" style="height:82px;float: left;"><div class="socialicons s4fblike" style="float:left;margin-right: 10px;"><div id="fb-root"></div><div class="fb-like" data-href="http://www.r-bloggers.com/which-torontonians-want-a-casino-survey-analysis-part-2/" data-send="true"  data-layout="box_count" data-width="55" data-height="62"  data-show-faces="false"></div></div><div class="socialicons s4twitter" style="float:left;margin-right: 10px;"><a href="https://twitter.com/share" data-url="http://www.r-bloggers.com/which-torontonians-want-a-casino-survey-analysis-part-2/" data-counturl="http://www.r-bloggers.com/which-torontonians-want-a-casino-survey-analysis-part-2/" data-text="Which Torontonians Want a Casino?  Survey Analysis Part 2" class="twitter-share-button" data-count="vertical" data-via="rbloggers"></a></div><div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="tall" href="http://www.r-bloggers.com/which-torontonians-want-a-casino-survey-analysis-part-2/" count="true"></g:plusone></div></div><div style="clear:both"></div></div>

<div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://rforwork.info/2013/05/17/which-torontonians-want-a-casino-survey-analysis-part-2/"> Data and Analysis with R, at Work</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/" rel="nofollow">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers" rel="nofollow">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/" rel="nofollow">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial" rel="nofollow">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2" rel="nofollow">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot" rel="nofollow">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map" rel="nofollow">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation" rel="nofollow">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio" rel="nofollow">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave" rel="nofollow">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX" rel="nofollow">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL" rel="nofollow">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse" rel="nofollow">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git" rel="nofollow">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop" rel="nofollow">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping" rel="nofollow">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression" rel="nofollow">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA" rel="nofollow">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series" rel="nofollow">time series</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading" rel="nofollow">trading</a>) and more...
</div></p><img src="http://feeds.feedburner.com/~r/RBloggers/~4/qIrRuJT_aTg" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/which-torontonians-want-a-casino-survey-analysis-part-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://0.gravatar.com/avatar/c29907bc7cc44244868554d0f14acd7b?s=96&amp;amp;d=identicon&amp;amp;r=G" length="" type="" />
<enclosure url="http://rforwork.files.wordpress.com/2013/05/summary-of-casino-glms1.png" length="" type="" />
<enclosure url="http://rforwork.files.wordpress.com/2013/05/where-located-by-city-of-residence.png" length="" type="" />
		<feedburner:origLink>http://www.r-bloggers.com/which-torontonians-want-a-casino-survey-analysis-part-2/</feedburner:origLink></item>
	</channel>
</rss><!-- Dynamic page generated in 1.055 seconds. --><!-- Cached page generated by WP-Super-Cache on 2013-05-19 14:26:44 --><!-- Compression = gzip -->
