<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" version="2.0">

<channel>
	<title>Statistical Modeling, Causal Inference, and Social Science</title>
	
	<link>http://andrewgelman.com</link>
	<description />
	<lastBuildDate>Wed, 22 Feb 2012 14:35:07 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/StatisticalModeling" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="statisticalmodeling" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:emailServiceId xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">StatisticalModeling</feedburner:emailServiceId><feedburner:feedburnerHostname xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">http://feedburner.google.com</feedburner:feedburnerHostname><item>
		<title>I’m officially no longer a “rogue”</title>
		<link>http://andrewgelman.com/2012/02/im-officially-not-a-rogue/</link>
		<comments>http://andrewgelman.com/2012/02/im-officially-not-a-rogue/#comments</comments>
		<pubDate>Wed, 22 Feb 2012 14:35:07 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Sociology]]></category>
		<category><![CDATA[Zombies]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14227</guid>
		<description><![CDATA[In our Freakonomics: What Went Wrong article, Kaiser and I wrote: Levitt’s publishers characterize him as a &#8220;rogue economist,&#8221; yet he received his Ph.D. from MIT, holds the title of Alvin H. Baum Professor at the University of Chicago, and has served as editor of the completely mainstream Journal of Political Economy. Further &#8220;rogue&#8221; credentials [...]]]></description>
			<content:encoded><![CDATA[<p>In our Freakonomics:  What Went Wrong article, Kaiser and I wrote:</p>
<blockquote><p>Levitt’s publishers characterize him as a &#8220;rogue economist,&#8221; yet he received his Ph.D. from MIT, holds the title of Alvin H. Baum Professor at the University of Chicago, and has served as editor of the completely mainstream Journal of Political Economy. Further &#8220;rogue&#8221; credentials revealed by Levitt&#8217;s online C.V. include an undergraduate degree from Harvard, a research fellowship with the American Bar Foundation, membership in the Harvard Society of Fellows, a fellowship at the National Bureau of Economic Research, and a stint as a consultant for &#8220;Corporate Decisions, Inc.&#8221;</p></blockquote>
<p>That&#8217;s all well and good, but, on the other hand, I too have degrees from Harvard and MIT and I also taught at the University of Chicago.  But what really clinches it is that this month I gave a talk for an organization called the Corporate Executive Board.  No kidding.</p>
<p>In my defense, I&#8217;ve never actually called myself a &#8220;rogue.&#8221;  But still . . .</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/im-officially-not-a-rogue/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>“Readability” as freedom from the actual sensation of reading</title>
		<link>http://andrewgelman.com/2012/02/readability-as-freedom-from-the-actual-sensation-of-reading/</link>
		<comments>http://andrewgelman.com/2012/02/readability-as-freedom-from-the-actual-sensation-of-reading/#comments</comments>
		<pubDate>Tue, 21 Feb 2012 16:48:44 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Literature]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14010</guid>
		<description><![CDATA[In her essay on Margaret Mitchell and Gone With the Wind, Claudia Roth Pierpoint writes: The much remarked &#8220;readability&#8221; of the book must have played a part in this smooth passage from the page to the screen, since &#8220;readability&#8221; has to do not only with freedom from obscurity but, paradoxically, with freedom from the actual [...]]]></description>
			<content:encoded><![CDATA[<p>In her essay on Margaret Mitchell and Gone With the Wind, Claudia Roth Pierpoint writes:</p>
<blockquote><p>The much remarked &#8220;readability&#8221; of the book must have played a part in this smooth passage from the page to the screen, since &#8220;readability&#8221; has to do not only with freedom from obscurity but, paradoxically, with <em>freedom from the actual sensation of reading</em> [emphasis added]&#8212;of the tug and traction of words as they move thoughts into place in the mind.  Requiring, in fact, the least reading, the most &#8220;readable&#8221; book allows its characters to slip easily through nets of words and into other forms.  Popular art has been well defined by just this effortless movement from medium to medium, which is carried out, as Leslie Fiedler observed in relation to Uncle Tom&#8217;s Cabin, &#8220;without loss of intensity or alteration of meaning.&#8221;  Isabel Archer rises from the page only in the hanging garments of Henry James&#8217;s prose, but Scarlett O&#8217;Hara is a free woman.</p></blockquote>
<p>Well put.  I wish Pierpoint would come out with another book.  But I think this sort of book is out of fashion nowadays.  There are zillions of uncollected book reviews and literary essays that I&#8217;d love to see in book form (the hypothetical collected reviews of Anthony West, Alfred Kazin, and many others) but it seems like it won&#8217;t ever happen.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/readability-as-freedom-from-the-actual-sensation-of-reading/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>How many data points do you really have?</title>
		<link>http://andrewgelman.com/2012/02/how-many-data-points-do-you-really-have/</link>
		<comments>http://andrewgelman.com/2012/02/how-many-data-points-do-you-really-have/#comments</comments>
		<pubDate>Tue, 21 Feb 2012 14:51:59 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Multilevel Modeling]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=13213</guid>
		<description><![CDATA[Chris Harrison writes: I have just come across your paper in the 2009 American Scientist. Another problem that I frequently come across is when people do power spectral analyses of signals. If one has 1200 points (fairly modest in this day and age) then there are 600 power spectral estimates. People will then determine the [...]]]></description>
			<content:encoded><![CDATA[<p>Chris Harrison writes:<br />
<span id="more-13213"></span></p>
<blockquote><p>I have just come across your paper in the 2009 American Scientist. Another problem that I frequently come across is when people do power spectral analyses of signals. If one has 1200 points (fairly modest in this day and age) then there are 600 power spectral estimates. People will then determine the 95% confidence limits and pick out any spectral estimate that sticks up above this, claiming that it is significant. But there will be on average 30 estimates that stick up too high or too low. So in general there will be 15 spectral estimates which are higher than the 95% confidence limit which could happen just by chance. I suppose that this means that you have to set a much higher confidence limit, which would depend on the number of data in your signal.</p>
<p>I would also like your opinion about <a href="http://www.pnas.org/cgi/doi/10.1073/pnas.1104268108">a paper</a> in the Proceedings of the National Academy of Science, &#8220;The causality analysis of climate change and large-scale<br />
human crisis&#8221; by David D. Zhang, Harry F. Lee, Cong Wang, Baosheng Li, Qing Pei, Jane Zhang, and Yulun An.</p>
<p>These authors take whole series of annual data from 1500 to 1800, giving 301 data in all and do linear correlations between pairs of data sets them. But some of the data sets only have data at longer intervals, such as 25 years. So the authors linearly interpolate the data to give an annual signal and then assume that they still have 301 data. Is this legitimate?</p></blockquote>
<p>My reply:</p>
<p>1.  For your spectral estimation problem, I think it would best to fit some sort of hierarchical model for the 600 parameters.</p>
<p>2.  I didn&#8217;t actually read the paper, but from your description I&#8217;d think it might be a good idea for them to bootstrap their data to get standard errors.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/how-many-data-points-do-you-really-have/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Joshua Clover update</title>
		<link>http://andrewgelman.com/2012/02/joshua-clover-update/</link>
		<comments>http://andrewgelman.com/2012/02/joshua-clover-update/#comments</comments>
		<pubDate>Mon, 20 Feb 2012 14:33:46 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Zombies]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14579</guid>
		<description><![CDATA[Surfing the blogroll, I found myself on Helen DeWitt&#8217;s page and noticed the link to the Joshua Clover, alias Jane Dark. I hadn&#8217;t checked out Clover for awhile (see my reactions here and here), so I decided to head on over. Here&#8217;s what it looked like: &#8220;The case against the Federal minimum wage,&#8221; huh? That [...]]]></description>
			<content:encoded><![CDATA[<p>Surfing the <a href="http://andrewgelman.com/blogroll/">blogroll</a>, I found myself on Helen DeWitt&#8217;s <a href="http://paperpools.blogspot.com/">page</a> and noticed the link to the Joshua Clover, alias Jane Dark.  I hadn&#8217;t checked out Clover for awhile (see my reactions <a href="http://andrewgelman.com/2009/11/1989/">here</a> and <a href="http://andrewgelman.com/2008/04/television_dura/">here</a>), so I decided to head on over.</p>
<p>Here&#8217;s what it looked like:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/02/Screen-shot-2012-02-19-at-9.32.16-PM.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/02/Screen-shot-2012-02-19-at-9.32.16-PM.png" alt="" title="Screen shot 2012-02-19 at 9.32.16 PM" width="606" height="487" class="alignnone size-full wp-image-14583" /></a></p>
<p>&#8220;The case against the Federal minimum wage,&#8221; huh?  That surprised me, as I had the vague impression that Clover was on the far left of the American political spectrum.  But I guess he could have some sort of wonky thing going on, or maybe there&#8217;s some unexpected twist?  It seemed a bit off of Clover&#8217;s usual cultural-criticism beat, so I clicked through to take a look . . . and it was just a boring set of paragraphs on the minimum wage.</p>
<p>Hmmmm.  I went back to the homepage, looked around more carefully, and realized that the blog is fake, the online equivalent of those fake book spines that are used to simulate rows of books on a bookshelf.</p>
<p>I don&#8217;t know what happened.  My guess is that Clover got tired of blogging and let the domain name lapse, and then some loser entrepreneur noticed it was still getting some hits (from DeWitt&#8217;s blog?) so they put up a fake blog.</p>
<p>I can only assume it was all done automatically?  Somebody has a webcrawler that looks for dead sites with links, then buys them up for something close to $0 and fills &#8216;em with crap?  Yuck.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/joshua-clover-update/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Factual – a new place to find data</title>
		<link>http://andrewgelman.com/2012/02/factual-a-new-place-to-find-data/</link>
		<comments>http://andrewgelman.com/2012/02/factual-a-new-place-to-find-data/#comments</comments>
		<pubDate>Mon, 20 Feb 2012 02:44:54 +0000</pubDate>
		<dc:creator>Aleks Jakulin</dc:creator>
				<category><![CDATA[Statistical computing]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14582</guid>
		<description><![CDATA[Factual collects data on a variety of topics, organizes them, and allows easy access. If you ever wanted to do a histogram of calorie content in Starbucks coffees or plot warnings with a live feed of earthquake data &#8211; your life should be a bit simpler now. Also see DataMarket, InfoChimps, and a few older [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.factual.com/">Factual</a> collects data on a variety of topics, organizes them, and allows easy access. If you ever wanted to do a histogram of calorie content in <a href="http://www.factual.com/t/hrHQV1/Starbucks_Nutrition_Info_for_Grande_Size_with_2_Milk_loaded_Sep_2008">Starbucks coffees</a> or plot warnings with a <a href="http://www.factual.com/t/wFiPkG/Earthquake_Data_Live_Feed">live feed of earthquake data</a> &#8211; your life should be a bit simpler now.</p>
<p>Also see <a href="http://andrewgelman.com/2010/08/datamarket/">DataMarket</a>, <a href="http://andrewgelman.com/2010/03/infochimps_find/">InfoChimps</a>, and a few older links in <a href="http://andrewgelman.com/2008/11/the_future_of_bayes/">The Future of Data Analysis</a>.</p>
<p>If you access the data through the API, you can build live visualizations like this:<br />
<a href="http://andrewgelman.com/wp-content/uploads/2012/02/Screen-shot-2012-02-19-at-9.41.46-PM.png"><img class="aligncenter size-full wp-image-14588" src="http://andrewgelman.com/wp-content/uploads/2012/02/Screen-shot-2012-02-19-at-9.41.46-PM.png" alt="" width="483" height="419" /></a></p>
<p>Of course, you could just go to the source. <a href="http://swfsc.noaa.gov/staff.aspx?id=669">Roy Mendelssohn</a> writes (with minor edits):</p>
<blockquote><p>Since you are both interested in data access, please look at our service ERDDAP:</p>
<p><a href="http://coastwatch.pfel.noaa.gov/erddap/index.html">http://coastwatch.pfel.noaa.gov/erddap/index.html</a></p>
<p><a href="http://upwell.pfeg.noaa.gov/erddap/index.html">http://upwell.pfeg.noaa.gov/erddap/index.html</a></p>
<p>Please do not be fooled by the web pages. Everything is a service (including search and graphics) and the URL completely defines the request, and response formats are easily changed just by changing the &#8220;file extension&#8221;. The web pages are just html and javascript that use the services. For example, put this URL in your browser:</p>
<p><a href="http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdBAsstamday.png?sst[(2010-01-16T12:00:00Z):1:(2010-01-16T12:00:00Z)][(0.0):1:(0.0)][(30):1:(50.0)][(220):1:(240.0)]">http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdBAsstamday.png?sst[(2010-01-16T12:00:00Z):1:(2010-01-16T12:00:00Z)][(0.0):1:(0.0)][(30):1:(50.0)][(220):1:(240.0)]</a></p>
<p>Now if you use R:</p>
<p><code><br />
library(ncdf4)<br />
library(lattice)<br />
download.file(url="http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdBAsstamday.nc?sst[(2010-01-16T12:00:00Z):1:(2010-01-16T12:00:00Z)][(0.0):1:(0.0)][(30):1:(50.0)][(220):1:(240.0)]", destfile="AGssta.nc")<br />
AGsstaFile&lt;-nc_open(&#039;AGssta.nc&#039;)<br />
sst&lt;-ncvar_get(AGsstaFile,&#039;sst&#039;,start=c(1,1,1,1),count=c(-1,-1,-1,-1))<br />
lonval&lt;-ncvar_get(AGsstaFile,&#039;longitude&#039;,1,-1)<br />
latval&lt;-ncvar_get(AGsstaFile,&#039;latitude&#039;,1,-1)<br />
image(lonval,latval,sst,col=rainbow(30))<br />
</code><br />
Or if you use Matlab:</p>
<p><code>link='http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdBAsstamday.mat?sst[(2010-01-16T12:00:00Z):1:(2010-01-16T12:00:00Z)][(0.0):1:(0.0)][(30):1:(50.0)][(220):1:(240.0)]';<br />
F=urlwrite(link,'cwatch.mat');<br />
load('-MAT',F);<br />
ssta=reshape(erdBAsstamday.sst,201,201);<br />
pcolor(double(ssta));shading flat;colorbar;<br />
</code><br />
The two services above allow access to literally petabytes of data, some observed some from model output. I realize you guys don&#8217;t usually work in these fields, but this is part of a significant NOAA effort to make as much of its data available as possible. One more thing, if you use &#8220;last&#8221; as the time, you will always get the latest data, This allows people to set up <a href="http://www.pfeg.noaa.gov/~cwilson/bloom/BW_14.html">web pages that track the latest (algal bloom) conditions</a>, such as done by one of my colleagues.</p>
<p>BTW &#8211; for people who want a GUI to help with the extract from within the app, there is a product called the <a href="http://www.pfeg.noaa.gov/products/EDC/">Environmental Data Connector</a> that runs in ArcGIS, Matlab, R and Excel.</p></blockquote>
<p>Roy&#8217;s links inspired me to write another blog post, which is forthcoming.</p>
<p>This post is by <a href="http://www.stat.columbia.edu/~jakulin/">Aleks Jakulin</a>, follow him at <a href="http://twitter.com/#!/aleksj">@aleksj</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/factual-a-new-place-to-find-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Standardized writing styles and standardized graphing styles</title>
		<link>http://andrewgelman.com/2012/02/at-some-point-the-graph-is-so-bad-that-it-doesnt-convey-the-information/</link>
		<comments>http://andrewgelman.com/2012/02/at-some-point-the-graph-is-so-bad-that-it-doesnt-convey-the-information/#comments</comments>
		<pubDate>Sun, 19 Feb 2012 14:23:13 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Literature]]></category>
		<category><![CDATA[Statistical graphics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14323</guid>
		<description><![CDATA[Back in the 1700s&#8212;JennyD can correct me if I&#8217;m wrong here&#8212;there was no standard style for writing. You could be discursive, you could be descriptive, flowery, or terse. Direct or indirect, serious or funny. You could construct a novel out of letters or write a philosophical treatise in the form of a novel. Nowadays there [...]]]></description>
			<content:encoded><![CDATA[<p>Back in the 1700s&#8212;JennyD can correct me if I&#8217;m wrong here&#8212;there was no standard style for writing.  You could be discursive, you could be descriptive, flowery, or terse.  Direct or indirect, serious or funny.  You could construct a novel out of letters or write a philosophical treatise in the form of a novel.</p>
<p>Nowadays there are rules.  You can break the rules, but then you&#8217;re Breaking. The. Rules.  Which is a distinctive choice all its own.</p>
<p>Consider academic writing.  Serious works of economics or statistics tend to be written in a serious style in some version of plain academic English.  The few exceptions (for example, by Tukey, Tufte, Mandelbrot, and Jaynes) are clearly exceptions, written in styles that are much celebrated but not so commonly followed.</p>
<p>A serious work of statistics, or economics, or political science <em>could</em> be written in a highly unconventional form (consider, for example, Wallace Shawn&#8217;s plays), but academic writers in these fields tend to stick with the standard forms.  The consensus seems to be that straight prose is the clearest way to convey interesting and important ideas.  Serious popular writers such as Oliver Sacks and Malcolm Gladwell follow a slightly different formula, going with the magazine-writing tradition of placing ideas inside human stories.  But they still, by and large, are trying to write clear prose.</p>
<p>When it comes to data graphics, though, we&#8217;re back in the freewheeling 1700s.  Maybe that&#8217;s a good thing, I don&#8217;t know.  But what I do know is there&#8217;s no standard way of displaying quantitative information, nor is there any acceptance of the unique virtues of the graphical equivalent of clear prose.</p>
<p>Serious works of social science nowadays use all sorts of data display, from showing no data at all, to tables, to un-designed Excel-style bar charts, to Cleveland-style dot and line plots, to creative new data displays, to ornamental information visualizations.  The analogy in writing style would be if some journal articles were written in the pattern of Ezra Pound, others like Ernest Hemingway, and others in the style of James Joyce or William Faulkner.</p>
<p>I won&#8217;t try to make the case that everybody should do graphs the way I do.  I accept that some people communicate with tables, others prefer infovis, and others prefer no quantitative information at all.  I just think it&#8217;s interesting that prose style is so standardized&#8212;I&#8217;ve had submissions to journals criticized on the grounds that my writing is too lively!&#8212;but when it comes to display of data and models, it&#8217;s the Wild West.</p>
<p><strong>For example . . .</strong></p>
<p>Kaiser <a href="http://junkcharts.typepad.com/junk_charts/2012/01/little-orange-circles-spell-trouble.html">points</a> to this graph from the book Poor Economics by Abhijit Banerjee and Esther Duflo:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/01/poor1.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/01/poor1-300x180.png" alt="" title="poor1" width="300" height="180" class="alignnone size-medium wp-image-14324" /></a></p>
<p>In case you&#8217;re curious what&#8217;s actually going on here, Kaiser helpfully replots the data in a readable form:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/01/poor2.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/01/poor2-300x248.png" alt="" title="poor2" width="300" height="248" class="alignnone size-medium wp-image-14325" /></a></p>
<p>I&#8217;d be interested in what my infovis friends would say about this.  The best argument I can think of in favor of the Banerjee and Duflo graph, besides its novelty and (perhaps) attractiveness, is that its very difficulty forces the reader to work, to put in so much effort to figure out what&#8217;s going on that he or she is then committed to learning more.  In contrast, one might argue that Kaiser&#8217;s direct plot is so clear that the reader can feel free to stop right there.  I don&#8217;t really believe this argument&#8212;I&#8217;d rather have the clear graph and convey more information&#8212;but that&#8217;s the best I can do.</p>
<p>That said, if a book has dozens of informative Kaiser-style graphs, I can see the benefit of having a few goofy ones just to mix things up a bit.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/at-some-point-the-graph-is-so-bad-that-it-doesnt-convey-the-information/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Not as ugly as you look</title>
		<link>http://andrewgelman.com/2012/02/not-as-ugly-as-you-look/</link>
		<comments>http://andrewgelman.com/2012/02/not-as-ugly-as-you-look/#comments</comments>
		<pubDate>Sat, 18 Feb 2012 14:30:43 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Sociology]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14320</guid>
		<description><![CDATA[Kaiser asks the interesting question: How do you measure what restaurants are &#8220;overrated&#8221;? You can&#8217;t just ask people, right? There&#8217;s some sort of social element here, that &#8220;overrated&#8221; implies that someone&#8217;s out there doing the rating.]]></description>
			<content:encoded><![CDATA[<p>Kaiser <a href="http://junkcharts.typepad.com/numbersruleyourworld/2012/01/does-being-underrated-have-statistical-meaning-i-dont-have-an-answer.html">asks</a> the interesting question:  How do you measure what restaurants are &#8220;overrated&#8221;?  You can&#8217;t just ask people, right?  There&#8217;s some sort of social element here, that &#8220;overrated&#8221; implies that someone&#8217;s out there doing the rating.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/not-as-ugly-as-you-look/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Rare name analysis and wealth convergence</title>
		<link>http://andrewgelman.com/2012/02/generational-mobility-and-wealth-convergence/</link>
		<comments>http://andrewgelman.com/2012/02/generational-mobility-and-wealth-convergence/#comments</comments>
		<pubDate>Sat, 18 Feb 2012 01:32:36 +0000</pubDate>
		<dc:creator>Aleks Jakulin</dc:creator>
				<category><![CDATA[Economics]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14540</guid>
		<description><![CDATA[Steve Hsu summarizes the research of economic historian Greg Clark and Neil Cummins: Using rare surnames we track the socio-economic status of descendants of a sample of English rich and poor in 1800, until 2011. We measure social status through wealth, education, occupation, and age at death. Our method allows unbiased estimates of mobility rates. [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://infoproc.blogspot.com/">Steve Hsu</a> <a href="http://infoproc.blogspot.com/2012/02/greg-clark-are-there-ruling-classes.html?utm_source=twitterfeed&amp;utm_medium=twitter">summarizes</a> the <a href="http://www.econ.yale.edu/seminars/echist/eh11/clark-111024.pdf">research</a> of <a title="Economic history" href="http://en.wikipedia.org/wiki/Economic_history" rel="wikipedia">economic historian</a> <a href="http://www.econ.ucdavis.edu/faculty/gclark/">Greg Clark</a> and <a href="http://people.qc.cuny.edu/Faculty/Neil.Cummins/Pages/Default.aspx">Neil Cummins</a>:</p>
<blockquote><p>Using rare surnames we track the <a class="zem_slink" title="Socioeconomic status" href="http://en.wikipedia.org/wiki/Socioeconomic_status" rel="wikipedia">socio-economic status</a> of descendants of a sample of English rich and poor in 1800, until 2011. We measure <a class="zem_slink" title="Social status" href="http://en.wikipedia.org/wiki/Social_status" rel="wikipedia">social status</a> through wealth, education, occupation, and age at death. Our method allows unbiased estimates of mobility rates. Paradoxically, we find two things. Mobility rates are lower than conventionally estimated. There is considerable persistence of status, even after 200 years. But there is convergence with each generation. The 1800 underclass has already attained mediocrity. And the 1800 <a class="zem_slink" title="Upper class" href="http://en.wikipedia.org/wiki/Upper_class" rel="wikipedia">upper class</a> will eventually dissolve into the mass of society, though perhaps not for another 300 years, or longer.</p>
<p><a href="http://3.bp.blogspot.com/-HVZla_gocIw/Tz2wPtHtC4I/AAAAAAAAByA/qVrSzNfVje0/s1600/Screen%2BShot%2B2012-02-16%2Bat%2B5.40.41%2BPM.png"><img src="http://3.bp.blogspot.com/-HVZla_gocIw/Tz2wPtHtC4I/AAAAAAAAByA/qVrSzNfVje0/s400/Screen%2BShot%2B2012-02-16%2Bat%2B5.40.41%2BPM.png" alt="" border="0" /></a></p></blockquote>
<p><a href="http://infoproc.blogspot.com/2012/02/greg-clark-are-there-ruling-classes.html?utm_source=twitterfeed&amp;utm_medium=twitter">Read more</a> at Steven&#8217;s blog. The idea of rare names to perform this analysis is interesting &#8211; and has been recently applied to the study of <a href="http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.0021160">nepotism in Italy</a>.</p>
<p>I haven&#8217;t looked into the details of the methodology, but rare events have their own distributional characteristics, and could benefit from Bayesian modeling in sparse data conditions. Moreover, there seems to be an underlying assumption that rare names are somehow uniformly represented in the population. They might not be. A hypothetical situation: in feudal days, rare names were good at predicting who&#8217;s rich and who&#8217;s not &#8211; wealth was passed through family by name. But then industrialization perturbed the old feudal order stratified by name into one that&#8217;s stratified by skill and no longer identifiable by name.</p>
<p>Let&#8217;s scrutinize this new methodology! With power comes responsibility.</p>
<p>This post is by <a href="http://www.stat.columbia.edu/~jakulin/">Aleks Jakulin</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/generational-mobility-and-wealth-convergence/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Sports examples in class</title>
		<link>http://andrewgelman.com/2012/02/sports-examples-in-class/</link>
		<comments>http://andrewgelman.com/2012/02/sports-examples-in-class/#comments</comments>
		<pubDate>Fri, 17 Feb 2012 14:08:12 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Sports]]></category>
		<category><![CDATA[Teaching]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=13918</guid>
		<description><![CDATA[Karl Broman writes: I [Karl] personally would avoid sports entirely, as I view the subject to be insufficiently serious. . . . Certainly lots of statisticians are interested in sports. . . . And I’m not completely uninterested in sports: I like to watch football, particularly Nebraska, Green Bay, and Baltimore, and to see Notre [...]]]></description>
			<content:encoded><![CDATA[<p>Karl Broman <a href="http://kbroman.wordpress.com/2011/11/07/sports-statistics/">writes</a>:</p>
<blockquote><p>I [Karl] personally would avoid sports entirely, as I view the subject to be insufficiently serious. . . . Certainly lots of statisticians are interested in sports. . . . And I’m not completely uninterested in sports: I like to watch football, particularly Nebraska, Green Bay, and Baltimore, and to see Notre Dame or any team from Florida or Texas lose.</p>
<p>But statistics about sports? Yawn.</p></blockquote>
<p>As a person who loves sports, statistics, and sports statistics, I have a few thoughts:</p>
<p>1.  Not everyone likes sports, and even fewer are interested in any particular sport.  It&#8217;s ok to use sports examples, but don&#8217;t delude yourself into thinking that everyone in the class cares about it.</p>
<p>2.  Don&#8217;t forget foreign students.  A lot of them don&#8217;t even know the rules of kickball, fer chrissake!</p>
<p>3.  Of the students who care about a sport, there will be a minority who <em>really</em> care.  We had some serious basketball fans in our class last year.</p>
<p>4.  I think the best solution is to cover examples in all sorts of topics, including but not limited to sports.  I&#8217;ve been trying to work in more examples from areas such as cooking, sewing, and shopping.</p>
<p>5.  In my experience, students looove education examples, stories about grades, studying, and so forth.  But maybe that&#8217;s just at the sorts of colleges where I&#8217;ve taught:  Columbia, Harvard, Berkeley, Chicago.  Perhaps students at less elite institutions are less interested in grades.</p>
<p>6.  Getting back to Karl&#8217;s point about sports being unimportant:  Yeah, I pretty much agree with him on that one.  Psychologists and economists who study sports will make the claim that the research has larger value, for example in studying decision making or in isolating some cognitive process (as in the justly-celebrated &#8220;hot-hand&#8221; study), but ultimately I think sports are valuable for their own sake.  Sports are a form of art, it&#8217;s not a topic such as medicine or education that has much interest beyond itself.  That&#8217;s ok, though, as long as we&#8217;re honest about it, and as long as we also include examples that interest other students in the class.</p>
<p>7.  Whenever you teach an applied example well, you induce some subject-matter learning.  When I teach sex ratios of births, I give the probability as 0.485, not 0.5, and students learn a little bit of biology.  When I teach a sports example, students learn a bit about sports and psychology (for example, the hot hand).  The one thing I never never like to do is use complicated gambling examples.  I have no interest in teaching students the rules of craps or the probability of getting three of a kind in  a poker hand.  There are lots of probability examples out there that have the same level of complexity but apply to real-world situations.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/sports-examples-in-class/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>Believe the statistics, not your lying eyes</title>
		<link>http://andrewgelman.com/2012/02/believe-the-statistics-not-your-lying-eyes/</link>
		<comments>http://andrewgelman.com/2012/02/believe-the-statistics-not-your-lying-eyes/#comments</comments>
		<pubDate>Fri, 17 Feb 2012 12:56:33 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Political Science]]></category>
		<category><![CDATA[Zombies]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14528</guid>
		<description><![CDATA[Here.]]></description>
			<content:encoded><![CDATA[<p><a href="http://themonkeycage.org/blog/2012/02/17/understanding-the-zombie-confusion-about-class-and-voting/">Here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/believe-the-statistics-not-your-lying-eyes/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>A previous discussion with Charles Murray about liberals, conservatives, and social class</title>
		<link>http://andrewgelman.com/2012/02/a-previous-discussion-with-charles-murray-about-liberals-conservatives-and-social-class/</link>
		<comments>http://andrewgelman.com/2012/02/a-previous-discussion-with-charles-murray-about-liberals-conservatives-and-social-class/#comments</comments>
		<pubDate>Fri, 17 Feb 2012 03:25:44 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Political Science]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14524</guid>
		<description><![CDATA[From 2.5 years ago. Read all the comments; the discussion is helpful.]]></description>
			<content:encoded><![CDATA[<p><a href="http://andrewgelman.com/2009/08/the_divergence/">From 2.5 years ago</a>.  Read all the comments; the discussion is helpful.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/a-previous-discussion-with-charles-murray-about-liberals-conservatives-and-social-class/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>“False-positive psychology”</title>
		<link>http://andrewgelman.com/2012/02/false-positive-psychology/</link>
		<comments>http://andrewgelman.com/2012/02/false-positive-psychology/#comments</comments>
		<pubDate>Thu, 16 Feb 2012 14:02:54 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Decision Theory]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Sociology]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14476</guid>
		<description><![CDATA[Everybody&#8217;s talkin bout this paper by Joseph Simmons, Leif Nelson and Uri Simonsohn, who write: Despite empirical psychologists&#8217; nominal endorsement of a low rate of false-positive findings (≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an [...]]]></description>
			<content:encoded><![CDATA[<p>Everybody&#8217;s <a href="http://hardsci.wordpress.com/2012/02/10/does-your-p-curve-weigh-as-much-as-a-duck/">talkin bout</a> this paper by Joseph Simmons, Leif Nelson and Uri Simonsohn, who <a href="http://people.psych.cornell.edu/~jec7/pcd%20pubs/simmonsetal11.pdf">write</a>:</p>
<blockquote><p>Despite empirical psychologists&#8217; nominal endorsement of a low rate of false-positive findings (≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. We [Simmons, Nelson, and Simonsohn] present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis. Second, we suggest a simple, low-cost, and straightforwardly effective disclosure-based solution to this problem. The solution involves six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process.</p></blockquote>
<p>Whatever you think about these recommendations, I strongly recommend you read the article.  I love its central example:</p>
<blockquote><p>To help illustrate the problem, we [Simmons et al.] conducted two experiments designed to demonstrate something false: that certain songs can change listeners&#8217; age. Everything reported here actually happened.</p></blockquote>
<p>They go on to present some impressive-looking statistical results, then they go behind the curtain to show the fairly innocuous manipulations they performed to attain statistical significance.</p>
<p>A key part of the story is that, although such manipulations could be performed by a cheater, they could also seem like reasonable steps to a sincere researcher who thinks there&#8217;s an effect and wants to analyze the data a bit to understand it further.</p>
<p>We&#8217;ve all known for a long time that a p-value of 0.05 doesn&#8217;t really mean 0.05.  Maybe it really means 0.1 or 0.2.  But, as this paper demonstrates, that p=.05 can often mean nothing at all.  This can be a big problem for studies in psychology and other fields where various data stories are vaguely consistent with theory.  We&#8217;ve all known about these problems but it&#8217;s only recently that we&#8217;ve been aware of how serious they are and how little we should trust a bunch of statistically significant results.</p>
<p>Sanjay Srivastava has some comments <a href="http://hardsci.wordpress.com/2012/01/02/an-editorial-board-discusses-fmri-analysis-and-false-positive-psychology/">here</a>.  My main comment on Simmons et al. is that I&#8217;m not so happy with the framing in terms of &#8220;false positives&#8221;; to me, the problem is not so much with null effects but with uncertainty and variation.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/false-positive-psychology/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
		</item>
		<item>
		<title>Charles Murray on the new upper class</title>
		<link>http://andrewgelman.com/2012/02/some-reactions-to-charles-murrays-thoughts-on-income-and-politics/</link>
		<comments>http://andrewgelman.com/2012/02/some-reactions-to-charles-murrays-thoughts-on-income-and-politics/#comments</comments>
		<pubDate>Thu, 16 Feb 2012 02:00:13 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Political Science]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14443</guid>
		<description><![CDATA[The other day I posted some comments on the voting patterns of rich and poor in the context of Charles Murray&#8217;s recent book, &#8220;Coming Apart.&#8221; My graphs on income and voting are just fine, but I mischaracterized Murray&#8217;s statements. So I want to fix that right away. After that I have some thoughts on the [...]]]></description>
			<content:encoded><![CDATA[<p>The other day I posted <a href="http://andrewgelman.com/2012/02/charles-murray-does-a-tucker-carlson-provoking-me-to-unleash-the-usual-torrent-of-graphs/">some comments</a> on the voting patterns of rich and poor in the context of Charles Murray&#8217;s recent book, &#8220;Coming Apart.&#8221;  My graphs on income and voting are just fine, but I mischaracterized Murray&#8217;s statements.  So I want to fix that right away.  After that I have some thoughts on the book itself.</p>
<p>In brief:</p>
<p>1. I was unfair to call him a Tucker Carlson.</p>
<p>2. Murray talks a lot about upper-class liberals. That’s fine but I think his discussion would be improved by also considering upper-class conservatives, given that I see the big culture war occurring within the upper class.</p>
<p>3. Using the case of Joe Paterno as an example, I discuss why Murray’s “preach what you practice” advice could be difficult to carry out in practice.<br />
<span id="more-14443"></span><br />
<strong>Murray on the top 5%</strong></p>
<p>David Frum <a href="http://www.thedailybeast.com/articles/2012/02/07/charles-murray-book-review-part-3.html">quoted</a> Murray as writing that the top 5% &#8220;tends to be liberal&#8212;right? There’s no getting around it. Every way of answering this question produces a yes.&#8221;  In response, Frum and I both pointed out that, no, Americans in the top 5% of income are <em>less</em> likely to be liberal, compared to the average American, and are more likely to vote Republican.</p>
<p>Those numbers are correct, but it was unfair to present them as a contradiction of Murray, who when talking in his book about the top 5% is not talking about income.  Murray defines &#8220;the broad elite&#8221; as &#8220;most successful 5 percent of the people working in the professions and managerial positions,&#8221; including top military officers, government officials, business executives, professionals, and the media, a set of occupations that include, in Murray&#8217;s words, &#8220;23 percent of all employed persons ages 25 or older.&#8221;  He&#8217;s talking about the top 5% (in &#8220;success,&#8221; as broadly defined, which is related to but not quite the same as income) in these professions.</p>
<p>After his offhand remark about the upper class being liberal (more on that below), Murray takes pains to emphasize that this popular impression is exaggerated, writing, &#8220;the essence of the culture of the new upper class is remarkably consistent across the political spectrum.&#8221;  The concept of upper-class people being liberal is not central to Murray&#8217;s argument; if anything, his point is the opposite, to de-emphasize the liberal tilt of &#8220;famous academics, journalists, Hollywooders, etc.&#8221; and rather make the point that, that whatever the political attitudes are of the new upper class, their attitudes and actions isolate them from mainstream America.</p>
<p>Getting back to Murray&#8217;s upper 5%:  as he defines them, I&#8217;d guess they are more conservative than the average American on economic issues and more liberal than the average American on social issues.  But I can&#8217;t really be sure.</p>
<p>Rather than defining the American upper class as including some job categories but not others, I&#8217;d prefer to include all the high-income groups and say that the American upper class is highly divided&#8212;that is, polarized.  Murray does address much of this in his comparison of different sorts of SuperZips (high-income zip codes), so maybe it&#8217;s just a matter of emphasis:  from my analysis of survey data (as in the graphs posted <a href="http://andrewgelman.com/2012/02/charles-murray-does-a-tucker-carlson-provoking-me-to-unleash-the-usual-torrent-of-graphs/">earlier</a>), I see the big culture war occurring <em>within</em> the upper class, whereas Murray focuses on differences in attitudes and lifestyles comparing rich to poor.</p>
<p>As I noted earlier, upper-income liberals, while a minority of upper-income Americans, are still an influential group and worth studying.  But alongside them is an even larger group of upper-income conservatives.</p>
<p>I think Murray and I are basically in agreement about the facts here.  If you take narrow enough slices and focus on the media, academia, and civilian government, you can find groups of elites with liberal attitudes on economic and social issues.  But I&#8217;m also interested in all those elites with conservative attitudes.  Statistically, they outnumber the liberal elites.  The conservative elites tend to live in different places than the liberal elites and they tend to have influence in different ways (consider, for example, decisions about where to build new highways, convention centers, etc., or pick your own examples), and those differences interest me.</p>
<p>In summary, it was unfair of me to lump Murray in with Tucker Carlson as a statistics-mangler.  I think that any focus on upper-class liberals would gain more context by contrasting them with the more numerous upper-class conservatives, but Murray&#8217;s real point has little to do with political attitudes, and if you remove his comments about the purported liberalism of elites, nothing is really taken away from his main arguments.</p>
<p><strong>&#8220;The New American Divide&#8221;</strong></p>
<p>Murray describes his book (see also this <a href='http://andrewgelman.com/wp-content/uploads/2012/02/murraywsj.pdf'>Wall Street Journal article</a>) as &#8220;about an evolution in American society&#8221; in the past half-century, &#8220;leading to the formation of classes that are different in kind and in their degree of separation from anything that the nation has ever known,&#8221; with a new upper class that now lives a life that is qualitatively different from the experiences of most Americans.</p>
<p>I see this argument having the following logical implications, in the context of Murray&#8217;s conservative political attitudes (i.e. that he favors low taxes and low public spending):</p>
<blockquote><p>As I read it, Murray&#8217;s argument plus his political opinions imply the following story:  Rich liberals lead personally admirable and economically productive lives, but they are tied to a false ideology of socialism and social permissiveness.  This left-wing ideology may have its appeal, but in the long term, or even the medium term, it does no favors for most poor and middle-income Americans, as it leads to economic stagnation (the natural result of money spent through the government&#8217;s political process rather than through the decisions of individuals and private businesses) and social disaster (all the problems that arise with families when individuals attempt to live their lives without restraint).</p>
<p>Murray writes about culturally and politically influential elites because they have the ability to influence American attitudes, both thorough their economic power and through their representation in the news and entertainment media.  Murray writes about <em>politically liberal</em> rich elites because he disagrees with their politics.  From Murray&#8217;s point of view, there&#8217;s no point in writing about rich conservatives (for example, that dude who&#8217;s funding Rick Santorum) because they are already doing what he wants, advocating for lower taxes, lower government spending, and more restrictions on the behavior of lower-income Americans.</p></blockquote>
<p>The above is not a quote; it&#8217;s just my attempt to draw out the implications of Murray&#8217;s thesis that the upper class should &#8220;preach what it practices&#8221; and recommend to ordinary Americans the attitude of long-term responsibility.</p>
<p>Just to be clear, let me emphasize that Murray&#8217;s book does <em>not</em> distinguish between a &#8220;good&#8221; elite that&#8217;s conservative and a &#8220;bad&#8221; elite that’s liberal.  He considers the new upper class as problematic as a class.  My point above is that, given his political views, it makes sense for Murray to be more concerned about the attitudes of the liberal elite, a concern Murray can have without implying any moral criticism on his part.</p>
<p>Again, Murray never writes anything like the bit I have above about economic stagnation; this is just my interpretation of the implications of his concerns in the context of his economic beliefs.</p>
<p>And let me also make clear that Murray does not consider the politics of the new upper class in making his case that it&#8217;s problematic.  Even if the American upper class were 100% conservative, Murray could still be concerned about their disconnect with the masses.  But I think the contrast between liberal and conservative views is relevant given Murray&#8217;s own attitudes.</p>
<p>One way to see this is to consider Murray&#8217;s political quiz, &#8220;How thick is your bubble,&#8221; where he challenges his upper-class readers to assess their points in common to the ordinary Americans.  One of Murray&#8217;s questions is, &#8220;Have you ever participated in a parade not involving global warming, a war protest, or gay rights?&#8221;  The bit about gay rights is cute, but it also serves to separate out the liberals in the audience.  After all, lots of non-elites go to gay rights parades.  What if Murray had asked, &#8220;Have you ever participated in a parade not involving the pro-life or Tea Party movements?&#8221;  This might not be the best example; my point is that there are lots of ways to separate the elites from the non-elites.  Elites are more likely to know a business executive, more likely to buy a new SUV, more likely to fly business class, more likely to attend professional sporting events (those tickets are expensive!), less likely to rent rather than their homes, less likely to ride public transportation, and so on.  Murray&#8217;s quiz is interesting but he chooses to separate elites from non-elites in a particular way that makes me think he&#8217;s sensitive to the attitudes of politically liberal elites in particular.</p>
<p><strong>Difficulties of the recommendation to &#8220;preach what you practice&#8221;</strong></p>
<p>Murray does not consider the case of Joe Paterno, but in many ways the Penn State football coach fits his story well.  Paterno was said to live an exemplary personal and professional life, combining traditional morality with football success&#8212;but, by his actions, he showed little concern about the morality of his players and coaches.  At a professional level, Paterno rose higher and higher, and in his personal life he was a responsible adult.  But he had an increasing disconnect with the real world, to the extent that horrible crimes were occurring nearby (in the physical and social senses) but he was completely insulated from the consequences for many years.  Paterno&#8217;s story is symbolic of upper-income America:  you can live an ordinary life in an ordinary house and still feel like a regular guy but still live in a bubble.</p>
<p>Paterno was a political conservative so he doesn&#8217;t quite match with Murray&#8217;s story, but he&#8217;s otherwise a good fit, a man who lived by a code of personal morality that he did not expect of others.</p>
<p>Joe Paterno is an extreme example, but I think his story is relevant, to explain the difficulty of the &#8220;preach what you practice&#8221; guideline.  My claim is that &#8220;preaching,&#8221; to make a difference, requires actions as well as words.  While Paterno did not espouse a nonjudgmental stance on rape, assault, etc., in his actions he expressed a hands-off policy.  I see no reason to think that Paterno believed these crimes committed by his coach and players were OK, he just didn&#8217;t seem to think it was his role to do anything about it.  I don&#8217;t place myself above Paterno in any moral sense&#8212;I certainly don&#8217;t monitor the after-hours activities of my own students and employees&#8212;I just see it as an example of the social distance that Murray writes about, that an authority figure such as Paterno can feel it&#8217;s acceptable to be so isolated in this way.</p>
<p>Murray&#8217;s argument is a step forward in sophistication compared to some other discussions of the culture war.  Old-style conservatives such as Michael Barone have characterized upper-income liberals as being frivolous &#8220;trustfunders&#8221; who do &#8220;not to have to work very hard&#8221; and &#8220;have done nothing to earn their money,&#8221; slackers who &#8220;revel in looking down on&#8221; the common people.</p>
<p>in contrast, Murray tones down the Snidely Whiplash rhetoric and describes upper-class liberals as people who are living admirable lives but who are giving irresponsible advice because of their deluded social theories.  His recommendation is, &#8220;When it comes to marriage and the work ethic, the new upper class must start preaching what it practices.&#8221;</p>
<p>The Paterno example illustrates the difficulty of this recommendation.  What he had to do was not simply preach against rape and violence, but to act to stop it.  Paterno was acting like the new upper class and simply looking away, allowing crimes to happen under his umbrella of protection.  Unfortunately, this sort of behavior would seem to be characteristic of the <em>old</em> upper class as well, so I&#8217;m not sure how new this all is.</p>
<p>My point is that preaching values in a real way is not so easy; it requires hard work and direct involvement, not just talk.  I don&#8217;t think Murray would disagree with me here.  He writes that conscientious people should &#8220;voice their disapproval of those who defy these norms,&#8221; but it takes more than voicing disapproval.  The kind of disapproval that makes a difference takes work and is risky.  Joe Paterno could have reported the crimes of his coach and his students to the police, but at a possible cost to his reputation.  Or, to choose a more homely example, just try telling an acquaintance that he or she is not conscientiously raising his or her kids.  That won&#8217;t be a costless conversation to you!  Again, it might be a good idea, but it&#8217;s hard to think about Murray&#8217;s suggestions without considering their challenges.</p>
<p><strong>Upper-class liberals and upper-class conservatives</strong></p>
<p>Setting aside the difficulties of implementing his recommendations, I see two limitations of Murray&#8217;s thesis.  The first is a matter of selection.  Let&#8217;s divide Americans into upper and lower income categories.  (Murray just talks about whites, but I think the arguments apply to the general population; my guess is that after the reception of his Bell Curve book, Murray just thought it would be safest to leave race out of his discussions entirely.)  Murray is comparing rich liberals to poor everybodys, but he just as well could be looking at rich conservatives.  By focusing on the cultural contradictions of liberalism, Murray piques the attention of the liberal elite while lulling the conservative elite into a false sense of security.  But I think he&#8217;s telling only part of the story, as I emphasized in graphs such as this:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/02/Screen-shot-2012-02-07-at-9.27.39-PM.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/02/Screen-shot-2012-02-07-at-9.27.39-PM.png" alt="" title="Screen shot 2012-02-07 at 9.27.39 PM" width="462" height="590" class="alignnone size-full wp-image-14415" /></a></p>
<p>My second problem with Murray&#8217;s argument is that it has a bit of a self-contradictory nature.  As David Frum has <a href="http://www.thedailybeast.com/articles/2012/02/07/charles-murray-book-review-part-4.html">noted</a>, Murray criticizes upper-income Americans for (a) shunning lower-income cigarette smokers, but also for (b) not shaming lower-income people for poor life choices.  But smoking is a poor life choice, no?</p>
<p>Elsewhere Murray states that upper-income Americans are more likely to go to church, and it seems that he would like these upper-class people to encourage churchgoing among the mass of Americans.  But at another place he says that the elites themselves should try going to church, just like the common people do.  So which is it:  is churchgoing an admirable habit, along the lines of marriage and hard work, that the elites should encourage others to do, or is churchgoing a bit of homespun Americana, like watching football on TV and eating at Applebee&#8217;s, that the top 5% should reconnect with?</p>
<p>The point of these examples is not that Murray is wrong, either in his prescriptions or in his recommendations&#8212;much here depends on one&#8217;s economic views about taxation and government spending&#8212;but rather that his argument keeps going in two opposite directions at once.  From one side he argues that the upper class has good habits that they should transmit to ordinary Americans; on the other side he says that the upper class should become more like the rest of the country.  But I can&#8217;t see how you can have it both ways.  This connects to my earlier point that much could be gained by considering the diversity of attitudes among the upper class.</p>
<p><strong>Summary</strong></p>
<p>This whole discussion got started because Murray was writing something about social class and David Frum and I fired back with statistics about income.  But Murray is not writing about income; in fact, he explicitly states,</p>
<blockquote><p>The new-upper-class culture is not the product of great wealth. It is enabled by affluence&#8212;people with common tastes and preferences need enough money to be able to congregate&#8212;but it is not driven by affluence. It is driven by the distinctive tastes and preferences that emerge when large numbers of cognitively talented people are enabled to live together in their own communities. You can whack the top income centile back to where it was in the 1980s, and it will have no effect whatsoever on the new-upper-class culture that had already emerged by that time.</p></blockquote>
<p>I don&#8217;t know how true that is, but to be fair to Murray, he&#8217;s talking about cultural attitudes, not income.  Based on my own interests, I&#8217;d take this the next step and consider the divisions between liberals and conservatives <em>within</em> America&#8217;s elites.  My suggestions along those lines don&#8217;t contradict what Murray&#8217;s saying but rather represent additional things to think about.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/some-reactions-to-charles-murrays-thoughts-on-income-and-politics/feed/</wfw:commentRss>
		<slash:comments>40</slash:comments>
		</item>
		<item>
		<title>The tabloids strike again</title>
		<link>http://andrewgelman.com/2012/02/the-tabloids-strike-again/</link>
		<comments>http://andrewgelman.com/2012/02/the-tabloids-strike-again/#comments</comments>
		<pubDate>Tue, 14 Feb 2012 18:25:31 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Sociology]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14225</guid>
		<description><![CDATA[See comments #2,3,4 here. I guess that&#8217;s why Science and Nature are known as &#8220;the tabloids.&#8221; As the commenter writes, &#8220;you can&#8217;t have people look at too many images of maggot-infested wounds.&#8221;]]></description>
			<content:encoded><![CDATA[<p>See comments #2,3,4 <a href="http://andrewgelman.com/2008/09/mellow_liberals/">here</a>.  I guess that&#8217;s why <em>Science</em> and <em>Nature</em> are <a href="http://andrewgelman.com/2012/01/groundbreaking-or-definitive-journals-need-to-pick-one/#comment-71932">known as</a> &#8220;the tabloids.&#8221;  As the commenter writes, &#8220;you can&#8217;t have people look at too many images of maggot-infested wounds.&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/the-tabloids-strike-again/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Extra babies on Valentine’s Day, fewer on Halloween?</title>
		<link>http://andrewgelman.com/2012/02/more-babies-on-valentines-day-and-less-babies-on-halloween/</link>
		<comments>http://andrewgelman.com/2012/02/more-babies-on-valentines-day-and-less-babies-on-halloween/#comments</comments>
		<pubDate>Tue, 14 Feb 2012 14:05:01 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Decision Theory]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14504</guid>
		<description><![CDATA[Just in time for the holiday, X pointed me to an article by Becca Levy, Pil Chung, and Martin Slade reporting that, during a recent eleven-year period, more babies were born on Valentine&#8217;s Day and fewer on Halloween compared to neighboring days: What I&#8217;d really like to see is a graph with all 366 days [...]]]></description>
			<content:encoded><![CDATA[<p>Just in time for the holiday, X <a href="http://xianblog.wordpress.com/2012/02/14/more-babies-on-valentines-day-and-less-babies-on-halloween/">pointed</a> me to <a href='http://andrewgelman.com/wp-content/uploads/2012/02/halloween.pdf'>an article</a> by Becca Levy, Pil Chung, and Martin Slade reporting that, during a recent eleven-year period, more babies were born on Valentine&#8217;s Day and fewer on Halloween compared to neighboring days:<br />
<a href="http://andrewgelman.com/wp-content/uploads/2012/02/Screen-shot-2012-02-14-at-8.06.17-AM.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/02/Screen-shot-2012-02-14-at-8.06.17-AM.png" alt="" title="Screen shot 2012-02-14 at 8.06.17 AM" width="473" height="678" class="alignnone size-full wp-image-14505" /></a></p>
<p>What I&#8217;d really like to see is a graph with all 366 days of the year.  It would be easy enough to make.  That way we could put the Valentine&#8217;s and Halloween data in the context of other possible patterns.  While they&#8217;re at it, they could also graph births by day of the week and show Thanksgiving, Easter, and other holidays that don&#8217;t have fixed dates.  It&#8217;s so frustrating when people only show part of the story.</p>
<p>The data are publicly available, so maybe someone could make those graphs?  If the Valentine&#8217;s/Halloween data are worth publishing, I think more comprehensive graphs should be publishable as well.  I&#8217;d post them here, that&#8217;s for sure.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/more-babies-on-valentines-day-and-less-babies-on-halloween/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>Recently in the sister blog</title>
		<link>http://andrewgelman.com/2012/02/recently-in-the-sister-blog/</link>
		<comments>http://andrewgelman.com/2012/02/recently-in-the-sister-blog/#comments</comments>
		<pubDate>Tue, 14 Feb 2012 02:55:12 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Miscellaneous Science]]></category>
		<category><![CDATA[Political Science]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14499</guid>
		<description><![CDATA[Lingsanity! What the sophisticates thought in September 2008 Political opinions of U.S. military The origin of essentialist reasoning]]></description>
			<content:encoded><![CDATA[<p><a href="http://themonkeycage.org/blog/2012/02/13/lingsanity/">Lingsanity!</a></p>
<p><a href="http://themonkeycage.org/blog/2012/02/13/what-the-sophisticates-thought-in-september-2008/">What the sophisticates thought in September 2008</a></p>
<p><a href="http://themonkeycage.org/blog/2012/02/10/more-on-political-opinions-of-u-s-military/">Political opinions of U.S. military</a></p>
<p><a href="http://www.cognitionandculture.net/the-study-of-cognition-and-culture-today/2359-the-origin-of-essentialist-reasoning">The origin of essentialist reasoning</a></p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/recently-in-the-sister-blog/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Help with this problem, win valuable prizes</title>
		<link>http://andrewgelman.com/2012/02/help-with-this-problem-win-valuable-prizes/</link>
		<comments>http://andrewgelman.com/2012/02/help-with-this-problem-win-valuable-prizes/#comments</comments>
		<pubDate>Tue, 14 Feb 2012 00:05:26 +0000</pubDate>
		<dc:creator>Phil</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14488</guid>
		<description><![CDATA[&#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; This post is by Phil. In the comments to an earlier post, I mentioned a problem I am struggling with right now. Several people mentioned having (and solving!) similar problems in the past, so this seems like a great way for me and a bunch of other [...]]]></description>
			<content:encoded><![CDATA[<p><div id="attachment_14513" class="wp-caption alignleft" style="width: 458px"><a href="http://andrewgelman.com/wp-content/uploads/2012/02/EquationPair.png"><img class="size-full wp-image-14513" src="http://andrewgelman.com/wp-content/uploads/2012/02/EquationPair.png" alt="" width="448" height="138" /></a><p class="wp-caption-text">Corrected equation</p></div></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>This post is by Phil.</p>
<p>In the comments to <a href="http://andrewgelman.com/2012/02/adding-an-error-model-to-a-deterministic-model/" target="_blank">an earlier post</a>, I mentioned a problem I am struggling with right now. Several people mentioned having (and solving!) similar problems in the past, so this seems like a great way for me and a bunch of other blog readers to learn something. I will describe the problem, one or more of you will tell me how to solve it, and you will win&#8230;wait for it&#8230;.my thanks, and the approval and admiration of your fellow blog readers, and a big thank-you in any publication that includes results from fitting the model.  You can&#8217;t ask fairer than that!</p>
<p>Here&#8217;s the problem.  The goal is to estimate six parameters that characterize the leakiness (or air-tightness) of a house with an attached garage.  We are specifically interested in the parameters that describe the connection between the house and the garage; this is of interest because of the effect on the air quality in the house  if there are toxic chemicals (gasoline, car exhaust, etc.) in the garage, but I won&#8217;t go into the motivation of the experiments, I&#8217;ll just describe them. (See below the fold for the rest)</p>
<p><span id="more-14488"></span></p>
<p>A researcher puts a &#8220;blower door&#8221; &#8212; basically just a big fan &#8212; in the front door of the house. The fan is ramped up in speed, or is sped up in stages, so as to gradually pressurize the house relative to the outdoors.  The flow rate through the fan is measured, so you know how much air is going into the house.  That amount has to leak out of the house, either to the outdoors or into the garage.  The researcher measures the pressure difference between the house and the outdoors, and between the garage and the outdoors; of course, from these s/he can determine the pressure difference between the house and the garage.</p>
<p>All of the air that flows in (Q_{ho}, and don&#8217;t try to tell me it should be Q_{oh} because I know) has to flow out.  The first equation in the graphic is the conservation equation if you consider drawing a boundary around just the house: the air that flows into the house is the amount that flows directly to the outdoors plus the amount that flows into the garage.  P_ho and P_hg are the pressure differences between the house and the outdoors and between the house and the garage, which are measured. n_ho and n_hg are &#8220;flow exponents,&#8221; and C_ho and C_hg are &#8220;flow parameters&#8221;; these are to be estimated.  The flow exponents are expected to be between about 0.5 and 0.7. If a pressure is negative, then P^n is to be interpreted as sign(P)*abs(P)^n.</p>
<p>The second equation shows what happens if you draw look at the flows through the entire house-garage boundary. Again, everything that flows in has to flow out, either from the house to the outdoors or the garage to the outdoors. The flow from the garage to outdoors introduces two additional parameters.</p>
<p>So, finally, the problem.  I have a bunch of measurements of pressures (P_ij) and flows through the blower door (Q_ho).  If the model and data were perfect, there would be a unique set of C_ij and n_ij values such that Qho could be predicted from P_ho, P_hg, and P_go with no error.  But of course the data are not perfect. One of the main problems is that the pressure measurements can be systematically wrong: for example, if the outdoor pressure measurement is made on the lee side of the building, the house-outdoor pressure difference will tend to be overestimated.  Doing something simple like minimizing the RMS difference between predictions and measurements turns out to lead to systematic bias (e.g. n_ho is overestimated), in a sort of nonlinear analog to <a href="http://en.wikipedia.org/wiki/Regression_dilution" target="_blank">regression dilution</a>. The obvious solution is to fit a statistical model that incorporates the error.  I think I could code this up myself, but I decided that this is a great opportunity to learn <a href="http://mcmc-jags.sourceforge.net/" target="_blank">JAGS</a>, which is similar to <a href="http://www.mrc-bsu.cam.ac.uk/bugs/" target="_blank">BUGS</a> (which I have used before).  But I ran into trouble, perhaps from not fully understanding the JAGS or BUGS coding rules.</p>
<p>Here&#8217;s what I want:</p>
<ol>
<li>The actual pressure difference between the house and outdoors P_{ho} is normally distributed about the measured pressure Pmeas_{ho}, with uncorrelated errors with standard deviation of 2 Pascals. Actually the mean error will not be zero as this implies, but for now let&#8217;s start this way. (In practice the blower door operator sets a desired pressure, and the blower door adjusts its flow automatically until the measured pressure matches the desired pressure).</li>
<li>The measured pressure difference between the garage and outdoors Pmeas_{go} is normally distributed about the actual pressure P_{go}, uncorrelated errors with s.d. of 2 Pa.</li>
<li>The error in the house-garage pressure difference is the difference between the P_{ho} error and the P_{go} error.</li>
<li>The measured flow Qmeas_{ho} is normally distributed about the actual Q_{ho} with error 20 cubic feet per minute.</li>
<li>The actual value of Q_{ho} that is predicted from the right side of either of the equations in the graphic has normal error with standard deviation 20 cubic feet per minute.</li>
</ol>
<p>I run into some problems when trying to implement this in JAGS. I am tempted to post my JAGS model here &#8212; it&#8217;s only 20 lines &#8212; but I&#8217;m afraid of contaminating the mind of you, the reader, with bad ideas. (The code does not run).  I&#8217;ll post it in the comments tomorrow if people want it.</p>
<p>But here&#8217;s what I&#8217;d love: could one of you JAGS or BUGS experts help me out here?  There are a few issues that I don&#8217;t know how to handle, such as the fact that the error in Pmeas_hg is equal to the difference between the error in Pmeas_ho and Pmeas_go, and the fact that the model seems to need two equations that have Qmeas_{ho} on the left side but I think that that is not allowed.</p>
<p>Inputs are sets of Pmeas_{ho}[i], Pmeas_{go}[i], Qmeas_{ho}[i]; desired outputs are estimates (with uncertainty) of all three of the C_{ij} values and all three of the n_{ij} values, and of course, while we&#8217;re at it, estimates of the actual values P_{ho}[i], P_{go}[i], P_{hg}[i].</p>
<p>C&#8217;mon, twenty lines of JAGS or BUGS code, how hard can it be?</p>
<div> Thanks for any help you can offer!</div>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/help-with-this-problem-win-valuable-prizes/feed/</wfw:commentRss>
		<slash:comments>46</slash:comments>
		</item>
		<item>
		<title>Philosophy of Bayesian statistics:  my reactions to Wasserman</title>
		<link>http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-wasserman/</link>
		<comments>http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-wasserman/#comments</comments>
		<pubDate>Mon, 13 Feb 2012 14:12:57 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14470</guid>
		<description><![CDATA[Continuing with my discussion of the articles in the special issue of the journal Rationality, Markets and Morals on the philosophy of Bayesian statistics: Larry Wasserman, &#8220;Low Assumptions, High Dimensions&#8221;: This article was refreshing to me because it was so different from anything I&#8217;ve seen before. Larry works in a statistics department and I work [...]]]></description>
			<content:encoded><![CDATA[<p>Continuing with <a href="http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-cox-and-mayo/">my discussion of the articles in the special issue</a> of the journal Rationality, Markets and Morals on the philosophy of Bayesian statistics:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/02/larry.jpg"><img src="http://andrewgelman.com/wp-content/uploads/2012/02/larry.jpg" alt="" title="larry" width="92" height="108" class="alignnone size-full wp-image-14471" /></a></p>
<p>Larry Wasserman, &#8220;Low Assumptions, High Dimensions&#8221;:</p>
<p>This article was refreshing to me because it was so different from anything I&#8217;ve seen before.  Larry works in a statistics department and I work in a statistics department but there&#8217;s so little overlap in what we do.  Larry and I both work in high dimesions (maybe his dimensions are higher than mine, but a few thousand dimensions seems like a lot to me!), but there the similarity ends.  His article is all about using few to no assumptions, while I use assumptions all the time.  Here&#8217;s an example.  Larry writes:</p>
<blockquote><p>P. Laurie Davies (and his co-workers) have written several interesting papers where probability models, at least in the sense that we usually use them, are eliminated. Data are treated as deterministic. One then looks for adequate models rather than true models. His basic idea is that a distribution P is an ad- equate approximation for x1,&#8230;,xn, if typical data sets of size n, generated under P look like x1,&#8230;,xn. In other words, he asks whether we can approximate the deterministic data with a stochastic model.</p></blockquote>
<p>This sounds cool.  And it&#8217;s so different from my world!  I do a lot of work with survey data, where the sample is intended to mimic the population, and a key step comes in the design, which is all about probability sampling.  I agree that Wassserman&#8217;s (or Davies&#8217;s) approach <em>could</em> be applied to surveys&#8212;the key step would be to replace random sampling with quota sampling, and maybe this would be a good idea&#8212;but in the world of surveys we would typically think of quota sampling or other nonprobabilistic approaches as an unfortunate compromise with reality rather than as a desirable goal.  In short, typical statisticians such as myself see probability modeling as a valuable tool that is central to applied statistics, while Wasserman appears to see probability as an example of an assumption to be avoided.</p>
<p>Just to be clear:  I&#8217;m not <em>at all</em> saying Wasserman is wrong in any way here; rather, I&#8217;m just marveling on how different his perspective is from mine.  I can&#8217;t immediately see how his assumption-free approach could possibly be used to estimate public opinion or votes cross-classified by demogtaphics, income, and state.  But, then again, maybe my models wouldn&#8217;t work so well on the applications on which Wasserman works.  Bridges from both directions would probably be good.</p>
<p>With different methods and different problems come different philosophies.  My use of generative modeling motivates, and allows, me to check fit to data using predictive simulation.  Wasserman&#8217;s quite different approach motivates him to understand his methods using other tools.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-wasserman/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Meta-analysis, game theory, and incentives to do replicable research</title>
		<link>http://andrewgelman.com/2012/02/meta-analysis-game-theory-and-incentives-to-do-replicable-research/</link>
		<comments>http://andrewgelman.com/2012/02/meta-analysis-game-theory-and-incentives-to-do-replicable-research/#comments</comments>
		<pubDate>Sun, 12 Feb 2012 14:15:43 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Decision Theory]]></category>
		<category><![CDATA[Economics]]></category>
		<category><![CDATA[Multilevel Modeling]]></category>
		<category><![CDATA[Public Health]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14145</guid>
		<description><![CDATA[One of the key insights of game theory is to solve problems in reverse time order. You first figure out what you would do in the endgame, then decide a middle-game strategy to get you where you want to be at the end, then you choose an opening that will take you on your desired [...]]]></description>
			<content:encoded><![CDATA[<p>One of the key insights of game theory is to solve problems in reverse time order.  You first figure out what you would do in the endgame, then decide a middle-game strategy to get you where you want to be at the end, then you choose an opening that will take you on your desired path.  All conditional on what the other players do in their turn.</p>
<p>In an <a href="http://www.jameslindlibrary.org/illustrating/records/meta-analysis-in-medical-research-strong-encouragement-for-high/whole_articles">article</a> from 1989, &#8220;Meta-analysis in medical research:  Strong encouragement for higher quality in individual research efforts,&#8221; Keith O&#8217;Rourke and Allan Detsky apply this principle to the process of publication of scientific research:<br />
<span id="more-14145"></span></p>
<blockquote><p>From the statistical point of view, there really is no escape from performing a de facto meta-analysis. One can either judge the effectiveness of a therapy based solely on the most recent study and ignore all previous studies, a method which is equivalent to giving the most recent study weight 1.Oand all previous studies weight 0, or try to choose the weights on some scientific basis . . . If important differences in study findings exist they must be identified and explained.</p>
<p>That most researchers realize the need for stating their results in the context of previous trials is evidenced by the literature review section in almost all scientific articles. Meta-analysis is a further development and refinement of this approach offering a more rigorous and coherent treatment of past research work. It is tempting to propose that no experimental results should be published without inclusion of an appropriate meta-analysis. In effect, one might suggest that a literature review section ought to be based on an explicitly described methodology in place of the usual ad hoc approach.</p></blockquote>
<p>So far, nothing exceptional.  But then O&#8217;Rourke and Detsky continue:</p>
<blockquote><p>What is it about meta-analysis that will actually help bring about improvement in individual research efforts? . . . the comprehensive, rigorous, and public peer review that a meta-analysis entails will encourage high quality participation by members of the research community in the resolution of the inadequacies. . . .</p>
<p>With a better understanding of meta-analysis in the context of the full scientific research process, meta-analysis is seen as a key element for improving individual research efforts and their reporting in the literature. This in turn will further enhance the role of meta-analysis in helping clinicians and policy makers answer clinical questions.</p></blockquote>
<p>The idea (if I&#8217;m reading O&#8217;Rourke and Detsky correctly) is that, not only is meta-analysis appropriate for summarizing existing dat, also the threat or promise of meta-analysis provides an incentives for researchers to follow better practices in their new projects.  If you know (or think there&#8217;s a high probability) that your work will be processed through a rigorous meta-analysis, this motivates you to be careful, to supply replication materials (otherwise your study will get a low weight), etc.  That&#8217;s where the game theory comes in.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/meta-analysis-game-theory-and-incentives-to-do-replicable-research/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Adding an error model to a deterministic model</title>
		<link>http://andrewgelman.com/2012/02/adding-an-error-model-to-a-deterministic-model/</link>
		<comments>http://andrewgelman.com/2012/02/adding-an-error-model-to-a-deterministic-model/#comments</comments>
		<pubDate>Sat, 11 Feb 2012 14:01:45 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=13809</guid>
		<description><![CDATA[Daniel Lakeland asks, &#8220;Where do likelihoods come from?&#8221; He describes a class of problems where you have a deterministic dynamic model that you want to fit to data. The data won&#8217;t fit perfectly so, if you want to do Bayesian inference, you need to introduce an error model. This looks a little bit different from [...]]]></description>
			<content:encoded><![CDATA[<p>Daniel Lakeland <a href="http://models.street-artists.org/?p=1287">asks</a>, &#8220;Where do likelihoods come from?&#8221;  He describes a class of problems where you have a deterministic dynamic model that you want to fit to data.  The data won&#8217;t fit perfectly so, if you want to do Bayesian inference, you need to introduce an error model.  This looks a little bit different from the usual way that models are presented in statistics textbooks, where the focus is typically on the random error process, not on the deterministic part of the model.  A focus on the error process makes sense in some applications that have inherent randomness or variation (for example, genetics, psychology, and survey sampling) but not so much in the physical sciences, where the deterministic model can be complicated and is typically the essence of the study.  Often in these sorts of studies, the staring point (and sometimes the ending point) is what the physicists call &#8220;nonlinear least squares&#8221; or what we would call normally-distributed errors.  That&#8217;s what we did for our <a href="http://www.stat.columbia.edu/~gelman/research/published/bois2.pdf">toxicology</a> and <a href="http://www.stat.columbia.edu/~gelman/research/published/serial.pdf">dilution-assay</a> models.  Sometimes it makes sense to have the error variance scale as a power of the magnitude of the measurement.  The error terms in these models typically include model error as well as measurement variation.  In other settings you might put errors in different places in the model, corresponding to different sources of variation and model error.  For discrete data, Iven Van Mechelen and I <a href="http://www.stat.columbia.edu/~gelman/research/published/determ20.pdf">suggested</a> a generic approach for adding error to a deterministic model, but I don&#8217;t think this really would work with Lakeland&#8217;s examples.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/adding-an-error-model-to-a-deterministic-model/feed/</wfw:commentRss>
		<slash:comments>21</slash:comments>
		</item>
		<item>
		<title>If an entire article in Computational Statistics and Data Analysis were put together from other, unacknowledged, sources, would that be a work of art?</title>
		<link>http://andrewgelman.com/2012/02/if-an-entire-article-in-computational-statistics-and-data-analysis-were-put-together-from-other-unacknowledged-sources-would-that-be-a-work-of-art/</link>
		<comments>http://andrewgelman.com/2012/02/if-an-entire-article-in-computational-statistics-and-data-analysis-were-put-together-from-other-unacknowledged-sources-would-that-be-a-work-of-art/#comments</comments>
		<pubDate>Fri, 10 Feb 2012 14:50:21 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Literature]]></category>
		<category><![CDATA[Zombies]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14074</guid>
		<description><![CDATA[Spy novelist Jeremy Duns tells the amazing story of Quentin Rowan, a young writer who based an entire career on patching together stories based on uncredited material from published authors, culminating in a patchwork job that Duns had blurbed as an &#8220;instant classic.&#8221; Rowan did not merely plagiarize to fill in some gaps or cover [...]]]></description>
			<content:encoded><![CDATA[<p>Spy novelist Jeremy Duns <a href="http://jeremyduns.blogspot.com/2011/11/highway-robbery-mask-of-knowing-in.html">tells</a> the amazing story of Quentin Rowan, a young writer who based an entire career on patching together stories based on uncredited material from published authors, culminating in a patchwork job that Duns had blurbed as an &#8220;instant classic.&#8221;</p>
<p>Rowan did not merely plagiarize to fill in some gaps or cover some technical material that he was too lazy to rewrite; rather, he put together an entire novel out of others&#8217; material. Rowan writes (as part of a longer passage that itself appears to be dishonest; see the November 15, 2011 5:36 AM comment later on in the thread):</p>
<blockquote><p>I [Rowan] sat there with the books [by others] on my kitchen table and typed the passages up word for word. I had a plot in mind, initially, and looked for passages that would work within that context. People told me the initial plot was dull (spies being killed all over Europe &#8211; no one knows why), so I changed it to be more like the premise of McCarry&#8217;s &#8220;Second Sight&#8221; which was a whole lot more interesting. I had certain things I wanted to see happen in the initial plot: a double cross, a drive through the South of France, a raid on a snowy satellite base. Eventually I found passages that adhered to these kinds of scenes that only meant changing the plot a little bit here and there. It felt very much like putting an elaborate puzzle together. Every new passage added has its own peculiar set of edges that had to find a way in.</p></blockquote>
<p>The problem is not that he cut and pasted but that he didn&#8217;t acknowledge the sources.  Although if he&#8217;d done that, he might&#8217;ve been up against some copyright infringement problems.</p>
<p>A commenter writes:</p>
<blockquote><p>The whole thing about this that is so sad is that, yes, writing is hard work, and sending your words into the world to be read and judged is hard. But writing also brings me great joy and satisfaction. And that joy is what Quentin has cheated himself out of because he was scared.</p></blockquote>
<p>I don&#8217;t know about that.  Putting together an entire novel out of existing scraps and pieces&#8212;that&#8217;s pretty impressive to me.  Quilting may be less technically impressive than weaving but it&#8217;s a skill all its own.  Similarly, rappers have stolen lots of 70s riffs but they&#8217;ve added something of their own.</p>
<p><strong>Literary vs. academic theft</strong></p>
<p>The commenters also discuss other literary plagiarists such as Jacob Epstein, Patricia Waddell, Richard Condon, and Jerzy Kosinski.</p>
<p>Based on all these examples, literary plagiarism seems a bit different than academic plagiarism.  (And both are different from journalists who <a href="http://themonkeycage.org/blog/2011/10/20/easterbrook-corrects-2-out-of-3-errors-but-does-not-acknowledge-the-source-is-this-just-standard-practice-in-the-world-of-paid-journalism/">take</a> from blogs without giving credit.)</p>
<p>Goodwin, Fischer, Wegman, Tribe, Ayres, Dershowitz, etc etc etc are doing just fine in their careers.  They don&#8217;t <em>need</em> to plagiarize; they seem to do it out of a sense of obligation, or because they&#8217;re too lazy to figure things out themselves.  (Or maybe for one of <a href="http://themonkeycage.org/blog/2011/06/08/top-ten-plagiarism-excuses/">these</a> reasons.)  It&#8217;s less effort to copy than to fully read external material, incorporate it into one&#8217;s worldview, and rewrite it in a way that is coherent with one&#8217;s larger argument.</p>
<p>In contrast, for literary plagiarists there is skill involved in patching together complementary material from others&#8217; published work, seeking passages that are obscure enough or bland enough to escape notice.  I don&#8217;t think that even Ed Wegman&#8217;s strongest defenders would argue that he applied any wit or creativity in his ripoffs of Wikipedia and other published sources.  But I think you can admire the skill (if not the dishonesty) of a literary copyist who can put together a whole novel out of others&#8217; material.</p>
<p><strong>Getting energy from the reader</strong></p>
<p>I also liked this comment, later on in the thread:</p>
<blockquote><p>Books contain energy, and when you purposefully use words found in other books, you pull that energy into your own work.</p>
<p>I [the commenter] think the energy comes from the author, and all her experiences and deliberations, but also it comes from the people working on the book: editors, artists, marketing, etc. I&#8217;d go so far as to say that people reading and responding to books can contribute to their energy as well.</p></blockquote>
<p>Interesting point.  I believe the reader can supply a lot, and some books stimulate this by having lots of hooks, as it were, to connect to the readers&#8217; thoughts and experiences.  Think of all those memoirs that work by reminding readers of their own childhoods.  Or consider Nassim Taleb&#8217;s books.  Many of my correspondents were surprised that I responded so positively to <a href="http://andrewgelman.com/2006/02/fooled_by_rando/">Fooled by Randomness</a> and <a href="http://andrewgelman.com/2007/04/nassim_talebs_t/">The Black Swan</a>, but I really enjoyed the experience of reading them with pen in hand, it was just the right book to bring out lots of thoughts that I had within myself.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/if-an-entire-article-in-computational-statistics-and-data-analysis-were-put-together-from-other-unacknowledged-sources-would-that-be-a-work-of-art/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Familial Linkage between Neuropsychiatric Disorders and Intellectual Interests</title>
		<link>http://andrewgelman.com/2012/02/familial-linkage-between-neuropsychiatric-disorders-and-intellectual-interests/</link>
		<comments>http://andrewgelman.com/2012/02/familial-linkage-between-neuropsychiatric-disorders-and-intellectual-interests/#comments</comments>
		<pubDate>Thu, 09 Feb 2012 14:47:58 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Sociology]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14401</guid>
		<description><![CDATA[When I spoke at Princeton last year, I talked with neuroscientist Sam Wang, who told me about a project he did surveying incoming Princeton freshmen about mental illness in their families. He and his coauthor Benjamin Campbell found some interesting results, which they just published: A link between intellect and temperament has long been the [...]]]></description>
			<content:encoded><![CDATA[<p>When I spoke at Princeton last year, I talked with neuroscientist Sam Wang, who told me about a project he did surveying incoming Princeton freshmen about mental illness in their families.  He and his coauthor Benjamin Campbell found some interesting results, which they just <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030405">published</a>:</p>
<blockquote><p>A link between intellect and temperament has long been the subject of speculation. . . . Studies of the artistically inclined report linkage with familial depression, while among eminent and creative scientists, a lower incidence of affective disorders is found. In the case of developmental disorders, a heightened prevalence of autism spectrum disorders (ASDs) has been found in the families of mathematicians, physicists, and engineers. . . .</p>
<p>We surveyed the incoming class of 2014 at Princeton University about their intended academic major, familial incidence of neuropsychiatric disorders, and demographic variables. . . . Consistent with prior findings, we noticed a relation between intended academic majors and ASDs. Looking for relations between other neuropsychiatric disorders and academic interest we also noted a heightened prevalence of bipolar disorder, major depressive disorder and substance abuse in the families of those pursuing the humanities. A composite score based on these four heritable disorders was strongly correlated with a student&#8217;s intended academic major. Thus, familial risk toward a spectrum of psychopathologies can predict propensity toward technical versus humanist interests.</p></blockquote>
<p>When I spoke with Sam last year we discussed various ways to analyze the data as well as various interpretations of the results, but I don&#8217;t actually remember any of our conversation except for the bit where he described to me how they conducted their study.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/familial-linkage-between-neuropsychiatric-disorders-and-intellectual-interests/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Charles Murray [perhaps] does a Tucker Carlson, provoking me to unleash the usual torrent of graphs</title>
		<link>http://andrewgelman.com/2012/02/charles-murray-does-a-tucker-carlson-provoking-me-to-unleash-the-usual-torrent-of-graphs/</link>
		<comments>http://andrewgelman.com/2012/02/charles-murray-does-a-tucker-carlson-provoking-me-to-unleash-the-usual-torrent-of-graphs/#comments</comments>
		<pubDate>Wed, 08 Feb 2012 14:07:54 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Political Science]]></category>
		<category><![CDATA[Zombies]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14404</guid>
		<description><![CDATA[Charles Murray wrote a much-discussed new book, &#8220;Coming Apart: The State of White America, 1960-2010.&#8221; David Frum quotes Murray as writing, in an echo of now-forgotten TV personality Tucker Carlson, that the top 5% of incomes &#8220;tends to be liberal&#8212;right? There&#8217;s no getting around it. Every way of answering this question produces a yes.&#8221; [I’ve [...]]]></description>
			<content:encoded><![CDATA[<p>Charles Murray wrote a much-discussed new book, &#8220;Coming Apart:  The State of White America, 1960-2010.&#8221;</p>
<p>David Frum <a href="http://www.thedailybeast.com/articles/2012/02/07/charles-murray-book-review-part-3.html">quotes</a> Murray as writing, in an echo of now-forgotten TV personality <a href="http://andrewgelman.com/2007/10/nobody_ever_men/">Tucker Carlson</a>, that the top 5% of incomes &#8220;tends to be liberal&#8212;right? There&#8217;s no getting around it.  Every way of answering this question produces a yes.&#8221;</p>
<p>[I’ve interjected a “perhaps” into the title of this blog post to indicate that I don’t have the exact Murray quote here so I’m relying on David Frum’s interpretation.]</p>
<p>Frum does me the favor of citing Red State Blue State as evidence, and I&#8217;d like to back this up with some graphs.</p>
<p>Frum writes:</p>
<blockquote><p>Say &#8220;top 5%&#8221; to Murray, and his imagination conjures up everything he dislikes: coastal liberals listening to NPR in their Lexus hybrid SUVs. He sees that image so intensely that no mere number can force him to remember that the top 5% also includes the evangelical Christian assistant coach of a state university football team. . . .</p></blockquote>
<p>To put it in graphical terms:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/02/Screen-shot-2012-02-07-at-9.27.39-PM.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/02/Screen-shot-2012-02-07-at-9.27.39-PM.png" alt="" title="Screen shot 2012-02-07 at 9.27.39 PM" width="462" height="590" class="alignnone size-full wp-image-14415" /></a><br />
<span id="more-14404"></span><br />
In blue America (where Charles Murray, David Brooks, David Frum, and I live), rich people are a bit more economically conservative then poor people but this is balanced by rich people being more socially liberal.  As a result, rich and poor have similar voting patterns in the blue states.</p>
<p>In contrast, in red America (where Charles Murray and David Brooks locate the forgotten majority), rich people are both economically and socially conservative (at least they were in 2000, which is when the data for this graph came from).</p>
<p>America&#8217;s top 5% of income includes rich households in red, purple, and blue America&#8212;and, as the graph above shows, this represents a large variation in political views.</p>
<p>Another way to put it is that, as we say in our book, the culture war is not a battle between rich liberals and poor (or middle-class) conservatives or even a battle between rich conservatives and lower-income liberals.  Rather, the culture war is between rich liberals and rich conservatives.</p>
<p>It&#8217;s not the Prius vs. the pickup truck, it&#8217;s the Prius vs. the Hummer.</p>
<p>There are more rich conservatives than rich liberals (just as there are more rich Republican voters than rich Democratic voters) but the minority of upper-income Americans who are rich do play an important role in our society.  It&#8217;s worth spending some time thinking about rich liberals, but it&#8217;s also worth remembering that most rich Americans are not liberal.</p>
<p>I think Charles Murray is interested in religion too, so let me throw in this set of graphs that subsets the population according to religious attendance:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/02/Screen-shot-2012-02-07-at-10.38.24-PM.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/02/Screen-shot-2012-02-07-at-10.38.24-PM.png" alt="" title="Screen shot 2012-02-07 at 10.38.24 PM" width="587" height="441" class="alignnone size-full wp-image-14422" /></a></p>
<p>Just to hack at this a little more:  <a href="http://andrewgelman.com/2009/08/who_are_the_lib/">data</a> from 2000, 2004, and 2008 showing the income distribution of voters self-classified by ideology (liberal, moderate, or conservative) and party identification (Democrat, Independent, or Republican):</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/02/pidideology1.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/02/pidideology1.png" alt="" title="pidideology" width="600" height="600" class="alignnone size-full wp-image-14420" /></a></p>
<p>There are more rich conservatives than rich liberals.</p>
<p>And here are some maps of how different ethnic groups voted in 2008 (click for the large version):</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/02/new2008map.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/02/new2008map-300x229.png" alt="" title="new2008map" width="300" height="229" class="alignnone size-medium wp-image-14424" /></a></p>
<p>The second row of maps gives the answer to your questions about white America, or at least those white Americans who vote.  (The evidence is that, compared to voters, nonvoters have lower income and are more likely to favor income redistribution.  So I don&#8217;t think that moving from &#8220;white voters&#8221; to &#8220;white America&#8221; would change our story.)</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/charles-murray-does-a-tucker-carlson-provoking-me-to-unleash-the-usual-torrent-of-graphs/feed/</wfw:commentRss>
		<slash:comments>39</slash:comments>
		</item>
		<item>
		<title>The more likely it is to be X, the more likely it is to be Not X?</title>
		<link>http://andrewgelman.com/2012/02/the-more-likely-it-is-to-be-x-the-more-likely-it-is-to-be-not-x/</link>
		<comments>http://andrewgelman.com/2012/02/the-more-likely-it-is-to-be-x-the-more-likely-it-is-to-be-not-x/#comments</comments>
		<pubDate>Wed, 08 Feb 2012 00:25:47 +0000</pubDate>
		<dc:creator>Phil</dc:creator>
				<category><![CDATA[Miscellaneous Statistics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14406</guid>
		<description><![CDATA[This post is by Phil Price. A paper by Wood, Douglas, and Sutton looks at &#8220;Beliefs in Contradictory Conspiracy Theories.&#8221;  Unfortunately the  subjects were 140 undergraduate psychology students, so one wonders how general the results are.  I found this sort of arresting: In Study 1 (n=137), the more participants believed that Princess Diana faked her [...]]]></description>
			<content:encoded><![CDATA[<p>This post is by Phil Price.</p>
<p>A <a title="Paper at academia.edu" href="http://kent.academia.edu/RobbieSutton/Papers/1275313/Dead_and_alive_Beliefs_in_contradictory_conspiracy_theories" target="_blank">paper by Wood, Douglas, and Sutton</a> looks at &#8220;Beliefs in Contradictory Conspiracy Theories.&#8221;  Unfortunately the  subjects were 140 undergraduate psychology students, so one wonders how general the results are.  I found this sort of arresting:</p>
<blockquote><p>In Study 1 (n=137), the more participants believed that Princess Diana faked her own death, the more they believed she was murdered.  In Study 2 (n=102), the more participants believed that Osama Bin Laden was already dead when U.S. Special Forces raided his compound in Pakistan, the more they believed he is still alive.</p></blockquote>
<p>As the article says, &#8220;conspiracy advocates&#8217; distrust of official narratives may be so strong that many alternative theories are simultaneously endorsed in spite of any contradictions between them.&#8221;  But I think the authors overstate things when they say &#8220;One would think that there ought to be a negative correlation between beliefs in contradictory accounts of events &#8212; the more one believes in a particular theory, the less likely rival theories will seem.&#8221;  Well, one might think that, but actually a positive correlation makes sense to me.  I can see how, if you really think that a lot of what the government says is a lie, you would think &#8220;well, I don&#8217;t know exactly which part of the Bin Laden account is a lie but they are probably lying about something; maybe he was already dead, or maybe he&#8217;s still alive now, but I don&#8217;t know which.&#8221;  The authors realize this is what is going on, they just make too much of how surprising it should be.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/the-more-likely-it-is-to-be-x-the-more-likely-it-is-to-be-not-x/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Philosophy of Bayesian statistics: my reactions to Hendry</title>
		<link>http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-hendry/</link>
		<comments>http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-hendry/#comments</comments>
		<pubDate>Tue, 07 Feb 2012 14:45:00 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Bayesian Statistics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14353</guid>
		<description><![CDATA[Continuing with my discussion here and here of the articles in the special issue of the journal Rationality, Markets and Morals on the philosophy of Bayesian statistics: David Hendry, &#8220;Empirical Economic Model Discovery and Theory Evaluation&#8221;: Hendry presents a wide-ranging overview of scientific learning, with an interesting comparison of physical with social sciences. (For some [...]]]></description>
			<content:encoded><![CDATA[<p>Continuing with my discussion <a href="http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-cox-and-mayo/">here</a> and <a href="http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-senn/">here</a> of the articles in the special issue of the journal Rationality, Markets and Morals on the philosophy of Bayesian statistics:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/02/hendry1.gif"><img src="http://andrewgelman.com/wp-content/uploads/2012/02/hendry1.gif" alt="" title="hendry1" width="91" height="104" class="alignnone size-full wp-image-14354" /></a></p>
<p>David Hendry, &#8220;Empirical Economic Model Discovery and Theory Evaluation&#8221;:</p>
<p>Hendry presents a wide-ranging overview of scientific learning, with an interesting comparison of physical with social sciences.  (For some reason, he discusses many physical sciences but restricts his social-science examples to economics and psychology.)</p>
<p>The only part of Hendry&#8217;s long and interesting article that I will discuss, however, is the part where he decides to take a gratuitous swing at Bayes.  I don&#8217;t know why he did this, but maybe it&#8217;s part of some fraternity initiation thing, like TP-ing the dean&#8217;s house on Halloween.</p>
<p>Here&#8217;s the story.  Hendry writes:</p>
<blockquote><p>&#8216;Prior distributions&#8217; widely used in Bayesian analyses, whether subjective or &#8216;objective&#8217;, cannot be formed in such a setting either, absent a falsely assumed crystal ball. Rather, imposing a prior distribution that is consistent with an assumed model when breaks are not included is a recipe for a bad analysis in macroeconomics. Fortunately, priors are neither necessary nor sufficient in the context of discovery.</p></blockquote>
<p>I could just laugh this off&#8212;but as someone who has published two books and hundreds of articles on applied Bayesian statistics, I think I&#8217;ll take Hendry seriously.</p>
<p>Let me start with the tone.  I generally don&#8217;t like when people take words or phrases that you disagree with them and put them in quotes.  If you&#8217;re going to put &#8220;prior distributions&#8221; and &#8220;objective&#8221; in quotes, then please show the same disrespect to your other terms:  &#8220;falsely&#8221; . . . &#8220;crystal ball&#8221; . . . &#8220;breaks&#8221; . . . &#8220;recipe&#8221; . . . &#8220;macroeconomics&#8221; . . . &#8220;discovery.&#8221;</p>
<p>But let me get to the substance.  First, Hendry&#8217;s right.  No statistical method is necessary.  With sufficient effort, I think you can solve all statistical problems with Bayesian methods, or with robust methods, or with bootstrapping, or with any number of alternative approaches.  Fuzzy sets would probably work too.  Different approaches have different advantages, but I&#8217;m sure that if Hendry adopts a self-denying ordinance and decides to never use priors, he can solve all sorts of data analysis problems.  He&#8217;ll just have to work really hard sometimes.  But, to be fair, there are some problems that I have to work really hard on too.  In short:  econometrics methods tend to require more effort in complicated settings, but they often have appealing robustness properties.  It&#8217;s fair enough that Hendry and I place different values on robustness vs. modeling flexibility.</p>
<p>My most serious criticism with Hendry&#8217;s above paragraph is the old, old story:  he&#8217;s singling out Bayesian methods and priors as being particularly bad.  Meanwhile all those likelihood functions and assumptions of additivity, symmetry, etc. all just sneak in.  Hendry&#8217;s standing at the back window with a shotgun, scanning for priors coming over the hill, while a million assumptions just walk right into his house through the front door.</p>
<p>Here&#8217;s Hendry&#8217;s summary:</p>
<blockquote><p>The pre-existing framework of ideas is bound to structure any analysis for better or worse, but being neither necessary nor sufficient, often blocking, and unhelpful in a changing world, prior distributions should play a minimal role in data analyses that seek to discover useful knowledge.</p></blockquote>
<p>I&#8217;m going to have to disagree.  I could give a million examples of useful knowledge that can be discovered with the aid of prior distributions.  For example, where are the houses in the U.S. that have high radon levels?  What are the effects of redistricting?  How much perchloroethylene does the body metabolize?  What is public opinion on gay rights by state?  Or, for a classic from Mosteller and Wallace in 1960, classify the authorship of the Federalist Papers using 1960s technology.</p>
<p>I&#8217;m not saying that Hendry and his colleagues need to be using Bayesian methods in his applied research.  I&#8217;m not even saying that Bayesian methods are needed to solve the problems listed in the above paragraph.  In practice these problems were indeed solved using Bayesian inference, but I think other approaches could get there too.  What I am saying is, why is Hendry so sure that &#8220;prior distributions should play a minimal role&#8221; etc.?  I&#8217;m really bothered when people go beyond the simple and direct, &#8220;I have no personal experience with Bayesian inference solving a useful problem&#8221; to prescriptive (and wrong) statements such as &#8220;prior distributions should play a minimal role.&#8221;  And it&#8217;s just silly to say that priors are &#8220;unhelpful in a changing world.&#8221;  I&#8217;d think an econometrician would know about time series models!</p>
<p>Hendry also pulls the no-true-Scotsman trick:</p>
<blockquote><p>Fortunately, priors are neither necessary nor sufficient in the context of discovery. For example, children learn whatever native tongue is prevalent around them, be it Chinese, Arabic or English, for none of which could they have a &#8216;prior&#8217;. Rather, trial-and-error learning seems a child&#8217;s main approach to language acquisition: see Clark and Clark (1977). Certainly, a general language system seems to be hard wired in the human brain (see Pinker 1994; 2002) but that hardly constitutes a prior. Thus, in one of the most complicated tasks imaginable, which computers still struggle to emulate, priors are not needed.</p></blockquote>
<p>This is a no-true-Scotsman argument because, when confronted with an example in which our brains figure things out using a pre-existing structure (not for Chinese, Arabic, or English, but for human language in general), Hendry simply says that this system that is &#8220;hard wired in the human brain . . . hardly constitutes a prior.&#8221;  Huh?  It&#8217;s definitely a prior.  That&#8217;s the whole point:  our brains are tuned to decode human language.</p>
<p>Why does this bug me so much about a few throwaway paragraphs in an otherwise-pretty-good-article?  Hendry&#8217;s anti-Bayesian sentiments are no more clueless than those earlier expressed by, say, <a href="http://andrewgelman.com/2010/06/hey_dude_ya_don/">John DiNardo</a>.  The difference is that DiNardo was just venting his opinions and was pretty open about this, whereas Hendry&#8217;s presenting his prejudices with an air of expertise.  If Hendry wants to work on &#8220;replacing unrestricted non-linear functions by an encompassing theory-derived form, such as an ogive,&#8221; then fine.  His theoretical models of model selection seem interesting and could perhaps be useful.  I just wish he&#8217;d cut out the part where he implicitly disparages the work of Mosteller and Wallace, Lax and Phillips, and a few zillion other researchers who&#8217;ve used Bayesian methods to solve problems.</p>
<p>It&#8217;s not too late for Hendry to reform (I hope).  All he needs to do is to retreat to present the positive virtues of his preferred inferential approach along with his explanations as to why Bayesian methods have not seemed useful for him.  He&#8217;s an econometrician, he doesn&#8217;t work in toxicology and that&#8217;s fine.  I think both his positive and his negative statements would be stronger if he would be more aware of the limits of his own experience.  Just as, in mathematics, a theorem is clearer if you understand the range of its applicability and the areas where there are counterexamples.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-hendry/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Bayesian model-building by pure thought:  Some principles and examples</title>
		<link>http://andrewgelman.com/2012/02/bayesian-model-building-by-pure-thought-some-principles-and-examples/</link>
		<comments>http://andrewgelman.com/2012/02/bayesian-model-building-by-pure-thought-some-principles-and-examples/#comments</comments>
		<pubDate>Mon, 06 Feb 2012 14:52:07 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14393</guid>
		<description><![CDATA[This is one of my favorite papers: In applications, statistical models are often restricted to what produces reasonable estimates based on the data at hand. In many cases, however, the principles that allow a model to be restricted can be derived theoretically, in the absence of any data and with minimal applied context. We illustrate [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.stat.columbia.edu/~gelman/research/published/deep.pdf">This</a> is one of my favorite papers:</p>
<blockquote><p>In applications, statistical models are often restricted to what produces reasonable estimates based on the data at hand. In many cases, however, the principles that allow a model to be restricted can be derived theoretically, in the absence of any data and with minimal applied context. We illustrate this point with three well-known theoretical examples from spatial statistics and time series. First, we show that an autoregressive model for local averages violates a principle of invariance under scaling. Second, we show how the Bayesian estimate of a strictly-increasing time series, using a uniform prior distribution, depends on the scale of estimation. Third, we interpret local smoothing of spatial lattice data as Bayesian estimation and show why uniform local smoothing does not make sense. In various forms, the results presented here have been derived in previous work; our contribution is to draw out some principles that can be derived theoretically, even though in the past they may have been presented in detail in the context of specific examples.</p></blockquote>
<p>I just love this paper.  But it&#8217;s only been cited 17 times (and four of those were by me), so I must have done something wrong.  In retrospect I think it would&#8217;ve made more sense to write it as three separate papers; then each might have had its own impact.  In any case, I hope the article provides some enjoyment and insight to those of you who click through.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/bayesian-model-building-by-pure-thought-some-principles-and-examples/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>What is a prior distribution?</title>
		<link>http://andrewgelman.com/2012/02/what-is-a-prior-distribution/</link>
		<comments>http://andrewgelman.com/2012/02/what-is-a-prior-distribution/#comments</comments>
		<pubDate>Sun, 05 Feb 2012 14:30:57 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Zombies]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14361</guid>
		<description><![CDATA[Some recent blog discussion revealed some confusion that I&#8217;ll try to resolve here. I wrote that I&#8217;m not a big fan of subjective priors. Various commenters had difficulty with this point, and I think the issue was most clearly stated by Bill Jeffreerys, who wrote: It seems to me that your prior has to reflect [...]]]></description>
			<content:encoded><![CDATA[<p>Some recent blog discussion revealed some confusion that I&#8217;ll try to resolve here.</p>
<p>I <a href="http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-senn/">wrote</a> that I&#8217;m not a big fan of subjective priors.  Various commenters had difficulty with this point, and I think the issue was most clearly stated by Bill Jeff<del datetime="2012-02-07T14:11:27+00:00">re</del>erys, who <a href="http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-senn/#comment-72825">wrote</a>:</p>
<blockquote><p>It seems to me that your prior has to reflect your subjective information before you look at the data. How can it not?</p>
<p>But this does not mean that the (subjective) prior that you choose is irrefutable; Surely a prior that reflects prior information just does not have to be inconsistent with that information. But that still leaves a range of priors that are consistent with it, the sort of priors that one would use in a sensitivity analysis, for example.</p></blockquote>
<p>I think I see what Bill is getting at.  A prior represents your subjective belief, or some approximation to your subjective belief, even if it&#8217;s not perfect.  That sounds reasonable but I don&#8217;t think it works.  Or, at least, it often doesn&#8217;t work.</p>
<p>Let&#8217;s start with a simple example.  You hop on a scale that gives unbiased measurements with errors that have a standard deviation of 0.1 kg.  To do Bayesian analysis, you assign a N(0,10000^2) prior on your true weight.  That doesn&#8217;t represent your subjective belief!  It&#8217;s not even an approximation.  No problem&#8212;it works fine for most purposes&#8212;but it&#8217;s not subjective.</p>
<p>More generally, think of all the linear and logistic regressions we use.  Instead of thinking of these as subjective beliefs, I prefer to think of the joint probability distribution as a model, reflecting a set of assumptions.  In some settings these assumptions represent subjective beliefs, in other settings they don&#8217;t.</p>
<p><a href="http://www.stat.columbia.edu/~gelman/research/published/p039-_o.pdf">This article</a> from 2002 might help.  If I could go back and alter it, I&#8217;d add something on weakly informative priors, but I still agree with the general approach discussed there.</p>
<p><strong>P.S.</strong>  Just to give an example of what I mean by prior information:  The analyses in Red State Blue State all use noninformative prior distributions.  But a lot of prior information comes in, in the selection of what questions to study, what models to consider, and what variables to include in the model.  For example, as state-level predictors we include region of the country, Republican vote in the previous presidential election, and average state income.  Prior <em>information</em> goes into the choice and construction of all these predictors.  But the prior <em>distribution</em> is a particular probability distribution that in this case is flat and does not reflect prior knowledge.</p>
<p>One way to think about informative prior distributions is as a form of smoothing:  when setting the parameters of a probability distribution based on prior knowledge, we are imposing some time smoothness on the parameters.  I think that&#8217;s probably a good idea and that the Red State Blue State analyses (among others) would be better for it.  I didn&#8217;t set up this prior structure because I wasn&#8217;t easily equipped to do so and it seemed like too much effort, but perhaps at some future time this sort of structuring will be as commonplace as hierarchical modeling is today.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/what-is-a-prior-distribution/feed/</wfw:commentRss>
		<slash:comments>33</slash:comments>
		</item>
		<item>
		<title>“Turn a Boring Bar Graph into a 3D Masterpiece”</title>
		<link>http://andrewgelman.com/2012/02/turn-a-boring-bar-graph-into-a-3d-masterpiece/</link>
		<comments>http://andrewgelman.com/2012/02/turn-a-boring-bar-graph-into-a-3d-masterpiece/#comments</comments>
		<pubDate>Sun, 05 Feb 2012 02:26:18 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Statistical graphics]]></category>
		<category><![CDATA[Zombies]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14078</guid>
		<description><![CDATA[Jimmy sends in this. Steps include &#8220;Make whimsical sparkles by drawing an ellipse using the Ellipse Tool,&#8221; &#8220;Rotate the sparkles . . . Give some sparkles less Opacity by using the Transparency Palette,&#8221; and &#8220;Add a haze around each sparkle by drawing a white ellipse using the Ellipse Tool.&#8221; The punchline: Now, the next time [...]]]></description>
			<content:encoded><![CDATA[<p>Jimmy sends in <a href="http://vector.tutsplus.com/tutorials/designing/turn-a-boring-bar-graph-into-a-3d-masterpiece/">this</a>.</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/01/Screen-shot-2012-01-10-at-4.28.19-PM.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/01/Screen-shot-2012-01-10-at-4.28.19-PM.png" alt="" title="Screen shot 2012-01-10 at 4.28.19 PM" width="561" height="630" class="alignnone size-full wp-image-14079" /></a></p>
<p>Steps include &#8220;Make whimsical sparkles by drawing an ellipse using the Ellipse Tool,&#8221; &#8220;Rotate the sparkles . . . Give some sparkles less Opacity by using the Transparency Palette,&#8221; and &#8220;Add a haze around each sparkle by drawing a white ellipse using the Ellipse Tool.&#8221;</p>
<p>The punchline:</p>
<blockquote><p>Now, the next time you need to include a boring graph in one of your designs you’ll be able to add some extra emphasis and get people to really pay attention to those numbers!</p></blockquote>
<p>P.S. to all the commenters:  Yeah, yeah, do your contrarian best and tell me why chartjunk is actually a good thing, how I&#8217;m just a snob, etc etc.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/turn-a-boring-bar-graph-into-a-3d-masterpiece/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>More on the economic benefits of universities</title>
		<link>http://andrewgelman.com/2012/02/more-on-the-economic-benefits-of-universities/</link>
		<comments>http://andrewgelman.com/2012/02/more-on-the-economic-benefits-of-universities/#comments</comments>
		<pubDate>Sat, 04 Feb 2012 14:16:17 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Economics]]></category>
		<category><![CDATA[Sociology]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=13733</guid>
		<description><![CDATA[Last year my commenters and I discussed Ed Glaeser&#8217;s claim that the way to create a great city is to &#8220;create a great university and wait 200 years.&#8221; I passed this on to urbanist Richard Florida and received the following response: This is a tough one with lots of causality issues. Generally speaking universities make [...]]]></description>
			<content:encoded><![CDATA[<p>Last year my commenters and I <a href="http://andrewgelman.com/2011/03/a_question_abou_10/#comments">discussed</a> Ed Glaeser&#8217;s claim that the way to create a great city is to &#8220;create a great university and wait 200 years.&#8221;</p>
<p>I passed this on to urbanist Richard Florida and received the following response:<br />
<span id="more-13733"></span></p>
<blockquote><p>This is a tough one with lots of causality issues.  Generally speaking universities make places stronger. But this is mainly the case for smaller, college towws. Boulder, Ann Arbor and so on, which also have very high human capital levels and high levels of creative, knowledge and professional workers. </p>
<p>For big cities the issue is mixed. Take Pittsburgh with CMU and Pitt or Baltimore with Hopkins, or St Louis. The list goes on and on.</p>
<p>Kevin Stolarick and I framed this very crudely as a transmitter reciever issue. The university in a city like this can generate a lot of signal, in terms of innovation or even human capital and the city may not receive it or push it away. A long ago paper by Mike Fogarty showed how innovations in Pittsburgh and Cleveland, by universities in these communities, tended to be picked up in Silicon Valley or even Tokyo.</p></blockquote>
<p>I responded:  Another factor in the interaction is:  how good does the university have to be?  Glaeser cited UW and Seattle, but that&#8217;s kind of a funny example, because I don&#8217;t think UW was such a great university 30 years ago.  On the other hand, given the existence of Boeing and Microsoft, UW is <em>good enough</em> to do the job of providing a center for the creative class.  Perhaps Ohio State (another good but not great university) has played a similar role in Columbus.</p>
<p>Florida replied:</p>
<blockquote><p>Better is better.  I think both are over threshold, but having taught at OSU at the very beginning of my career, it brings both plusses and minuses. It was an open admission school. The faculty was very, very mixed. And a huge football factory. Gates and Allen amongothers have pumped big wads of cash into UW, and it is good in computers and biosciences. </p>
<p>Both strike me as regional talent hubs, which probably trumps university quality.</p>
<p>Portland is another outlier with lots of talent/ human capital attraction and pretty crappy universities.</p></blockquote>
<p>Florida also sent along <a href="http://www.creativeclass.typepad.com/thecreativityexchange/files/university_and_the_creative_economy.pdf">this article</a> and <a href="http://www.theatlantic.com/business/archive/2010/10/where-the-worlds-brains-are/64508/">this blog</a>.</p>
<p>Also, Hal Varian wrote:</p>
<blockquote><p>There is a literature that attempts to assess the impact of university research on the local economy.  One person I know well who works in this area is <a href="http://mgt.gatech.edu/directory/faculty/thursby_m/index.html">Marie Thursby</a>.  Click on her vitae to see the kind of work that has been done.  This is pretty careful research, though of course it is hard to pin down causality&#8230;</p></blockquote>
<p>P.S.  Originally I wrote that UW is not such a great university, which may or may not be true but is sort of beside the point since the real issue is whether UW&#8217;s past greatness contributed to Seattle&#8217;s current prosperity.  So I clarified that I&#8217;m really talking about UW thirty or so years ago.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/more-on-the-economic-benefits-of-universities/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Web equation</title>
		<link>http://andrewgelman.com/2012/02/web-equation/</link>
		<comments>http://andrewgelman.com/2012/02/web-equation/#comments</comments>
		<pubDate>Fri, 03 Feb 2012 22:06:36 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Statistical computing]]></category>
		<category><![CDATA[Zombies]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14357</guid>
		<description><![CDATA[Aleks sends along this app which, while cute, is not quite &#8220;killer&#8221; for me. I find it more difficult to write the equation using the trackpad than to simply type it in using Latex! But I suppose it could be useful to beginners who want their papers to look more like science.]]></description>
			<content:encoded><![CDATA[<p>Aleks sends along this <a href="http://webdemo.visionobjects.com/equation.html?locale=default">app</a> which, while cute, is not quite &#8220;killer&#8221; for me.  I find it more difficult to write the equation using the trackpad than to simply type it in using Latex!  But I suppose it could be useful to beginners who want their papers to <a href="http://www.stat.columbia.edu/~gelman/research/unpublished/zombies.pdf">look more like science</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/web-equation/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Philosophy of Bayesian statistics:  my reactions to Senn</title>
		<link>http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-senn/</link>
		<comments>http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-senn/#comments</comments>
		<pubDate>Fri, 03 Feb 2012 14:53:00 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Bayesian Statistics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14346</guid>
		<description><![CDATA[Continuing with my discussion of the articles in the special issue of the journal Rationality, Markets and Morals on the philosophy of Bayesian statistics: Stephen Senn, &#8220;You May Believe You Are a Bayesian But You Are Probably Wrong&#8221;: I agree with Senn&#8217;s comments on the impossibility of the de Finetti subjective Bayesian approach. As I [...]]]></description>
			<content:encoded><![CDATA[<p>Continuing with <a href="http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-cox-and-mayo/">my discussion of the articles in the special issue</a> of the journal Rationality, Markets and Morals on the philosophy of Bayesian statistics:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/02/stephen-senn-photo.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/02/stephen-senn-photo.png" alt="" title="stephen senn photo" width="130" height="132" class="alignnone size-full wp-image-14347" /></a></p>
<p>Stephen Senn, &#8220;You May Believe You Are a Bayesian But You Are Probably Wrong&#8221;:</p>
<p>I agree with Senn&#8217;s comments on the impossibility of the de Finetti subjective Bayesian approach.  As I wrote in 2008, if you could really construct a subjective prior you believe in, why not just look at the data and write down your subjective posterior.  The immense practical difficulties with <em>any</em> serious system of inference render it absurd to think that it would be possible to just write down a probability distribution to represent uncertainty.  I wish, however, that Senn would recognize <em>my</em> Bayesian approach (which is also that of John Carlin, Hal Stern, Don Rubin, and, I believe, others).  De Finetti is no longer around, but we are!</p>
<p>I have to admit that my own Bayesian views and practices have changed.  In particular, I resonate with Senn&#8217;s point that conventional flat priors miss a lot and that Bayesian inference can work better when real prior information is used.  Here I&#8217;m not talking about a subjective prior that is meant to express a personal belief but rather a distribution that represents a summary of prior scientific knowledge.  Such an expression can only be approximate (as, indeed, assumptions such as logistic regressions, additive treatment effects, and all the rest, are only approximations too), and I agree with Senn that it would be rash to let philosophical foundations be a justification for using Bayesian methods.  Rather, my work on the philosophy of statistics is intended to demonstrate how Bayesian inference can fit into a falsificationist philosophy that I am comfortable with on general grounds.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-senn/feed/</wfw:commentRss>
		<slash:comments>28</slash:comments>
		</item>
		<item>
		<title>The inevitable problems with statistical significance and 95% intervals</title>
		<link>http://andrewgelman.com/2012/02/the-inevitable-problems-with-statistical-significance-and-95-intervals/</link>
		<comments>http://andrewgelman.com/2012/02/the-inevitable-problems-with-statistical-significance-and-95-intervals/#comments</comments>
		<pubDate>Thu, 02 Feb 2012 14:00:25 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14017</guid>
		<description><![CDATA[I&#8217;m thinking more and more that we have to get rid of statistical significance, 95% intervals, and all the rest, and just come to a more fundamental acceptance of uncertainty. In practice, I think we use confidence intervals and hypothesis tests as a way to avoid acknowledging uncertainty. We set up some rules and then [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m thinking more and more that we have to get rid of statistical significance, 95% intervals, and all the rest, and just come to a more fundamental acceptance of uncertainty.</p>
<p>In practice, I think we use confidence intervals and hypothesis tests as a way to avoid acknowledging uncertainty. We set up some rules and then act as if we know what is real and what is not. Even in my own applied work, I&#8217;ve often enough presented 95% intervals and gone on from there. But maybe that&#8217;s just not right.</p>
<p>I was thinking about this after receiving the following email from a psychology student:<br />
<span id="more-14017"></span></p>
<blockquote><p>I [the student] am trying to conceptualize the lessons in <a href="http://www.stat.columbia.edu/~gelman/research/published/signif4.pdf">your paper with Stern</a> with comparing treatment effects across studies. When trying to understand if a certain intervention works, we must look at what the literature says. However this can be complicated if the literature has divergent results. There are four situations I am thinking of. FOr each of these situations, assume the studies are randomized control designs with the same treatment and outcome measures, and each situation refers to a different treatment. It is easiest for me to put it into a table. In each of these situations only 1 of 2 published studies is found to be statistically significant.</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td>
<div></div>
</td>
<td>
<p align="center"><strong>Effect</strong></p>
</td>
<td>
<p align="center"><strong>se</strong></p>
</td>
<td>
<p align="center"><strong>Sig</strong></p>
</td>
<td>
<p align="center"><strong>Sig in diff</strong></p>
</td>
<td>
<p align="center"><strong>Result</strong></p>
</td>
</tr>
<tr>
<td><strong>Situation 1</strong></td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
</tr>
<tr>
<td>     Study A</td>
<td>
<p align="center">.5</p>
</td>
<td>
<p align="center">.05</p>
</td>
<td>
<p align="center">Y</p>
</td>
<td rowspan="2">
<p align="center">X</p>
</td>
<td rowspan="2">
<p align="center">Treatment is effective</p>
</td>
</tr>
<tr>
<td>     Study B</td>
<td>
<p align="center">.4</p>
</td>
<td>
<p align="center">.2</p>
</td>
<td>
<div></div>
</td>
</tr>
<tr>
<td><strong>Situation 2</strong></td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
</tr>
<tr>
<td>     Study C</td>
<td>
<p align="center">.5</p>
</td>
<td>
<p align="center">.1</p>
</td>
<td>
<p align="center">Y</p>
</td>
<td rowspan="2">
<p align="center">Y</p>
</td>
<td rowspan="2">
<p align="center">Unclear, needs more replications</p>
</td>
</tr>
<tr>
<td>     Study D</td>
<td>
<p align="center">.1</p>
</td>
<td>
<p align="center">.1</p>
</td>
<td>
<div></div>
</td>
</tr>
<tr>
<td><strong>Situation 3</strong></td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
</tr>
<tr>
<td>     Study E</td>
<td>
<p align="center">.41</p>
</td>
<td>
<p align="center">.2</p>
</td>
<td>
<p align="center">Y</p>
</td>
<td rowspan="2">
<p align="center">X</p>
</td>
<td rowspan="2">
<p align="center">Unclear, needs more replications</p>
</td>
</tr>
<tr>
<td>     Study F</td>
<td>
<p align="center">.14</p>
</td>
<td>
<p align="center">.2</p>
</td>
<td>
<div></div>
</td>
</tr>
<tr>
<td><strong>Situation 4</strong></td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
<td>
<div></div>
</td>
</tr>
<tr>
<td>     Study G</td>
<td>
<p align="center">.7</p>
</td>
<td>
<p align="center">.3</p>
</td>
<td>
<p align="center"> Y</p>
</td>
<td rowspan="2">
<p align="center">X</p>
</td>
<td rowspan="2">
<p align="center">Null/needs more replications</p>
</td>
</tr>
<tr>
<td>     Study H</td>
<td>
<p align="center">.19</p>
</td>
<td>
<p align="center">.1</p>
</td>
<td></td>
</tr>
</tbody>
</table>
<p>Here, Situation 1 refers to 2 studies that have similar effects in magnitude, though the larger of the 2 studies (smaller se) is the only sig one. SInce the difference between the two effects is itself, not statistically significant, we should conclude treatment in situation 1 is effective (this seems to be in line with your paper).<br />
In situation 2 there are 2 equally sized experiments that differ in treatment effect and significance. Since the difference between the estimates is statistically significant, one concludes the paradigm needs more replications.<br />
In situation 3 the 2 studies have 2 effects, one is statistically significant while the other is not. However in this situation study F is neither statistically nor substantively significant. Unlike situation 1 it would seem unwise to conclude Treatment in situation 3 is effective and we need more replications.<br />
Situation 4 is just some result I cam across in a research synthesis, where a smaller study (larger se) had a statistically sig effect, but a larger one did not. It would seem in this situation the true effect is null and the stat sig effect is a type 1 error. However the difference between studies is not stat sig, would this matter?</p></blockquote>
<p>I replied that my quick reaction is that it would be better if there were data from more studies.  With only two studies, your inference will necessarily depend on your prior information about effectiveness and variation of the treatments.</p>
<p>The student then wrote:</p>
<blockquote><p>That is my reaction as well. Unfortunately sometimes the only data we have is from a small number of studies, and not enough to necessarily run a meta-analysis on.  In addition, the hypothetical situations I sent you are sometimes all we know about the effectiveness and variation in treatments, because it is all the evidence we have.  What I am trying to better understand is if your paper is addressing situation 1 ONLY, or if it is making inferences or statements about the evidence in the other situations I presented.</p></blockquote>
<p>To which I replied that I don&#8217;t know that our paper gives any real recommendations.  In a decision problem, I think ultimately it&#8217;s necessary to bite the bullet and decide what prior information you have on effectiveness rather than relying on statistical significance.</p>
<p>This is a problem under classical or Bayesian methods.  Either way, it&#8217;s standard practice to summarize uncertainty in a way that encourages deterministic thinking.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/the-inevitable-problems-with-statistical-significance-and-95-intervals/feed/</wfw:commentRss>
		<slash:comments>38</slash:comments>
		</item>
		<item>
		<title>Philosophy of Bayesian statistics:  my reactions to Cox and Mayo</title>
		<link>http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-cox-and-mayo/</link>
		<comments>http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-cox-and-mayo/#comments</comments>
		<pubDate>Wed, 01 Feb 2012 14:45:18 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Bayesian Statistics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14333</guid>
		<description><![CDATA[The journal Rationality, Markets and Morals has finally posted all the articles in their special issue on the philosophy of Bayesian statistics. My contribution is called Induction and Deduction in Bayesian Data Analysis. I&#8217;ll also post my reactions to the other articles. I wrote these notes a few weeks ago and could post them all [...]]]></description>
			<content:encoded><![CDATA[<p>The journal Rationality, Markets and Morals has finally posted all the articles in their <a href="http://andrewgelman.com/2011/09/articles-on-the-philosophy-of-bayesian-statistics-by-cox-mayo-senn-and-others/">special issue</a> on the philosophy of Bayesian statistics.</p>
<p><a href="http://www.rmm-journal.de/downloads/Article_Gelman.pdf">My contribution</a> is called Induction and Deduction in Bayesian Data Analysis.  I&#8217;ll also post my reactions to the other articles.  I wrote these notes a few weeks ago and could post them all at once, but I think it will be easier if I post my reactions to each article separately.</p>
<p><img src="http://andrewgelman.com/wp-content/uploads/2012/02/cox2.jpg" alt="" title="cox" width="105" height="130" class="alignnone size-full wp-image-14340" /><img src="http://andrewgelman.com/wp-content/uploads/2012/02/mayo.jpg" alt="" title="mayo" width="99" height="144" class="alignnone size-full wp-image-14338" /></p>
<p>To start with my best material, here&#8217;s my reaction to David Cox and Deborah Mayo, &#8220;A Statistical Scientist Meets a Philosopher of Science.&#8221;  I recommend you read all the way through my long note below; there&#8217;s good stuff throughout:</p>
<p>1.  Cox:  &#8220;[Philosophy] forces us to say what it is that we really want to know when we analyze a situation statistically.&#8221;</p>
<p>This reminds me of a standard question that Don Rubin (who, unlike me, has little use for philosophy in his research) asks in virtually any situation:  &#8220;What would you do if you had all the data?&#8221;  For me, that &#8220;what would you do&#8221; question is one of the universal solvents of statistics.</p>
<p>2.  Mayo defines scientific objectivity as concerning &#8220;the goal of using data to distinguish correct from incorrect claims about the world&#8221; and contrasts this with so-called objective Bayesian statistics.  All I can say here is that the terms &#8220;subjective&#8221; and &#8220;objective&#8221; seem way overloaded at this point.  To me, science is objective in that it aims for reproducible findings that exist independent of the observer, and it&#8217;s subjective in that the process of science involves many individual choices.  And I think the statistics I do (mostly, but not always, using Bayesian methods) is both objective and subjective in that way.</p>
<p>3.  Cox discusses Fisher&#8217;s rule that it&#8217;s ok to use prior information in design of data collection but not in data analysis.  Like a lot of hundred-year-old ideas, this rule makes sense in some contexts but not in others.  Consider the notorious study in which a random sample of a few thousand people was analyzed, and it was found that the most beautiful parents were 8 percentage points more likely to have girls, compared to less attractive parents.  The result was statistically significant (p<.05) and published in a reputable journal.  But in this case we have good prior information suggesting that the difference in sex ratios in the population, comparing beautiful to less-beautiful parents, is less than 1 percentage point.  A classical design analysis reveals that, with this level of true difference, any statistically-significant oberved difference in the sample is likely to be noise.  (Even conditional on statistical significance, the observed difference has an over 40% chance of being in the wrong direction and will overestimate the population difference by an order of magnitude.)  At this point, you might well say that the original analysis should never have been done at all---but, given that it has been done, it is essential to use prior information to interpret the data and generalize from sample to population.</p>
<p>Where did Fisher&#8217;s principle go wrong here?  The answer is simple&#8212;and I think Cox would agree with me here.  We&#8217;re in a setting where the prior information is much stronger than the data.  If one&#8217;s only goal is to summarize the data, then taking the difference of 8% (along with a confidence interval and even a p-value) is fine.  But if you want to generalize to the population&#8212;which was indeed the goal of the researcher in this example&#8212;then it makes no sense to stop there.</p>
<p>Cox illustrates the difficulty in a later quote:  &#8220;[Bayesians'] conceptual theories are trying to do two entirely different things. One is trying to extract information from the data, while the other, personalistic theory, is trying to indicate what you should believe, with regard to information from the data and other, prior, information treated equally seriously. These are two very different things.&#8221;</p>
<p>Yes, but Cox is missing something important!  He defines two goals:<br />
(a)  Extracting information from the data.<br />
(b)  A &#8220;personalistic theory&#8221; of &#8220;what you should believe.&#8221;<br />
I&#8217;m talking about something in between, which is inference for the population.  I think Laplace would understand what I&#8217;m talking about here.  The sample is (typically) of no interest in itself, it&#8217;s just a means to learning about the population.  But my inferences about the population aren&#8217;t &#8220;personalistic&#8221;&#8212;at least, no more than the dudes at CERN are personalistic when they&#8217;re trying to learn about particle theory from cyclotron experiments, and no more than the Census and the Bureau of Labor Statistics are personalistic when they&#8217;re trying to learn about the U.S. economy from sample data.</p>
<p>4.  Cox:  &#8220;There are situations where it is very clear that whatever a scientist or statistician might do privately in looking at data, when they present their information to the public or government department or whatever, they should absolutely not use prior information, because the prior opinions on some of these prickly issues of public policy can often be highly contentious with different people with strong and very conflicting views.&#8221;</p>
<p>Maybe.  But I don&#8217;t think Cox even believes this statement himself if it were taken literally.  For example, right now I&#8217;m working on the politically controversial problem of reconstructing historical climate from tree rings.  We have a lot of prior information on the processes under which tree rings grow and how they are measured.  I don&#8217;t think anyone would want to just take raw numbers from core samples as a climate estimate!  All the tools from Statistical Methods for Research Workers won&#8217;t take you from tree rings to temperature estimates.  You need some scientific knowledge and prior information on where these measurements came from.</p>
<p>So let me interpret what I think Cox was saying.  I take him to be dividing any scientific inference into two parts, inside and outside.  Priors are allowed in the inside work of scientific modeling, which uses lots of external information, from the basic assumptions that the data correspond to your scientific goals, through the mathematical form of the transfer function, down to details such as an assumption of normally-distributed measurement errors, which might be supported based on prior experimental evidence.  But Cox would prefer to avoid priors in the outside problem.  In my example, I assume he&#8217;d allow prior information on the tree-ring measurement process&#8212;I don&#8217;t see how you can get anywhere otherwise&#8212;but he&#8217;d rather not combine with external estimates of the temperature series.  That&#8217;s a tenable position.  It doesn&#8217;t avoid all the controversy&#8212;manipulations of the data model can map in predictable ways to changes in the final inferences&#8212;but it could make sense.</p>
<p>I&#8217;ve followed this approach in much of my own applied work, using noninformative priors and carefully avoiding the use of prior information in the final stages a statistical analysis.  But that can&#8217;t always be the right choice.  Sometimes (as in the sex ratio example above), the data are just too weak&#8212;and a classical textbook data analysis can be misleading.  Imagine a Venn diagram, where one circle is &#8220;Topics that are so controversial that we want to avoid using prior information in the statistical analysis&#8221; and the other circle is &#8220;Problems where the data are weak compared to prior information.&#8221;  If you&#8217;re in the intersection of these circles, you have to make some tough choices!</p>
<p>More generally, there is a Bayesian solution to the problem of sensitivity to prior assumptions.  That solution is sensitivity analysis:  perform several analyses using different reasonable priors.  Make more explicit the mapping from prior and data to conclusions.  Be open about sensitivity, don&#8217;t try to sweep the problem under the rug, etc etc.  And, if you&#8217;re going that route, I&#8217;d also like to see some analysis of sensitivity to assumptions that are not conventionally classified as &#8220;prior.&#8221;  You know, those assumptions that get thrown in because they&#8217;re what everybody does.  For example, Cox regression is great, but additivity is a prior assumption too!  (One might argue that assumptions such as additivity, logistic links, etc., are exempt from Fisher&#8217;s strictures by virtue of being default assumptions rather than being based on prior information&#8212;but I certainly don&#8217;t think Mayo would take that position, given her strong feelings on Bayesian default priors.)</p>
<p>My point here is that all statistical methods require choices&#8212;assumptions, if you will.  Not all your choices can be determined or even validated from the data at hand.  If you don&#8217;t want your choices to be based on prior information, what other options do you have?  You can rely on convention&#8212;using methods that appear in major textbooks and have stood the test of time&#8212;or maybe on theory.  Both these meta-foundational approaches have their virtues but neither is perfect:  Conventional methods are not necessarily good (as can be seen by noting that for many problems there are multiple conventional methods that give different results), and theory often doesn&#8217;t help (for example classical confidence intervals and hypothesis tests are insufficient in the simple sex-ratio problem noted above).</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/02/philosophy-of-bayesian-statistics-my-reactions-to-cox-and-mayo/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>“the forces of native stupidity reinforced by that blind hostility to criticism, reform, new ideas and superior ability which is human as well as academic nature”</title>
		<link>http://andrewgelman.com/2012/01/the-forces-of-native-stupidity-reinforced-by-that-blind-hostility-to-criticism-reform-new-ideas-and-superior-ability-which-is-human-as-well-as-academic-nature/</link>
		<comments>http://andrewgelman.com/2012/01/the-forces-of-native-stupidity-reinforced-by-that-blind-hostility-to-criticism-reform-new-ideas-and-superior-ability-which-is-human-as-well-as-academic-nature/#comments</comments>
		<pubDate>Tue, 31 Jan 2012 14:53:07 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Political Science]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14006</guid>
		<description><![CDATA[Q. D. Leavis wrote: The answer does seem to be that the academic world, like other worlds, is run by the politicians, and sensitively scrupulous people tend to leave politics to other people, while people with genuine work to do certainly have no time as well as no taste for committee-rigging and the associated techniques. [...]]]></description>
			<content:encoded><![CDATA[<p>Q. D. Leavis wrote:</p>
<blockquote><p>The answer does seem to be that the academic world, like other worlds, is run by the politicians, and sensitively scrupulous people tend to leave politics to other people, while people with genuine work to do certainly have no time as well as no taste for committee-rigging and the associated techniques.  And then of course there are the forces of native stupidity reinforced by that blind hostility to criticism, reform, new ideas and superior ability which is human as well as academic nature.</p></blockquote>
<p>Not that I&#8217;ve ever read anything by Mrs. Leavis (or, as the Brits used to write, Mrs Leavis).  The above quote is one of the epigraphs to a book by Richard Kostelanetz.  Whom I&#8217;ve never heard of, except in a footnote in John Rodden&#8217;s classic Orwell study, The Politics of Literary Reputation.</p>
<p>I&#8217;ll have more to say about Orwell in another post, but for now let me return to the above Leavis quote, to which I have three reactions:</p>
<p>1.  On a personal level, I&#8217;m on Leavis&#8217;s side.  I&#8217;d much rather work (or blog, which I feel is related to my work and is also a public service) than spend time on academic politics:  forming coalitions, doing the pre-meeting meetings, trading favors, kissing up and kicking down, and all the rest.</p>
<p>To put it another way, I don&#8217;t like political games because (a) I&#8217;m not good at manipulation and deception, and (b) Much of politics is zero-sum, and I prefer to collaborate in positive-sum activities such as writing Stan.</p>
<p>2.  But on a more practical level, <em>somebody</em> needs to do the dirty work.  Every once in awhile.  I&#8217;ve encountered some administrators who are good at &#8220;committee-rigging,&#8221; etc., and others who show less political ability.  I&#8217;ve seem people use political processes in a pointless destructive way&#8212;power for the sake of power&#8212;but others can use their political skills to foster smooth cooperation.</p>
<p>To put it another way, I require the political efforts of others to create the safe space I need to do my work.  And it&#8217;s a special bonus when these political efforts are not &#8220;reinforced by that blind hostility to criticism, reform, new ideas and superior ability.&#8221;</p>
<p>3.  As a political scientist, I recognize that politics is necessary. There&#8217;s no such thing as a non-political process.  Politics is how we fight against entropy.  Whatever non-politicized zones we have in life are often the result of continued political effort.  As the saying goes, the price of liberty is eternal vigilance.</p>
<p>Ultimately I&#8217;ll have to go with #3.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/the-forces-of-native-stupidity-reinforced-by-that-blind-hostility-to-criticism-reform-new-ideas-and-superior-ability-which-is-human-as-well-as-academic-nature/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Statistical Murder</title>
		<link>http://andrewgelman.com/2012/01/statistical-murder/</link>
		<comments>http://andrewgelman.com/2012/01/statistical-murder/#comments</comments>
		<pubDate>Mon, 30 Jan 2012 22:32:31 +0000</pubDate>
		<dc:creator>Aleks Jakulin</dc:creator>
				<category><![CDATA[Economics]]></category>
		<category><![CDATA[Public Health]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14277</guid>
		<description><![CDATA[Robert Zubrin writes in &#8220;How Much Is an Astronaut’s Life Worth?&#8221; (Reason, Feb 2012): &#8230;policy analyst John D. Graham and his colleagues at the Harvard Center for Risk Analysis found in 1997 that the median cost for lifesaving expenditures and regulations by the U.S. government in the health care, residential, transportation, and occupational areas ranges [...]]]></description>
			<content:encoded><![CDATA[<div class="mceTemp">
<p><div class="wp-caption alignright" style="width: 160px"><a href="http://commons.wikipedia.org/wiki/File:Robert_Zubrin_by_the_Mars_Society.jpg"><img class="zemanta-img-inserted zemanta-img-configured" src="http://upload.wikimedia.org/wikipedia/commons/thumb/b/ba/Robert_Zubrin_by_the_Mars_Society.jpg/300px-Robert_Zubrin_by_the_Mars_Society.jpg" alt="English: Photo of Robert Zubrin taken by the M..." width="150" /></a><p class="wp-caption-text">Image via Wikipedia</p></div></p>
<div class="mceTemp"></div>
<dl>
<dt></dt>
</dl>
</div>
<div class="mceTemp"></div>
<p><a href="http://www.marssociety.org/">Robert Zubrin</a> writes in <a href="http://reason.com/archives/2012/01/26/how-much-is-an-astronauts-life-worth/singlepage">&#8220;How Much Is an Astronaut’s Life Worth?&#8221;</a> (<a href="http://reason.com/">Reason</a>, <a href="http://reason.com/issues/february-2012">Feb 2012</a>):</p>
<div class="mceTemp"></div>
<blockquote><p>&#8230;policy analyst John D. Graham and his colleagues at the Harvard Center for Risk Analysis found in 1997 that the median cost for lifesaving expenditures and regulations by the U.S. government in the health care, residential, transportation, and occupational areas ranges from about $1 million to $3 million spent per life saved in today’s dollars. The only marked exception to this pattern occurs in the area of environmental health protection (such as the Superfund program) which costs about $200 million per life saved.</p>
<p>Graham and his colleagues call the latter kind of inefficiency “<a class="zem_slink" title="Statistical murder" href="http://en.wikipedia.org/wiki/Statistical_murder" rel="wikipedia">statistical murder</a>,” since thousands of additional lives could be saved each year if the money were used more cost-effectively. To avoid such deadly waste, the Department of Transportation has a policy of rejecting any proposed safety expenditure that costs more than $3 million per life saved. That ceiling therefore may be taken as a high-end estimate for the value of an American’s life as defined by the U.S. government.</p></blockquote>
<p>This reminds me of my old article on <a href="http://andrewgelman.com/2009/02/value_of_life/">Value of Life</a> &#8211; where the hidden cost of the Iraq war for the US comes to 720,000 lives lost (based on the huge cost).</p>
<div class="zemanta-pixie" style="margin-top: 10px;height: 15px"><a class="zemanta-pixie-a" title="Enhanced by Zemanta" href="http://www.zemanta.com/"><img class="zemanta-pixie-img" style="float: right" src="http://img.zemanta.com/zemified_e.png?x-id=ca1c70d4-ba7b-4f5e-be6d-e42bfd47cceb" alt="Enhanced by Zemanta" /></a></div>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/statistical-murder/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>A tax on inequality, or a tax to keep inequality at the current level?</title>
		<link>http://andrewgelman.com/2012/01/a-tax-on-inequality-or-a-tax-to-keep-inequality-at-the-current-level/</link>
		<comments>http://andrewgelman.com/2012/01/a-tax-on-inequality-or-a-tax-to-keep-inequality-at-the-current-level/#comments</comments>
		<pubDate>Mon, 30 Jan 2012 15:39:55 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Economics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=13924</guid>
		<description><![CDATA[My sometime coauthor Aaron Edlin cowrote (with Ian Ayres) an op-ed recommending a clever approach to taxing the rich. In their article they employ a charming bit of economics jargon, using the word &#8220;earn&#8221; to mean &#8220;how much money you make.&#8221; They &#8220;propose an automatic extra tax on the income of the top 1 percent [...]]]></description>
			<content:encoded><![CDATA[<p>My sometime coauthor Aaron Edlin cowrote (with Ian Ayres) <a href="http://www.nytimes.com/2011/12/19/opinion/dont-tax-the-rich-tax-inequality-itself.html">an op-ed</a> recommending a clever approach to taxing the rich.</p>
<p>In their article they employ a charming bit of economics jargon, using the word &#8220;earn&#8221; to mean &#8220;how much money you make.&#8221;  They &#8220;propose an automatic extra tax on the income of the top 1 percent of earners.&#8221;  I assume their tax would apply to unearned income as well, but they (or their editor at the Times) are just so used to describing income as &#8220;earnings&#8221; that they just threw that in.  Funny.</p>
<p>Also, there&#8217;s a part of the article that doesn&#8217;t make sense to me.<br />
<span id="more-13924"></span><br />
Ayres and Edlin first describe the level of inequality:</p>
<blockquote><p>In 1980 the average 1-percenter made 12.5 times the median income, but in 2006 (the latest year for which data is available) the average income of our richest 1 percent was a whopping 36 times greater than that of the median household.</p></blockquote>
<p>Then they lay out their solution:</p>
<blockquote><p>Enough is enough. . . . we propose an automatic extra tax on the income of the top 1 percent of earners — a tax that would limit the after-tax incomes of this club to 36 times the median household income.</p></blockquote>
<p>This seems fair enough to me, but one thing that puzzles me is:  my impression is that Ayres and Edlin feel that the rich have too much as it is already?  So why freeze inequality at the current rate?  (Yes, inequality could decline, but if it&#8217;s on an inexorable upward trend, my quick guess would be that maxing this ratio at 36 would be nearly equivalent to <em>setting</em> the ratio to 36.)  Given the U.S. budget crisis, why 36?  Why not 30, or 20, or 15?</p>
<p>P.S.  When we last heard from Ayres he was supplying <a href="http://andrewgelman.com/2010/04/advice_to_help/">advice</a> for young people who were rich or expecting to be rich.  So I think it&#8217;s fair to say he&#8217;s no class warrior, that he&#8217;d like to keep income inequality at the current level but no lower.</p>
<p>And please note that I&#8217;m neither endorsing the Ayres/Edlin plan nor criticizing it.  (Given my lack of expertise in macroeconomics, I&#8217;m certainly not the one you&#8217;d go running to, asking for an informed opinion on a proposed tax plan.)  I&#8217;m just asking a question.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/a-tax-on-inequality-or-a-tax-to-keep-inequality-at-the-current-level/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
		</item>
		<item>
		<title>Convenient page of data sources from the Washington Post</title>
		<link>http://andrewgelman.com/2012/01/convenient-page-of-data-sources-from-the-washington-post/</link>
		<comments>http://andrewgelman.com/2012/01/convenient-page-of-data-sources-from-the-washington-post/#comments</comments>
		<pubDate>Mon, 30 Jan 2012 14:45:40 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Miscellaneous Statistics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=13954</guid>
		<description><![CDATA[Wayne Folta points us to this list.]]></description>
			<content:encoded><![CDATA[<p>Wayne Folta points us to <a href="http://www.washingtonpost.com/wp-srv/metro/data/datapost.html">this list</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/convenient-page-of-data-sources-from-the-washington-post/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>G+ &gt; Skype</title>
		<link>http://andrewgelman.com/2012/01/g-skype/</link>
		<comments>http://andrewgelman.com/2012/01/g-skype/#comments</comments>
		<pubDate>Sun, 29 Jan 2012 19:58:41 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Teaching]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14270</guid>
		<description><![CDATA[I spoke at the University of Kansas the other day. Kansas is far away so I gave the talk by video. We did it using a G+ hangout, and it worked really well, much much better than when I gave a talk via Skype. With G+, I could see and hear the audience clearly, and [...]]]></description>
			<content:encoded><![CDATA[<p>I <a href="http://web.ku.edu/~quant/cgi-bin/mw0/index.php?title=Weekly_Colloquium">spoke</a> at the University of Kansas the other day.  Kansas is far away so I gave the talk by video.  We did it using a G+ hangout, and it worked really well, much much better than when I gave a talk via <a href="http://andrewgelman.com/2010/11/i_just_skyped_i/">Skype</a>.  With G+, I could see and hear the audience clearly, and they could hear me just fine while seeing my slides (or my face, I went back and forth).  Not as good as a live presentation but pretty good, considering.</p>
<p>P.S.  And <a href="http://plus.google.com/hangouts/extras">here&#8217;s</a> how to do it!</p>
<p>Conflict of interest disclaimer:  I was paid by Google last year to give a short course.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/g-skype/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>How many parameters are in a multilevel model?</title>
		<link>http://andrewgelman.com/2012/01/how-many-parameters-are-in-a-multilevel-model/</link>
		<comments>http://andrewgelman.com/2012/01/how-many-parameters-are-in-a-multilevel-model/#comments</comments>
		<pubDate>Sun, 29 Jan 2012 14:54:38 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Multilevel Modeling]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=13103</guid>
		<description><![CDATA[Stephen Collins writes: I’m reading your Multilevel modeling book and am trying to apply it to my work. I’m concerned with how to estimate a random intercept model if there are hundreds/thousands of levels. In the Gibbs sampling, am I sampling a parameter for each level? Or, just the hyper-parameters? In other words, say I [...]]]></description>
			<content:encoded><![CDATA[<p>Stephen Collins writes:</p>
<blockquote><p>I’m reading your Multilevel modeling book and am trying to apply it to my work.  I’m concerned with how to estimate a random intercept model if there are hundreds/thousands of levels.  In the Gibbs sampling, am I sampling a parameter for each level?  Or, just the hyper-parameters?  In other words, say I had 500 zipcode intercepts modeled as ~ N(m,s).  Would my posterior be two dimensional, sampling for “m” and “s,” or would it have 502 dimensions?</p></blockquote>
<p>My reply:  Indeed you will have hundreds or thousands of parameters&#8212;or, in classical terms, hundreds or thousands of predictive quantities.  But that&#8217;s ok.  Even if none of those predictions is precise, you&#8217;re learning  about the model.</p>
<p>See page 526 of the book for more discussion of the number of parameters in a multilevel model.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/how-many-parameters-are-in-a-multilevel-model/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using predator-prey models on the Canadian lynx series</title>
		<link>http://andrewgelman.com/2012/01/the-last-word-on-the-canadian-lynx-series/</link>
		<comments>http://andrewgelman.com/2012/01/the-last-word-on-the-canadian-lynx-series/#comments</comments>
		<pubDate>Sat, 28 Jan 2012 14:58:21 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Bayesian Statistics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=13956</guid>
		<description><![CDATA[The &#8220;Canadian lynx data&#8221; is one of the famous examples used in time series analysis. And the usual models that are fit to these data in the statistics time-series literature, don&#8217;t work well. Cavan Reilly and Angelique Zeringue write: Reilly and Zeringue then present their analysis. Their simple little predator-prey model with a weakly informative [...]]]></description>
			<content:encoded><![CDATA[<p>The &#8220;Canadian lynx data&#8221; is one of the famous examples used in time series analysis.  And the usual models that are fit to these data in the statistics time-series literature, don&#8217;t work well.  Cavan Reilly and Angelique Zeringue <a href="http://andrewgelman.com/wp-content/uploads/2011/12/ReillyLynx.pdf">write</a>:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2011/12/Screen-shot-2011-12-22-at-6.04.02-PM.png"><img src="http://andrewgelman.com/wp-content/uploads/2011/12/Screen-shot-2011-12-22-at-6.04.02-PM.png" alt="" title="Screen shot 2011-12-22 at 6.04.02 PM" width="456" height="241" class="alignnone size-full wp-image-13957" /></a></p>
<p>Reilly and Zeringue then present their analysis.  Their simple little predator-prey model with a weakly informative prior way outperforms the standard big-ass autoregression models.  Check this out:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2011/12/Screen-shot-2011-12-22-at-6.08.39-PM.png"><img src="http://andrewgelman.com/wp-content/uploads/2011/12/Screen-shot-2011-12-22-at-6.08.39-PM.png" alt="" title="Screen shot 2011-12-22 at 6.08.39 PM" width="455" height="325" class="alignnone size-full wp-image-13958" /></a></p>
<p>Or, to put it into numbers, when they fit their model to the first 80 years and predict to the next 34, their root mean square out-of-sample error is 1480 (see scale of data above).  In contrast, the standard model fit to these data (the SETAR model of Tong, 1990) has more than twice as many parameters but gets a worse-performing root mean square error of 1600, even when that model is fit to the entire dataset.  (If you fit the SETAR or any similar autoregressive model to the first 80 years and use it to predict the next 34, the predictions are a disaster&#8212;the predicted values quickly go toward the mean and can&#8217;t even attempt to track the curve.)</p>
<p>As Reilly and Zeringue note, the above graph shows potential room for improvement in the model, but even as is, it shows the huge benefits that can be obtained by attempting to model the underlying process rather than simply fitting the data using a conventional family of models.</p>
<p>(It&#8217;s funny for me to emphasize this point, given how often I use conventional models such as linear and logistic regression.)</p>
<p>P.S.  The title and text above have been modified to reflect comments below with reference to models fit to the lynx data in the ecology literature.  There appears to be not enough communication between ecologists and statisticians.  The statistical point above still holds&#8212;a simple model with some reasonable structure can outperform a generic data-fitting model such as an autoregression&#8212;but you should probably check out some of the references given in the comments if you&#8217;re interested in the lynx example or ecology models more generally.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/the-last-word-on-the-canadian-lynx-series/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
		</item>
		<item>
		<title>Educational monoculture</title>
		<link>http://andrewgelman.com/2012/01/educational-monoculture/</link>
		<comments>http://andrewgelman.com/2012/01/educational-monoculture/#comments</comments>
		<pubDate>Fri, 27 Jan 2012 14:29:21 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Teaching]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14260</guid>
		<description><![CDATA[John Cook writes that he&#8217;d like to hear more people talk about &#8220;educational monoculture.&#8221; I don&#8217;t actually know John Cook but I enjoy reading his blog, so I feel like the least I can do is to honor his request. I have to admit that I have a bit of a monocultural temperament myself. I [...]]]></description>
			<content:encoded><![CDATA[<p>John Cook <a href="http://www.johndcook.com/blog/2012/01/22/educational-monoculture/">writes</a> that he&#8217;d like to hear more people talk about &#8220;educational monoculture.&#8221;  I don&#8217;t actually know John Cook but I enjoy reading his blog, so I feel like the least I can do is to honor his request.</p>
<p>I have to admit that I have a bit of a monocultural temperament myself.  I have strong feelings about the right and wrong way to do things, and I don&#8217;t have much patience for what seems to me to be the wrong way.  As a result, I&#8217;ve often disparaged or ignored important statistical developments because some small aspect of the new idea didn&#8217;t fit with my thinking.  (On the plus side, I think I&#8217;ve disparaged or ignored lots more bad ideas thad deserve oblivion.)</p>
<p>I&#8217;ve always been suspicious of the hedgehog/fox distinction because my impression is that just about everybody likes to think of him or herself as a fox.  Being a hedgehog is like being &#8220;ideological&#8221;; most of us like to think of ourselves as pragmatic foxes.  And in any case I think <a href="http://andrewgelman.com/2005/10/statisticians_a/">most statisticians</a> are foxes.</p>
<p>One of the many positive outcomes of my mugging at Berkeley was a commitment to pluralism (for example, see <a href="http://andrewgelman.com/2011/04/bayesian_statis_1/">here</a>).</p>
<p>Beyond this, I move away from my natural monocultural instincts by teaching classes that include material I wouldn&#8217;t otherwise cover, by listening carefully to people I respect who do things in a different way than I do, and by thinking hard about why certain methods or attitudes which seem silly to me, still remain popular.</p>
<p>Finally, my approach as a political scientist and public opinion researcher is to understand the views of others.  I think I have a pretty good grip on why it can make sense for people to vote for Gingrich or Romney or Obama or Santorum or whatever, and I&#8217;m interested in understanding political ideologies as they manifest themselves in different areas (even in statistics, where political views range from Dennis Lindley to <a href="http://andrewgelman.com/2011/03/more_on_the_cor/">Jacob Wolfowitz</a>).</p>
<p>&#8220;Moving beyond monoculture&#8221; doesn&#8217;t mean that I abandon my <a href="http://andrewgelman.com/2011/01/that_silly_esp/">skepticism</a> but it means that I should at least try to understand other approaches to looking at the world.</p>
<p>P.S.  I thought the above discussion would be more useful than yet another argument about the extent to which modern education is such a scam etc.  </p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/educational-monoculture/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Suggested resolution of the Bem paradox</title>
		<link>http://andrewgelman.com/2012/01/suggested-resolution-of-the-bem-paradox/</link>
		<comments>http://andrewgelman.com/2012/01/suggested-resolution-of-the-bem-paradox/#comments</comments>
		<pubDate>Thu, 26 Jan 2012 14:11:20 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Sociology]]></category>
		<category><![CDATA[Zombies]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14243</guid>
		<description><![CDATA[There has been an increasing discussion about the proliferation of flawed research in psychology and medicine, with some landmark events being John Ioannides&#8217;s article, &#8220;Why most published research findings are false&#8221; (according to Google Scholar, cited 973 times since its appearance in 2005), the scandals of Marc Hauser and Diederik Stapel, two leading psychology professors [...]]]></description>
			<content:encoded><![CDATA[<p>There has been an increasing discussion about the proliferation of flawed research in psychology and medicine, with some landmark events being John Ioannides&#8217;s <a href="http://andrewgelman.com/2007/09/most_science_st/">article</a>, &#8220;Why most published research findings are false&#8221; (according to Google Scholar, cited 973 times since its appearance in 2005), the scandals of Marc Hauser and Diederik Stapel, two leading psychology professors who resigned after disclosures of scientific misconduct, and Daryl Bem&#8217;s <a href="http://andrewgelman.com/2011/01/that_silly_esp/">dubious</a> recent paper on ESP, published to much <a href="http://www.freakonomics.com/2010/11/29/are-cornell-students-psychic/">fanfare</a> in Journal of Personality and Social Psychology, one of the top journals in the field.</p>
<p>Alongside all this are the plagiarism scandals, which are uninteresting from a scientific context but are relevant in that, in many cases, neither the institutions housing the plagiarists nor the editors and publishers of the plagiarized material seem to care.  Perhaps these universities and publishers are more worried about bad publicity (and maybe lawsuits, given that many of the plagiarism cases involve law professors) than they are about scholarly misconduct.</p>
<p>Before going on, perhaps it&#8217;s worth briefly reviewing who is hurt by the publication of flawed research.  It&#8217;s not a victimless crime.  Here are some of the malign consequences:</p>
<p>- Wasted time and resources spent by researchers trying to replicate non-findings and chasing down dead ends.</p>
<p>- Fake science news bumping real science news off the front page.</p>
<p>- When the errors and scandals come to light, a decline in the prestige of higher-quality scientific work.</p>
<p>- Slower progress of science, delaying deeper understanding of psychology, medicine, and other topics that we deem important enough to deserve large public research efforts.</p>
<p><strong>This is a hard problem!</strong></p>
<p>There&#8217;s a general sense that the system is broken with no obvious remedies.  I&#8217;m most interested in presumably sincere and honest scientific efforts that are misunderstood and misrepresented into more than they really are (the breakthrough-of-the-week mentality criticized by Ioannides and exemplfied by Bem).  As noted above, the cases of outright fraud have little scientific interest but I brought them up to indicate that, even in extreme cases, the groups whose reputations seem at risk from the unethical behavior often seem more inclined to bury the evidence than to stop the madness.</p>
<p>If universities, publishers, and editors are inclined to look away when confronted with out-and-out fraud and plagiarism, we can hardly be surprised if they&#8217;re not aggressive against merely dubious research claims.</p>
<p>In the last section of this post, I briefly discuss several examples of dubious research that I&#8217;ve encountered, just to give a sense of the difficulties that can arise in evaluating such reports.</p>
<p><strong>What to do (statistics)?</strong></p>
<p>My generic solution to the statistics problems involved in estimating small effects is to replace multiple comparisons by multilevel modeling, that is, to estimate configurations rather than single effects or coefficients.  This tactic won&#8217;t solve every problem but it&#8217;s my overarching conceptual framework.  There&#8217;s lots room for research on how to do better in particular problem settings.</p>
<p><strong>What to do (scientific publishing)?</strong></p>
<p>I have clearer ideas of resolutions (at least in the short term) of the Bem paradox; in short, what to do with dubious but potentially interesting findings.</p>
<p>So far there seem to be two suggestions out there:  Either publish such claims in top journals (as for example Bem&#8217;s in JPSP, or the contagion-of-obesity paper in NEJM), or the journals should reject them (perhaps from some combination of more careful review of methodology, higher standards than classical 5% significance, and Bayesian skepticism).</p>
<p>The problem with the publish-in-top-journals strategy is that it ensures publicity for some mistakes and it creates incentives for researchers to stretch their statistics to get a prestigious publication.</p>
<p>The problem with the reject-&#8217;em-all-and-let-the-Arxiv-sort-&#8217;em-out strategy is that it&#8217;s perhaps too rigorous.  So many papers have potential methodological flaws.  Recall that the Bem paper was published, which means in some sense that its reviewers thought the paper&#8217;s flaws were no worse than what usually gets published in JPSP.  Long-term, sure, we&#8217;d like to improve methodological rigor, but in the meantime a key problem with Bem&#8217;s paper was not <em>just</em> its methodological flaws, it was also the implausibility of the claimed results.</p>
<p>So here&#8217;s my proposed solution.  Instead of publishing speculative results in top journals such as JPSP, Science, Nature, etc., publish them in lower-ranked venues.  For example, Bem could publish his experiments in some specialized journal of psychological measurement.  If the work appears to be solid (as judged by the usual corps of referees), then publish it, get it out there.  I&#8217;m not saying to send the paper to a trash journal; if it&#8217;s good stuff it can go in a good journal, the sort where peer review really means something.  (I assume there&#8217;s also a journal of parapsychology but that&#8217;s probably just for true believers; it&#8217;s fair enough that Bem etc would like to publish somewhere that outsiders would respect.)</p>
<p>Under this system, JPSP could feel free to reject the Bem paper on the grounds that it&#8217;s too speculative to get the journal&#8217;s implicit endorsement.  This is not suppression or censorship or anything like it, it&#8217;s just a recommendation that the paper be sent to a more specialized journal where there will be a chance for criticism and replication.  At some point, if the findings are tested and replicated and seem to hold up, then it could be time for a publication in JPSP, Science, or Nature.</p>
<p>From the other side, this should be acceptable to the Bems and Fowlers who like to work on the edge.  You still get your ideas out there in a respectable publication (and you still might even get a bit of publicity), and then you, the skeptics, and the rest of the scientific community can go at it in public.</p>
<p>There have also been proposals for more interactive publications of individual articles, with bloglike opportunities for discussion and replies.  That&#8217;s fine too, but I think the only way to make real progress here is to accept that no individual article will tell the whole story, especially if the article is a report of new research.  If the Bem finding is real, this can be demonstrated in a series of papers in some specialized journal.<br />
<span id="more-14243"></span><br />
<strong>Appendix:  Individual cases can be tough!</strong></p>
<p>I&#8217;ve encountered a lot of these borderline research findings over the past several years, and my own reaction is typically formed by some mix of my personal scientific knowledge, the statistical work involved, and my general impressions.  Here are a few examples:</p>
<p>&#8220;Beautiful women have more daughters&#8221;:  I was pretty sure this one was empty just based on my background knowledge (the claim was an difference of 8 percentage points, which is much more than I could possibly expect based on the literature).  Careful review of the articles led me to find problems with the statistics.</p>
<p>Dennis the dentist, Laura the lawyer, and the proclivity of Dave Kingman and Vince Koleman to strike out a lot:  I was ready to believe the Dennis/Laura effect on occupations and only slightly skeptical of the K effect on strikeouts, but then the work was later strongly criticized on methodological grounds.  Still, my back-of-the-envelope calculation let me to believe that they hypothesized effects could be there.</p>
<p>Warming increases the risk of civil war in Africa:  This one certainly could be true but something about it rang some bells in my head and I&#8217;m skeptical.  The statistical evidence here is vague enough that I could well take the opposite tack, believing the claim and being skeptical about skepticism of it.  To be honest, if I knew these researchers personally I might very well be more inclined to trust the result.  (And that&#8217;s not so silly:  if I knew them personally I could ask them a bunch of questions and get a sense of where their belief in this finding is coming from.)</p>
<p>&#8220;45% hitting, 25% fielding, and 25% pitching&#8221;:  I was skeptical here because it was presented as a press release with no link to the paper but with enough details to make me suspect that the statistical analysis was pretty bad.</p>
<p>&#8220;Minority rules: scientists discover tipping point for the spread of ideas&#8221;:  I don&#8217;t know if this should be called &#8220;junk science&#8221; or just a silly generalization from a mathematical model.  Here I was suspicious because the claim was logically inconsistent and the study as a whole fit the pattern of physicists dabbling in social science.  (As I wrote at the time,  I&#8217;ll mock what&#8217;s mockable. If you don&#8217;t want to be mocked, don&#8217;t make mockable claims.)</p>
<p>“Discovered: the genetic secret of a happy life”:  There&#8217;s potentially something here but the differences are much smaller than implied by the headlines, the news articles, or even the abstract of the published article.</p>
<p>Whatever medical breakthrough happens to have been reported in the New York Times this week:  I believe all of these.  Even though I know that these findings don&#8217;t always persist, when I see it in the newspaper and I know nothing about the topic, I&#8217;m inclined to just believe.</p>
<p>That&#8217;s one reason the issue of flawed research is important!  I&#8217;m as well prepared as anyone to evaluate research claims, but as a consumer I can be pretty credulous when the research is not close to my expertise.</p>
<p>If there is any coherent message from the above examples, it is that my own rules for how to evaluate research claims are not clear, even to me.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/suggested-resolution-of-the-bem-paradox/feed/</wfw:commentRss>
		<slash:comments>26</slash:comments>
		</item>
		<item>
		<title>Chris Schmid on Evidence Based Medicine</title>
		<link>http://andrewgelman.com/2012/01/chris-schmid-on-evidence-based-medicine/</link>
		<comments>http://andrewgelman.com/2012/01/chris-schmid-on-evidence-based-medicine/#comments</comments>
		<pubDate>Wed, 25 Jan 2012 14:15:47 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Decision Theory]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Public Health]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=13839</guid>
		<description><![CDATA[Chris Schmid is a statistician at New England Medical Center who is an expert on evidence-based medicine. I invited him to present an introductory overview lecture on the topic at last year&#8217;s Joint Statistical Meetings, and here are his slides. All 123 of them. I don&#8217;t know how he expected to go though all of [...]]]></description>
			<content:encoded><![CDATA[<p>Chris Schmid is a statistician at New England Medical Center who is an expert on evidence-based medicine.  I invited him to present an introductory overview lecture on the topic at last year&#8217;s Joint Statistical Meetings, and <a href="http://andrewgelman.com/wp-content/uploads/2011/12/SchmidJSM2011.pdf">here are his slides</a>.  All 123 of them.  I don&#8217;t know how he expected to go though all of these in an hour.  You could teach a semester-long course based on this material.</p>
<p>Good stuff, I recommend you all read it.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/chris-schmid-on-evidence-based-medicine/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Difficulties in publishing non-replications of implausible findings</title>
		<link>http://andrewgelman.com/2012/01/difficulties-in-publishing-non-replications-of-implausible-findings/</link>
		<comments>http://andrewgelman.com/2012/01/difficulties-in-publishing-non-replications-of-implausible-findings/#comments</comments>
		<pubDate>Tue, 24 Jan 2012 14:57:55 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Miscellaneous Science]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Zombies]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=13837</guid>
		<description><![CDATA[Eric Tassone points me to this news article by Christopher Shea on the challenges of debunking ESP. Shea writes: Earlier this year, a major psychology journal published a paper suggesting that there was some evidence for “pre-cognition,” a form of ESP. Stuart Ritchie, a doctoral student at the University of Edinburgh, is part of a [...]]]></description>
			<content:encoded><![CDATA[<p>Eric Tassone points me to <a href="http://blogs.wsj.com/ideas-market/2011/12/07/the-challenges-of-debunking-esp/">this</a> news article by Christopher Shea on the challenges of debunking ESP.  Shea <a href="http://blogs.wsj.com/ideas-market/2011/12/07/the-challenges-of-debunking-esp/">writes</a>:</p>
<blockquote><p>Earlier this year, a major psychology journal published a paper suggesting that there was some evidence for “pre-cognition,” a form of ESP. Stuart Ritchie, a doctoral student at the University of Edinburgh, is part of a team that tried, but failed, to replicate those results. Here, he tells the Chronicle of Higher Education’s Tom Bartlett about the difficulties he’s had getting the results published.</p>
<p>Several journals told the team they wouldn&#8217;t publish a study that did no more than disprove a previous study. . . . An editor at another journal said he&#8217;d &#8220;only accept our paper if we ran a fourth experiment where we got a believer [in ESP] to run all the participants, to control for . . . experimenter effects.&#8221;</p></blockquote>
<p>My reaction is, this isn&#8217;t as easy a question as it might seem.  At first, one&#8217;s reaction might share Ritchie&#8217;s frustration that a shoddy paper by Bem got published while Ritchie&#8217;s careful replication got dinged.  But, as I wrote when the issue <a href="http://themonkeycage.org/blog/2011/12/07/annals-of-interesting-peer-review-decisions/#comment-21691">came up</a> on the sister blog:</p>
<p>Setting aside the whole &#8220;psychic powers&#8221; thing, it makes sense to me not to run the new experiment. After all, it&#8217;s hardly news that ESP doesn’t work. If &#8220;ESP doesn&#8217;t work&#8221; were publishable, you could fill up a journal many times over with such findings. And what would be the point of that? Better to start a new journal with some catchy title such as Replications of Well-Known Findings. In the physics division, you could have articles demonstrating that objects fall down, not up. In the chemistry division, you could publish demonstrations that H2 + O2 yields H2O plus energy. The biology section could have a paper demonstrating that cats and dogs can&#8217;t produce offspring. And so on.</p>
<p>So I don&#8217;t know the answer here.  On one hand, we can hardly require or even expect that journals fill their pages with dog-bites-man nonreplications.  (And, even in a computerized era where there are no page limits, there are still constraints on the time of editors and reviewers.)  On the other hand, this leads to an asymmetry where crap gets on the front page and the refutation doesn&#8217;t even get published on page B16.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/difficulties-in-publishing-non-replications-of-implausible-findings/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
		</item>
		<item>
		<title>Fight!  (also a bit of reminiscence at the end)</title>
		<link>http://andrewgelman.com/2012/01/fight/</link>
		<comments>http://andrewgelman.com/2012/01/fight/#comments</comments>
		<pubDate>Mon, 23 Jan 2012 14:02:12 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Causal Inference]]></category>
		<category><![CDATA[Zombies]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=13831</guid>
		<description><![CDATA[Martin Lindquist and Michael Sobel published a fun little article in Neuroimage on models and assumptions for causal inference with intermediate outcomes. As their subtitle indicates (&#8220;A response to the comments on our comment&#8221;), this is a topic of some controversy. Lindquist and Sobel write: Our original comment (Lindquist and Sobel, 2011) made explicit the [...]]]></description>
			<content:encoded><![CDATA[<p>Martin Lindquist and Michael Sobel published a <a href="http://andrewgelman.com/wp-content/uploads/2012/01/CloakandDag.pdf">fun little article</a> in Neuroimage on models and assumptions for causal inference with intermediate outcomes. As their subtitle indicates (&#8220;A response to the comments on our comment&#8221;), this is a topic of some controversy. Lindquist and Sobel write:</p>
<blockquote><p>Our original comment (Lindquist and Sobel, 2011) made explicit the types of assumptions neuroimaging researchers are making when directed graphical models (DGMs), which include certain types of structural equation models (SEMs), are used to estimate causal effects. When these assumptions, which many researchers are not aware of, are not met, parameters of these models should not be interpreted as effects. . . . [Judea] Pearl does not disagree with anything we stated. However, he takes exception to our use of potential outcomes notation, which is the standard notation used in the statistical literature on causal inference, and his comment is devoted to promoting his alternative conventions. [Clark] Glymour&#8217;s comment is based on three claims that he inappropriately attributes to us. Glymour is also more optimistic than us about the potential of using directed graphical models (DGMs) to discover causal relations in neuroimaging research . . .</p></blockquote>
<p>Lindquist and Sobel&#8217;s arguments make sense to me, except on one point. They consider a causal setting z -&gt; x -&gt; y, where z is the treatment variable, x is the intermediate outcome, and y is the ultimate outcome, and much of their discussion centers on estimating the causal effect of x on y. I have two difficulties with their perspective:</p>
<p>1. If x is an observed variable that is not directly manipulated, I don&#8217;t know if it makes sense to talk about the effect of x on y, unconditional on the intervention that was used to change x. In their example, I&#8217;d talk about &#8220;the effect of x on y, if x is changed through z.&#8221; Different z&#8217;s can induce different effects of x on y.</p>
<p>2. Lindquist and Sobel talk about the effect of z on x. If z=0 or 1, they write x(z), so that the causal effect of z on x is x(1) &#8211; x(0) (or, more generally, x(1) compared to x(0), but we lose nothing by considering simple differences here). So far, so good.</p>
<p>But I get stuck at the next step, where they define the effect of x on y. If x can equal 0 or 1, they write y(z,x), so that the causal effect of x on y, conditional on z, is y(z,1) &#8211; y(z,0). At least, I think that&#8217;s what they&#8217;re saying.</p>
<p>The trouble is, I don&#8217;t see how the two parts of this model fit together. For any given item in the experiment, I think they&#8217;re following the rule that x(z) has a particular (although maybe unknown) value. But then I don&#8217;t see what it means to look at y(z,1) &#8211; y(z,0). For any particular value of z, it seems to me that only one of these two terms is possible. (For example, if x(z)=1, then y(z,1) is defined but y(z,0) seems meaningless.)</p>
<p>I&#8217;m not saying that this framework is wrong, just that I don&#8217;t understand it.</p>
<p>That said, Lindquist and Sobel&#8217;s criticisms of Pearl and Glymour seem sound to me.<br />
<span id="more-13831"></span><br />
P.S. I wrote this last month and put it in the queue. Since then I&#8217;ve noticed that Pearl has responded to Lindquist and Sobel; see <a href="http://www.mii.ucla.edu/causality/?p=384">here</a>. I don&#8217;t find Pearl&#8217;s response to be so convincing&#8212;I agree with Lindquist and Sobel&#8217;s statement that the graphical or structural equation modeling expression looks simple and appealing but the underlying assumptions in those expressions are not so clear. But you can judge for yourself; as I wrote in my discussion of the book by Morgan and Winship, it&#8217;s good to have muultiple expressions for a model, as different users are looking for different things.</p>
<p>To be specific, Pearl contrasts three expressions of a single model, the causal chain Z&#8212;&gt;X&#8212;&gt;Y. Here&#8217;s Pearl:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/01/Screen-shot-2012-01-23-at-8.32.44-AM.png"><img class="alignnone size-full wp-image-14234" title="Screen shot 2012-01-23 at 8.32.44 AM" src="http://andrewgelman.com/wp-content/uploads/2012/01/Screen-shot-2012-01-23-at-8.32.44-AM.png" alt="" width="541" height="185" /></a></p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/01/Screen-shot-2012-01-23-at-8.33.45-AM.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/01/Screen-shot-2012-01-23-at-8.33.45-AM.png" alt="" title="Screen shot 2012-01-23 at 8.33.45 AM" width="427" height="345" class="alignnone size-full wp-image-14235" /></a></p>
<p>Pearl characterizes the third expression is a more meaningful and clear display.</p>
<p>In contrast, Lindquist and Sobel argue that the above graphical expression appears clear only because it sweeps the model&#8217;s assumptions under the rug.  Lindquist and Sobel write:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/01/Screen-shot-2012-01-23-at-8.38.34-AM.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/01/Screen-shot-2012-01-23-at-8.38.34-AM.png" alt="" title="Screen shot 2012-01-23 at 8.38.34 AM" width="447" height="147" class="alignnone size-full wp-image-14236" /></a></p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/01/Screen-shot-2012-01-23-at-8.39.15-AM.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/01/Screen-shot-2012-01-23-at-8.39.15-AM.png" alt="" title="Screen shot 2012-01-23 at 8.39.15 AM" width="448" height="259" class="alignnone size-full wp-image-14237" /></a></p>
<p>None of this seems clear and simple to me!  Speaking of clear and simple, I&#8217;m reminded of a scene, several decades ago, when a bunch of us on the county math team won some competition, and the prize was that we each got to choose one of several math books.  One of the books was called Elementary Linear Algebra, and I remember making a disdainful remark to my friend that I didn&#8217;t want something elementary.  My friend replied, &#8220;Linear algebra is not elementary.&#8221;  Good point.</p>
<p>Which brings back another memory:  our coach for the Mathematical Olympiad program was an unbelievably grumpy old man.  At one point he interrupted one of his lectures to rant about how all the calculus books now are wasting their space with applications.  At some point, he said, they&#8217;re gonna come up with a book called Applied Calculus with Applications.  That all seemed natural to me at the time but in retrospect I&#8217;m amazed by how brainwashed we all were.  There was one kid there who I recall was interested in engineering problems rather than number theory etc., but that was an unusual preference.  (I just <a href="http://www.code.ucsd.edu/zeger/">looked him up</a> and, amazingly, he grew up to be an engineering researcher!)  The other thing I remember about the grumpy coach dude, besides his personality (which, in retrospect, was perhaps necessary to keep a bunch of 15-year-old boys in line; even nerds can make trouble), was that he thought it was cheating to use calculus or analytic geometry.  His favorite sorts of problems used elaborate arguments from classical geometry and he always felt we should be able to solve these without resorting to technical means.</p>
<p>As I&#8217;ve remarked more than once in this space, I feel lucky in retrospect to have been pretty unprepared for the Olympiad program, with the result that I didn&#8217;t do very well there, gradually lost interest in this sort of competitive event, and decided I didn&#8217;t want to be a pure mathematician.  I think it must&#8217;ve been really hard on the kids who were top performers but didn&#8217;t happen to be Noam.  It was easier for those of us in the bottom half of the group.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/fight/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>Advice on do-it-yourself stats education?</title>
		<link>http://andrewgelman.com/2012/01/advice-on-do-it-yourself-stats-education/</link>
		<comments>http://andrewgelman.com/2012/01/advice-on-do-it-yourself-stats-education/#comments</comments>
		<pubDate>Sun, 22 Jan 2012 15:19:52 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Teaching]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14218</guid>
		<description><![CDATA[Dustin Palmer writes: I am a recent graduate looking for a bit of advice. While I took intro classes on math and statistics in my undergraduate degree as a political science major, I find myself university-less and seeking to develop my statistics toolkit. I work for an NGO in the international development field. I think [...]]]></description>
			<content:encoded><![CDATA[<p>Dustin Palmer writes:</p>
<blockquote><p>I am a recent graduate looking for a bit of advice. While I took intro classes on math and statistics in my undergraduate degree as a political science major, I find myself university-less and seeking to develop my statistics toolkit.</p>
<p>I work for an NGO in the international development field. I think that a solid statistics foundation would offer me not only more career opportunities, but more importantly, a deeper and more nuanced understanding of the processes and problems that interest me. I&#8217;m talking about field experiments and practical quantitative and qualitative data analysis.</p>
<p>I have plenty of free time, ambition, and enthusiasm to improve this part of my toolbox, but I lack an attachment to an institution and much in the way of financial resources. How would you go about making a concentrated effort at acquiring an understanding of the field and its actual application in something like R or Stata, which I admit to never having used? </p>
<p>Perhaps I am simply asking about web resources or best texts, but any broad advice would be much appreciated too.</p></blockquote>
<p>My gut recommendation is to start with a problem you care about and figure out what you need to get a reasonable solution, then go to the next problem, and so forth.  For books, you could start with The Statistical Sleuth and my book with Jennifer.  If you want to learn R, just try to make some pretty and useful graphs, that will motivate you to be able to do more.</p>
<p>Any other suggestions?</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/advice-on-do-it-yourself-stats-education/feed/</wfw:commentRss>
		<slash:comments>49</slash:comments>
		</item>
		<item>
		<title>Lessons learned from a recent R package submission</title>
		<link>http://andrewgelman.com/2012/01/lessons-learned-from-a-recent-r-package-submission/</link>
		<comments>http://andrewgelman.com/2012/01/lessons-learned-from-a-recent-r-package-submission/#comments</comments>
		<pubDate>Sat, 21 Jan 2012 22:19:58 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Sociology]]></category>
		<category><![CDATA[Statistical computing]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14212</guid>
		<description><![CDATA[R has zillions of packages, and people are submitting new ones each day. The volunteers who keep R going are doing an incredibly useful service to the profession, and they&#8217;re busy. A colleague sends in some suugestions based on a recent experience with a package update: 1. Always use the R dev version to write [...]]]></description>
			<content:encoded><![CDATA[<p>R has zillions of packages, and people are submitting <a href="http://dirk.eddelbuettel.com/blog/2011/11/27/">new ones each day</a>.  The volunteers who keep R going are doing an incredibly useful service to the profession, and they&#8217;re <a href="http://journal.r-project.org/archive/2009-2/RJournal_2009-2_Fox.pdf">busy</a>.</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/01/Screen-shot-2012-01-21-at-6.17.16-PM.png"><img src="http://andrewgelman.com/wp-content/uploads/2012/01/Screen-shot-2012-01-21-at-6.17.16-PM.png" alt="" title="Screen shot 2012-01-21 at 6.17.16 PM" width="383" height="377" class="alignnone size-full wp-image-14214" /></a></p>
<p>A colleague sends in some suugestions based on a recent experience with a package update:</p>
<blockquote><p><strong>1. Always use the R dev version to write a package.  Not the current stable release.</strong> The R people use the R dev version to check your package anyway.  If you don&#8217;t use the R dev version, there is chance that your package won&#8217;t pass the check.  In my own experience, every time R has a major change, it tends to have new standards and find new errors in your package with these new standards.  So better use the dev version to find out the potential errors in advance.</p>
<p><strong>2. After submission, write an email to claim it.</strong>  I used to submit the package to the CRAN without writing an email.  This was standard operating procedure, but it has changed. Writing an email to claim about the submission is now a requirement.  There is a good reason.  The R team is afraid that the package was not submitted by a legal developer.  So there is a security issue involved here.  Write an email to remind them that you submit a package, not a virus.</p>
<p><strong>3. The R people are busy.</strong> The number of R packages submitted to CRAN is growing exponentially.  So the R people&#8217;s working loads are heavy.  We should understand their situation and try to work with them to solve the package issues, when problems come up.</p>
<p>The first two lessons are the most important.  If you have done the first two, I believe you won&#8217;t need the third one.</p></blockquote>
<p>I&#8217;ve never actually written an R package myself&#8212;my last experience with this sort of thing was several years ago, using dyn.load and dyn.load2 in S&#8212;but I&#8217;ve used many R packages and I&#8217;ve contributed to several widely-used R packages.  So I really appreciate the effort put in by the central R people, and I&#8217;m posting this note as a way to make their lives easier and also help the people who are writing and updating R packages.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/lessons-learned-from-a-recent-r-package-submission/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>A counterfeit data graphic</title>
		<link>http://andrewgelman.com/2012/01/a-counterfeit-data-graphic/</link>
		<comments>http://andrewgelman.com/2012/01/a-counterfeit-data-graphic/#comments</comments>
		<pubDate>Sat, 21 Jan 2012 20:18:47 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Statistical graphics]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14210</guid>
		<description><![CDATA[Kaiser Fung discusses. It&#8217;s a good sign when statistical graphics are so popular that people feel the need to fake them!]]></description>
			<content:encoded><![CDATA[<p>Kaiser Fung <a href="http://junkcharts.typepad.com/junk_charts/2012/01/a-counterfeit-data-graphic.html">discusses</a>.  It&#8217;s a good sign when statistical graphics are so popular that people feel the need to fake them!</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/a-counterfeit-data-graphic/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Judea Pearl on why he is “only a half-Bayesian”</title>
		<link>http://andrewgelman.com/2012/01/judea-pearl-on-why-he-is-only-a-half-bayesian/</link>
		<comments>http://andrewgelman.com/2012/01/judea-pearl-on-why-he-is-only-a-half-bayesian/#comments</comments>
		<pubDate>Sat, 21 Jan 2012 17:32:38 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Causal Inference]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=13237</guid>
		<description><![CDATA[In an article published in 2001, Pearl wrote: I [Pearl] turned Bayesian in 1971, as soon as I began reading Savage’s monograph The Foundations of Statistical Inference [Savage, 1962]. The arguments were unassailable: (i) It is plain silly to ignore what we know, (ii) It is natural and useful to cast what we know in [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://ftp.cs.ucla.edu/pub/stat_ser/r284-reprint.pdf">an article</a> published in 2001, Pearl wrote:</p>
<blockquote><p>I [Pearl] turned Bayesian in 1971, as soon as I began reading Savage’s monograph The Foundations of Statistical Inference [Savage, 1962]. The arguments were unassailable: (i) It is plain silly to ignore what we know, (ii) It is natural and useful to cast what we know in the language of probabilities, and (iii) If our subjective probabilities are erroneous, their impact will get washed out in due time, as the number of observations increases.</p>
<p>Thirty years later, I [Pearl] am still a devout Bayesian in the sense of (i), but I now doubt the wisdom of (ii) and I know that, in general, (iii) is false.</p></blockquote>
<p>He elaborates:</p>
<blockquote><p>The bulk of human knowledge is organized around causal, not probabilistic relationships, and the grammar of probability calculus is insufficient for capturing those relationships. Specifically, the building blocks of our scientific and everyday knowledge are elementary facts such as “mud does not cause rain” and “symptoms do not cause disease” and those facts, strangely enough, cannot be expressed in the vocabulary of probability calculus. It is for this reason that I consider myself only a half-Bayesian.</p></blockquote>
<p>Interesting.  The Neyman-Rubin framework of potential outcomes does allow for casual reasoning within a probabilistic structure, but indeed it does not allow for statements such as &#8220;mud does not cause rain.&#8221;  In the potential outcomes notation, one could define a random variable y=1 for rain or 0 for no rain, and define y^1 to be the outcome under treatment and y^2 to be the outcome under control.  But it would not make sense for &#8220;mud&#8221; to be a treatment:  in the potential-outcomes framework, a treatment is something that you do, not something such as &#8220;mud&#8221; that you observe.</p>
<p>I&#8217;m not saying here that Pearl&#8217;s framework is a good or bad idea; my point here is that I&#8217;m agreeing that he indeed seems to be asking questions that cannot be addressed by probability models.</p>
<p>Some of my earlier discussions with Pearl are <a href="http://andrewgelman.com/2009/07/pearls_and_gelm/">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/judea-pearl-on-why-he-is-only-a-half-bayesian/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Stan: A (Bayesian) Directed Graphical Model Compiler</title>
		<link>http://andrewgelman.com/2012/01/stan-a-bayesian-directed-graphical-model-compiler/</link>
		<comments>http://andrewgelman.com/2012/01/stan-a-bayesian-directed-graphical-model-compiler/#comments</comments>
		<pubDate>Sat, 21 Jan 2012 02:23:37 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Statistical computing]]></category>

		<guid isPermaLink="false">http://andrewgelman.com/?p=14199</guid>
		<description><![CDATA[Here&#8217;s Bob&#8217;s talk from the NYC machine learning meetup. And here&#8217;s Stan himself:]]></description>
			<content:encoded><![CDATA[<p><a href="http://andrewgelman.com/wp-content/uploads/2012/01/stan-meetup.pdf">Here&#8217;s Bob&#8217;s talk</a> from the NYC machine learning <a href="http://www.meetup.com/NYC-Machine-Learning/events/46533822/">meetup</a>.  And here&#8217;s Stan himself:</p>
<p><a href="http://andrewgelman.com/wp-content/uploads/2012/01/STAN_ULAM_HOLDING_THE_FERMIAC.jpg"><img src="http://andrewgelman.com/wp-content/uploads/2012/01/STAN_ULAM_HOLDING_THE_FERMIAC-262x300.jpg" alt="" title="STAN_ULAM_HOLDING_THE_FERMIAC" width="262" height="300" class="alignnone size-medium wp-image-14200" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://andrewgelman.com/2012/01/stan-a-bayesian-directed-graphical-model-compiler/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

