<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Statistical Modeling, Causal Inference, and Social Science</title>
	<atom:link href="https://statmodeling.stat.columbia.edu/feed/" rel="self" type="application/rss+xml" />
	<link>https://statmodeling.stat.columbia.edu</link>
	<description></description>
	<lastBuildDate>Mon, 08 Jun 2026 18:52:26 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.4.3</generator>
	<item>
		<title>Stein&#8217;s method, learning and inference -or- how to really monitor convergence and thin chains</title>
		<link>https://statmodeling.stat.columbia.edu/2026/06/08/steins-method-learning-and-inference-or-how-to-really-monitor-convergence-and-thin-chains/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/06/08/steins-method-learning-and-inference-or-how-to-really-monitor-convergence-and-thin-chains/#respond</comments>
		
		<dc:creator><![CDATA[Bob Carpenter]]></dc:creator>
		<pubDate>Mon, 08 Jun 2026 19:00:21 +0000</pubDate>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Statistical Computing]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53833</guid>

					<description><![CDATA[This post is from Bob. I&#8217;ve been thinking a lot about scores (gradients of the log density function) and how they can be used for convergence monitoring. We know that the expected value of the score is zero. Stein generalized &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/06/08/steins-method-learning-and-inference-or-how-to-really-monitor-convergence-and-thin-chains/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><b>This post is from Bob.</b></p>
<p>I&#8217;ve been thinking a lot about scores (gradients of the log density function) and how they can be used for convergence monitoring.  We know that the expected value of the score is zero.  Stein generalized this with Stein operators.  In the monomial case, the Stein operators give you functions in increasing degrees, all of which have zero expectation in the posterior.  Here theta is the variable being sampled and S is the score function, so that S(theta) is the gradient of the target log density evaluated at theta.</p>
<p>&nbsp; &nbsp; Order 0: S(theta)</p>
<p>&nbsp; &nbsp; Order 1: 1 + theta .* S(theta)</p>
<p>&nbsp; &nbsp; Order 2: 2 * theta + theta^2 .* S(theta)</p>
<p>This leads to a natural test for convergence of first, second, and third moments.  Just compute Monte Carlo estimates of these quantities and see if they&#8217;re zero.  We&#8217;d want to standardize for standard deviation to make the result scale-free like R-hat.  To develop some intuitions, in a standard normal distribution p(theta) = normal(theta | 0, I), we have S(theta) = -theta, and thus S(theta) converges to zero at the same rate as our variable theta converges to its true value; the order 1 test is 1 &#8211; theta^2, which we know has expectation zero because theta^2 has a ChiSquared(1) distribution with expectation of 1).  The order 1 case corresponds to equipartition in physics and the form D + theta&#8217; * S(theta) also naturally has zero expectation as shown in the viral theorem in physics in the 1870s.</p>
<p>Diving into this a bit more led me back to Jackson Gorham and Lester Mackey&#8217;s work on Stein&#8217;s method.  They haven&#8217;t been sitting still since introducing the basic idea, which kernelizes the idea above.  Mackey et al. have produced an absolutely wonderful summary of this body of work in two forms.  The first is a dense, 41-slide deck with all the key definitions and results.  I&#8217;d suggest at least skimming this first.</p>
<blockquote><p>
Lester Mackey. April 2026.  <a href="https://lmackey.github.io/papers/gsd_ksd-slides.pdf">Stein&#8217;s Method, Learning, and Inference.</a>.  GitHub.
</p></blockquote>
<p>Mackey along with Chris Oates and Qiang Liu, who have also worked heavily in this area, put together a definitive monograph.  They&#8217;ve presented a great deal of difficult material in a way that I can digest (though it&#8217;s going to be rough going if you&#8217;re not well versed in sampling and how MCMC is traditionally measured and evaluated).</p>
<blockquote><p>
Qiang Liu, Lester Mackey, Chris Oates.  March 2026. <a href="https://arxiv.org/abs/2603.07467">Probabilistic Inference and Learning with Stein&#8217;s Method</a>.  arXiv.
</p></blockquote>
<p>In particular, they go over Stein variational inference, which seems to me like it would be the ideal way to perform quasi Monte Carlo-like inference for statistical models if we could only get a robust version to scale.  The idea&#8217;s to initialize a bunch of points, then use optimization to minimize a kernelized Stein discrepancy of the empirical distribution of those points to the true distribution.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/06/08/steins-method-learning-and-inference-or-how-to-really-monitor-convergence-and-thin-chains/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Podcast coming on Bayesian workflow!  With a contest!</title>
		<link>https://statmodeling.stat.columbia.edu/2026/06/08/podcast-coming-on-bayesian-workflow-with-a-contest/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/06/08/podcast-coming-on-bayesian-workflow-with-a-contest/#respond</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Mon, 08 Jun 2026 13:26:12 +0000</pubDate>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Teaching]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53832</guid>

					<description><![CDATA[Alexandre Andorra will be interviewing Aki, Richard, and me on the topic of our just-released book, Bayesian Workflow. And there&#8217;s a contest! Here&#8217;s Alexandre: 🥁One listener will get to bring their real-world workflow problem onto the recording for the three &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/06/08/podcast-coming-on-bayesian-workflow-with-a-contest/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>Alexandre Andorra will be interviewing Aki, Richard, and me on the topic of our just-released book, <a href="https://statmodeling.stat.columbia.edu/2026/04/16/the-bayesian-workflow-book-is-coming/">Bayesian Workflow</a>.</p>
<p>And there&#8217;s a contest! <a href="https://www.linkedin.com/posts/bayesianstatistics-datascience-machinelearning-share-7465433819622502400-jhkw/"> Here&#8217;s Alexandre</a>:</p>
<blockquote><p><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f941.png" alt="🥁" class="wp-smiley" style="height: 1em; max-height: 1em;" />One listener will get to bring their real-world workflow problem onto the recording for the three of them to work through live!</p>
<p><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f39f.png" alt="🎟" class="wp-smiley" style="height: 1em; max-height: 1em;" /> How to enter:<br />
1. Like, comment or repost this announcement post so other practitioners see it<br />
2. Review the show on Apple Podcasts (https://lnkd.in/gbTagKAS) or Spotify (https://lnkd.in/guYMdMUv), and upload a screenshot here: https://lnkd.in/dPpuSKf5<br />
3. Submit your problem (100-200 words) on the form</p>
<p><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f381.png" alt="🎁" class="wp-smiley" style="height: 1em; max-height: 1em;" />  Prizes:<br />
1. Grand prize: your problem worked through live by Gelman, Vehtari, and McElreath during the recording + a signed copy of Bayesian Workflow<br />
2. The two runners-up will get signed copies<br />
3. Five LBS Patreon subscribers (random draw) will get ebook copies!</p>
<p>Contest closes June 10!<br />
If you&#8217;ve ever wanted to come on the show and ask Andrew, Aki or Richard &#8220;okay but what would you actually do here?&#8221; &#8212; this is your shot!</p></blockquote>
<p>And here&#8217;s <a href="https://docs.google.com/forms/d/e/1FAIpQLSda89M1VwVhLLDZhytAnEgZLI8jDIqG99qaYi6SS9gheINq9w/viewform">the Google form</a> for the contest. </p>
<p>I&#8217;ve been on Alexandre&#8217;s podcast twice before:</p>
<p>• <a href="http://Modeling the US Presidential Elections">Modeling the US Presidential Elections</a> (with Merlin Heidemanns), 1 Nov 2020</p>
<p>• <a href="https://learnbayesstats.com/episode/20-regression-and-other-stories-with-andrew-gelman-jennifer-hill-aki-vehtari">Regression and Other Stories</a> (with Jennifer Hill and Aki Vehtari), 30 Jul 2020</p>
<p>It was fun both times.  I assume this new one will be too.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/06/08/podcast-coming-on-bayesian-workflow-with-a-contest/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Scott Alexander as a modern-day Edmund Wilson</title>
		<link>https://statmodeling.stat.columbia.edu/2026/06/07/scott-alexander-as-a-modern-day-edmund-wilson/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/06/07/scott-alexander-as-a-modern-day-edmund-wilson/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Sun, 07 Jun 2026 13:34:43 +0000</pubDate>
				<category><![CDATA[Literature]]></category>
		<category><![CDATA[Political Science]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53212</guid>

					<description><![CDATA[Edmund Wilson was a mid-twentieth-century literary critic and all-around intellectual authority. He wrote for moderate-circulation magazines like the New Republic and the New Yorker, and he also wrote several influential books. I&#8217;m interested in him, and people like him, partly &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/06/07/scott-alexander-as-a-modern-day-edmund-wilson/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>Edmund Wilson was a mid-twentieth-century literary critic and all-around intellectual authority.  He wrote for moderate-circulation magazines like the New Republic and the New Yorker, and he also wrote several influential books.  I&#8217;m interested in him, and people like him, partly because they&#8217;re the sort of expert we don&#8217;t see so much of anymore.  They were autodidacts who achieved intellectual and worldly success through their writing ability.  Other examples of that ere are George Orwell, the so-called New York Intellectuals, and, a bit later, people like Susan Sontag, Tom Wolfe, Joan Didion, and Clive James.  What they had in common was an ability to write well, a willingness to write publicly on all sorts of topics not limited to whatever might have been their nominal expertise, and a cultural impact that reflected that they had interesting things to say and said these things well.  The interesting things they had to say were not always so innovative (yeah, Tom Wolfe hated lots of modern art; so do I, but I don&#8217;t write a whole book about it) but they were distinctive enough to stand out, and their essays and books were interesting to read.  Sometimes they took pretty goofy political stances and sometimes they were more mainstream; either way, they were worth reading.  One of my favorites from that mid-twentieth-century era is the now-mostly-forgotten Dwight Macdonald.</p>
<p>For a writer to be a public intellectual was a choice, and a choice that not every writer made.  Robert Penn Warren and John Updike wrote some public-intellectual sorts of things that were trifles compared to their fiction; T. S. Eliot wrote some essays but they weren&#8217;t so readable, perhaps no surprise given that his poetry is pretty tangled too!  Gore Vidal and Norman Mailer became more celebrated for their nonfiction than for their fiction, which may well have annoyed them.</p>
<p>But who are the public intellectuals of today?  Not the authors.  Lorrie Moore, Jonathan Franzen, etc., write the occasional essay but they don&#8217;t have the moral or intellectual authority of Wilson or Orwell or even a lesser light such as Clive James.  Paul Krugman and, formerly, Steven Levitt, have had some influence but they are coming from academia; their authority doesn&#8217;t come from their ability to put words together, in the manner of a Mencken, Didion, etc.  The literary critic Jackson Lears writes about politics for the London Review of Books, but he seems to me more of a well-connected bullshitter than anything else, and not readable enough to be worth reading for style alone, as was sometimes the case with Wolfe.</p>
<p>OK, so one thing is that literary fiction is much less important in our culture as an arts and entertainment medium.  But it&#8217;s not like filmmakers or the equivalent have picked up the slack.  Tarentino or Scorsese might make the occasional pronouncement, but they&#8217;re not all-purpose expert.  David Mamet might like to be an all-purpose expert, but he&#8217;s just a playwright and screenwriter with a side gig as a loud opinionator:  you see lots of people like that on the internet and, sure, Mamet and Joyce Carol Oates and J. K. Rowling and all the rest have every right to try to transfer their literary fame into political influence, but really they&#8217;re acting like celebrities here, not thinkers of the order of Mary McCarthy or, yes, Gore Vidal.</p>
<p>Another direction to look among modern writers would be political pundits like Josh Marshall or David Brooks, commentators who have reached wide influence through their ability to say interesting or provocative things (even if, in the case of Brooks, they sometimes get their facts wrong and don&#8217;t seem to care) in a direct and clear style.  But I&#8217;d say that Marshall and Brooks correspond not to Edmund Wilson but rather to other news pundits of the past, like Walter Lippman or Joseph Alsop:  they weren&#8217;t promulgating theories in the manner of Wilson, Orwell, Sontag, Wolfe, etc., but rather reporting on ideas that are out there.  Which is valuable too, just not quite the same thing.</p>
<p>OK, my answer to the &#8220;Who&#8217;s the modern-day Edmund Wilson?&#8221; question is no secret&#8211;it&#8217;s in the title of this post.  Scott Alexander is, like Wilson and those others, a prolific writer with strong political opinions, who feels comfortable opining about&#8211;and gets respect for his opinions on&#8211;all sorts of things.  Like Wilson, he&#8217;s an autodidact and a bit of an outsider.  Alexander&#8217;s topics are different from Wilson&#8217;s&#8211;more on science, less on literature&#8211;but he has the same feeling of well-earned authority.  And a fun writing style.  Bloggy rather than magaziney&#8211;times have changed&#8211;but a distinctive and readable style in any case.</p>
<p>Alexander and Wilson also share an interesting balance between political extremism and curmudgeonly centrism.  Wilson was heavily influenced by communism&#8211;no surprise for a New York writer in those days&#8211;and you see in a lot of his writing a pushing back and forth among different ideologies.  He was cranky but no ideologue.  Alexander has been heavily influenced by far-right internet writings&#8211;no surprise for a tech-adjacent blogger in the past decade or so&#8211;but doesn&#8217;t strongly identify with the far right (indeed, he unequivocally recommended voting against Donald Trump); you can see the influence still there even as it battles with others.  Just as Wilson was always writing under the auspices of communism, as it were, even&#8211;indeed, especially when&#8211;he was arguing against it, similarly, Alexander is always writing under the auspices of the far right, operating on that playing field, as it were, even when he has his differences.</p>
<p>Wilson and Alexander are people with eccentric personas (I don&#8217;t mean &#8220;eccentric&#8221; in a bad way here), and I think this is to the benefit of their writing, even if sometimes it leads them into places they might not have intended to go.</p>
<p>I don&#8217;t want to make too much of this.  Neither Wilson nor Alexander are unique literary figures; as noted above, Wilson was part of a long tradition of writers who were public intellectuals, and Alexander&#8217;s not the only blogger out there.  Other influential bloggers include Josh Marshall (mentioned above), Tyler Cowen and Alex Tabarrok, and others.  Actually not so many others anymore.  Now we&#8217;re seeing another kind of thought leader, which is people who are loud on twitter, and I guess they&#8217;re important too, but I think of their role more as spreading ideas, not in the mode of writers like Orwell or Didion who express new thoughts and for whom the way that the thought is expressed is as important as its content.</p>
<p><strong>P.S.</strong>  Another way that Scott Alexander is a public intellectual is that people care about his positions; he has some intellectual and moral authority. You can see this in the comments:  some people like Alexander, some people don&#8217;t, but there&#8217;s a sense that he&#8217;s an important figure and that his opinions matter.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/06/07/scott-alexander-as-a-modern-day-edmund-wilson/feed/</wfw:commentRss>
			<slash:comments>69</slash:comments>
		
		
			</item>
		<item>
		<title>When is detecting AI-generated text worthwhile?</title>
		<link>https://statmodeling.stat.columbia.edu/2026/06/06/when-is-detecting-ai-generated-text-worthwhile/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/06/06/when-is-detecting-ai-generated-text-worthwhile/#comments</comments>
		
		<dc:creator><![CDATA[Jessica Hullman]]></dc:creator>
		<pubDate>Sat, 06 Jun 2026 20:04:03 +0000</pubDate>
				<category><![CDATA[Decision Analysis]]></category>
		<category><![CDATA[Miscellaneous Science]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53829</guid>

					<description><![CDATA[This is Jessica. AI-text detectors are coming to play a bigger role in adjudicating what texts are worthy of our attention. There was the surprising case of an apparently AI-generated short story winning the Commonwealth Foundation Short Story Prize, which &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/06/06/when-is-detecting-ai-generated-text-worthwhile/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><span style="font-weight: 400">This is Jessica. AI-text detectors are coming to play a bigger role in adjudicating what texts are worthy of our attention. There was the surprising case of an apparently AI-generated </span><a href="https://granta.com/the-serpent-in-the-grove/"><span style="font-weight: 400">short story</span></a><span style="font-weight: 400"> winning the </span><a href="https://commonwealthfoundation.com/short-story-prize/"><span style="font-weight: 400">Commonwealth Foundation Short Story Prize</span></a><span style="font-weight: 400">, which returns 100% AI generated by Pangram, the leading detector whose false positive rate is reported as roughly 1 in 10,000 in its own audits and near zero on medium-to-long passages in an </span><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5407424"><span style="font-weight: 400">external audit</span></a><span style="font-weight: 400">. Applying Pangram to the other 4 stories that won awards this year suggests two others were heavily AI-assisted. More recently, the NeurIPS Position Paper track </span><a href="https://blog.neurips.cc/2026/06/02/ai-generated-papers-in-the-neurips-2026-position-paper-track/"><span style="font-weight: 400">announced</span></a><span style="font-weight: 400"> that it was desk rejecting 18% of submitted papers that were detected by Pangram as fully AI-generated. Another 13% are getting followed up on with the authors to investigate AI use. In this case the <a href="https://neurips.cc/Conferences/2026/CallForPositionPapers">Call for Papers</a> made clear that submissions should be &#8220;substantially written by human authors,&#8221; so this should not have come as a surprise.</span></p>
<p><span style="font-weight: 400">We’re having to reconsider what authorship means. Can a person create literature or express their position on a subject without writing a single sentence themselves? When do we really care who strung the words together?   </span></p>
<p><span style="font-weight: 400">Some people think detection is a waste of our collective time because we will never reach an equilibrium. AI-generated text will keep shifting toward what passes the detector. Human writers will continually update their beliefs about what features are indicative of AI-writing, but will also be influenced to write more like AI by reading so much AI text. There’s no stable target, just an endless cat and mouse game that incentivizes being savvy enough at any given time to avoid getting flagged. Meanwhile people are being morally scorned and suffering reputational damage for being caught on the wrong side of things. This may disproportionately affect some writers (like non-native english speakers) who are finally seeing the playing field leveled a bit. </span></p>
<p><span style="font-weight: 400">On the other hand, there are situations where it really is important to know who strung the words together. Education is the most obvious one. It’s just very hard to teach someone to think if they’re not writing down their ideas themselves. </span></p>
<p><span style="font-weight: 400">The problem is that outside of select scenarios like teaching, what we really tend to care about is who controlled the ideas, and this is not equivalent to who strung the words together. Some would argue that the latter is becoming increasingly irrelevant given that AI can write more fluently than many people and many people prefer AI-generated text. </span></p>
<p><span style="font-weight: 400">Of course the reason we’re seeing detection used to filter paper submissions is because the ideal process–where the content of each paper is carefully considered on its own merits–is increasingly untenable given the huge surge in submissions in some fields. It’s easy to pump out credible-seeming papers with minimal human oversight using AI, and enough people are doing this to create serious problems. </span></p>
<p><span style="font-weight: 400">Mostly my response is that if we are going to debate the value of detection we should be willing to make our assumptions explicit. So let’s walk through a toy model to think about what we’re really conjecturing about.</span></p>
<p><span style="font-weight: 400">One way to think of the latent state that we actually care about in paper review is the author type. Let’s say type A authors come up with their ideas and do a lot of the writing themselves. Type B authors rely on AI to do much of the thinking for them, and also use AI to do much of the writing. Type C authors come up with their own ideas, but engage in extensive prompting to get AI to write everything they want to say for them.*</span></p>
<p><span style="font-weight: 400">For each paper, we choose to either pass or reject, conditional on the output of a Pangram check. Let’s say we only care about whether it flags 100% AI generated or not, so the signal s is binary, where s=1 means AI detected.</span></p>
<p><span style="font-weight: 400">Based on available Pangram audits, if a text is actually written heavily by AI there is a very high chance it flags as AI-generated: beta=P(s=1|AI written) with beta very close to 1. If a text is not written by AI, there is a very small chance it flags as AI-generated: alpha=P(s=1|human written). Pangram’s internal audits put alpha around 10^−4 but other audits find essentially zero false positives for medium-to-long passages. </span></p>
<p><span style="font-weight: 400">So P(s=1| A)=alpha, and if we assume Types B and C use AI to a similar extent for the writing, then \beta=P(s=1|B) = P(s=1|C). The posterior probability that a flagged paper is from a Type B author is then:</span></p>
<p><span style="font-weight: 400">P(B|s=1) = (beta × p_B)/(alpha × p_A + beta × p_B + beta × p_C), and since alpha is tiny and beta is close to 1, P(B|s=1) ≈ p_B/(p_B + p_C)</span></p>
<p><span style="font-weight: 400">The relevant considerations become what we think the author population looks like, and how costly we think a false positive versus a false negative are. </span></p>
<p><span style="font-weight: 400">As a starting point, let’s say that for our conference submissions this year, Type C is the rarest, at 20%, and Type A and Type B equally split the remaining mass at 40% each. Let’s also say that we consider rejecting an acceptable paper, c_FP, to be twice as bad as passing an unacceptable one c_FN. </span></p>
<p><span style="font-weight: 400">The optimal decision rule is to reject if c_FN​ * P(B|s=1)&gt;c_FP * ​P(A or C|s=1), or equivalently P(B|s=1)&gt;c_FP/(c_FN+​c_FP​​)</span></p>
<p><span style="font-weight: 400">With c_FP=2 and c_FN=1, this means we reject if P(B|s=1) &gt; 2/3.</span></p>
<p><span style="font-weight: 400">Under the prevalence assumptions above, P(B|s=1) is approximately 2/3, so we are right on the boundary. From the standpoint of making the right decisions for this particular conference cycle, it’s not obviously bad. But if Type C is a little more common, e.g., we shift a little mass from p_A to p_C to make p_C 0.25, then P(B|s=1) is 0.62, then we shouldn’t desk reject only based on the flag. Similarly if we were to decide that falsely rejecting an acceptable paper is three times as bad as passing an unacceptable one, we shouldn’t rely on it alone. </span></p>
<p><span style="font-weight: 400">This model is obviously very simple. But it shows us what kinds of things we have to make assumptions about in the most basic case. Obviously I don’t really know how many people are using AI blindly to write papers, nor how many people are relying heavily on AI to write up their own ideas. You should take my numbers with a grain of salt. Personally I can’t imagine how relying on AI to do all the writing when I came up with the ideas would ever feel efficient, because I tend to have strong opinions on how things are said. But I can accept I am probably more of a control freak than many others. And AI overreliance is easy to slip into. Maybe papers chairs from recent ML conferences (or arXiv moderators) have estimates on bad-actor rates based on what they are seeing. </span></p>
<p><span style="font-weight: 400">What this exercise can&#8217;t tell us is how scientific progress is impacted by the warping of incentives that can happen when we use AI-detection as a filter. Classic principal-agent problems suggest that when we care about </span><span style="font-weight: 400">something hard to observe—like scientific quality or long-term epistemic value—but must rely on observable proxy signals to judge authors’ outputs, we should expect authors to shift more effort toward improving exclusively on those proxies. Avoiding m-dashes and ‘not this, but this’ constructions and whatever else currently ups the posterior probability of AI-generation is orthogonal to the actual thinking that research requires. What if relying more heavily on AI to write up our ideas is a good idea for science in the long run, in terms of more clearly communicating the ideas or saving a lot of time, so that we can get more good ideas out in the same amount of time? Then too much emphasis on detection might slow us down. However, I’m doubtful we are currently anywhere near a state of the world where discouraging writing with AI is as costly for scientific progress as spending time reviewing and reading many more questionable AI-generated papers is. The bigger threat at the moment is the slop overwhelming our ability to find the good stuff.</span></p>
<p><span style="font-weight: 400">*We could also posit Type D authors that get AI to generate the ideas, but then write the papers themselves to evade detection, or are extremely good at getting AI-written text to evade detection. But this seems much less likely so I’m ignoring it.</span></p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/06/06/when-is-detecting-ai-generated-text-worthwhile/feed/</wfw:commentRss>
			<slash:comments>35</slash:comments>
		
		
			</item>
		<item>
		<title>What is the relation between interactions in a regression model and correlations among the predictors?</title>
		<link>https://statmodeling.stat.columbia.edu/2026/06/06/what-is-the-relation-between-interactions-in-a-regression-model-and-correlations-among-the-predictors/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/06/06/what-is-the-relation-between-interactions-in-a-regression-model-and-correlations-among-the-predictors/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Sat, 06 Jun 2026 13:31:36 +0000</pubDate>
				<category><![CDATA[Causal Inference]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Multilevel Modeling]]></category>
		<category><![CDATA[Sociology]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53208</guid>

					<description><![CDATA[I&#8217;ve often seen confusion between interactions in a regression model and correlations among the predictors. To keep it simple, consider the model y = b0 + b1*x1 + b2*x2 + b3*x1*x2 + error, and assume the predictors have been signed &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/06/06/what-is-the-relation-between-interactions-in-a-regression-model-and-correlations-among-the-predictors/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>I&#8217;ve often seen confusion between <em>interactions in a regression model</em> and <em>correlations among the predictors</em>.  To keep it simple, consider the model y = b0 + b1*x1 + b2*x2 + b3*x1*x2 + error, and assume the predictors have been signed so that both b1 and b2 are positive.  Then b3 represents the interaction.  This has nothing to do with the joint distribution of x1 and x2 in the data, or in the population.  (For simplicity, assume the data to which the model are being fit is a random sample from the population of interest.)</p>
<p>The <em>interaction</em> depends on the model of y given x1 and x2, while the <em>correlation</em> depends on the model for x1 and x2.  These are two completely different parts of the model.  And yet, they often seem connected.</p>
<p>I have the general impression that I&#8217;d be more likely to expect a positive interaction of x1 and x2 when predicting y, if x1 and x2 are positively correlated in the population.</p>
<p>For example, when predicting income from height and sex, being taller and being male both predict higher income, also they interact&#8211;the coefficient for height is higher for men than for women&#8211;and of course the two predictors, height and male, are positively correlated in the population.</p>
<p>I&#8217;m not sure how to think about this connection or even whether it&#8217;s a real pattern!  But there might be something there so I wanted to share it with you.</p>
<p>The issue of interactions comes up in the context of the concept of intersectionality, which is a form of interaction that comes up in sociology.  It started for me with this email from Elin Waring:</p>
<blockquote><p>I&#8217;ve been working on data on intersectionality and retention of students in STEM majors. My little group is specifically looking at data from Lehman College and trying to model graduation with a STEM degree.  There are a lot of details, but basically we have come to the conclusion that the right way to describe this is with a discrete time competing risk model (the competing risks being graduation with a STEM degree and graduation with a non-STEM degree).  I won&#8217;t go into all the details. We have data for between 1 and 20 semesters enrolled for students starting as freshman. For us, intersectional identity is defined by 5 variables that yield 32 distinct combinations or strata as used in the next articles. </p>
<p>In trying to think about how to account for intersectional identities we came across the &#8220;MAIHDA Method.&#8221;  I was wondering if you had seen this discussion before or have any thoughts about it.</p>
<blockquote><p>Evans, Clare R., George Leckie, and Juan Merlo. 2020. “Multilevel versus Single-Level Regression for the Analysis of Multilevel Information: The Case of Quantitative Intersectional Analysis.” Social Science &#038; Medicine (1982) 245:112499. doi:10.1016/j.socscimed.2019.112499.</p></blockquote>
<p>They essentially argue for treating the strata as random effects in a multilevel model where with the individual components of the combinations introduced as fixed effects describing the combinations.</p>
<p>The next article criticizes that approach and argues for fixed effects all around. </p>
<blockquote><p>Wilkes, Rima, and Aryan Karimi. 2024. “What Does the MAIHDA Method Explain?” Social Science &#038; Medicine 345:116495. doi:10.1016/j.socscimed.2023.116495.</p></blockquote>
<p>Responded to here:</p>
<blockquote><p>Evans, Clare R., Luisa N. Borrell, Andrew Bell, Daniel Holman, S. V. Subramanian, and George Leckie. 2024. “Clarifications on the Intersectional MAIHDA Approach: A Conceptual Guide and Response to Wilkes and Karimi (2024).” Social Science &#038; Medicine 350:116898. doi:10.1016/j.socscimed.2024.116898.</p></blockquote>
<p>I was wondering if you have any thoughts about this? For me, intersectionality as a theoretical approach does mean that it makes sense to look at the strata rather than thinking of the strata as just the most complex level of creating statistical models of the intersection of the variables. But then it seems as though treating this a random effect more or less undermines its centrality to the theory.   And is treating both the strata and the individual characteristics as variables at the same level basically a way to decompose? </p>
<p>In the end, I feel like the pro-MAIHDA people retreat to &#8220;we are just descriptive&#8221; in a way that isn&#8217;t very helpful. That said, they are right that this seems to have some traction in the world of health disparity research.</p></blockquote>
<p>I replied that I&#8217;d never heard of any of this method before.  I couldn&#8217;t actually muster the energy to read the above articles, as all this debate seems to be missing the key issues.  I don&#8217;t really care if something is called a fixed effect or a random effect (<a href="https://statmodeling.stat.columbia.edu/2005/01/25/why_i_dont_use/">see here</a>); my current preferred way of thinking of these problems is by framing as a generative model.</p>
<p>Regarding intersectionality, the natural way I would see it is that this would show up as an interaction term, the idea that the interaction is more than the sum of its parts?  For a simple example, if there are 5 binary variables and each has the same effect on its own (which they wouldn&#8217;t, this is just a simple hypothetical example), then you could create a variable which is the total number of identities, thus a number from 0 to 5, and &#8220;intersectionality&#8221; would show up as a super-linear or convex relation between the outcome and this total predictor?</p>
<p>Waring responded:</p>
<blockquote><p>Sure, but the idea you suggested about intersectionality itself isn&#8217;t right.  You can&#8217;t just sum the number of identities, everyone has identities and the idea is that it is not just about concentrated disadvantage of having all or some specific identities.  If we have 5 dichtomous identity/group variables everyone has 5 dimensions of identity.  Intersectionality is about the idea that something like &#8220;white, native born. woman, high income&#8221; shapes what happens because of how those come together to shape (in the case of my analysis) whether, as an undergraduate, you persist in STEM fields.</p></blockquote>
<p>I replied as follows:</p>
<p>Yes, I was actually thinking this when I wrote that!  I was imagining that each of the 5 factors has an &#8220;off&#8221; and &#8220;on&#8221; setting, and intersectionality kicks in when there are multiple &#8220;on&#8221; settings, where &#8220;on&#8221; represents the group that faces more difficulty (nonwhite, non-native born, female, low income, gender nonconformist, etc.).  Once you allow arbitrary possibilities for intersectionality, then my simple superadditive model wouldn&#8217;t fit.  On the other hand, if you were to allow all 32 possibilities to take on any value, then realistically you would not be able to estimate anything much at all:  this is the usual problem in sociology of approximating a complex social structure by a simple model that explains most of the variance.  For predicting persistence in STEM (or any academic field), one possible factor that could enter in a complicated way is conservative political ideology, in that for many attitudes and behavior its predictive effect goes in the opposite of the &#8220;on&#8221; categories listed above, but grad students, in STEM and other fields are predominantly politically on the left.  I could well imagine that conservative political ideology, like the other &#8220;on&#8221; categories, is predictive of not persisting in STEM but that this could interact in unexpected ways with those other categories.</p>
<p>From a statistical perspective, my main message is to choose such a model based on its explanatory power and recognizing that it&#8217;s an approximation, rather than using methods such as statistical significance or Bayes factors which in different ways are driven by sample size, as we discussed in <a href="https://sites.stat.columbia.edu/gelman/research/published/avoiding.pdf">this 1995 paper</a>.</p>
<p>Another interesting statistical feature of this and similar discussions is that it&#8217;s natural for the discussion to go back and forth between the correlation between two predictors in the data (or the population) and the interaction between their predictive effects, as discussed at the top of this post.</p>
<p>I&#8217;m not sure if this interaction thing is a general pattern that has some statistical explanation, or just a faulty intuition of mine based on just a couple of special cases.  But I have noticed a general confusion that when people talk about interactions, often they seem to be talking about correlation between the predictors.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/06/06/what-is-the-relation-between-interactions-in-a-regression-model-and-correlations-among-the-predictors/feed/</wfw:commentRss>
			<slash:comments>4</slash:comments>
		
		
			</item>
		<item>
		<title>Old posts on the Monkey Cage blog, also something about Israel and Hamas</title>
		<link>https://statmodeling.stat.columbia.edu/2026/06/05/old-posts-on-the-monkey-cage-blog/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/06/05/old-posts-on-the-monkey-cage-blog/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Fri, 05 Jun 2026 13:44:17 +0000</pubDate>
				<category><![CDATA[Political Science]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53093</guid>

					<description><![CDATA[A couple of decades ago some political scientists at George Washington University started a blog, which they called the Monkey Cage. They invited me to contribute. This was back in 2008. I haven&#8217;t thought about the Monkey Cage for awhile. &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/06/05/old-posts-on-the-monkey-cage-blog/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>A couple of decades ago some political scientists at George Washington University started a blog, which they called the Monkey Cage.  They invited me to contribute.  This was back in 2008.</p>
<p>I haven&#8217;t thought about the Monkey Cage for awhile.  At some point it became attached to the Washington Post, and they kept telling me that my posts were too bloggy and not journalistic, so I ended up contributing much less often.  Then it left the Post and got renamed as Good Authority.  My last post was in 2020.</p>
<p>Sometimes, though, I come across old posts on this blog that link back to my Monkey Cage articles.  Often the links don&#8217;t work because the site kept changing ownership.  So then I have to do some googling to find my original post.  Annoying!  But I guess my bad for not saving everything I write locally.</p>
<p>Anyway, I recently found <a href="https://goodauthority.org/people/andrew/">this archive of everything I wrote for the Monkey Cage</a>.  It looks like there are over 1000 posts!  It&#8217;s hard for me to believe I wrote so much for that site.  I&#8217;d forgotten almost all of this.  To be fair, some of these posts are pretty short.</p>
<p><strong>P.S.</strong>  The Good Authority people need to fix their webpage.  The top of it looks like this:</p>
<blockquote><p><img fetchpriority="high" decoding="async" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-15-at-08.52.11-1024x546.png" alt="" width="584" height="311" class="alignnone size-large wp-image-53094" srcset="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-15-at-08.52.11-1024x546.png 1024w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-15-at-08.52.11-300x160.png 300w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-15-at-08.52.11-768x410.png 768w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-15-at-08.52.11-1536x819.png 1536w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-15-at-08.52.11-2048x1092.png 2048w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-15-at-08.52.11-500x267.png 500w" sizes="(max-width: 584px) 100vw, 584px" /></p></blockquote>
<p>That&#8217;s all fine except that I clicked on the tab for &#8220;Israel-Hamas&#8221; and got the following articles:</p>
<blockquote><p><img decoding="async" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-15-at-08.53.58-1024x878.png" alt="" width="400" /></p></blockquote>
<p>Only one of these is about Israel or Hamas!  Or maybe they&#8217;re trying to draw a not-so-subtle analogy by including a post on white South Africans in the &#8220;Israel&#8221; category.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/06/05/old-posts-on-the-monkey-cage-blog/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
			</item>
		<item>
		<title>Against shallow anti-rational humanism</title>
		<link>https://statmodeling.stat.columbia.edu/2026/06/04/against-shallow-humanism/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/06/04/against-shallow-humanism/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Thu, 04 Jun 2026 13:32:00 +0000</pubDate>
				<category><![CDATA[Decision Analysis]]></category>
		<category><![CDATA[Political Science]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=52939</guid>

					<description><![CDATA[Jessica writes: I get so tired of people dumping on decision theory because real world decisions are complex. If decision theory is so deeply flawed, I&#8217;d love to know what alternative methods the critics advise for trying to evaluate and &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/06/04/against-shallow-humanism/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>Jessica writes:</p>
<blockquote><p>I get so tired of people dumping on decision theory because real world decisions are complex. If decision theory is so deeply flawed, I&#8217;d love to know what alternative methods the critics advise for trying to evaluate and improve decision making in some real world setting. Should we give up on modeling completely because some cause problems for our assumptions? What happened to the epistemic value of attempting to formalize goals so as to better understand what components we think are at play? Do we really want to go back to talking about man as a creature of instinct and habit and leave it at that?</p></blockquote>
<p>I agree, and this reminds me of <a href="https://statmodeling.stat.columbia.edu/2005/09/28/the_rational_an/">a discussion from twenty years ago</a> (!) about the transition from viewing people as &#8220;rational animals&#8221; to viewing people as &#8220;irrational computers.&#8221;</p>
<p>Here&#8217;s Thomas Jefferson from 1823:</p>
<blockquote><p>We believed . . . that man was a rational animal, endowed by nature with rights, and with an innate sense of justice; and that he could be restrained from wrong and protected in right, by moderate powers, confided to persons of his own choice, and held to their duties by dependence on his own will.</p></blockquote>
<p>He&#8217;s coming from a liberal (in the U.S. politics) perspective, with the idea being that rationality is a way to move forward from outmoded feudal arrangements.  Not that this was so easy&#8211;Jefferson owned slaves!&#8211;, but nobody said that rationality was easy, just that it&#8217;s a way forward.</p>
<p>This association, in which the left was associated with utopian rationality and the right was associated with sensible acceptance of irrationality, <a href="https://statmodeling.stat.columbia.edu/2012/03/08/the-politics-of-economic-and-statistical-models/">continued for another century</a>.  Consider, for example, the contrast between the rationalist and socialist George Barnard Shaw and the Catholic conservative G. K. Chesterton.  This association of rationality with the left continued through the New Deal period in the U.S. and the idea of the Soviet Union as being scientifically socialist.  The second world war pitted Soviet central planning and &#8220;Fordist&#8221; American organization against the blood-and-soil Axis powers.</p>
<p>Sometime during the mid-cold-war period there was a shift, at least in the U.S. and its allies, where science and technology was associated with the military-industrial complex and gained a conservative tinge, while the left embraced an anti-technology, back-to-the-land vision.  &#8220;Humanism&#8221; moved from a conservative, roll-back-the-tide, <a href="https://statmodeling.stat.columbia.edu/2007/06/15/politics_and_ec/">Chestertonian</a> position to a liberal, fight-the-Man position.</p>
<p>Nowadays things are a mess:  conservatives support military and police hardware, coal, nuclear power, bitcoin, data centers, and gas guzzlers more generally, but conservatives also oppose vaccines and scientific more research more generally, and Biblical creationism hasn&#8217;t gone away either.  And, with conservatives in charge of the country and much of public discourse, liberals are often defining themselves based on what they oppose.</p>
<p>I&#8217;m with Jessica in that I see no conflict between humanism and rationality.  Rationality is an ideal or a way of being, not an algorithm.  Yes, we&#8217;re animals, and rationality is one of our very useful tricks.  I wouldn&#8217;t want to abandon rationality or define ourselves against it, any more than I&#8217;d want to abandon running or singing or any of the other things that we can do so well, when we do them well.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/06/04/against-shallow-humanism/feed/</wfw:commentRss>
			<slash:comments>23</slash:comments>
		
		
			</item>
		<item>
		<title>Epidemiologist Donna Spiegelman sez:  SUTVA is &#8220;mostly not necessary for valid causal estimation and inference most of the time&#8221;</title>
		<link>https://statmodeling.stat.columbia.edu/2026/06/03/epidemiologist-donna-spiegelman-sez-sutva-is-mostly-not-necessary-for-valid-causal-estimation-and-inference-most-of-the-time/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/06/03/epidemiologist-donna-spiegelman-sez-sutva-is-mostly-not-necessary-for-valid-causal-estimation-and-inference-most-of-the-time/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Wed, 03 Jun 2026 13:12:31 +0000</pubDate>
				<category><![CDATA[Causal Inference]]></category>
		<category><![CDATA[Public Health]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53803</guid>

					<description><![CDATA[Donna Spiegelman shares this presentation she gave at the recent American Causal Inference Conference. I like what she has to say. Here are the two parts of the stable treatment value assumption: 1. No interference between units. As Spiegelman says, &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/06/03/epidemiologist-donna-spiegelman-sez-sutva-is-mostly-not-necessary-for-valid-causal-estimation-and-inference-most-of-the-time/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><img decoding="async" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/06/Screenshot-2026-06-03-at-09.19.43-1024x291.png" alt="" width="450" /></p>
<p>Donna Spiegelman shares this <a href="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/06/SUTVA_5min_conference_presentation-2.pdf">presentation she gave</a> at the recent American Causal Inference Conference.  I like what she has to say.</p>
<p>Here are the two parts of the stable treatment value assumption:</p>
<p>1.  No interference between units.  As Spiegelman says, nowadays it&#8217;s not hard to model spillovers.  As I say, untangling spillovers is an ill-posed inverse problem that can be solved using Bayesian inference with reasonable priors.  Serious practical work has moved past the demonstrate-that-spillover-doesn&#8217;t-matter stage to the just-model-the-spillover-directly stage.</p>
<p>2.  Deterministic potential outcomes.  As Spiegelman says, in the real world, outcomes are stochastic.  Jonas and I talk about this in our <a href="https://sites.stat.columbia.edu/gelman/research/published/StochasticPotentialOutcomes.pdf">Russian roulette</a> paper.</p>
<p>The part that I&#8217;m less sure about is Spiegelman&#8217;s claim that adjustments for pre-treatment variables usually don&#8217;t matter.  I&#8217;m persuaded that they usually don&#8217;t matter in the epidemiology and biostatistics applications she&#8217;s worked on, but I think that in social science, such adjustments can be important.  Especially if there are big treatment interactions and your population is a lot different from your sample.</p>
<p>In any case, I recommend you look through Spiegelman&#8217;s slides, as she offers a refreshing perspective compared to our usual obsessive focus on the details of causal identification:</p>
<p><img decoding="async" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/06/Screenshot-2026-06-03-at-09.20.17-1024x716.png" alt="" width="400" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/06/03/epidemiologist-donna-spiegelman-sez-sutva-is-mostly-not-necessary-for-valid-causal-estimation-and-inference-most-of-the-time/feed/</wfw:commentRss>
			<slash:comments>44</slash:comments>
		
		
			</item>
		<item>
		<title>Survey Statistics: it is (still) the people</title>
		<link>https://statmodeling.stat.columbia.edu/2026/06/02/survey-statistics-it-is-still-the-people/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/06/02/survey-statistics-it-is-still-the-people/#comments</comments>
		
		<dc:creator><![CDATA[shira]]></dc:creator>
		<pubDate>Tue, 02 Jun 2026 20:00:11 +0000</pubDate>
				<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Political Science]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53798</guid>

					<description><![CDATA[A year and a day ago, the Survey Statistics blog series launched with: “it is the people that make make survey statistics (and anything) great&#8221;. This past weekend, we got to celebrate wonderful people at Andrew Gelman’s 60-ish Birthday workshop. &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/06/02/survey-statistics-it-is-still-the-people/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>A year and a day ago, the Survey Statistics blog series launched with: <a href="https://statmodeling.stat.columbia.edu/2025/06/01/survey-statistics-it-is-the-people">“it is the people</a> that make make survey statistics (and anything) great&#8221;. This past weekend, we got to celebrate wonderful people at <a href="https://gelman60.com/">Andrew Gelman’s 60-ish Birthday workshop</a>.</p>
<p>Artist Sophie Gelman made the below:</p>
<p><img decoding="async" class="" src="https://gelman60.com/images/andrew-sketch.png" alt="Sketch portrait of Andrew Gelman" width="397" height="319" /></p>
<p>Yair Ghitza gave a talk about Andrew&#8217;s influence on polling. Yair is Chief Scientist at Catalist and coauthor of excellent papers about MRP we&#8217;ve cited in this blog series: <a href="https://sites.stat.columbia.edu/gelman/research/published/misterp.pdf">Ghitza and Gelman 2013</a> and <a href="https://www.cambridge.org/core/journals/political-analysis/article/abs/voter-registration-databases-and-mrp-toward-the-use-of-largescale-databases-in-public-opinion-research/C6C428EB05DC7132678215896F38B6B7">Ghitza and Gelman 2020</a>. The discussion after his talk included mention of <a href="https://www.nytimes.com/2026/05/18/upshot/times-siena-poll-changes.html">Nate Cohn&#8217;s May 18, 2026 NYT article</a> about weighting with &#8220;synthetic past vote&#8221;.</p>
<p><img decoding="async" class="alignnone wp-image-53800" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/06/Doobie_TN_AT_May_8_2026_poles_view-scaled.jpg" alt="" width="453" height="342" /></p>
<p>Let&#8217;s use our notation from <a href="https://statmodeling.stat.columbia.edu/2025/12/23/survey-statistics-is-a-mismeasured-x-better-than-none-at-all/">&#8220;is a mismeasured X better than none at all ?&#8221;</a> and <a href="https://statmodeling.stat.columbia.edu/2025/12/30/survey-statistics-more-adventures-in-mismeasured-x/">&#8220;more adventures in mismeasured X&#8221;</a> (see also <a href="https://statmodeling.stat.columbia.edu/2026/02/10/survey-statistics-more-on-recalled-vote/">&#8220;more on recalled vote&#8221;</a>):</p>
<ul>
<li>Y = current support</li>
<li>X = true 2024 vote, unknown</li>
<li>X* = recalled 2024 vote</li>
</ul>
<p>And notation from <a href="https://statmodeling.stat.columbia.edu/2025/11/11/survey-statistics-weights-and-mrp-for-voters/">&#8220;weights and MRP for voters&#8221;</a>:</p>
<ul>
<li>V = current registered voter</li>
<li>V2024 = record of voting in 2024 (a coarsened version of X that only tells of whether someone voted)</li>
</ul>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53801" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/06/Doobie_TN_AT_May_8_2026_poles_higher_up-scaled.jpg" alt="" width="460" height="347" /></p>
<p>Suppose we want E(Y | V=1), current support among current registered voters. Using MRP, we might want to estimate this via E(E(Y | X, sample, V = 1) | V = 1). But we&#8217;ve got at least 2 challenges:</p>
<ol>
<li>We can&#8217;t directly estimate E(Y | X, sample, V = 1) because we only have recalled vote X*.</li>
<li>We need p(X | V=1) for the outer expectation. But past election results give p(X).</li>
</ol>
<p><a href="https://www.nytimes.com/2026/05/18/upshot/times-siena-poll-changes.html">Nate&#8217;s article</a> proposes:</p>
<ol>
<li>Create <strong>synthetic past vote, X**</strong>, which aims to improve recalled vote X*:
<ol>
<li><strong>Impute</strong> X** if X* is missing and there is a record they voted (i.e. V2024 = 1).</li>
<li><strong>Validate:</strong> set X** to &#8220;nonvoter&#8221; if there is no record they voted (i.e. V2024 = 0).</li>
</ol>
</li>
<li>I interpret this to mean they estimate p(X** | V = 1) ? See <a href="https://statmodeling.stat.columbia.edu/2025/11/11/survey-statistics-weights-and-mrp-for-voters/">&#8220;weights and MRP for voters&#8221;</a> for ideas.<br />
<blockquote><p>&#8220;Synthetic past vote is weighted to match our estimate for how today’s registered voters &#8230;voted in the last election.&#8221;</p>
<p>&nbsp;</p></blockquote>
</li>
</ol>
<p>What do you think ?</p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53802" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/06/Doobie_TN_AT_May_8_2026_on_blaze-scaled.jpg" alt="" width="485" height="366" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/06/02/survey-statistics-it-is-still-the-people/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title>Say what you want about this junk survey, at least it&#8217;s more plausible than other hyped claims like the hyperloop or the idea that UFOs are space aliens!</title>
		<link>https://statmodeling.stat.columbia.edu/2026/06/02/hey-its-more-plausible-than-the-hyperloop-or-the-idea-that-ufos-are-space-aliens/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/06/02/hey-its-more-plausible-than-the-hyperloop-or-the-idea-that-ufos-are-space-aliens/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Tue, 02 Jun 2026 13:38:56 +0000</pubDate>
				<category><![CDATA[Zombies]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=52614</guid>

					<description><![CDATA[Palko points to this breakdown of a junk news story. The fake-survey-to-headline pipeline reminds me of a credulous Wall Street Journal story from a few years back. But, yeah, with respected news sources repeatedly falling for ridiculous scams like the &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/06/02/hey-its-more-plausible-than-the-hyperloop-or-the-idea-that-ufos-are-space-aliens/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>Palko points to <a href="https://bsky.app/profile/absolutely-not.bsky.social/post/3m2ayr6yju22b">this breakdown</a> of a junk news story.  The fake-survey-to-headline pipeline reminds me of a credulous Wall Street Journal story from <a href="https://statmodeling.stat.columbia.edu/2008/11/02/political-attitudes-of-the-super-rich/">a few years back</a>.</p>
<p>But, yeah, with respected news sources repeatedly falling for ridiculous scams like the <a href="https://statmodeling.stat.columbia.edu/2016/09/09/exploration-vs-exploitation-tradeoff/">hyperloop</a> or the idea that UFOs are <a href="https://statmodeling.stat.columbia.edu/2024/08/15/sports-media-prestige-media-space-aliens-edition/">space aliens</a> or <a href="https://statmodeling.stat.columbia.edu/2024/10/19/carroll-langer-credulous-scientist-as-hero-reporting-from-a-podcaster-who-should-know-better/">mind-body healing</a> etc etc etc., I guess we shouldn&#8217;t be shocked that they will uncritically report on a junk survey.  It&#8217;s that insatiable need for fresh content.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/06/02/hey-its-more-plausible-than-the-hyperloop-or-the-idea-that-ufos-are-space-aliens/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title>Noem’s Razor and why I think the concept of &#8220;unintended consequences&#8221; is overrated</title>
		<link>https://statmodeling.stat.columbia.edu/2026/06/01/noems-razor-and-why-i-think-the-concept-of-unintended-consequences-is-overrated/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/06/01/noems-razor-and-why-i-think-the-concept-of-unintended-consequences-is-overrated/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Mon, 01 Jun 2026 13:41:13 +0000</pubDate>
				<category><![CDATA[Decision Analysis]]></category>
		<category><![CDATA[Economics]]></category>
		<category><![CDATA[Political Science]]></category>
		<category><![CDATA[Zombies]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53184</guid>

					<description><![CDATA[I was thinking more about Noem’s Razor (&#8220;Never attribute to stupidity that which is adequately explained by malice&#8221;) and it reminded me of that “Unintended consequences” often were actually intended, a principle that I discussed back in 2008 in the &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/06/01/noems-razor-and-why-i-think-the-concept-of-unintended-consequences-is-overrated/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>I was thinking more about <a href="https://statmodeling.stat.columbia.edu/2009/05/24/handy_statistic/">Noem’s Razor</a> (&#8220;Never attribute to stupidity that which is adequately explained by malice&#8221;) and it reminded me of that “Unintended consequences” often were actually intended, a principle that I <a href="https://statmodeling.stat.columbia.edu/2008/01/22/what_kind_of_la/">discussed back in 2008</a> in the context of Freakonomics, that reliable purveyor of conventional wisdom; see also <a href="https://statmodeling.stat.columbia.edu/2008/02/14/discussion_of_u/">here</a> and <a href="https://statmodeling.stat.columbia.edu/2018/03/21/moral-hazard-quantitative-social-science-causal-identification-statistical-inference-policy/">here</a>.</p>
<p>One of my general problems with the concept of &#8220;unintended consequences&#8221; is that it so often seems to be used either as an argument against a proposed reform (recommending to not do this seemingly good thing because of its unintended consequences; what Albert Hirschman called the &#8220;perversity thesis&#8221; in his classic book, The Rhetoric of Reaction) or as a way to get evildoers off the hook by arguing that their bad actions were actually the unintended consequences of somebody&#8217;s good intentions.</p>
<p>I have a similar problem with Hanlon&#8217;s Razor (&#8220;Never attribute to malice that which is adequately explained by stupidity&#8221;).  Often Hanlon&#8217;s Razor applies, that&#8217;s for sure, but I also think it can be a way to let people off the hook.</p>
<p>Also, often the simpler explanation is the right one.  In the <a href="https://statmodeling.stat.columbia.edu/2026/01/19/noems-razor-never-attribute-to-stupidity-that-which-is-adequately-explained-by-malice/">motivating example</a> for Noem&#8217;s Razor, someone attributed the lethal behavior of the immigration police in Minneapolis as a &#8220;Sad case of poor incentive design (ICErs create expensive externalities bc of legal, reputation. etc costs of processing bad detentions and arrests. Textbook amateur mistake.&#8221;&#8211;but it seemed to me more likely that those police were doing what the government wanted.  So the incentives (by which the agents can break the law without fear of consequences) worked directly.  There&#8217;s no evidence that the consequences were unintended.  The <em>political</em> consequences may well have been not as desired, but I see that as more of a political miscalculation than anything else.</p>
<p>As the economists say, when there’s a policy that seems like it doesn’t make sense, think more carefully about the incentives. And of course this policy applies much more generally, as in the literature on regulatory capture.</p>
<p>I don&#8217;t buy the argument that the nice guys are the real assholes.  I think the assholes are usually the real assholes.</p>
<p>That doesn&#8217;t mean I think that all purported do-gooders are actually doing good&#8211;<a href="https://statmodeling.stat.columbia.edu/2026/02/01/for-gods-sake-dont-donate-to-the-international-peace-institute-unless-you-want-to-pay-for-corrupt-assholes-to-fly-first-class-around-the-world-to-give-speeches-about-how-everyone-needs-to/">see here</a>, for example.  People need to be evaluated based on what they do, not what they say.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/06/01/noems-razor-and-why-i-think-the-concept-of-unintended-consequences-is-overrated/feed/</wfw:commentRss>
			<slash:comments>36</slash:comments>
		
		
			</item>
		<item>
		<title>&#8220;Rationally Turbulent Expectations&#8221;</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/31/rationally-turbulent-expectations/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/31/rationally-turbulent-expectations/#respond</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Sun, 31 May 2026 13:27:56 +0000</pubDate>
				<category><![CDATA[Economics]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=52257</guid>

					<description><![CDATA[Kent Osband writes: About 15 years ago you kindly linked an article I wrote on “rational turbulence.” I’d like to let you know that I have recently summarized much more research along these lines in a short book Rationally Turbulent &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/31/rationally-turbulent-expectations/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>Kent Osband writes:</p>
<blockquote><p>About 15 years ago you kindly <a href="https://statmodeling.stat.columbia.edu/2011/12/11/rational-turbulence/">linked an article I wrote</a> on “rational turbulence.” I’d like to let you know that I have recently summarized much more research along these lines in a short book Rationally Turbulent Expectations.</p>
<p>I have published it as cheaply as color printing allows and also posted all chapters for free on ssrn, starting with <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5207446">this overview</a>.  </p>
<p>The main finding, summarized in the first few pages of Chapter 4, is that&#8211;once we allow for even tiny doubts about the the stability of an iid process&#8211;Bayesian learning has calm and turbulent phases, with fast learning more turbulent. It explains why differences in opinion between two reasonable people often widen before they narrow. I think this deserves broader attention in that people can learn to disagree more respectfully.</p>
<p>Hardly anyone will listen to me but many listen to you, so I am hopeful you will persuade yourself of this quickly and help persuade others.</p></blockquote>
<p>I clicked through and took a look.  Lots of things there resonate with various ideas I&#8217;ve discussed various times without ever fully thinking them through.  For example:</p>
<p><em>&#8220;Words like &#8216;ahead&#8217; and &#8216;forward&#8217; that point to space in front of us also point to future time. However, that isn’t the only way to align directions&#8221;</em>:  This reminds me of the idea <a href="https://statmodeling.stat.columbia.edu/2025/04/07/in-science-as-in-genre-storytelling-the-thrill-of-the-unexpected-can-only-come-with-reference-to-and-in-confounding-some-preexisting-norm/">discussed here</a> that there&#8217;s a logic of causality going forward in time and a logic of inference going backward in time.  Generative models in statistics are a way of going from one of these to the other.</p>
<p><em>&#8220;Most surprises are outliers from a still intact trend. We tweak the next round of forecasts and move on. Occasionally the surprises get under our skin. They shock us less by their size than their persistence. They make us suspect that what we thought of as a rare outlier is now the new norm&#8221;</em>:  This reminds me of the ideas of Shewhart, Deming, etc., on quality control, an approach to statistics which I think is <a href="https://statmodeling.stat.columbia.edu/2017/10/29/quality-control-rather-hypothesis-testing-inference-discovery-better-metaphor-statistical-processes-science/">important and underrated</a>.</p>
<p><em>&#8220;Contemporary finance theory rests on an unstable truce between two opposing schools. . . . The Rational Expectations school treats the market as a knowledge machine that assesses risks correctly and prices them appropriately. The Behavioral Finance school treats the market as a ship of fools prone to long stretches of complacency and short bouts of panic&#8221;</em>:  This reminds me of what I&#8217;ve called the <a href="https://statmodeling.stat.columbia.edu/2018/04/26/quick-rule-thumb-someone-seems-acting-like-jerk-economist-will-defend-behavior-essence-morality-someone-seems-something-nice/">two modes of reasoning in microeconomics</a>:  people are sometimes considered to be rational, so that the role of economists is to observe and analyze behavior and, from that, deduce values and motivations; and sometimes people are considered to be irrational, and the role of economists is to set them straight.  Either way, the economist (or &#8220;freakonomist&#8221;) is portrayed as a culture hero, either in protecting us from pinheaded academics who don&#8217;t trust the ordinary Joe to make his own damn decisions, or in helping people avoid deadweight losses all around them.</p>
<p>Osband frames the economy in terms of <em>turbulence</em>, which fits in well with the idea that economics occurs on the phase transition of equilibrium.  &#8220;Turbulence&#8221; seems like an appropriate term.</p>
<p>I don&#8217;t have the energy to read the whole book&#8211;I guess there&#8217;s an economic message there too!&#8211;so I can&#8217;t say more, but I&#8217;m happy to spread the word, and if you&#8217;re interested you can take a look into it yourself.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/31/rationally-turbulent-expectations/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Against too-clever-by-half political science cynicism</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/30/against-too-clever-by-half-political-science-cynicism/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/30/against-too-clever-by-half-political-science-cynicism/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Sat, 30 May 2026 13:35:50 +0000</pubDate>
				<category><![CDATA[Political Science]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=52615</guid>

					<description><![CDATA[There&#8217;s a long tradition in political science of skepticism regarding proposed quick fixes in politics. For example: term limits sound good but they weaken the legislature and reduce voter choice. Jungle primaries sound fair but they encourage insincere voting. Campaign &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/30/against-too-clever-by-half-political-science-cynicism/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>There&#8217;s a long tradition in political science of skepticism regarding proposed quick fixes in politics.  For example:  term limits sound good but they weaken the legislature and reduce voter choice.  Jungle primaries sound fair but they encourage insincere voting.  Campaign finance rules sound like a good idea but donors can always get around them.  Anticorruption laws might sound necessary for preserving political integrity, but ultimately politics is all about favors so why single out certain practices and label them as bribery?  Gerrymandering sounds bad but it&#8217;s all part of the political process.  </p>
<p>These arguments typically follow the patterns of anti-reform arguments noted by Albert Hirschman:  perversity, futility and jeopardy.  From <a href="https://en.wikipedia.org/wiki/The_Rhetoric_of_Reaction">wikipedia</a>:</p>
<blockquote><p>&#8211; According to the Perversity Thesis, any purposive action to improve some feature of the political, social, or economic status quo only serves, perversely, to exacerbate the very condition one wishes to remedy (compare: Unintended consequences).</p>
<p>&#8211; The Futility Thesis holds that attempts at social transformation will be unavailing, that they will fail to &#8220;make a dent&#8221; in the problem, and the motives of those who keep attempting futile reforms are suspect.</p>
<p>&#8211; The Jeopardy Thesis states that the risk of the proposed change is too great as it imperils some previous, precious accomplishment.</p></blockquote>
<p>Just because these are standard forms of argument, it doesn&#8217;t mean they&#8217;re wrong.  In any given case&#8211;indeed, in many actual cases&#8211;one or more of these points can be correct.  Just cos a proposed reform sounds kinda reasonable, it doesn&#8217;t mean that it would be a good idea if implemented.</p>
<p>But I see a broader themes in political science anti-reformism, a cynical attitude that we&#8217;ve described in the past as, <a href="https://statmodeling.stat.columbia.edu/2018/04/26/quick-rule-thumb-someone-seems-acting-like-jerk-economist-will-defend-behavior-essence-morality-someone-seems-something-nice/">A quick rule of thumb is that when someone seems to be acting like a jerk, an economist will defend the behavior as being the essence of morality, but when someone seems to be doing something nice, an economist will raise the bar and argue that he’s not being nice at all</a>.  In political science this comes out in the form of, Ha ha those silly reformers don&#8217;t understand the real world.  Among economics writers you sometimes see the argument that insider trading laws shouldn&#8217;t be enforced, either because insider trading is going to happen anyway (futility, in Hirschman&#8217;s terminology) or that insider trading is good in itself because it makes the market more efficient (jeopardy).  There&#8217;s also the argument that environmental regulation is bad because it can be evaded (futility), that it leads to the dreaded &#8220;regulatory capture&#8221; (perversity) or that, being a government-enforced law, will interfere with a preferable market solution (jeopardy).  Political science anti-reform arguments are a bit different from economics anti-reform arguments in that the economists often seem to want to remove politics and government as much as possible, whereas political scientists will argue that all problems should be settled at the ballot box.</p>
<p>I&#8217;m not saying that the political scientists or the economists making anti-reform arguments are themselves cynical.  There&#8217;s no reason you can&#8217;t sincerely believe that campaign finance laws will be futile or that insider trading laws can&#8217;t work.  But I do think they can have knee-jerk reactions against anything that is promoted as a good-government reform.  Skepticism is fine, but just cos something is promoted as a reform, that doesn&#8217;t automatically make it bad either.</p>
<p>I&#8217;m not sympathetic to the general argument that he bad guys are gonna win anyway, so let&#8217;s not even try to stop them.</p>
<p>All this has become clearer to me in the current political climate in which corruption has pretty much become legalized (<a href="https://www.nytimes.com/2025/10/10/opinion/tom-homan-bribery-investigation.html">see here</a> for one of many high-profile recent examples).  Partisn redistricting is out of control, all sorts of lies are heavily promoted by public officials and on social media, nonpartisan news sources are being attacked by the government, campaign finance laws are routinely violated with no consequence, but meanwhile ordinary people are at risk all over the country for getting fired because of saying something insufficiently worshipful about Charlie Kirk or saying something that some campus watchdog doesn&#8217;t like.  This is political polarization&#8211;anything goes because the view is that the other side is worse&#8211;and I&#8217;m not saying that any particular reforms, whether they be campaign finance restrictions, anti-gerrymandering laws, or an end to court rulings that effectively legalize bribery, will stop this trend&#8211;but I do think that the ending of political restraint has been facilitated by the efforts made in recent decades to take away the rules.</p>
<p>In short:  Yes, political reforms don&#8217;t always deliver what they promise.  But a continuing flow of reforms may be necessary to counter trends in lawlessness.</p>
<p>To put it another way, the fact that bad actors can work their way around reforms is often taken as an argument against reform:  if a reform is gonna be evaded, why do it?  But I&#8217;d argue it the other way: if bad actors are going to try to get around the rules, we should make it effortful for them to do so, and risky too.  When corruption is effectively legalized, as it is now, the barriers go down and more people will do corrupt things, indeed from an economic and political standpoint they&#8217;re pushed in that direction.  When corruption is forbidden, yes, it will still happen, but you&#8217;d expect to see less of it.  And there&#8217;s a big difference between &#8220;some politicians are corrupt&#8221; and &#8220;the government is up for sale.&#8221;</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/30/against-too-clever-by-half-political-science-cynicism/feed/</wfw:commentRss>
			<slash:comments>12</slash:comments>
		
		
			</item>
		<item>
		<title>15 new articles on statistical workflow!</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/29/15-new-articles-on-statistical-workflow/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/29/15-new-articles-on-statistical-workflow/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Fri, 29 May 2026 13:50:20 +0000</pubDate>
				<category><![CDATA[Decision Analysis]]></category>
		<category><![CDATA[Miscellaneous Science]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53785</guid>

					<description><![CDATA[Aki, Richard, Lizzie, and I put together a special issue on Statistical Workflow for the Philosophical Transactions of the Royal Society. I guess &#8220;royal&#8221; isn&#8217;t as impressive as it used to be, but still. Statistics and data analytics play an &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/29/15-new-articles-on-statistical-workflow/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><img loading="lazy" decoding="async" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Screenshot-2026-05-27-at-20.52.32-710x1024.png" alt="" width="584" height="842" class="alignnone size-large wp-image-53786" srcset="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Screenshot-2026-05-27-at-20.52.32-710x1024.png 710w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Screenshot-2026-05-27-at-20.52.32-208x300.png 208w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Screenshot-2026-05-27-at-20.52.32-768x1108.png 768w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Screenshot-2026-05-27-at-20.52.32-1065x1536.png 1065w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Screenshot-2026-05-27-at-20.52.32.png 1120w" sizes="(max-width: 584px) 100vw, 584px" /></p>
<p>Aki, Richard, Lizzie, and I put together <a href="https://royalsocietypublishing.org/rsta/issue/384/2321">a special issue on Statistical Workflow for the Philosophical Transactions of the Royal Society</a>. I guess &#8220;royal&#8221; <a href="https://www.news.com.au/entertainment/celebrity-life/royals/king-snubs-andrew-mountbattenwindsor-following-dramatic-police-update/news-story/19462334f867f7047c363f959bde6abd">isn&#8217;t as impressive</a> as it used to be, but still.</p>
<p>Statistics and data analytics play an increasingly important role in and across science and policy. But much of what is done by the best practitioners&#8211;their “workflow”&#8211;is tacit knowledge only glanced over in textbooks and research articles. In this new collection covering a wide range of disciplines, leading statisticians and researchers discuss the motivations and details for their workflows.</p>
<p>The four of us did this project because we were all interested in Bayesian workflow, and we wanted to learn more about statistical workflow in general, not just the Bayesian part.</p>
<p>Here&#8217;s what&#8217;s in the issue:</p>
<ul>
<li>Statistical workflow, by Andrew Gelman, Aki Vehtari &amp; Richard McElreath</li>
<li>Unsupervised machine learning for scientific discovery: workflow<br />
and best practices, by Andersen Chang, Tiffany M Tang, Tarek M Zikry &amp; Geneva I Allen</li>
<li>PCS workflow for veridical data science in the age of AI, by Zachary T Rewolinski &amp; Bin Yu</li>
<li>Simulations in statistical workflows, by Paul-Christian Bürkner, Marvin Schmitt &amp; Stefan T Radev</li>
<li>An automatic finite-sample robustness metric: when can dropping a little data change conclusions? Part I: definitions and experiments, by Ryan Giordano, Rachael Meager &amp; Tamara Broderick</li>
<li>An automatic finite-sample robustness metric: when can dropping a little data change conclusions? Part II: theory and intuition, by Ryan Giordano, Rachael Meager &amp; Tamara Broderick</li>
<li>Building a Backdrop of Meaning in Magnitude (BoMM) as part of research workflow, by Megan Dailey Higgs</li>
<li>A preliminary data analysis workflow for meta-analysis of dependent effect sizes, by Elizabeth Tipton, James Pustejovsky &amp; Jingru Zhang</li>
<li>A four-step simulation-based workflow for ecological analysis and science, by EM Wolkovich, T Jonathan Davies, William D Pearse &amp; Michael Betancourt</li>
<li>Scientific workflow in experimental economics, by Anna Dreber &amp; Séverine Toussaert</li>
<li>Hidden processes of workflow in cognitive developmental psychology, by Lauren N. Girouard &amp; Susan A. Gelman</li>
<li>Reproducible workflow for online AI in digital health, by Susobhan Ghosh et al.</li>
<li>Model checks for Bayesian estimation and forecasting of health coverage indicators in low- and middle-income countries, by Leontine Alkema et al.</li>
<li>Closing the gap between statistical and scientific workflows for improved forecasts in ecology, by Victor Van der Meersch, James Regetz, T Jonathan Davies &amp; EM Wolkovich</li>
<li>Machine learning workflows in climate modeling: design patterns and insights from case studies, by Tian Zheng et al.</li>
</ul>
<p>Lots of good stuff here, and lots of different perspectives.  Thanks to all the authors.  <a href="https://royalsocietypublishing.org/rsta/issue/384/2321">The issue is here</a>, and all the papers should be freely available.</p>
<p>If you have any thoughts on the articles in the volume, or on any other statistical workflow topics, just let us know right here in the comments box.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/29/15-new-articles-on-statistical-workflow/feed/</wfw:commentRss>
			<slash:comments>10</slash:comments>
		
		
			</item>
		<item>
		<title>The Kappa Zoo: David Eubanks&#8217;s online monograph on rating models</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/28/the-kappa-zoo-david-eubankss-online-monograph-on-rating-models/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/28/the-kappa-zoo-david-eubankss-online-monograph-on-rating-models/#comments</comments>
		
		<dc:creator><![CDATA[Bob Carpenter]]></dc:creator>
		<pubDate>Thu, 28 May 2026 19:00:39 +0000</pubDate>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Statistical Computing]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53775</guid>

					<description><![CDATA[David Eubanks writes: My site is kappazoo.com, and it&#8217;s still a work in progress. I would rather have emailed after I had the new goodness-of-fit code done, but I saw that you&#8217;re doing a summer workshop (on Andrew&#8217;s blog) [editor: &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/28/the-kappa-zoo-david-eubankss-online-monograph-on-rating-models/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><a href="https://www.furman.edu/people/david-eubanks/">David Eubanks</a> writes:</p>
<blockquote><p>
My site is <a href="https://kappazoo.com">kappazoo.com</a>, and it&#8217;s still a work in progress. I would rather have emailed after I had the new goodness-of-fit code done, but I saw that you&#8217;re doing a summer workshop (on Andrew&#8217;s blog) [editor: <a href="https://modeling.fordham.edu">Modern Modeling Methods (M3)</a>], so thought I&#8217;d mention it now.
</p></blockquote>
<p>It may be billed as a work in progress, but it&#8217;s a complete draft with no missing sections that provides a really nice overview of rating/crowdsourcing models.  These are the models that dragged me into statistics, namely Bayesian rating models formulated as noisy measurement models.  The first model of this kind that I or Eubanks could find was <a href="https://crowdsourcing-class.org/readings/downloads/ml/EM.pdf">Phil Dawid and Allan Skene&#8217;s (1979) paper on rating</a>.  </p>
<p>Eubanks works through a great deal of workflow without calling it that.  There are multiple model evaluation and comparison measures used and explained with connections to information-theoretic notions like entropy.</p>
<p>There&#8217;s a long discussion of Cohen&#8217;s kappa statistic, which is a commonly reported statistic measuring inter-rater agreement.  As Eubanks notes, it doesn&#8217;t deliver on its promise of adequately measuring inter-rater agreement.  The discussion is quite good here and complementary to the discussion from me and Becky Passonneau in our paper on rating in NLP, though our conclusions are the same.</p>
<p>I was surprised to see that Eubanks has a section comparing item-response theory (IRT) models with difficulty.  I&#8217;ve been trying to convince people this is important for years.  It took me around ten years to figure out how to move from Dawid and Skene&#8217;s IRT-0-like model to an IRT-1-like model, which we report in in our <a href="https://arxiv.org/abs/2405.19521">latest paper on crowdsourcing with difficulty parameters</a> (which also works through a lot of Bayesian workflow in considering different models). I can&#8217;t identify what took so long&#8212;it seems so obvious to me now.  </p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/28/the-kappa-zoo-david-eubankss-online-monograph-on-rating-models/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
			</item>
		<item>
		<title>What if scientists really were dispassionate observers, communicating ideas without irrational commitment? Look here, says AI.</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/28/what-if-scientists-really-were-dispassionate-observers-communicating-ideas-without-irrational-commitment-look-here-says-ai/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/28/what-if-scientists-really-were-dispassionate-observers-communicating-ideas-without-irrational-commitment-look-here-says-ai/#comments</comments>
		
		<dc:creator><![CDATA[Jessica Hullman]]></dc:creator>
		<pubDate>Thu, 28 May 2026 16:37:16 +0000</pubDate>
				<category><![CDATA[Miscellaneous Science]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Sociology]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53789</guid>

					<description><![CDATA[This is Jessica. We often idealize science as proceeding primarily by the scientific method, where scientists approach the objects of their investigation with a healthy dose of detachment and neutrality, who become convinced only when the evidence is there, and &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/28/what-if-scientists-really-were-dispassionate-observers-communicating-ideas-without-irrational-commitment-look-here-says-ai/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><span style="font-weight: 400">This is Jessica. We often idealize science as proceeding primarily by the scientific method, where scientists approach the objects of their investigation with a healthy dose of detachment and neutrality, who become convinced only when the evidence is there, and remain open to changing their mind if new evidence becomes available. But in reality we see examples of authors becoming personally attached to their ideas despite the data, slipping into advocacy and becoming defensive or going into denial mode when presented with clear evidence they were wrong. </span></p>
<p><span style="font-weight: 400">The seemingly irrational attachment to the ideas or findings can seem easy to dismiss as a bad thing. </span><span style="font-weight: 400">Yet there are also times when having some level of personal commitment makes one more effective at certain roles that scientists must play. For example, being too transparent about our own uncertainty is not always effective when presenting research to others, because the audience can become distracted and stop listening entirely, even if you have some useful insight to convey. My question is, What does our ability to now use AI to generate implementations, presentations, and even the ideas we work on themselves add to the mix?</span></p>
<p><span style="font-weight: 400">I got a glimpse of this recently. May ended up being workshop month for me, with at least one each week. I saw a lot of presentations. A couple of these showed me something I hadn’t yet seen, at least outside of student presentations: talks comprised of obviously AI-generated slides. If you’ve tried to use the non-design optimized versions of models like GPT or Claude to create slides yourself, you will know what I mean. Almost every slide has content organized in a grid. There’s too much text—full sentences or nearly so in multiple places, headers and footers, and stylized phrasing everywhere, like “principal design levers” and “load-bearing assumptions” and “actionable pathways”. </span></p>
<p><span style="font-weight: 400">These were not presentations by overwhelmed junior faculty or researchers I’d never heard of. They were by prominent researchers who are respected in their fields. </span></p>
<p><span style="font-weight: 400">Needless to say, they were not very effective talks. The slides tended to have too much going on to parse in time, with way too much text. The vague phrasing was distracting, making me wonder what exactly the presenter meant by terms like “governing frictions” or “strategic bottlenecks” and whether they write like that in their papers too. Part of the problem is that the presenter tends to use their own language as they present, rather than reinforcing what’s on the slide, so you have two competing streams of information that feel like they’re from two distinct viewpoints, one which is quite confident and willing to summarize and even exaggerate, the other more reserved. </span></p>
<p><span style="font-weight: 400">It makes sense that you’re more likely to hold at a distance what you didn’t come up with yourself, subconsciously at least, even if you think you’re selling it. In one case, the speaker also described how some of the results themselves were discovered by AI, which probably further contributes to the impression that they hadn’t fully committed to what they are presenting. </span></p>
<p><span style="font-weight: 400">This has me wondering what the impact on diffusion of ideas will be as it becomes more standard practice to rely on AI for implementation in scientific production and communication. It’s funny how reserving skepticism for your own results often comes up in </span><a href="https://statmodeling.stat.columbia.edu/2026/04/14/epistemic-virtues-for-science-in-the-age-of-automation/"><span style="font-weight: 400">discussing epistemic virtues</span></a><span style="font-weight: 400">, but when speakers present as if holding their work at arm’s length, the result is not so informative. As we rely more heavily on AI in all stages of research, will we face more challenges in getting others to adopt our ideas? </span></p>
<p><span style="font-weight: 400">It’s also another reminder of how few people thinking about AI for science seem to have considered all the personal stuff that goes into the practice of science, with lots of irrational investment and fixation and stubbornness and pride to drive the loop of discovery and validation and communication. Scientific discovery may be an “ocean,” to borrow an analogy associated with Leibniz, but surfing it requires strapping oneself to a board and committing to seeing where it gets you, not just keeping it in sight while you splash around somewhere else. </span></p>
<p><span style="font-weight: 400">This also leads to a practical question of how you instill a sense of ownership, or at least commitment, to ideas that were partly produced by AI. My own experience is that it takes a lot of time to verify AI produced results before I get to the level of confidence I’d have if I’d done it myself. For complex tasks there will inevitably be decisions made along the way, e.g., about how to parameterize certain things in implementation or to deal with edge cases or other exceptions. Each of these has to be reconstructed before I can really feel that I stand behind the output. </span></p>
<p><span style="font-weight: 400">Is there an alternative? It makes me think of the “baking guilt” that housewives supposedly felt after cake mixes came on the market, because they only required adding water. There was a loss of a sense of personal contribution and emotional ownership. The solution, which persists today, was to have them add an egg. Some psychoanalysts went so far as to interpret this as symbolic of their fertility. For AI-aided science, the closest thing to adding an egg seems to be having agents explain at length to you what was done, which can still mean a big improvement over implementing everything yourself, but not as much of a boost as it first seems.</span></p>
<p><span style="font-weight: 400">At any rate, interpreting the new challenges of AI-generated presentations of potentially AI-generated ideas as an aesthetic problem, or of “putting style before substance,” does not seem right. Scientific ideas don’t diffuse as bare propositions. They diffuse through people who have developed some passion for them. If we’re talking about AI for science, we shouldn’t be ignoring scientists and their relationships with what they do.  </span></p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/28/what-if-scientists-really-were-dispassionate-observers-communicating-ideas-without-irrational-commitment-look-here-says-ai/feed/</wfw:commentRss>
			<slash:comments>30</slash:comments>
		
		
			</item>
		<item>
		<title>Statistical analysis recapitulates the development of statistical methods</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/28/statistical-analysis-recapitulates-the-development-of-statistical-methods/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/28/statistical-analysis-recapitulates-the-development-of-statistical-methods/#respond</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Thu, 28 May 2026 13:00:47 +0000</pubDate>
				<category><![CDATA[Miscellaneous Science]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Sociology]]></category>
		<category><![CDATA[Statistical Computing]]></category>
		<category><![CDATA[Statistical Graphics]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53788</guid>

					<description><![CDATA[We ran this a few years ago but it remains interesting so I&#8217;m reposting: There’s a old saying in biology that the development of the organism recapitulates the development of the species: thus in utero each of us starts as &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/28/statistical-analysis-recapitulates-the-development-of-statistical-methods/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>We ran this <a href="https://statmodeling.stat.columbia.edu/2015/02/06/statistical-analysis-recapitulates-development-statistical-methods/">a few years ago</a> but it remains interesting so I&#8217;m reposting:</p>
<p>There’s a old saying in biology that the development of the organism recapitulates the development of the species: thus in utero each of us starts as a single-celled creature and then develops into an embryo that successively looks like a simple organism, then like a fish, an amphibian, etc., until we reach our human form in preparation for birth.</p>
<p>Modern biologists don’t believe in this recapitulation.  But taking this as an intriguing idea, I see an analogy with statistical practice.</p>
<p>Some version of this recapitulation occurs just about whenever we do applied statistics. We start with the simplest methods&#8211;univariate data summaries and some basic multivariate analyses&#8211;then we perform some comparisons which we check via standard errors and off-the-shelf hypothesis tests, then we move to modeling. We might well start with least squares and maximum likelihood and then move to regularization and multilevel modeling as needed, then throw in measurement error models, selection models, nonparametric this and that, and so forth.</p>
<p>The analogy isn’t perfect&#8211;in particular, we don’t always begin an analysis with simple averages and plots; sometimes we begin with a sophisticated nonparametric data-exploration tool such as lowess or deep nets. And, lots of methods for graphical exploratory data analysis have only been developed recently; indeed, even methods as basic as scatterplots are <a href="https://nightingaledvs.com/statistical-graphics-and-comics/">only a few centuries old</a>.</p>
<p>Within the context of modeling, though, it does seem to me that we tend to start simple and then add more complicated features one at a time&#8211;and this seems like a sensible way to proceed. In so proceeding, we’re motivated in part by computational stability but also in part by the logic of increasing complexity: we take each step for a reason. Thus it is logical that statistical analysis recapitulates the development of statistical methods.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/28/statistical-analysis-recapitulates-the-development-of-statistical-methods/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>No, Bayes does not like Mayor Pete. (Pitfalls of using implied betting market odds to estimate electability.)</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/27/no-bayes-does-not-like-mayor-pete-pitfalls-of-using-implied-betting-market-odds-to-estimate-electability/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/27/no-bayes-does-not-like-mayor-pete-pitfalls-of-using-implied-betting-market-odds-to-estimate-electability/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Wed, 27 May 2026 13:03:21 +0000</pubDate>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Economics]]></category>
		<category><![CDATA[Political Science]]></category>
		<category><![CDATA[Zombies]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53136</guid>

					<description><![CDATA[This one&#8217;s from 2019, but it&#8217;s worth reposting given recent interest in prediction markets. The story starts with a post from economist Greg Mankiw, who wrote: Who has the best chance of beating Donald Trump? A clue can be found &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/27/no-bayes-does-not-like-mayor-pete-pitfalls-of-using-implied-betting-market-odds-to-estimate-electability/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>This one&#8217;s <a href="https://statmodeling.stat.columbia.edu/2019/11/23/pitfalls-of-using-implied-betting-market-odds-to-estimate-electability/">from 2019</a>, but it&#8217;s worth reposting given recent interest in prediction markets.</p>
<p>The story starts with a post from economist Greg Mankiw, who <a href="https://gregmankiw.blogspot.com/2019/04/bayes-likes-mayor-pete.html">wrote</a>:</p>
<blockquote><p>Who has the best chance of beating Donald Trump? A clue can be found using Bayes Theorem.</p>
<p>Here is the logic. Let A be the event that a candidate wins the general election, and B be the event that a candidate wins his or her party&#8217;s nomination. <a href="https://www.predictit.org/markets">Predictit</a> gives us the betting market&#8217;s view of P(A) and P(B). It is a safe assumption that P(B|A) = 1, that is, a candidate can win only if nominated. We can then use <a href="https://en.wikipedia.org/wiki/Bayes%27_theorem">Bayes theorem</a> to compute P(A|B), the probability that the candidate will win the general election conditional on being nominated.</p>
<p>So here are the results for P(A|B) as of now:</p>
<p>Buttigieg 0.80<br />
Biden 0.77<br />
O&#8217;Rourke 0.67<br />
Sanders 0.65<br />
Booker 0.60<br />
Yang 0.60<br />
Harris 0.57<br />
Warren 0.44</p>
<p>That is, the betting markets suggest that Mayor Pete would be the strongest candidate if nominated, with Joe Biden close behind. (Of course, these numbers will bounce around as the prices in betting markets change.)</p>
<p>By the way, when I [Mankiw] did a <a href="https://gregmankiw.blogspot.com/2006/11/bayes-likes-obama.html">similar calculation in 2006</a>, Bayes liked Barack Obama.</p></blockquote>
<p>I copied Mankiw&#8217;s post in its entirety, with the only change being that he wrote P(A / B) etc., and I changed the slash to the vertical bar, P(A|B).  (Are there people who write conditioning using a slash rather than a vertical bar?  I had no idea.  P(A|B) is more standard, I believe.  In the above post, Mankiw links to the wikipedia page which uses the P(A|B) notation.  No big deal, it just seemed odd to me.)</p>
<p>Anyway, I think the above set of calculations is a great example for teaching conditional probability.</p>
<p>The next step is to push a bit:  Do we really believe these numbers?  There&#8217;s nothing wrong with the probability calculations, but I&#8217;m not sure we should be taking Predictit&#8217;s betting odds as actual win probabilities.</p>
<p>To start with, I looked at Mankiw&#8217;s list and wondered what Yang was doing on it.  Yang&#8217;s a fringe candidate, right?  I wrote my post in June, 2019, and Yang was polling at 0.8% on Real Clear Politics then.  I went over to Predictit and it said you can buy a Yang contract for the Democratic nomination for the price of 9%.  OK, sure, at 0.8% in the polls there&#8217;s room for improvement.  But 9%???  Seems like a lot.</p>
<p>The next think I&#8217;m worried about, beyond bias in the online markets, is volatility.</p>
<p>Sure, Mankiw writes, &#8220;these numbers will bounce around as the prices in betting markets change,&#8221; but I think he&#8217;s not fully appreciating how noisy these numbers are!</p>
<p>Mankiw&#8217;s post is dated 27 Apr 2019.  Predictit conveniently gives prices going back a few months, so I could do some Biden-Warren price comparisons of then to when I was writing my post:</p>
<p> 27 Apr  12 Jun<br />
Biden primary election 22 28<br />
Biden general election 17 19<br />
Warren primary election 9 19<br />
Warren general election 4 13</p>
<p>Something weird was going on in April, when Biden&#8217;s price was 22 for the primary and 17 for the general election.  This just can&#8217;t be right, and all I can conclude is that the betting markets here were thin enough that nobody was taking these numbers very seriously.</p>
<p>If you want to take the numbers as is, you&#8217;ll get the following:</p>
<p>27 Apr:  Biden 17/22 = 0.77, Warren 4/9 = 0.44<br />
12 Jun:  Biden 19/28 = 0.68, Warren 13/19 = 0.68.</p>
<p>These numbers aren&#8217;t quite right, even if you take these betting markets seriously, because of rounding and the vig.  If you add up all the prices on the &#8220;Who will win the 2020 Democratic presidential nomination?&#8221; page, you get something well over 100%.  So you can&#8217;t directly interpret these prices as probabilities, even beyond the issues of bias and noise.</p>
<p>I discussed this with David Rothschild, who thinks a lot about elections and prediction markets (for example, <a href="https://slate.com/news-and-politics/2016/07/why-political-betting-markets-are-failing.html">here</a>), and David responded as follows:</p>
<blockquote><p>People ask me to compute this automatically on my blog, but I refrain, because it is so noisy this early. Here I compute the conditional probability range separately for Betfair and PredictIt, by diving the seller’s price of win / buyer’s price of nom &#038; buyer’s price of win / seller’s price of nom. Betfair has advantage of being tighter by definition (PredictIt trades on the penny, but Betfair on the odds, which have more depth).</p>
<p>Here is a figure from an old paper I wrote with David Pennock about the 2012 election. As you can see, while informative, it can get quite noisy!</p>
<p><img decoding="async" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2019/06/image001.jpg" alt="" width="450" /></p></blockquote>
<p>Anyway, my point here is not to criticize Mankiw but rather to thank him for putting out this fun example, and then to demonstrate how we can take it further by interrogating each step in the analysis.  Which is how we do applied statistics in general.</p>
<p><strong>P.S.</strong>  In case you&#8217;re curious, based on the numbers when I wrote my post, where Biden&#8217;s implied electability is 19/28 = 0.68 and Warren&#8217;s is 13/19 = 0.68, we can look up Buttigieg.  He was at 9/16 = 0.56, the <em>least</em> electable of the three.  So, no, Bayes did not like Mayor Pete that day.</p>
<p>It&#8217;s a fun example, but when we look at the data more carefully, the original conclusion goes away.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/27/no-bayes-does-not-like-mayor-pete-pitfalls-of-using-implied-betting-market-odds-to-estimate-electability/feed/</wfw:commentRss>
			<slash:comments>7</slash:comments>
		
		
			</item>
		<item>
		<title>Survey Statistics: double-plus robustness</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/26/survey-statistics-double-plus-robustness/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/26/survey-statistics-double-plus-robustness/#comments</comments>
		
		<dc:creator><![CDATA[shira]]></dc:creator>
		<pubDate>Tue, 26 May 2026 21:15:53 +0000</pubDate>
				<category><![CDATA[Miscellaneous Statistics]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53779</guid>

					<description><![CDATA[Meng (2022) pops up a lot here: &#8220;it is the people&#8221; (the launch of this blog series a year ago !), &#8220;probability samples vs epsem samples vs SRS samples&#8221;, &#8220;divine probabilities&#8221;, and last week&#8217;s &#8220;GREG&#8221;. Like a lot of Meng&#8217;s &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/26/survey-statistics-double-plus-robustness/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><a href="https://www150.statcan.gc.ca/n1/pub/12-001-x/2022002/article/00006-eng.htm">Meng (2022)</a> pops up a lot here: <a href="https://statmodeling.stat.columbia.edu/2025/06/01/survey-statistics-it-is-the-people/#comment-2398127">&#8220;it is the people&#8221;</a> (the launch of this blog series a year ago !), <a href="https://statmodeling.stat.columbia.edu/2025/12/02/survey-statistics-probability-samples-vs-epsem-samples-vs-srs-samples/#comment-2406745">&#8220;probability samples vs epsem samples vs SRS samples&#8221;</a>, <a href="https://statmodeling.stat.columbia.edu/2025/12/09/survey-statistics-divine-probabilities/">&#8220;divine probabilities&#8221;</a>, and last week&#8217;s <a href="https://statmodeling.stat.columbia.edu/2026/05/19/survey-statistics-greg/#comment-2414811">&#8220;GREG&#8221;</a>. Like a lot of Meng&#8217;s papers, it deserves several rereads.</p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53782" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Doobie_umbrella_May_2026_DWG-scaled.jpg" alt="" width="308" height="233" /></p>
<p>(The polar bear celebrated the blog series birthday with a rainy hike on the PA AT. Here he is attempting to dry off.)</p>
<p>Let&#8217;s zoom in on the part about the Generalized REGression estimator (that doesn&#8217;t specifically say &#8220;GREG&#8221;). Green anotations are mine:</p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53780" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Meng2022_GREG.png" alt="" width="595" height="935" /></p>
<p><a href="https://www150.statcan.gc.ca/n1/pub/12-001-x/2022002/article/00006-eng.htm">Meng (2022)</a>&#8216;s (5.2) is the first way of writing GREG in our post <a href="https://statmodeling.stat.columbia.edu/2026/05/19/survey-statistics-greg/#comment-2414811">&#8220;GREG&#8221;</a>, from <a href="https://link.springer.com/book/9780387406206">Särndal, Swensson, Wretman (1992)</a>:</p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53746" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/GREG_SSW_p231.png" alt="" width="411" height="176" /></p>
<p>That book goes on to say that GREG often takes a super simple form:</p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53781" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/GREG_SSW_p.231B.png" alt="" width="472" height="99" /></p>
<p><a href="https://www150.statcan.gc.ca/n1/pub/12-001-x/2022002/article/00006-eng.htm">Meng (2022)</a> doesn&#8217;t mention this as far as I can tell ? Although I think Meng&#8217;s example satisfies the conditions the book <a href="https://link.springer.com/book/9780387406206">Särndal, Swensson, Wretman (1992)</a> goes on to describe: the regression model assumes constant variance and has an intercept.</p>
<p>Anyways, back to the title of this post. Meng emphasizes that GREG is not only &#8220;double robust&#8221; (consistent if either the outcome model or response model are correct), but &#8220;double-plus robust&#8221; (consistent if what is left of the outcome model and response model are uncorrelated). I&#8217;m interested in the practical implications of this, such as the suggestion to include the estimated response probabilities in the outcome regression model. Thoughts ?</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/26/survey-statistics-double-plus-robustness/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title>How much skill is in &#8220;skill games&#8221;?  There can&#8217;t be much.</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/26/how-much-skill-is-in-skill-games-there-cant-be-much/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/26/how-much-skill-is-in-skill-games-there-cant-be-much/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Tue, 26 May 2026 13:46:52 +0000</pubDate>
				<category><![CDATA[Decision Analysis]]></category>
		<category><![CDATA[Sports]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53760</guid>

					<description><![CDATA[A few years ago we posted on luck vs. skill in poker and luck vs. skill in sports. A new one of these came up when Palko pointed me to this disturbing news article, &#8220;They Look Like Slot Machines. They &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/26/how-much-skill-is-in-skill-games-there-cant-be-much/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>A few years ago we posted on <a href="https://statmodeling.stat.columbia.edu/2014/08/14/luck-vs-skill-poker/">luck vs. skill in poker</a> and <a href="https://statmodeling.stat.columbia.edu/2014/06/27/quantifying-luck-vs-skill-sports/">luck vs. skill in sports</a>.</p>
<p>A new one of these came up when Palko pointed me to <a href="https://www.thetrace.org/2026/05/pennsylvania-skill-games-violence-crime/">this disturbing news article</a>, &#8220;They Look Like Slot Machines. They Pay Out in Cash. And Critics Say They Are Getting Workers Killed,&#8221; which reports:</p>
<blockquote><p>Store clerks in Pennsylvania have been robbed and shot while handling payouts for “skill games,” which are not subject to the security standards required of gambling operations. . . .</p>
<p>They look like casino slot machines and video arcade games, but they are neither. They are skill games. Like their name implies, players must use their skills — memory, reflexes, strategy, recognition — to win cash. They don’t solely rely on the luck of the draw, like with slot machines. . . .</p>
<p>The Pennsylvania Gaming Control Board licenses 17 casinos and 75 truck stop video gaming terminal facilities, requiring them to have secure facilities, trained staff, and digital video recording. Their gambling machines also have to be linked to a centralized computer monitoring system. Businesses that offer skill games are not held to any standards, their critics say. As a result, some are putting their employees in danger by having them pay winners with cash. . . .</p></blockquote>
<p>Some gruesome stories follow, along with predictable quotes from evil people making money off these things.</p>
<p><strong>&#8220;Skill games&#8221;?</strong></p>
<p>But here&#8217;s my question.  How much skill is actually in these &#8220;skill games&#8221;?  I assume not much, because, if the games really did involve skill, then skillful players could just show up and win regularly.</p>
<p>I guess the &#8220;skill games&#8221; could involve some small amount of skill, but not enough so that skillful players could beat the house edge.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/26/how-much-skill-is-in-skill-games-there-cant-be-much/feed/</wfw:commentRss>
			<slash:comments>30</slash:comments>
		
		
			</item>
		<item>
		<title>&#8220;The Ten Year Affair&#8221;</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/25/the-ten-year-affair/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/25/the-ten-year-affair/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Mon, 25 May 2026 13:06:32 +0000</pubDate>
				<category><![CDATA[Literature]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53139</guid>

					<description><![CDATA[I just finished The Ten Year Affair by Erin Somers. The book was excellent, and it reminded me of Banal Nightmare by Halle Butler and the novels of Sally Rooney: a story of Millennials and their friends and spouses, told &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/25/the-ten-year-affair/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>I just finished The Ten Year Affair by Erin Somers.  The book was excellent, and it reminded me of Banal Nightmare by Halle Butler and the novels of Sally Rooney:  a story of Millennials and their friends and spouses, told in a deadpan, I&#8217;m-sane-and-everyone-around-me-is-slightly-clueless style, with the plot being that one thing happens and then another thing happens and then another thing happens, and lots of conversations, not always using quotation marks so that the inner monologues and the interpersonal interactions blur together, which makes a lot of sense given that these things are all happening in our heads.</p>
<p>The Ten Year Affair employs a storytelling device also used by Lionel Shriver in The Post-Birthday World and also by whoever wrote that movie, Sliding Doors, with Gwyneth Paltrow.  Somers did it better, though, in two ways.  First, she leans into the reality that both threads of the story are fiction, and her protagonist is aware of the two threads.  This is the right thing to do, because the point of an alternative timeline is not just that it&#8217;s something else that could&#8217;ve happened but also that we&#8217;re aware of the possibility:  that other hypothetical world is always there in the periphery, just outside of reach and informing our actions in the real world.  The second way that Somers did better than Shriver and that screenwriter is that she (Somers) avoids an easy way out.  In The Post-Birthday World and Sliding Doors, the husband or boyfriend is a bad guy, and so, in the logic of the story the heroine is automatically justified in trying to find someone else.  This removes the dramatic tension.  In The Ten Year Affair, the husband is far from perfect, but he&#8217;s not cheating on his wife.  The other thing done well in The Ten Year Affair is that all the characters are flawed.  You&#8217;re seeing things through the eyes of one main character, and early on we get a sense of the goofiness of all the people around her&#8211;indeed, she bonds with one of her friends based on a mutual distaste for a mildly obnoxious third party&#8211;, but, as the book goes on, we see the viewpoint character&#8217;s flaws too.  I like that Somers is willing, in the end, to make that main character as flawed as everyone around her, which I guess fits the there-is-no-escape theme of the book.</p>
<p>As a side note, point of view is done very rigorously in the book, to the extent that you get a sense of what all the characters look like, except for the protagonist, because she doesn&#8217;t need to describe herself, right?  I guess the main thing we learn about the main character, regarding her looks, is that she&#8217;s not particularly insecure about her appearance:  she doesn&#8217;t think she&#8217;s the most beautiful woman out there but she doesn&#8217;t really worry about her looks either.  To me, this sort of rigor contributes to the pleasure of reading a book:  I feel comfortably in the hands of a confident storyteller.</p>
<p>Going back a bit in literary time, The Ten Year Affair is a lot like the novels of John Updike:  various suburban married couples having affairs.  The writing style is different&#8211;Updike is famously lyrical, whereas Somers uses a Millennial flat writing style:  This happens, then This happens, then That happens, etc.  Kind of like Ernest Hemingway or Raymond Carver if they had a sense of humor.</p>
<p>I think Somers does a much better job than Updike in conveying what it feels like to be a parent.  To me, Updike, like Philip Roth, was to the end of his life always a son, never a father.  Updike did have four kids, but I guess his wife did most of the parenting.  Updike&#8217;s characters often have children but always seem to be thinking only about themselves.  Not so much that his adult characters are self-centered&#8211;I mean, yeah, they are, but that&#8217;s kind of the point&#8211;but more that their children don&#8217;t seem to exist at all, except to the extent that they sometimes have to be dealt with as obstacles when they get in the way of the parents.  In contrast, the adults in The Ten Year Affair are very aware of their kids.  In some ways this is similar to Little Children by Tom Perotta, a book whose entire theme is that these adults are thinking only of themselves and are not shouldering the responsibilities of parenthood.</p>
<p>The children in The Ten Year Affair are real people, but they don&#8217;t come to life as much as, say, the children in the novels of Meg Wolitzer.  Wolitzer achieves an equality across generations that I rarely see in literature; perhaps it has something to do with her being a Boomer, coming from a generation in which kids are central.</p>
<p><strong>P.S.</strong>  I looked up Somers and it turns out she&#8217;s <a href="https://www.thenation.com/article/culture/lorrie-moore-ghosts/">a fan of Lorrie Moore</a>.  Cool!  I&#8217;m a fan of Lorrie Moore too.  Although one thing that always annoys me about Moore is that her stories always seem to feature a main character who is a woman who is a good person and has to deal with asshole guys.  Yeah, I get it, there are a lot of assholes out there, but Moore&#8217;s protagonists are always so clever and thoughtful that I find it frustrating that they keep coming across as victims.  I think Somers does better in this area.  Her <a href="https://newrepublic.com/article/172883/martin-amis-let-readers-joke">appreciation of Martin Amis</a> is good too.  I think Somers should come out with a book of literary essays.  I&#8217;d buy it.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/25/the-ten-year-affair/feed/</wfw:commentRss>
			<slash:comments>4</slash:comments>
		
		
			</item>
		<item>
		<title>Physics fraud compared to fraud and junk science in the social and behavioral sciences</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/24/physics-fraud-compared-to-fraud-and-junk-science-in-the-social-and-behavioral-sciences/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/24/physics-fraud-compared-to-fraud-and-junk-science-in-the-social-and-behavioral-sciences/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Sun, 24 May 2026 13:12:03 +0000</pubDate>
				<category><![CDATA[Miscellaneous Science]]></category>
		<category><![CDATA[Sociology]]></category>
		<category><![CDATA[Zombies]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53138</guid>

					<description><![CDATA[A few months ago I read this book, Plastic Fantastic: How the Biggest Fraud in Physics Shook the Scientific World, by Eugenie Reich. The book was from 2009, and I&#8217;d never heard of it, or of the case of fraud &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/24/physics-fraud-compared-to-fraud-and-junk-science-in-the-social-and-behavioral-sciences/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>A few months ago I read this book, Plastic Fantastic:  How the Biggest Fraud in Physics Shook the Scientific World, by Eugenie Reich.  The book was from 2009, and I&#8217;d never heard of it, or of the case of fraud it discusses; indeed I don&#8217;t even remember who recommended the book to me, maybe it was a blog commenter?</p>
<p>Anyway, the book was interesting.  Short story is that there was a mediocre young physics Ph.D. who achieved some success by faking data, he got a research job at Bell Labs, faked more data, got more success, eventually got too much success in that people were interested enough in his findings that they tried to use his methods themselves, but the methods never worked.  After a bit more struggle he then got fired.</p>
<p>Compared to the usual glacial pace of such scandals in academia, the whole thing was pretty quick and clean.  From receipt of Ph.D. to getting fired was just a bit over five years.  That might seem like a long time, but it&#8217;s quick compared to the careers of Wansink, Ariely, Wegman, Hauser, etc.&#8211;not to mention the many bullshitters in social and medical science who are still out there.</p>
<p>I think the key difference between this physics case and fraudulent science in cognitive and social science is that when a key part of any physics paper is its methods.  If a finding is interesting, and other researchers want to get into the game, they&#8217;ll want to start by replicating the experiment.  If they can&#8217;t do that, they&#8217;ll get upset.  In contrast, if you want to follow up on a political science or economics study, you&#8217;ll look for new dat, and if you want to follow up on a psychology experiment on embodied cognition or <a href="https://sites.stat.columbia.edu/gelman/research/published/healing3.pdf">mind-body healing</a> or whatever, you can just do your own experiment from scratch.  There&#8217;s no particular method that you have to use.</p>
<p>So in social science junk papers like those <a href="https://statmodeling.stat.columbia.edu/2013/01/10/that-controversial-claim-that-high-genetic-diversity-or-low-genetic-diversity-is-bad-for-the-economy/">discussed here</a> or <a href="https://sites.stat.columbia.edu/gelman/research/published/power5r.pdf">here</a> can stay around forever&#8211;even <a href="https://statmodeling.stat.columbia.edu/2025/10/21/reanalysis-of-that-nobel-prizewinning-study-of-patents-and-innovation/">questionable papers</a> written by Nobel prize winners&#8211;because people only care about the results, not the method.</p>
<p>Here&#8217;s another interesting difference.  Those three papers discussed in the link in the previous paragraph are not fraudulent, they&#8217;re just bad science, some combination of noisy data, bad theory, bad measurement, and statistical misunderstanding.  Here&#8217;s the relevant principle:  in the social and behavioral sciences, you can get prominent bad results by accident.  In physics, you pretty much have to cheat.  Sure, experimental physicists get fluke results all the time, but then they don&#8217;t replicate and the field moves on.  To really stick the landing on a non-result in physics you have to cheat.</p>
<p>Social scientists who do junk research or even fraud can stay afloat forever, but that cheating physicist was headed for a fall, once his research got some attention.</p>
<p>Also, it seems that this guy was a real asshole.  Not only did he lie, and then lie again to cover up, then later when it all came out and his university revoked his Ph.D., he sued them.  What kind of person would do that:  defraud the education system and then sue them?</p>
<p>At this point I should step back and think like a statistician.  Some percentage of people are unscrupulous assholes, and some percentage of people are physicists.  Draw the Venn diagram:  you&#8217;ll expect to see some overlap.</p>
<p>In any case, this is all so clean compared to <a href="https://statmodeling.stat.columbia.edu/2026/01/22/aking/">what happens in social science</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/24/physics-fraud-compared-to-fraud-and-junk-science-in-the-social-and-behavioral-sciences/feed/</wfw:commentRss>
			<slash:comments>28</slash:comments>
		
		
			</item>
		<item>
		<title>The “humans are imperfect reporters too” defense for ascribing little thoughts to machines</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/23/the-humans-are-imperfect-reporters-too-defense-for-ascribing-little-thoughts-to-machines/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/23/the-humans-are-imperfect-reporters-too-defense-for-ascribing-little-thoughts-to-machines/#comments</comments>
		
		<dc:creator><![CDATA[Jessica Hullman]]></dc:creator>
		<pubDate>Sat, 23 May 2026 16:12:39 +0000</pubDate>
				<category><![CDATA[Miscellaneous Science]]></category>
		<category><![CDATA[Sociology]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53762</guid>

					<description><![CDATA[This is Jessica. In my last post about the tension between the necessity that we ascribe human folk psychological concepts like thinking and reasoning to machines and the problems that arise when we overinterpret them, I briefly mentioned a defense &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/23/the-humans-are-imperfect-reporters-too-defense-for-ascribing-little-thoughts-to-machines/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>This is Jessica. In my last post about the tension between the necessity that we ascribe human folk psychological concepts like thinking and reasoning to machines and the problems that arise when we overinterpret them, I briefly mentioned a defense I sometimes hear for anthropomorphic terms. The defense goes like this: We can’t protest the application of words like reasoning or belief or desire or understanding to AI because humans are also not always trustworthy reporters of the latent states that these terms refer to. And if humans can’t demonstrate full awareness of their reasoning process or their beliefs or their understanding or their intentions, why should we expect AI to? So, the argument goes, it is asymmetric to require complete faithfulness of LLM traces to some latent state. That would be setting a higher bar than we currently use with humans.</p>
<p>This line of reasoning takes two things that may resemble each other on the surface (e.g., what humans report their reasoning or belief or intention to be, and what machines report) which we have no solid reason to believe are produced by similar processes, and says they cannot be distinguished because we expect both to be reported with loss.</p>
<p>It’s bad logic. But more than that, it occurs to me that by doing this we indirectly deny the value of our subjective, non-verbal experience. It’s true we may not be able to faithfully report how we reasoned or what exactly we desired or intended or understood. But part of the reason we have terms for these things is because we perceive a distinction between more genuine and less genuine versions of them within ourselves. These qualitative distinctions are how we come to believe that reasoning and belief and intention and desire exist in the first place: because we can perceive them as being authentically present to different degrees in different situations, even if sometimes we seem to be mimicking the real thing rather than really doing it. It’s what licenses researchers to keep trying to study these things in people despite the difficulty of getting an unbiased read. I’m not saying we shouldn’t look for analogous processes in machines. But we should acknowledge that the referent for these terms is grounded in subjective experience. Among humans that has worked fine, because we assume ourselves to share an understanding of what it’s like to have an inner life.</p>
<p>Andrew recently analogized chatbots with being in autopilot in conversations that occur in a meeting, where if you have a rich enough memory bank of observations related to what the speaker is saying, you can spout appropriate conjectures and feedback without having to exert much conscious effort. But I think fake thinking extends to much more than just being on autopilot in conversation. It can seep into all of our decisions, whether about research ideas or methods or life decisions like what job to have or where to live. It’s very tempting to live by patterns, even when it’s not what you feel you actually want.</p>
<p>Our perceptions that there are more real and more fake versions of our own thinking or reasoning or belief or desire is what makes us feel more “alive” or “awake” in some situations over others. It connects us to the present. Only when we try harder to do these things authentically or recognize the motives that are actually driving us (despite the explanations we might want to assume) do we feel like we are really living, rather than just performing some role. And so our ability to perceive the difference seems intimately connected to the process of understanding ourselves.</p>
<p>From this angle, it is unfortunate how much of the AI rhetoric we’ve come to take for granted (at least in machine learning) — i.e., AI as scientist, AI as decision-maker, etc.— enacts this implicit move of equating human and machine processes by their outputs, and consequently subtly devaluing the role of our internal reality in giving these terms meaning. We take examples from the few domains where language can fully capture reasoning–math, coding–and we reduce all reasoning or thinking or intention to what can be made manifest.</p>
<p>Maybe this helps explain why there’s so much emphasis lately in certain circles (including tech) on being “high agency.” We become more insistent about shaping our external world as we lose appreciation for our internal one.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/23/the-humans-are-imperfect-reporters-too-defense-for-ascribing-little-thoughts-to-machines/feed/</wfw:commentRss>
			<slash:comments>8</slash:comments>
		
		
			</item>
		<item>
		<title>Differences between crackdowns on dissent now and in the early Cold War period</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/23/differences-between-crackdowns-on-dissent-now-and-in-the-early-cold-war-period/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/23/differences-between-crackdowns-on-dissent-now-and-in-the-early-cold-war-period/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Sat, 23 May 2026 13:07:04 +0000</pubDate>
				<category><![CDATA[Political Science]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=52559</guid>

					<description><![CDATA[First, the similarities: 1. Government actors are directly threatening both private citizens and government employees to suppress dissenting speech. 2. The attacks on free expression are notable but they&#8217;re still very rare. Most people in this country can still say &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/23/differences-between-crackdowns-on-dissent-now-and-in-the-early-cold-war-period/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>First, the similarities:</p>
<p>1. Government actors are directly threatening both private citizens and government employees to suppress dissenting speech.</p>
<p>2. The attacks on free expression are notable but they&#8217;re still very rare.  Most people in this country can still say what they want in public without threats.</p>
<p>3.  The pressure from the government is coming down against what are perceived to be left-leaning views, and the government is attacking left-leaning individuals and organizations, along with non-political individuals and organizations that the government is portraying as aiding the left.</p>
<p>4.  Joe McCarthy in the 1950s had a lot of similarities with Donald Trump in the past decade.  See <a href="https://statmodeling.stat.columbia.edu/2016/06/08/donald-trump-and-joe-mccarthy/">my post from 2016</a> discussing this point.</p>
<p>Now, some differences:</p>
<p>1.  Back in the Cold War period, there was a direct threat from the left.  The Communist party and related organizations were allied with the Soviet Union, a rival power.  Far-left groups in the U.S. were generally nonviolent (late 1960s and early 1970s aside) but, at least in theory, they advocated extra-legal overthrow of the U.S. government.  Nowadays it is the right that has the violent rhetoric and it is the right that tried to violently overthrow the government.  In the Cold War, there were right-wing organizations, but, setting aside what was happening in the South, they weren&#8217;t attempting to hold power by violent means.</p>
<p>2.  In the mid-twentieth century, the Communist Party ran its own network of libraries and schools.  (For example, <a href="https://statmodeling.stat.columbia.edu/wp-content/uploads/2025/09/McCarthyism_in_the_Suburbs_Quakers_Communists_and_._-_Introduction.pdf"> here&#8217;s the story</a> of Mary Knowles, the librarian at the William Jeanes Library in Plymouth Meeting, Pennsylvania.  That was our local library when I was a little kid.  It says in the above-linked article that &#8220;from 1944 until 1948, she’d been the secretary at the Samuel Adams School for Social Studies, a Communist Party–funded school for adult learners in Boston.&#8221; In that way, the far left back then is similar to the far right today, with alternative infrastructure, alternative historical narratives, etc.  Back then there was the Daily Worker, now there&#8217;s Fox News.</p>
<p>3.  From the other direction, there are many institutions that were on the right back in the 1940s and 50s but are in the political center now, such as the Roman Catholic Church and much of big business.  There have been many political shifts, organized labor is much much less of a thing, the ultra-rich are loud and powerful in a way that, with rare exceptions, they were not during the Cold War period, etc.</p>
<p>I have <a href="https://statmodeling.stat.columbia.edu/2024/10/29/props-to-the-liberal-anticommunists-of-the-1930s-1950s/">a lot of sympathy</a> for the liberal anticommunists of the 1930s-1950s:  these are people who took a lot of flak from the left and the right at the time and don&#8217;t get so much respect now, but in retrospect I admire their willingness to take a political hit to fight authoritarianism within their own part.  I don&#8217;t have much sympathy for McCarthy etc. who just made stuff up and lied about people, also there&#8217;s a difference between kicking extremists out of a political party and trying to control the speech and political activities of private citizens and government employees.</p>
<p>So what&#8217;s the point of all this comparison?  Just that in many ways what&#8217;s happening now with our authoritarians in power looks like what was happening in the 1950s (but with much less constraint now, in large part because one faction of one party controls all three branches of government), but with the big difference that now there are large, well-organized, and well-funded set of parallel institutions on the right&#8212;not just the Republican party, but far-right news organizations, street gangs, lawyers and judges, political donations, etc.&#8212;as compared to the 1950s when it was the far left and labor unions that had powerful alternative institutions.</p>
<p>I&#8217;m not sure what this all implies; I just think that the similarities are easy to see but the differences are important too.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/23/differences-between-crackdowns-on-dissent-now-and-in-the-early-cold-war-period/feed/</wfw:commentRss>
			<slash:comments>51</slash:comments>
		
		
			</item>
		<item>
		<title>Don&#8217;t cite sources you haven&#8217;t read, and don&#8217;t trust when people claim to be reporting something from the literature.</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/22/dont-cite-sources-you-havent-read-and-dont-trust-when-people-claim-to-be-reporting-something-from-the-literature/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/22/dont-cite-sources-you-havent-read-and-dont-trust-when-people-claim-to-be-reporting-something-from-the-literature/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Fri, 22 May 2026 13:32:32 +0000</pubDate>
				<category><![CDATA[Miscellaneous Science]]></category>
		<category><![CDATA[Zombies]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53077</guid>

					<description><![CDATA[Peter Dorman writes: In case you haven&#8217;t seen it, check out this recent piece in Rolling Stone. A key paragraph toward the end: Craig Callender, a philosophy professor at the University of California San Diego and president of the Philosophy &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/22/dont-cite-sources-you-havent-read-and-dont-trust-when-people-claim-to-be-reporting-something-from-the-literature/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>Peter Dorman writes:</p>
<blockquote><p>In case you haven&#8217;t seen it, check out <a href="https://archive.ph/0Iuo6">this recent piece</a> in Rolling Stone.  A key paragraph toward the end:</p>
<blockquote><p>Craig Callender, a philosophy professor at the University of California San Diego and president of the Philosophy of Science Association, agrees with that assessment, observing that “the appearance of legitimacy to non-existent journals is like the logical end product of existing trends.” There are already journals, he explains, that accept spurious articles for profit, or biased ghost-written research meant to benefit the industry that produced it. “The ‘swamp’ in scientific publishing is growing,” he says. “Many practices make existing journals [or] articles that aren’t legitimate look legitimate. So the next step to non-existent journals is horrifying but not too surprising.”</p></blockquote>
<p>The one point I [Peter] would add is that the first step to perdition is citing sources you haven&#8217;t actually read or at least looked through yourself, relying instead on what other people said about them, or simply that they were used as citations in other articles.  I know I&#8217;ve been tempted to do this sometimes because it takes extra time to do it right, and I may be in a hurry, but then I manage to impose a little integrity on myself.  After you&#8217;ve taken the first step toward &#8220;blind&#8221; citation, however, everything else becomes possible &#8212; and now even likely.</p></blockquote>
<p>This reminds me of something that my adviser once told me, that he made a principle of never putting his name on a document that he hadn&#8217;t read.</p>
<p>As to citing sources, yeah, I agree with Dorman 100%.  You should only cite sources that you, or one of your coauthors, have read or have at least looked through.</p>
<p>A related point:  When you&#8217;re reading a paper and going through its references, you will sometimes find that the paper described the reference inaccurately.  As we wrote in Section 3 of <a href="https://sites.stat.columbia.edu/gelman/research/published/healing3.pdf">this paper</a>:</p>
<blockquote><p>Unreplicable claims based on weak theory can gain apparent support by connections to related published work. Three problems can arise.</p>
<p>First, the connections between the cited literature and the new study can be tenuous, and this can particularly be an issue when the underlying theory is vague. Ideas such as embodied cognition, evolutionary psychology, nudging, mindfulness, or mind-body unity are general enough to encompass a wide range of potential phenomena, to the extent that there is almost no limit to the past studies that could be thought to have some possible relevance to any new experiment.</p>
<p>Second, informal literature reviews are subject to selection bias. An article promoting a controversial idea can easily cite studies claiming to have found evidence for related ideas, while avoiding citations of failed replications or papers suggesting alternative theories. This can even be a problem with systematic meta-analyses, if the entire subfield being meta-analyzed is full of studies with uncontrolled researcher degrees of freedom.</p>
<p>Third, the interpretation of individual studies being cited can be seriously flawed. This is a problem of citing past literature as support for a general claim without looking at exactly what was done in the cited research and without following up on that work. Here we discuss three different examples of this sort of misinterpretation of the literature cited in the paper under discussion. . . .</p></blockquote>
<p>So, yeah, don&#8217;t trust what someone writes about a cited study.  Take a look at it yourself.</p>
<p><strong>P.S.</strong>  I wrote this post several months ago.  It&#8217;s just a coincidence that it happened to show up the same week as <a href="https://statmodeling.stat.columbia.edu/2026/05/20/what-do-i-think-about-that-proposed-arxiv-policy-to-ban-authors-of-papers-with-ai-slop/">What do I think about that proposed Arxiv policy to ban authors of papers with AI slop?</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/22/dont-cite-sources-you-havent-read-and-dont-trust-when-people-claim-to-be-reporting-something-from-the-literature/feed/</wfw:commentRss>
			<slash:comments>18</slash:comments>
		
		
			</item>
		<item>
		<title>Full day Stan tutorial at Modern Modeling Methods (M3) this summer in New York (22 June 2026)</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/21/full-day-stan-tutorial-at-modern-modeling-methods-m3-this-summer-in-new-york-22-june-2026/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/21/full-day-stan-tutorial-at-modern-modeling-methods-m3-this-summer-in-new-york-22-june-2026/#comments</comments>
		
		<dc:creator><![CDATA[Bob Carpenter]]></dc:creator>
		<pubDate>Thu, 21 May 2026 19:00:23 +0000</pubDate>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Stan]]></category>
		<category><![CDATA[Teaching]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53756</guid>

					<description><![CDATA[This post is from Bob Mitzi Morris and Bob Carpenter, two of Stan&#8217;s developers, will be presenting a tutorial on Stan and Bayesian data analysis aimed at psychometricians this summer. Modern Modeling Methods Conference (M3), Fordham University Lincoln Center Campus, &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/21/full-day-stan-tutorial-at-modern-modeling-methods-m3-this-summer-in-new-york-22-june-2026/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><I><b>This post is from Bob</b></I></p>
<p>Mitzi Morris and Bob Carpenter, two of Stan&#8217;s developers, will be presenting a tutorial on Stan and Bayesian data analysis aimed at psychometricians this summer.</p>
<ul>
<li><a href="https://modeling.fordham.edu">Modern Modeling Methods Conference (M3)</a>, Fordham University Lincoln Center Campus, Manhattan, June 22&#8211;24
<li><a href="https://modeling.fordham.edu/2026-modern-modeling-methods-conference/modern-modeling-methods-2026-preliminary-program/">Program here</a>
</ul>
<p><b>Abstract</b></p>
<p>This workshop is a full day, hands-on introduction to Bayesian modeling and statistical inference using the probabilistic programming language Stan.</p>
<p>The course will be organized around the key properties of Bayesian statistical modeling for science, including the nature of uncertainty, modeling a generative process through a data generating distribution, modeling existing knowledge through a prior, and pushing uncertainty through inference. As we do this, we will show how Stan can be used to both code the models and perform statistical inference for quantities of interest, be they retrospective parameter estimates or prospective predictions or forecasts. We will concentrate on full Bayesian posterior inference, including a discussion of calibration, model checking for both prior and posterior inference, and model comparison with cross-validation. We will spend some time showing how some structural equation models (SEM) can be translated directly to Stan and will also introduce psychological models for educational testing, crowdsourcing, rating and ranking, and real-time decision processes.</p>
<p>This class will require a notebook computer with a network connection (Wifi will be available in the classroom). We will use the <a href="https://stan-playground.flatironinstitute.org">Stan Playground</a>, which runs Stan in the browser, which we will pre-populate with models of interest. We will probably also break into R or Python at various points to demonstrate methods not yet supported by the Playground, such as the brms regression expression language.</p>
<p><b>Andrew on psychometrics</b></p>
<p>Andrew once told me that any model you could come up with was probably invented by a psychometrician 50 years ago (make that 60&#8212;he said it at least 10 years ago).  I have evidence that he&#8217;s right form the project that drew me into Bayesian statistics&#8212;crowdsourcing.  Andrew and Jennifer Hill helped me formulate a crowdsourcing model where raters give you noisy measurements of underlying categorical variables (e.g., they answer survey questions about whether a word in context is a noun, for example, to use something I was working on at the time).  Turns out Phil Dawid and A.P. Skene published the same model in 1979 in one of the earlier applications of the expectation maximization (EM) algorithm and they used natural language data (drawn from medical records).</p>
<p><b>The rest of the conference</b></p>
<p>The rest of the program looks really great&#8212;it&#8217;s just the kind of applied wrestling with real data that I like.</p>
<p>Speaking of Jennifer Hill, she&#8217;s one of the keynote speakers at M3.  Every talk of Jennifer&#8217;s I&#8217;ve attended has been great.  You may know her as Andrew&#8217;s co-author on the regression books, which I cannot recommend highly enough if you&#8217;re interested in this kind of applied modeling.  </p>
<p><b>Beyond the conference</b></p>
<p>You see the same kind of Bayesian modeling focus for real data at venues such as ISEC (international ecology conference to which I went to once just because I like these models and these kinds of conferences), StanCon (see you in Uppsala in August!), and GeoMed (which Mitzi attends).  I&#8217;m sure there are more in other fields.  I&#8217;m always disappointed that there&#8217;s almost nothing like these kinds of nitty-gritty applied papers at ISBA (Nagoya this summer) or BayesComp (somewhere next year). The conferences about Bayesian statistics or computing that I&#8217;ve been to have all been super theoretical.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/21/full-day-stan-tutorial-at-modern-modeling-methods-m3-this-summer-in-new-york-22-june-2026/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title>&#8220;The Quick Fix:  Why Fad Psychology Can&#8217;t Cure Our Social Ills&#8221;</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/21/the-quick-fix-why-fad-psychology-cant-cure-our-social-ills/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/21/the-quick-fix-why-fad-psychology-cant-cure-our-social-ills/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Thu, 21 May 2026 13:55:43 +0000</pubDate>
				<category><![CDATA[Miscellaneous Science]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53728</guid>

					<description><![CDATA[I heard about the above-titled book by science journalist Jesse Singal when it came out, actually before it came out, as the author had talked with me about some of the topics in the book and had run some passages &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/21/the-quick-fix-why-fad-psychology-cant-cure-our-social-ills/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>I heard about the above-titled book by science journalist Jesse Singal when it came out, actually before it came out, as the author had talked with me about some of the topics in the book and had run some passages by me to check before publication.  But I only happened to read it recently.  As I often say with this kind of book, I&#8217;m not the intended audience; still, I find it interesting to see what people have to say about the replication crisis and related problems with public science.</p>
<p>One thing I like about this book is its modesty:  it doesn&#8217;t push any big theories, instead going through several examples of hyped psychology research that found its way into public attention and policy.  The examples are &#8220;self-esteem,&#8221; &#8220;superpredators,&#8221; &#8220;power pose,&#8221; &#8220;positive psychology,&#8221; &#8220;grit,&#8221; &#8220;implicit bias,&#8221; &#8220;social priming,&#8221; and &#8220;nudging.&#8221;  I&#8217;m putting all these terms in scare quotes because they&#8217;re slogans as much as anything else.</p>
<p>I don&#8217;t want to say that Singal offers no big-picture perspective.  His larger theme, beyond his discussion of how shaky ideas can get publicity and influence, is that that this sort of push-a-button-change-your-life version of psychology is appealing as a way for people to make sense of their lives in a time of uncertainty:</p>
<blockquote><p>The world is a big and scary place, and imposing structures often circumscribe our room to maneuver.  That&#8217;s why the reductive storytelling of grit is so appealing; that&#8217;s why we turn to figures like Martin Seligman for positive thinking that can shield us from trauma, or Amy Chua for parenting advice, or Angela Duckworth for the secret to grittiness.  There&#8217;s always the idea of the side entrance; there&#8217;s always something you, the individual, can do to regain control in a world that sometimes seems hell-bent on robbing you of it.</p></blockquote>
<p>Well put.  The question, then, is what&#8217;s new about this?  Dale Carnegie&#8217;s &#8220;How to Win Friends and Influence People&#8221; came out in 1936, has sold a zillion copies since then, and remains in print.  Before that there was Christian Science and all sorts of self-help movements.  One thing that distinguishes &#8220;grit,&#8221; &#8220;nudge,&#8221; and &#8220;positive psychology&#8221; from its predecessors in the bestseller list and the pulpit is that these more recent ideas are coming from academia and have elite media endorsements.  Some of this could just be that the academic world is much bigger than it used to be.  Just as many people who in the past would have been professional novelists but now support themselves as university teachers, so it could be that professors such as Duckworth, Sunstein, etc., would in the past have been freelance authors or speakers.  Indeed, the recent rise of celebrity podcasters has perhaps taken us back to the Dale Carnegie era.</p>
<p>Some of this came up in our <a href="https://statmodeling.stat.columbia.edu/2025/12/08/my-new-class-this-spring-pols-4280-rationalizing-the-world-the-hopes-and-disappointments-of-american-social-science-from-1900-to-the-present/">Rationalizing the World course</a>:  the idea that intellectual movements can come from academic social science or from the popular or commercial world.  It may be that the &#8220;Quick Fix&#8221; era of 2010-2015&#8211;the high-water mark of Psychological Science, PNAS, Ted, and NPR&#8211;was a transient period.  </p>
<p>The celebrity-academic-social-science era took off in 2005 with Freakonomics and ended in 2020 with Covid.  If ever there was a time for society to cash the check  of the promise of academic social science for society, it was the pandemic.  And academic social science pretty much failed the test.  <em>Science</em> didn&#8217;t fail&#8211;vaccines were rapidly developed, tested, and implemented&#8211;but social science didn&#8217;t do so well, and the authority of science in society failed, as evidenced by vaccine deniers <a href="https://statmodeling.stat.columbia.edu/2026/04/28/what-bothered-me-with-the-conversation-of-jay-bhattacharya-and-emily-oster/">taking over</a> our public health establishment.</p>
<p>The thing that really struck me about Singal&#8217;s book, though, was how it seems like a product of a much more innocent time than today.  The book came out in 2021 so it was finished in 2020, but I wonder if much of it was written a few years before that.  I say this for two reasons.  First, he was so gentle on many of the researchers.  He keeps giving them the benefit of the doubt.  He even had good things to say about <a href="https://statmodeling.stat.columbia.edu/2020/08/03/kinda-like-reeses-pieces-if-you-dont-like-chocolate-and-you-dont-like-peanut-butter/">the notorious</a> Roy Baumeister!</p>
<p>Second, Singal&#8217;s political take on junk science was so mellow.  Yes, he considers it to be an important problem&#8211;he did write a whole book on the topic!&#8211;and he makes the political point that supposed &#8220;one quick trick&#8221; tweaks (or as Brian &#8220;Pizzagate&#8221; Wansink would say, <a href="https://statmodeling.stat.columbia.edu/2018/09/23/tweeking-big-problem-not-think/">&#8220;tweeks&#8221;</a>) can be viewed as a distraction from serious political reform, in the same way that do-it-yourself environmentalism could be a substitute for real action on global warming.  As Singal says, one of the problems with the &#8220;quick fix&#8221; mentality is that it places the burden of societal problems on individuals, which might not be such a bad idea, except that these quick fixes don&#8217;t work.  There are also some logical issues that we&#8217;ve discussed which he doesn&#8217;t get into, such as the <a href="https://sites.stat.columbia.edu/gelman/research/published/piranha_published.pdf">piranha problem</a> that all these purportedly large effects can&#8217;t coexist and the competition problem that, even if something like &#8220;power pose&#8221; worked as advertised, it won&#8217;t help you get that job if your competitors are power posing too.</p>
<p>OK, yeah, so there&#8217;s some political content.  But, as of 2017 or 2020 or whenever most of the book was written, there was less of a sense that the political stakes were high.  Yes, there was political polarization; yes, the 2016 election featured two unpopular candidates; etc.  But it wasn&#8217;t like now, where the government is shooting protesters on the street, shooting people in boats, starting wars, etc.  Not to mention government science policy:  global-warning denial, vaccine denial, and public health authorities that are such a joke that the best thing that even their supporters can say is that the new dietary guidelines is that they&#8217;re &#8220;not crazy&#8221;&#8211;yeah, that was <a href="https://statmodeling.stat.columbia.edu/2026/01/08/the-soft-bigotry-of-low-expectations/">supposed to be</a> a positive reaction.</p>
<p>Now, partisans of different sides will have different reactions here.  The Democratic take is that the Trump administration is a bunch of corrupt liars.  The Republican take is that the science establishment has been in the tank for the Democrats for awhile now, and there&#8217;s no good alternative but to shake things up:  it&#8217;s too bad about the stupid dietary guidelines, but the only way forward is to start over and show the public some respect.</p>
<p>My point here is not to try to resolve this political debate, but rather to emphasize that it exists, and to a much greater extent now than six or ten years ago.</p>
<p>One way to see this is to compare Singal&#8217;s book, The Quick Fix, to the podcast <a href="https://statmodeling.stat.columbia.edu/2026/05/17/if-books-could-kill-podcast/">If Books Could Kill</a> by Michael Hobbes and Peter Shamshiri.  In content, Singal&#8217;s chapter on &#8220;grit&#8221; is very similar to Hobbes and Shamshiri&#8217;s recent episode on the topic&#8211;but the tone is much different.  Part of this can be attributed to politics&#8211;Singal is center-left, while Hobbes and Shamshiri are far-left (at least, in the U.S. context)&#8211;and part could be just differences in personality:  maybe Singal is just a mellower person.  But I don&#8217;t think it&#8217;s just that, and I say this in part because I think Singal really is bothered by science hype.  I think it&#8217;s mostly that the stakes are higher now.  As Hobbes and Shamshiri say, the &#8220;grit&#8221; phenomenon is mostly played out, but educational fads haven&#8217;t gone away.  (A weakness of If Books Could Kill is that it&#8217;s so politically slanted that even when the hosts are getting all the facts right, they can be missing important context, for example not talking about various liberal-leaning educational fads of recent decades.)  But the issue here isn&#8217;t really &#8220;grit&#8221; or the associated publicity campaigns or even education policy; it&#8217;s more the sense from people on both sides that the stakes are high.</p>
<p>I think it would be harder for a journalist like Singal to write a book like The Quick Fix with such a mild tone today.</p>
<p>Not that the replication crisis has gotten worse&#8211;it&#8217;s my impression that in most ways the promotion of junk science has diminished.  Gladwell, Freakonomics, and Ted are still around, but I think they take up a smaller fraction of the public bandwidth nowadays; NPR and other journalistic outlets seem more likely to approach new claims with skepticism, even the Association for Psychological Science has reformed a bit, finally publishing <a href="https://statmodeling.stat.columbia.edu/2025/08/31/thank-you-perspectives-on-psychological-science-for-finally-getting-your-act-together/">my letter</a> refuting their earlier-published lies about me.  Nobody&#8217;s calling us terrorists or Stasi anymore, either.</p>
<p>So, yeah, I do think social-science hype has diminished.  But, outside of academia and the news media, bad things are happening:  vaccine denial and all the rest.  And I think it&#8217;s hard to write about problems in science without all that casting a shadow.  The Quick Fix was written in a more innocent time.</p>
<p>To put it another way, examples such as Stanford&#8217;s Andrew Huberman shilling supplements or Harvard&#8217;s Cass Sunstein partying with Henry Kissinger may have in the past seemed like peccadillos, somewhat relevant to the exploration of bad judgment by credentialed scholars but not so important to the big picture.  But now that the manufacturers of supplements have taken over the public health authorities and the modern-day Kissingers are starting new wars, all this takes a more serious turn.  I&#8217;m not trying to blame Huberman or Sunstein for our government&#8217;s misdeeds (or for the misdeeds of the Biden administration, for that matter); I&#8217;m just saying that the corruption of science and the science media is concerning now on a new level beyond what it was as of the writing of The Quick Fix, and I think that any new book on unreplicable science would have to address these larger issues.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/21/the-quick-fix-why-fad-psychology-cant-cure-our-social-ills/feed/</wfw:commentRss>
			<slash:comments>19</slash:comments>
		
		
			</item>
		<item>
		<title>What do I think about that proposed Arxiv policy to ban authors of papers with AI slop?</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/20/what-do-i-think-about-that-proposed-arxiv-policy-to-ban-authors-of-papers-with-ai-slop/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/20/what-do-i-think-about-that-proposed-arxiv-policy-to-ban-authors-of-papers-with-ai-slop/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Wed, 20 May 2026 13:23:55 +0000</pubDate>
				<category><![CDATA[Literature]]></category>
		<category><![CDATA[Sociology]]></category>
		<category><![CDATA[Zombies]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53750</guid>

					<description><![CDATA[Tim First writes: I’m curious what your thoughts are on the new arXiv policy that authors will be banned for a year if their paper includes mistakes due to the use of AI. My (uninformed) thoughts: 1. arXiv is acting &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/20/what-do-i-think-about-that-proposed-arxiv-policy-to-ban-authors-of-papers-with-ai-slop/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>Tim First writes:</p>
<blockquote><p>I’m curious what your thoughts are on <a href="https://x.com/tdietterich/status/2055000956144935055">the new arXiv policy</a> that authors will be banned for a year if their paper includes mistakes due to the use of AI. </p>
<p>My (uninformed) thoughts:<br />
1. arXiv is acting more like a publisher here. Interesting, and probably a good thing given how it’s used.<br />
2. In an ideal world, every author on a paper would be intimately involved with the writing of the paper and be familiar with every citation. But that’s not realistic in many fields! Some papers have dozens of authors on them, and only a few of those authors are directly involved in the writing of the paper.<br />
3. I understand that misconduct in research is an issue and that we should hold academics to high standards, but I worry this will have a chilling effect on collaboration. If someone reaches out to you to collaborate, the floor for your effort has gone way up.</p>
<p>What are your thoughts?</p></blockquote>
<p>First off, I&#8217;ve been told that this if this policy were applied, it would only apply to the computer science part of Arxiv.  CS is already out of control with publishing:  there are a zillion Ph.D. students, each of whom is pressured to publish something like 8 conference papers a year, and I guess a lot of these go to Arxiv.  And remember <a href="https://statmodeling.stat.columbia.edu/2025/05/19/if-only-arxiv-required-researchers-to-sign-at-the-top-rather-than-the-bottom-of-the-page-none-of-this-wouldve-happened/">this story</a> from last year?</p>
<p>Setting that aside, I guess that this ban could happen to me, if I collaborate with somebody who inserts AI slop into a paper.  I already know people who tell me they use AI to write emails for them, and it doesn&#8217;t seem so much different for this to be used in text for a research article.  Maybe the next step is to require coauthors to sign a form attesting that they have added no slop (&#8220;inappropriate language, plagiarized content, biased content, errors, mistakes, incorrect references, or misleading content&#8221;).  On one hand this sounds pretty extreme; on the other hand, the actual examples given by the Arxiv guy (&#8220;hallucinated references, meta-comments from the LLM (&#8216;here is a 200 word summary; would you like me to make any changes?&#8217;; &#8216;the data in this table is illustrative, fill it in with the real numbers from your experiments&#8217;)&#8221; seem like they should be catchable.  The big concern would be if there&#8217;s a jointly-authored article and the AI-slop-inserter happens to be the last person to have edited it before submitting to Arxiv.</p>
<p>That all said, this could well be a better alternative than the current situation where I guess they&#8217;re getting lots of slop.  Fake citations <a href="https://arxiv.org/abs/2605.07723">are a thing</a>.  The Arxiv people are doing their work for free, or close to free, so if they&#8217;re getting overwhelmed, they have to do something.  The complainers on the above-linked twitter thread seem to miss the point in that they&#8217;re implicitly evaluating the policy based on its potential costs without balancing against the benefits.</p>
<p>First responded:</p>
<blockquote><p>I similarly wondered if requiring some coauthors to sign something could help. Maybe a single author on each paper should declare themselves responsible for the contents of the paper and take the blame if the paper has slop in it? I think that would undermine most of the concerns around collaborations.</p></blockquote>
<p>I guess the simplest rule would be for the first author to take responsibility by default.  Setting aside chatbots and all the rest, there&#8217;s a problem in joint-authored papers with plain old incompetence or fraud.  Consider the unfortunate case of Dan Ariely, a scrupulously honest researcher who has had <a href="https://statmodeling.stat.columbia.edu/2021/08/19/a-scandal-in-tedhemia-noted-study-in-psychology-first-fails-to-replicate-but-is-still-promoted-by-npr-then-crumbles-with-striking-evidence-of-data-fraud/">the misfortune</a> to repeatedly be coauthor on papers involving fake data.  Multiple authors and nobody takes responsibility.</p>
<p>I also asked Arxiv founder Paul Ginsparg, who replied:</p>
<blockquote><p>My thoughts are more along the lines of what do we do in three months when<br />
(a) the median LLM-produced cs paper is better than can be produced by the median cs grad student.<br />
and what do we do in six months when<br />
(b) an elementary school student produces a paper with a legitimately important theoretical result in physics, but sans LLM can&#8217;t explain a word of it.</p>
<p>For expanded version of these thoughts, <a href="https://www.cs.cornell.edu/~ginsparg/PG_Colloq_23Feb26.mp4">here&#8217;s a recording</a> of a recent colloq I gave that includes arXiv data on rejections.</p></blockquote>
<p>Here&#8217;s the abstract of Ginsparg&#8217;s talk, &#8220;The Rise of Slop&#8221;:</p>
<blockquote><p>The rapid proliferation of AI-generated text, figures, code, and reviews is reshaping the ecology of research communication. AI has made it cheap to produce text that looks like knowledge. The result is an emerging deluge of &#8220;slop&#8221;: fluent, confident, and often content-light material that strains the filters of research and academic publishing. While generative tools promise efficiency, accessibility, and assistance, they also undermine the fragile social contracts on which scholarship depends: authorship, attribution, review, and trust. This talk argues that the core danger of AI-generated slop is not isolated fraud or error, but volume: content produced faster than it can be meaningfully evaluated. The rise of slop forces a choice: retrofit our scholarly systems for an age of cheap language, or watch signal drown in noise.</p>
<p>[Disclaimer: this abstract was 100% LLM-generated.]</p></blockquote>
<p>I&#8217;ll conclude by saying that we&#8217;re proceeding on two tracks.  Or maybe I should say there are two worlds.</p>
<p>In one world, we&#8217;re doing research as before, using computers as tools, trying to understand what&#8217;s going on, and writing up our results as research papers.  For example, just the other day I met with some students on our project of agent-based modeling of coalition formation (a followup of <a href="https://sites.stat.columbia.edu/gelman/research/published/blocs.pdf">this research</a> from 2003).  We have some results, we&#8217;re trying to figure out where to go next, and we&#8217;re writing a paper together.</p>
<p>In the other world, students are pressured to whip out a high volume of papers and get some of them accepted at top conferences and journals.  You improve your odds by submitting more things, and you can &#8220;do a <a href="https://statmodeling.stat.columbia.edu/2011/04/22/arrows_other_th/">Bruno Frey</a>&#8221; and publish the same idea five times.  If that&#8217;s the way you wanna go, a chatbot can be really helpful!  Not just to refactor your code or clean up your writing, but to create those many many papers that you&#8217;re planning to submit to all those places.</p>
<p>It&#8217;s weird that these two worlds are coexisting.  Kind of like how, in publishing, there&#8217;s a world of people writing and publishing books that other people want to read, and another world of fake books being spammed on Amazon to catch the suckers.  Or how we get a mix of real blog comments, spam, and those weird intermediate cases where the commenter seems to actually have something to say, but then the url is some spam link.  Usually I delete those, but sometimes I keep them and just strip out the URL and email address before posting.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/20/what-do-i-think-about-that-proposed-arxiv-policy-to-ban-authors-of-papers-with-ai-slop/feed/</wfw:commentRss>
			<slash:comments>36</slash:comments>
		
		<enclosure url="https://www.cs.cornell.edu/~ginsparg/PG_Colloq_23Feb26.mp4" length="110420275" type="video/mp4" />

			</item>
		<item>
		<title>Survey Statistics: GREG</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/19/survey-statistics-greg/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/19/survey-statistics-greg/#comments</comments>
		
		<dc:creator><![CDATA[shira]]></dc:creator>
		<pubDate>Tue, 19 May 2026 21:33:09 +0000</pubDate>
				<category><![CDATA[Causal Inference]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53741</guid>

					<description><![CDATA[I just got to chat with Andrew and some of the authors of the MrPlew paper: Ryan Giordano, Erin Hartman, and Avi Feller. Lots more I have to digest here ! The paper came out while the polar bear and I &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/19/survey-statistics-greg/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>I just got to chat with Andrew and some of the authors of <a href="https://statmodeling.stat.columbia.edu/2026/05/18/mrplew-locally-equivalent-weights-for-multilevel-regression-and-poststratification/">the MrPlew paper</a>: Ryan Giordano, Erin Hartman, and Avi Feller. Lots more I have to digest here ! The paper came out while the polar bear and I were crossing from TN into VA.</p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53748" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Doobie_TN_VA_border_May_2026-scaled.jpg" alt="" width="347" height="262" /></p>
<p>We talked about using <a href="https://statmodeling.stat.columbia.edu/2025/06/17/survey-statistics-3-flavors-of-survey-weights/">a model for response R, a model for outcome Y, or both</a>. So GREG came up, and Andrew asked &#8220;what&#8217;s GREG ?&#8221; Good question.</p>
<p>GREG is Generalized REGression estimator. <a href="https://link.springer.com/book/9780387406206">Särndal, Swensson, Wretman (1992)</a> has a nice section that writes it in a few alternative ways:</p>
<p>1. Adjust an estimate based on the model with a Horvitz-Thompson estimate of the error:</p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53745" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/GREG_SSW_p230.png" alt="" width="403" height="88" /></p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53746" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/GREG_SSW_p231.png" alt="" width="411" height="176" /></p>
<p>2. Or on the flip side, you can see it as adjusting the Horvitz-Thompson estimate with the model:</p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53747" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/GREG_SSW_p232.png" alt="" width="438" height="43" /></p>
<p>It&#8217;s called GREG for <strong><em>Generalized </em></strong>REGression estimator,<strong> what is being generalized ?</strong></p>
<p><a href="https://onlinelibrary.wiley.com/doi/book/10.1002/9780470580066">Lumley 2010</a> made me think we were generalizing to continuous X variables:</p>
<p><img loading="lazy" decoding="async" class="" src="https://www.wiley.com/storefront-pdp-assets/_next/image?url=https%3A%2F%2Fmedia.wiley.com%2Fproduct_data%2FcoverImage300%2F07%2F04702843%2F0470284307.jpg&amp;w=640&amp;q=75" alt="Preview" width="185" height="284" /></p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53743" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/GREG_Lumley_p141.png" alt="" width="443" height="214" /></p>
<p><a href="https://www.sharonlohr.com/sampling-design-and-analysis-3e">Sharon Lohr’s book</a> made me think we were generalizing beyond simple random samples:</p>
<p><img loading="lazy" decoding="async" class="" src="https://images.squarespace-cdn.com/content/v1/5b7f148eec4eb7ee4ea24591/1628531619965-8HZT50HHRWEHVTBYK88X/Sampling3ecover125.png" alt="Sampling Design and Analysis: Third Edition — Sharon Lohr" width="172" height="252" /></p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53742" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/GREG_Lohr_p598.png" alt="" width="411" height="509" /></p>
<p><a href="https://link.springer.com/book/9780387406206">Särndal, Swensson, Wretman (1992)</a> made me think we were generalizing to multiple X  variables:</p>
<p><img loading="lazy" decoding="async" class="" src="https://m.media-amazon.com/images/I/611nI4RUKyL._AC_UF1000,1000_QL80_.jpg" alt="Amazon.com: Model Assisted Survey Sampling (Springer Series in Statistics): 9780387406206: Särndal, Carl-Erik, Swensson, Bengt, Wretman, Jan: Books" width="168" height="251" /></p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53744" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/GREG_SSW_p225.png" alt="" width="438" height="357" /></p>
<p>Regardless of the exact origin of the name, <strong>GREG has connections to the Doubly Robust literature in causal inference</strong> (as <a href="https://arxiv.org/abs/1909.00066">Coston et al. (2020)</a> note in a footnote). Any favorite references making these connections ?</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/19/survey-statistics-greg/feed/</wfw:commentRss>
			<slash:comments>6</slash:comments>
		
		
			</item>
		<item>
		<title>James Heathers will fix Wiley&#8217;s problems for less than 3.7 million dollars (that is, 2,553,739 Jamaican beef patties, 47,064 whisky-sodden meals at Newark airport, or nearly 218 invites to a conference featuring Gray Davis, Grover Norquist, and a rabbi)</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/19/j/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/19/j/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Tue, 19 May 2026 13:55:42 +0000</pubDate>
				<category><![CDATA[Decision Analysis]]></category>
		<category><![CDATA[Economics]]></category>
		<category><![CDATA[Miscellaneous Science]]></category>
		<category><![CDATA[Zombies]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=51252</guid>

					<description><![CDATA[The data thug quotes from: an April 2023 post from the EVP of Research at Wiley: In September 2022, Wiley identified and immediately alerted the industry to paper mill activity we found operating at scale. Specifically, we found fraudulent outside &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/19/j/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><a href="https://jamesclaims.substack.com/p/the-hindawi-files-part-3-wiley">The data thug quotes from</a>: an April 2023 post from the EVP of Research at Wiley:</p>
<blockquote><p>In September 2022, Wiley identified and immediately alerted the industry to paper mill activity we found operating at scale. Specifically, we found fraudulent outside editors that had subverted our processes and workflows, leading to a proliferation of bad content. This scheme hit Hindawi’s Special Issues program hard.</p></blockquote>
<p>For those who are unfamiliar with academic publishing:  Wiley is a long-established firm.</p>
<p>Back when I was a student, Wiley was perhaps considered the #1 publisher within statistics.  They published Feller&#8217;s classic books on probability, Cochran&#8217;s classics on design of experiments and survey sampling, and many other standard texts.</p>
<p>In recent decades, as with other academic publishers, they&#8217;ve branched out into other publishing-related businesses, for example, Hindawi, which has a habit of filling your inbox with spam about dodgy journals.  From Wikipdia:  &#8220;In 2023 and after over 7000 article retractions in Hindawi journals related to the publication of articles originating from paper mills, Wiley announced that it will cease using the Hindawi brand and will integrate Hindawi&#8217;s 200 remaining journals into its main portfolio. The Wiley CEO who initiated the Hindawi acquisition stepped down in the wake of those announcements.&#8221;</p>
<p>To those of us of a certain age, seeing Wiley and Hindawi in the same sentence is disturbing in itself, a sign of what the world of publishing has come to.  Not that publishing has ever been pure&#8212;just for example, back in the 1960s and 70s, legitimate publishers released fake-science books such as Chariots of the Gods, The Bermuda Triangle, and The Jupiter Effect&#8212;; still, it was sad to see the once-respected Wiley name dragged so low.</p>
<p><strong>You can hire James Heathers for less than $3.7 million</strong></p>
<p>Heathers points out that, because Wiley is a public company, certain of its business records are required to be public, and he found this:</p>
<blockquote><p><img loading="lazy" decoding="async" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2024/10/image-1-1.jpg" alt="" width="736" height="359" class="alignnone size-full wp-image-51253" srcset="https://statmodeling.stat.columbia.edu/wp-content/uploads/2024/10/image-1-1.jpg 736w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2024/10/image-1-1-300x146.jpg 300w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2024/10/image-1-1-500x244.jpg 500w" sizes="(max-width: 736px) 100vw, 736px" /></p></blockquote>
<p>Heathers explains:</p>
<blockquote><p>‘Legal settlement’ is exactly what it sounds like, and the footnote description is ‘a litigation matter related to consideration for a previous acquisition’.</p>
<p>The shorthand is: their own shareholders sued them. They said they were going to, and did. . . .</p>
<p>This is not uncommon . . . Any large public company in business for long enough has seen a suit or two like this. . . . Generally, they settle. . . . this is noticeably more expensive than running a full-scale proactive research integrity program.</p></blockquote>
<p>And here&#8217;s the kicker:</p>
<blockquote><p>For 3.7M, you could have the world. I [Heathers] am quite confident in saying: I could run that as the operating budget of a fraud mitigation unit for multiple years, and drop the amount of nonsense by . . . maybe two-thirds, three-quarters? within that time.</p></blockquote>
<p>All right, then!</p>
<p><strong>This is not new to Wiley</strong></p>
<p>Just one thing.  This is not new.  Wiley&#8217;s been in the lucrative science-fraud business for awhile.  Recall <a href="https://statmodeling.stat.columbia.edu/2011/09/28/wiley-wegman-chutzpah-update/">this story from 2011</a>, &#8220;Wiley Wegman chutzpah update: Now you too can buy a selection of garbled Wikipedia articles, for a mere $1400-$2800 per year!&#8221;</p>
<p>But, yeah, the Hindawi business sounds a lot worse.  When Wiley was conned by a formerly respected academic into republishing Wikipedia content and charging money for it, that was just a one-time breach in editorial standards.  The Hindawi story seems like something else entirely.  On the other hand, when it comes to fraudulent publishing, they had some track record.</p>
<p><strong>Adversarial journalism</strong></p>
<p>Heathers writes:</p>
<blockquote><p>There absolutely IS adversarial journalism in academia/research/science/etc. Science Magazine, Undark, Vox, etc. have all published great pieces on this.</p></blockquote>
<p>Ahhhh, Undark Magazine . . . that <a href="https://statmodeling.stat.columbia.edu/2020/06/13/fake-mit-journalists-misrepresent-real-buzzfeed-journalist-maybe-we-shouldnt-be-so-surprised/">brings up memories</a>.  A few years ago, Undark published a terrible article, misrepresenting a scientific story in which I&#8217;d been involved.  That was adversarial journalism in the worst sense, in that the journalists were coming in with an agenda and using it to distort and slam anyone who disagreed with it.  <a href="https://statmodeling.stat.columbia.edu/2020/12/07/unlike-mit-scientific-american-does-the-right-thing-and-flags-an-inaccurate-and-irresponsible-article-that-they-mistakenly-published/">See here</a> for more on that story.</p>
<p>I&#8217;m not saying you shouldn&#8217;t trust anything in Undark just cos they ran that one bad article, any more than I&#8217;d un-recommend the books by Cochran just cos they were published by Wiley.  I just thought it was funny that Heathers mentioned Undark in particular, given that my only experience with that magazine was so unpleasant.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/19/j/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
			</item>
		<item>
		<title>MrPlew:  Locally Equivalent Weights for Multilevel Regression and Poststratification</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/18/mrplew-locally-equivalent-weights-for-multilevel-regression-and-poststratification/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/18/mrplew-locally-equivalent-weights-for-multilevel-regression-and-poststratification/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Mon, 18 May 2026 22:57:41 +0000</pubDate>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Multilevel Modeling]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53739</guid>

					<description><![CDATA[Ryan Giordano, Alice Cima, Jared Murray, Erin Hartman, and Avi Feller write: Multilevel regression and poststratification (MrP) has become a workhorse method for estimating population quantities from non-probability surveys, and is the primary model-based alternative to traditional survey calibration weighting &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/18/mrplew-locally-equivalent-weights-for-multilevel-regression-and-poststratification/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><a href="https://rgiordan.github.io/assets/mrplew_paper.pdf">Ryan Giordano, Alice Cima, Jared Murray, Erin Hartman, and Avi Feller write</a>:</p>
<blockquote><p>Multilevel regression and poststratification (MrP) has become a workhorse method for estimating population quantities from non-probability surveys, and is the primary model-based alternative to traditional survey calibration weighting methods, such as raking. For simple linear regression models, MrP methods admit “equivalent weights”, allowing for direct comparisons between MrP and traditional calibration weighting. Such weights, however, have been unavailable for the most widely used MrP models, such as logistic regression. In this paper, we develop a natural generalization, “MrP locally equivalent weights” (MrPlew), which represent MrP as a weighting-style estimator that is locally equivalent to calibration weights near the observed responses.</p></blockquote>
<p>Cool!  This goes beyond my 2007 paper, <a href="https://sites.stat.columbia.edu/gelman/research/published/STS226.pdf">Struggles with survey weighting and regression modeling</a> (&#8220;for logistic regression, the poststratified estimate is no longer a weighted average of the data, even after controlling for the variance parameters in the model. However, we suspect that the model could be linearized, yielding approximate weights&#8221;) and <a href="https://sites.stat.columbia.edu/gelman/research/published/serial.pdf">our 2004 paper on dilution assays</a>, in particular Section 5.2, &#8220;Equivalent weights for nonlinear models.&#8221;  The funny thing is that I forgot about that 2004 paper when working on equivalent weights for MRP in the 2007 paper.  Also, the 2004 method won&#8217;t work as is, because it&#8217;s designed to estimate sensitivity to individual data points, not to produce good weighted averages.</p>
<p>I say this not to try to claim credit for the method of Giordano et al., but rather the opposite, to emphasize that even though I&#8217;ve been thinking about equivalent weights in MRP for a long time, I haven&#8217;t yet succeeded in getting them to work in practice, so I&#8217;m very happy to see developments in this area.</p>
<p>One thing that came up with equivalent weights when we tried to apply them in practice is that sometimes the weights can be negative.</p>
<p>Negative weights can sometimes make statistical sense.  The idea is that, depending on how the data line up in the regression model, sometimes if you pull one data point upward, it will cause the slope of the fitted line to change in such a way as to reduce the predicted mean value.  This doesn&#8217;t sound right at first, but it can easily occur with poststratification when the population distribution of the predictors differs from the sample.  Even if the negative weights can make sense in the estimation context, it still would seem kind of awkward to pass them along to the user.</p>
<p>The other thing that&#8217;s tricky is:  What are the weights going to be used for?  In the 2007 paper, the equivalent weights are set up to get the right answer for the estimate of the population mean, but presumably they&#8217;d be used for large subgroups too (for example, the average among men or women in the population). For more complicated estimates such as arise in small-area estimation or regression, you might <a href="https://sites.stat.columbia.edu/gelman/research/unpublished/weight_regression.pdf">want to use MRPW</a>.  Which is fine, but whatever it would take to get good weights for one of these purposes might not work best for the others.</p>
<p>Still, I remain interested in MRP locally equivalent weights of some sort, for two reasons:</p>
<p>1.  We&#8217;re often doing MRP (or, <a href="https://statmodeling.stat.columbia.edu/2018/05/19/regularized-prediction-poststratification-generalization-mister-p/">more generally, RPP</a>) anyway, so why not provide weights for other users of the survey that we&#8217;re analyzing?  </p>
<p>2.  Sometimes we&#8217;re called upon to provide weights for a public-facing survey, and the way we end up doing this is through an awkward and unsatisfying sequence of adjustment and smoothing steps (the &#8220;struggles&#8221; in &#8220;Struggles with survey weighting and regression modeling&#8221;).  If we can do this using modeling and MRP, that could be a much more effective workflow, providing weights that are more stable and yield more accurate estimates of population quantities while also being more scientifically defensible and requiring fewer arbitrary choices.</p>
<p>Model-based weights will depend on some set of predictors X, variables that are observed in the sample and in the population (or, as necessary or appropriate, <a href="https://statmodeling.stat.columbia.edu/2018/10/28/mrp-rpp-non-census-variables/">estimated from the population</a>).  One funny thing is that the weights will be mathematically a function of X, but the function itself will depend not just on sampling design, and not just on the distributions of X in the sample and population, but also on the outcome y that is being modeled.  Different outcome variables will yield different sets of weights.  At first this might seem disturbing, but upon reflection I think this dependence is a good thing.  When it comes to weighting, the relative importance of the different variables in X will indeed depend on the outcome.  Different variables are important for predicting public health risk factors than predicting how you will vote.  That said, if you want some sort of omnibus weights, which you probably will want for a public survey, you can compute equivalent weights for each of a battery of outcomes and then average these weights to get a single set.  That seems reasonable enough.</p>
<p>OK, back to Giordano et al., who continue:</p>
<blockquote><p>This enables a suite of standard weighting diagnostics, including frequentist sampling variability, covariate balance, and subgroup contribution. We formally justify the use of MrPlew in these cases: we prove the MrPlew-based variance estimator is asymptotically equivalent to the infinitesimal jackknife for common exponential family models, and we introduce a novel class of model checks based on invariance to data perturbations that generalize covariate balance and subgroup contribution to nonlinear models. We further show that MrPlew can be computed easily using existing MCMC samples and provide open-source software to compute MrPlew using the output of standard software. We illustrate our approach for several canonical studies that use MrP, including via a logistic regression outcome model, showing that implied covariate balance can sometimes be worse for MrP than for raking. Given the ease of computing, we recommend making MrPlew a standard part of the MrP model interrogation workflow.</p></blockquote>
<p>It makes sense that implied covariate balance can sometimes be worse for MRP than for raking.  MRP is a smoothed version of raking, and unsmoothed raking can overfit.  Or, in practice, you might rake on fewer variables so as to avoid overfitting.  Multilevel regression gives you the freedom to include more predictors and interactions, secure in the understanding that the model will smooth the estimate and there will be less possibility for overfitting.  In short, multilevel modeling&#8211;or, more generally, regularization&#8211;is a sort of safety net that can give us the security to construct better models, in the same way that a social safety net can give people the security to try new jobs, or for that matter in the same way that an actual safety net can give acrobats the security to perform more elaborate routines.</p>
<p>Where I want to go next is to be able to use these methods to construct weights for public surveys.  I&#8217;m still not sure about all the steps that will take us there, but I continue to think it&#8217;s possible.</p>
<p>The new Giordano et al. paper is thoughtful and readable as well as having lots of math, statistical modeling, and real-data examples.  I recommend you read it.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/18/mrplew-locally-equivalent-weights-for-multilevel-regression-and-poststratification/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
			</item>
		<item>
		<title>Jonah&#8217;s seminar tomorrow: &#8220;Bayesian Workflow and the Software That Shapes It&#8221;</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/18/jonahs-seminar-tomorrow-bayesian-workflow-and-the-software-that-shapes-it/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/18/jonahs-seminar-tomorrow-bayesian-workflow-and-the-software-that-shapes-it/#comments</comments>
		
		<dc:creator><![CDATA[Leonardo Egidi]]></dc:creator>
		<pubDate>Mon, 18 May 2026 16:30:21 +0000</pubDate>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Miscellaneous Science]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Multilevel Modeling]]></category>
		<category><![CDATA[Stan]]></category>
		<category><![CDATA[Statistical Computing]]></category>
		<category><![CDATA[Statistical Graphics]]></category>
		<category><![CDATA[Teaching]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53732</guid>

					<description><![CDATA[This is Leo. Jonah Gabry (Stan developer, Andrew&#8217;s collaborator, etc.) is spending the whole month of May as a visiting professor here with us at the University of Trieste in Italy. Tomorrow, May 19th, in the De Finetti room at &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/18/jonahs-seminar-tomorrow-bayesian-workflow-and-the-software-that-shapes-it/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>This is Leo. Jonah Gabry (Stan developer, Andrew&#8217;s collaborator, etc.) is spending the whole month of May as a visiting professor here with us at the University of Trieste in Italy. Tomorrow, May 19th, in the De Finetti room at the University of Trieste, at  9 am NYC time (GMT-4), Jonah will give the following talk:</p>
<p>&#8220;Bayesian Workflow and the Software That Shapes It&#8221;</p>
<p>based on the upcoming book:  <a href="https://avehtari.github.io/Bayesian-Workflow/">&#8220;Bayesian Workflow&#8221;.</a></p>
<p>For anyone local, you are welcome to come in person. Anyone else can join on Microsoft Teams (<a href="https://teams.microsoft.com/meet/396256279682020?p=f6UiaLeldPcRTJkscM">available here</a>).</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/18/jonahs-seminar-tomorrow-bayesian-workflow-and-the-software-that-shapes-it/feed/</wfw:commentRss>
			<slash:comments>6</slash:comments>
		
		
			</item>
		<item>
		<title>What is “the definition of a professional career”?</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/18/whats-the-definition-of-a-professional-career-2/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/18/whats-the-definition-of-a-professional-career-2/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Mon, 18 May 2026 13:03:22 +0000</pubDate>
				<category><![CDATA[Economics]]></category>
		<category><![CDATA[Literature]]></category>
		<category><![CDATA[Sociology]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53129</guid>

					<description><![CDATA[I happened to come across this post from 2015 where I discussed a remark from a political journalist who advocated &#8220;some measure of accountability . . . which allows both that very bad teachers be fired and that very good &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/18/whats-the-definition-of-a-professional-career-2/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>I happened to come across <a href="https://statmodeling.stat.columbia.edu/2011/08/27/whats-the-definition-of-a-professional-career/">this post from 2015</a> where I discussed a remark from a political journalist who advocated &#8220;some measure of accountability . . . which allows both that very bad teachers be fired and that very good ones can obtain greater pay and recognition. That’s the definition of a professional career track . . .&#8221;</p>
<p>What interested me there was not the question of how easy it should be to promote or fire teachers, but rather the idea that the risk of being fired is part of &#8220;the definition of a professional career track.&#8221;</p>
<p>OK, I&#8217;m not trying to take him literally.  If you look up &#8220;professional&#8221; in the dictionary it says, &#8220;engaged in a profession that requires academic learning as preparation,&#8221; and if you look up &#8220;profession,&#8221; you get &#8220;a calling requiring specialized knowledge and often long and intensive academic preparation.&#8221;  Obviously he didn&#8217;t literally mean that being fired is part of the definition, more that it&#8217;s a core or essential part of what being a professional is.</p>
<p>It was funny for me to see this because being a tenured professor is part of a professional career track, and we can&#8217;t be fired.  Also, at a lot of universities there&#8217;s not much range for promotion either.  On the other hand, I&#8217;d call journalism a profession too, and, unfortunately, journalists get fired all the time.  Not just &#8220;very bad&#8221; journalists either.  Journalists lose their jobs because of well-known economic factors leading to a decades-long decline in employment in that field.</p>
<p>I guess a more accurate way to put it is not that the risk of being fired is an essential part of a professional career track, but rather that this journalist thinks that this risk <em>should</em> be an essential part of any professional career track.</p>
<p>And I see where he&#8217;s coming from.  I don&#8217;t want to be at risk of being fired&#8211;but there are lots of professors I know for whom, if they were fired, I&#8217;d be cool with that.</p>
<p>The thing I&#8217;m worried about is that whoever has the ability to do this firing will use this ability to extort things out of me or to retaliate at me.  But I guess that&#8217;s where the &#8220;measure of accountability&#8221; comes in.</p>
<p>At this point I expect many of you will be groaning and saying that the vast majority of employees in this country can be fired for no reason at any time (&#8220;at-will employment,&#8221; as they call it) and I&#8217;m privileged to have a job where it&#8217;s really hard to fire me.  To which I reply:  I agree that I&#8217;m privileged, and not just in my job.  I&#8217;m privileged in so many ways.  Maybe more people should be privileged in this way!</p>
<p>But actually that&#8217;s not what I wanted to talk about here&#8211;we already covered most of this in our 2015 post and subsequent comment thread.</p>
<p><strong>Here&#8217;s my question for you</strong></p>
<p>Rather, I wanted to ask, from scratch, the question posed by the title of this post:  What <em>is</em> “the definition of a professional career”?</p>
<p>The traditional definition, requiring academic learning, covers a lot.  Doctors, lawyers, college professors all require long and intensive academic preparation.  Physical therapists, too, and physical therapists do seem much more professional than they did thirty years ago.  It also seems to me that professionalism involves some sort of standardization:  a job category being more &#8220;professional&#8221; is often associated with a low variance, not necessarily in abilities but in how they comport themselves.  When you go to a doctor or a dentist, they always act like doctors or dentists.  Lawyers too, to some extent.  Maybe K-12 teachers, not so much.  K-12 teachers require academic learning but not so much as those other professions.</p>
<p>What about getting fired?  Things have changed.  Back in the day, doctors were mostly self-employed.  Now I imagine they&#8217;re mostly employees.  So I guess they can get fired.  When you&#8217;re self-employed you can&#8217;t get fired but you can go out of business.</p>
<p>There&#8217;s also the distinction between the professions and the trades.  For some reason, people always seem to want to bring up plumbers.  To be a plumber you need training, but it&#8217;s not academic training.  I guess you&#8217;d be a better plumber if you took a few physics classes, in the same way that it&#8217;s probably a good idea that pre-med students have to learn a bunch of biology.  Architect is a profession because architectural training is academic, or because it&#8217;s traditionally an upper-middle-class job rather than a lower-middle-class job?  And then there are lots of jobs that require little or no special training at all, but they require some skills or competence, like carrying boxes or caring for kids or elders.  These sorts of jobs could become professionalized too, which is either a good thing or a bad thing, depending on how you think about it.</p>
<p>And then there&#8217;s journalism, and writing more generally.  Traditionally no barriers to entry and a path forward for lots of people to make their mark, from Jim Thompson and Carl Bernstein on down:  some of these people went to college and some didn&#8217;t&#8211;going to college is a great thing, you can learn a lot, even if it&#8217;s not giving you job qualifications&#8211;and even in later decades when more and more journalists were college graduates, it&#8217;s not like it was required.</p>
<p>I remember a bunch of years ago there were some political scientists writing about the professionalization of state legislators.  In that context, &#8220;professional&#8221; meant that being a legislator was a full-time job, or close to it, with a good salary and some staff.  In this case, professionalism had nothing to do with academic qualifications; it was being used in the same way as we talk about a &#8220;professional&#8221; athlete.  A professional athlete makes a living from it; a semi-pro gets paid but needs some other job; an amateur does it just for fun.</p>
<p>So in a conversation about the so-called gig economy (speaking of journalists), being a professional means that you have a steady job that pays reasonably well.  If you work at a nice restaurant, you can be a professional waitress.  In that sense, I guess that many of my favorite novelists are not professional writers; they need to hold down other jobs such as teaching at universities.  In which case they <em>are</em> professionals, but not at the task that uses their best talents.</p>
<p><strong>P.S.</strong>  Academic tenure only came up briefly in this post, but several people brought it up in the discussion. For those of you interested in my thoughts on the matter, I&#8217;ll point you to these two posts from 2011:</p>
<p>&#8211; <a href="https://statmodeling.stat.columbia.edu/2011/06/01/the_cushy_life/">The “cushy life” of a University of Illinois sociology professor</a></p>
<p>&#8211; <a href="https://statmodeling.stat.columbia.edu/2011/06/07/update_on_the_u/">Looking for a purpose in life: Update on that underworked and overpaid sociologist whose “main task as a university professor was self-cultivation”</a></p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/18/whats-the-definition-of-a-professional-career-2/feed/</wfw:commentRss>
			<slash:comments>41</slash:comments>
		
		
			</item>
		<item>
		<title>If Books Could Kill podcast</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/17/if-books-could-kill-podcast/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/17/if-books-could-kill-podcast/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Sun, 17 May 2026 13:04:17 +0000</pubDate>
				<category><![CDATA[Literature]]></category>
		<category><![CDATA[Zombies]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=50062</guid>

					<description><![CDATA[As we&#8217;ve discussed, the If Books Could Kill podcast has its issues, notably that sometimes they&#8217;re too soft on their own premises, and sometimes they seem to be working too hard to contort things into a certain political context. But &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/17/if-books-could-kill-podcast/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>As we&#8217;ve discussed, the If Books Could Kill podcast <a href="https://statmodeling.stat.columbia.edu/2024/09/28/fake-stories-in-purported-nonfiction/">has its issues</a>, notably that sometimes they&#8217;re too soft on their own premises, and sometimes they seem to be working too hard to contort things into a certain political context.  But when they hit it, they hit it.</p>
<p>The hosts have a great rapport and move pretty fast, which is all the more impressive given that they don&#8217;t have the production values or the scripts of a show like This American Life; they&#8217;re just winging it. I can wing it too in a live presentation; I think I&#8217;d find it harder to sustain it in regular hour-long podcasts.</p>
<p>In any case, it was all worth it for this story, <a href="https://www.buzzsprout.com/2040953/13887364-the-48-laws-of-power">from their episode on</a> &#8220;The 48 Laws of Power&#8221;:</p>
<blockquote><p>African proverb or something. I don&#8217;t know where he&#8217;s pulling this from. . . . a snake chased by hunters asked a farmer to save its life. To hide it from its pursuers, the farmer squatted and let the snake crawl into his belly. But when the danger had passed and the farmer asked the snake to come out, the snake refused. It was warm and safe inside. On his way home, the man saw a heron and whispered what had happened. The heron told him to squat and strain to eject the snake. When the snake stuck its head out, the heron caught it, pulled it out and killed it.</p>
<p>The farmer was worried that the snake&#8217;s poison might still be inside him and the heron told him that the cure for snake poison was to cook and eat six white fowl. You&#8217;re a white fowl, said the farmer. He grabbed the heron, put it in a bag and carried it home, where he hung it up while he told his wife what had happened. “I&#8217;m surprised at you,” said the wife. The bird does you a kindness, rids you of the evil in your belly, saves your life, yet you catch it and talk of killing it. She immediately released the heron and it flew away. But on its way, it gouged out her eyes.</p></blockquote>
<p>Whaaaa?</p>
<p>I laughed so hard I almost fell off my bike.</p>
<p>I have a horrible feeling that the other 47 laws aren&#8217;t nearly so entertaining.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/17/if-books-could-kill-podcast/feed/</wfw:commentRss>
			<slash:comments>7</slash:comments>
		
		
			</item>
		<item>
		<title>Why are there squares everywhere in statistics (e.g., normal density, variance, least squares, etc.)?</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/16/why-are-there-squares-everywhere-in-statistics-normal-variance/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/16/why-are-there-squares-everywhere-in-statistics-normal-variance/#comments</comments>
		
		<dc:creator><![CDATA[Bob Carpenter]]></dc:creator>
		<pubDate>Sat, 16 May 2026 19:00:44 +0000</pubDate>
				<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Zombies]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53716</guid>

					<description><![CDATA[I remember asking my colleagues at Carnegie Mellon this very same question as I was first learning basic statistics in the early 1990s and they gave the same kind of answers as I found more recently in the AskStatistics subreddit. &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/16/why-are-there-squares-everywhere-in-statistics-normal-variance/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>I remember asking my colleagues at Carnegie Mellon this very same question as I was first learning basic statistics in the early 1990s and they gave the same kind of answers as I found more recently in the AskStatistics subreddit.  It&#8217;s an evergreen question, coming up regularly enough I think it&#8217;s fair to say it&#8217;s a zombie.  This is probably going to be familiar material to most blog readers, but if you&#8217;re like me when I first started reading this blog decades ago, then read on.</p>
<ul>
<li>AskStatistics subreddit: <a href="https://www.reddit.com/r/AskStatistics/comments/1d9gveg/why_is_everything_always_being_squared_in/">Why is everything always being squared?</a></li>
<li>AskStatistics subreddit:<a href="https://www.reddit.com/r/AskStatistics/comments/1t9an0g/why_square_in_variance_not_absolute_value/">Why square in variance not absolute value?</a></li>
</ul>
<p>The answers are all over the place ranging from &#8220;the central limit theorem&#8221; to &#8220;it&#8217;s distance&#8221;  to &#8220;it makes everything positive&#8221; to &#8220;it is smooth unlike absolute value&#8221; (another path to positivity) to &#8220;mathematical convenience&#8221; or &#8220;the math says so&#8221; (sure, but how?) to the &#8220;central limit theorem&#8221; (a more specific form of &#8220;the math says so&#8221;) to the link to entropy (it is the distribution that maximizes entropy for a given mean and variance), and so on.</p>
<p>I think the easiest answer for the person asking &#8220;why squares?&#8221; is due to Gauss by way of Pythagoras.  Simply put, </p>
<p>&nbsp; &nbsp; <b>*** the mean is the number that minimizes square error. ***</b></p>
<p>That is, if I have a sequence x[1], &#8230;, x[N], then mu = mean(x) is the number that minimizes the error</p>
<p>&nbsp; &nbsp;  err(mu, x) = SUM{n in 1:N} (x[n] &#8211; mu)^2,</p>
<p>in the sense that</p>
<p>&nbsp; &nbsp;  ARGMIN_mu err(mu, x) = mean(x).</p>
<p>You can verify this by taking a derivative of the summation, showing that it&#8217;s zero at the mean, then confirming that minimizing, and checking that you have convexity by checking that the second derivative is positive.</p>
<p>Gauss realized that he could define a density where mean(x) is the maximum likelihood estimator.  It is now called the &#8220;Gaussian distribution&#8221; or &#8220;normal distribution&#8221;.  That is, we have </p>
<p>&nbsp; &nbsp;  log normal(x | mu, 1) = -1/2 (x &#8211; mu)^2 + const.</p>
<p>If we have a bunch of observations and want to maximize their likelihood, that&#8217;s equivalent to minimizing the error I wrote down above,</p>
<p>&nbsp; &nbsp;   ARGMAX_mu PRODUCT{n in :N} normal(x[n] | mu, 1)</p>
<p>&nbsp; &nbsp;  &nbsp; &nbsp;    = ARGMAX_mu PRODUCT{n in 1:N} exp(-1/2 (x[n] &#8211; mu)^2) * const</p>
<p>&nbsp; &nbsp;  &nbsp; &nbsp;    = ARGMAX_mu SUM{n in 1:N} log(exp(-1/2 (x[n] &#8211; mu)^2)) + log(const)</p>
<p>&nbsp; &nbsp;  &nbsp; &nbsp;    = ARGMAX_mu N * log(const) + SUM{n in 1:N} -1/2 (x[n] &#8211; mu)^2 </p>
<p>&nbsp; &nbsp;  &nbsp; &nbsp;    = ARGMIN_mu 1/2 SUM{n in 1:N} (x[n] &#8211; mu)^2</p>
<p>&nbsp; &nbsp;  &nbsp; &nbsp;    = mean(x)</p>
<p>This is why &#8220;ordinary least squares&#8221; fit for a regression with normal error terms yields the same result as maximum likelihood.  </p>
<p>Some of the other answers in these threads are correct, but they failed to enlighten the original posters as to why there is a square.  Just saying &#8220;because of the math&#8221; like my colleagues did is not helpful without saying which math!  For example, the normal distribution, with its quadratic log density term, arises through the central limit theorem asymptotically.  Squaring does emphasize outliers as people said, but that&#8217;s not the reason.  If you say &#8220;for mathematical convenience,&#8221; it would help to say what it buys you, such as the sum of normals remaining normals, normal being its own conjugate prior in regression, integrals being very nice, tails being very nicely behaved, the connection to maximum entropy, residence in the exponential family, etc. etc. </p>
<p><b>Why not absolute values instead of squares?</b></p>
<p>We can use absolute values.  In that case, medians replace means, because absolute error has the pleasant property that</p>
<p>&nbsp; &nbsp; <b>*** the median is the number that minimizes absolute error. ***</b></p>
<p>In symbols, that&#8217;s</p>
<p>&nbsp; &nbsp; ARGMIN_mu SUM{x in 1:N} abs(x[n] &#8211; mu) = median(x)</p>
<p><b>Connection to Bayesian inference</b></p>
<p>If our model is correct (big &#8220;if&#8221; that&#8217;s almost always false, of course), then we know that the posterior mean is the parameter estimate that minimizes expected square error, whereas the posterior median minimizes expected absolute error.  Just like in the simpler case.  It&#8217;s important to note methodologically that we don&#8217;t get a natural point estimate out of Bayesian inference without specifying an error function.  If the error function is quadratic, the error minimizing estimate is the posterior mean, whereas if the error function is absolute error, the error minimizing estimate is the posterior median.</p>
<p><b>The multivariate normal case</b></p>
<p>There is a strong relation to distance, as some of the answers on the AskStatistics subreddit said.  More precisely,  squared error is just squared Euclidean distance from the point [mu mu &#8230; mu].  This points the way to how Gauss defined the multivariate normal distribution.  Simply replace the squared Euclidean distance term (x[n] &#8211; mu)^2 term with a quadratic form involving the inverse covariance,</p>
<p>&nbsp; &nbsp; log normal(y | mu, Sigma) = -1/2 (x[n] &#8211; mu)&#8217; * inverse(Sigma) * (x[n] &#8211; mu) + const,</p>
<p>where Sigma is a positive-definite covariance matrix.  This quadratic form is the distance in the space defined by the Euclidean metric inverse(Sigma).  The resulting distance over vectors is known as the &#8220;Mahalanobis distance&#8221; in statistics.  For example, if Sigma = [[1, 0.9], [0.9, 1]], for a highly correlated bivariate normal, it will be easier (less distance under the quadratic form) to move along the diagonal x = y than transverse to the diagonal x = y. That is, the points [1 1] and [2 2] are closer to each other given the inverse metric Sigma (distance approximately 1.05) than [1 1] is to [1 1 + sqrt(2)] (distance approximately 10.5), even though they are the same simple Euclidean distance away from each other.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/16/why-are-there-squares-everywhere-in-statistics-normal-variance/feed/</wfw:commentRss>
			<slash:comments>19</slash:comments>
		
		
			</item>
		<item>
		<title>Sean Manning&#8217;s lexicon</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/16/sean-mannings-lexicon/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/16/sean-mannings-lexicon/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Sat, 16 May 2026 13:08:47 +0000</pubDate>
				<category><![CDATA[Miscellaneous Science]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53112</guid>

					<description><![CDATA[Unlike me, the historian lists his entries not chronologically but alphabetically: Abstraction Ad fontes All Publicity is Good Publicity Analogy Anecdote Anchoring Effect Appeal to Authority Apples and Oranges Archaeological Visibility Argumentum ad baculum Argumentative Theory of Reason Art Historian’s &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/16/sean-mannings-lexicon/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><a href="https://statmodeling.stat.columbia.edu/2009/05/24/handy_statistic/">Unlike me</a>, the historian <a href="https://www.bookandsword.com/2025/01/11/knowing-things-is-hard/">lists his entries not chronologically but alphabetically</a>:</p>
<p>Abstraction<br />
Ad fontes<br />
All Publicity is Good Publicity<br />
Analogy<br />
Anecdote<br />
Anchoring Effect<br />
Appeal to Authority<br />
Apples and Oranges<br />
Archaeological Visibility<br />
Argumentum ad baculum<br />
Argumentative Theory of Reason<br />
Art Historian’s Dilemma<br />
Authoritarian Teleology<br />
Autobiographical Heuristic<br />
Auxiliary Sciences of History<br />
Availability Bias<br />
. . .</p>
<p>Lots of good stuff there&#8211;just <a href="https://www.bookandsword.com/2025/01/11/knowing-things-is-hard/">click on the link</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/16/sean-mannings-lexicon/feed/</wfw:commentRss>
			<slash:comments>5</slash:comments>
		
		
			</item>
		<item>
		<title>When is it time for a Five-Year Plan?</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/15/when-is-it-time-for-a-five-year-plan/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/15/when-is-it-time-for-a-five-year-plan/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Fri, 15 May 2026 13:59:36 +0000</pubDate>
				<category><![CDATA[Political Science]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53096</guid>

					<description><![CDATA[The term &#8220;Five-Year Plan&#8221; is a bitter joke, referring to the announcements by the Soviet Union about how they were going to reach some specified level of production of heavy industry, which would notoriously be followed by some combination of &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/15/when-is-it-time-for-a-five-year-plan/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>The term &#8220;Five-Year Plan&#8221; is a bitter joke, referring to the announcements by the Soviet Union about how they were going to reach some specified level of production of heavy industry, which would notoriously be followed by some combination of inefficient allocation of resources, manufacture of useless or defective products, and flat-out lying about what was being produced in the factories.</p>
<p>On the other hand, sometimes Five-Year Plans, or something like them, do work.  Most notably there was WW2, when the U.S. really did build a bunch of factories and produced previously-unimaginably large quantities of airplanes etc.  And we really did get to the moon, and it didn&#8217;t take much more than five years.</p>
<p>I was thinking about this a few months ago when I was at a conference and some political scientist stood up and told us how, because of competition with China over data centers or AI or something like that, we needed to build a bunch of nuclear power plants.  And all the savvy politicians know this, he said.  I guess AI isn&#8217;t just about making goofy videos or cheating on your homework anymore, or even just about writing more efficient code.  You don&#8217;t need zillions of gigawatts just to run coding assistants.  I think the idea was that this is about building the next-generation AI, whatever it will be.</p>
<p>I hadn&#8217;t heard about all this so I did some searching online and I found out that this is the pitch that&#8217;s been coming from various software executives.  These guys have tons of money, tons of media exposure, and tons of mystique, so it&#8217;s no surprise that lots of politicians would fall under their sway.  I guess all the politicians except the far right and the far left:  at both extremes, their combination of strong ideology and safe seats could make them relatively immune to the siren song of tech zillionaires.</p>
<p>And, I dunno, maybe these guys are right, that it&#8217;s absolutely necessary to set national priorities in this way, climate change be damned.  (Yeah, I get it that nukes don&#8217;t contribute much to global warming, but we&#8217;re using tons of power anyway, so if the nukes are going to just be used to power these new data centers, we&#8217;ll still need all the other energy sources to keep the heat on and the cars running and all the rest.)  I don&#8217;t know.</p>
<p>It&#8217;s just funny how these guys are going all-in on the whole Five-Year Plan thing.  I could imagine someone saying, Hey, if the Chinese want to bankrupt themselves by investing all their resources into gas-guzzling data centers, let &#8217;em!  But in this case the centrist consensus seems to be that Planning is the way to go.  Or maybe it doesn&#8217;t count as Planning if the profits are private.  It was just funny to hear a political scientist going on about how this needs to be done without even remarking on the historical parallels, which indeed go in both directions, as noted in the first two paragraphs of this post.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/15/when-is-it-time-for-a-five-year-plan/feed/</wfw:commentRss>
			<slash:comments>46</slash:comments>
		
		
			</item>
		<item>
		<title>Alchemize:  PyMC&#8217;s model to replace Stan/PyMC, etc. with an LLM</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/14/alchemize-pymcs-model-to-replace-stan-pymc-etc-with-an-llm/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/14/alchemize-pymcs-model-to-replace-stan-pymc-etc-with-an-llm/#comments</comments>
		
		<dc:creator><![CDATA[Bob Carpenter]]></dc:creator>
		<pubDate>Thu, 14 May 2026 19:00:41 +0000</pubDate>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Stan]]></category>
		<category><![CDATA[Statistical Computing]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53715</guid>

					<description><![CDATA[This post is from Bob I&#8217;ll let Thomas Wiecki, who is one of the core PyMC devs and one of the partners at PyMC Labs, speak for himself here: Thomas Wiecki. 2026. Alchemize: Transpile PyMC to Rust for 3-7x speed-up. &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/14/alchemize-pymcs-model-to-replace-stan-pymc-etc-with-an-llm/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><I>This post is from Bob</I></p>
<p>I&#8217;ll let <a href="https://www.pymc-labs.com/team-detail/thomas-wiecki">Thomas Wiecki</a>, who is one of the core PyMC devs and one of the partners at PyMC Labs, speak for himself here:</p>
<ul>
<li>Thomas Wiecki. 2026.  <a href="https://discourse.pymc.io/t/alchemize-transpile-pymc-to-rust-for-3-7x-speed-up/17709">Alchemize: Transpile PyMC to Rust for 3-7x speed-up</a>. PyMC Discourse.</li>
</ul>
<p>If you haven&#8217;t seen what people are doing with agentic AI, this is a good example.  I&#8217;m really happy that Thomas and PyMC Labs are sharing their thoughts and initial tries at things like this as I think it has the potential to benefit everyone working on modeling.  </p>
<p>If you want to see the basis of the agent&#8217;s instructions, check out the <a href="https://urldefense.com/v3/__https://hub.decision.ai/skills/pymc-labs/pymc-modeling__;!!DSb-azq1wVFtOg!UObuvAhkQG6cuzrMwqPo3BhtFEWJuklK1RGL3qWqrazl-XTn1LI1hBwic3LaqKKMcP55XQBPSGkGJ1jTg_peOF43552gUU1Zeatg$">&#8220;skill&#8221; for PyMC</a> that Chris Fonnesbeck wrote.</p>
<p>We&#8217;ve already batted this around a bit in email with Thomas, so I can summarize some talking points:</p>
<p>LLM-based chatbots are really good at translating.  Compiling (or more technically correct, transpiling) a statistical model down to a language like Rust or C++ or JAX is a kind of translation.</p>
<p>You can start from PyMC&#8217;s execution trace, but you can also start with a model description.  You could also start with something like Stan code. </p>
<p>The biggest bottleneck to deploying Bayesian models in my opinion is the inherent variance and unreliability of MCMC-based inference.  Our workflow proposals are all about making sure this doesn&#8217;t go wrong.  Wiecki&#8217;s point here is that we can have the bots go through the workflow.  Iterating until the gradients and log densities match is a good example, but this could be extended to more parts of workflow.</p>
<p>The skills feel a lot like writing a textbook for a bot. I have no idea how hard or easy this is or how much it improves over the baseline.  Jeremy Magland built a RAG-like helper for Stan that compressed the <I>Stan Reference Manual</I> down to 1K tokens for context (like a skill) and allowed it to search and import from the <I>Stan User&#8217;s Guide</I>, but never measured how much it improved over the baseline.  It really feels like it should also have the <I>Stan Functions Reference</I>, <I>BDA3</I>, <I>Regression and Other Stories</I>, and the <I>Bayesian Workflow</I> book, as well.</p>
<p>Hopefully we&#8217;ll asymptote at writing a textbook sized set of skills and not have to write one per target model (that is, something like the <I>Stan User&#8217;s Guide, Reference Manual, Functions Reference</I>).  </p>
<p>I&#8217;m curious as to whether it will eventually be able to make writing hard models easier.  I&#8217;m thinking of efforts like <a href="https://github.com/epiforecasts/EpiNow2/tree/main/inst/stan">epinow2</a>, which involves a very large chunk of Stan code.</p>
<p>As the foundation models and chatbot tuning changes, there&#8217;s going to be an issue of regression testing and tuning for whatever the latest models are.</p>
<p>P.S.  This effort explains how Thomas was able to create the huge posteriordb pull requests for PyMC (<a href="https://github.com/stan-dev/posteriordb/pull/320">#320</a> and <a href="https://github.com/stan-dev/posteriordb/pull/319">#319</a>)!</p>
<p>P.P.S.  The latest thing Claude (Opus 4.7) did that impressed me was generate the <code>ess(MatrixXd, vector<size_t>)</code> function in <a href="https://github.com/flatironinstitute/walnuts/blob/posterior-summary/include/walnuts/summary.hpp">summary.hpp</a>.  This function estimates effective sample size Stan style (discounting for R-hat > 1) on a ragged array of Markov chains.  We have to generalize all the posterior analysis tools to deal with the new asynchronous parallel sampler).  I had Stan&#8217;s ESS function and all the other functions I&#8217;d written for the ragged structures to give it as a guide.  It&#8217;s very easy to code review that it matches Stan&#8217;s implementation for the new data structure.  I only had to tweak the output a little bit for style.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/14/alchemize-pymcs-model-to-replace-stan-pymc-etc-with-an-llm/feed/</wfw:commentRss>
			<slash:comments>12</slash:comments>
		
		
			</item>
		<item>
		<title>“As our daily lives involve ever more sophisticated computers, we will find that ascribing little thoughts to machines will be increasingly useful in understanding how to get the most good out of them” but “we must be careful not to ascribe properties to a machine that the particular machine doesn&#8217;t have”</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/14/as/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/14/as/#comments</comments>
		
		<dc:creator><![CDATA[Jessica Hullman]]></dc:creator>
		<pubDate>Thu, 14 May 2026 17:09:31 +0000</pubDate>
				<category><![CDATA[Miscellaneous Science]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Sociology]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53722</guid>

					<description><![CDATA[This is Jessica. Maybe one of the biggest crimes of academic computer science (besides routinely ignoring prior work and making up social science to suit our needs) is our tolerance for abuse of language. We take technical things and inject &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/14/as/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><span style="font-weight: 400">This is Jessica. Maybe one of the biggest crimes of academic computer science (besides routinely ignoring prior work and making up social science to suit our needs) is our tolerance for abuse of language. We take technical things and inject them with social significance without thinking through what we’ve implied. This is perhaps forgivable in early stages of research when we’re trying to get more people excited about exploring some direction, but at some point people start taking things more seriously and we find ourselves committed to terminology that overreaches. Then the question becomes what, if anything, we should do about it.</span></p>
<p><span style="font-weight: 400">Previously it didn’t feel like such a crime to talk about intelligence or learning in machines because nothing really worked that well, so the labels were clearly aspirational. But now it’s much easier to believe the simulacra. And so it becomes harder to tell when we are using human-oriented terms as a predictive convenience versus a scientific claim versus a marketing device. There are ramifications of referring to models’ reasoning or beliefs or chain of thought or explanations or intentions. Lots of people—from end users having personal relationships with models to media and AI companies themselves referring to </span><a href="https://www.wsj.com/tech/ai/anthropic-amanda-askell-philosopher-ai-3c031883"><span style="font-weight: 400">“parenting” the latest models</span></a><span style="font-weight: 400"> or asking if they can be </span><a href="https://www.washingtonpost.com/technology/2026/04/11/anthropic-christians-claude-morals/"><span style="font-weight: 400">“children of god”</span></a><span style="font-weight: 400">—are taking models too seriously. </span><span style="font-weight: 400">It’s bad enough in a computer science context that I now take for granted that if I want to refer to participants or scientists or decision-makers, unless I mean AI, I should add “human” in front, because otherwise the audience will assume I mean AI agents. Someone reminded me at a workshop recently how silly all this sounds to people who aren’t used to it. </span></p>
<p><span style="font-weight: 400">Too much casualness with words is unscientific. There was no good reason in the first place to call the token sequences a model produces when we ask it to “explain its reasoning” reasoning, other than that’s what we wish we could see. What an LLM is doing is distant from what happens when a human thinks about something, even after all the RL post-training. Similarly, we call lots of things “explanations” when we have barely begun to figure out what causal evidence we’d need to see to claim the output faithfully explains the model’s process of arriving at it.</span></p>
<p><span style="font-weight: 400">But it can also seem unscientific to simply declare that “only humans can have beliefs” or “reason” or “provide rationales.” There’s no non-arbitrary line that we can draw between systems whose makeup or behavior truly warrants applying constructs like beliefs and desires and those where it is simply convenient to act as if they have these qualities. If you’ve ever tried to take beliefs seriously, from a decision theoretic perspective, you quickly come to realize that the “real” beliefs we assume a person has are a mythical thing that we will never directly observe, and you fall back on equating “beliefs” with a simpler idea: the probability distribution we arrive at after an elicitation process.</span></p>
<p><span style="font-weight: 400">Much has been said in defense of a functional perspective toward using folk psychology terms with machines, where we decide what’s appropriate based on the predictive validity of the terms for our own understanding and use. John McCarthy wrote in 1983 that anthropomorphism can be a good idea “when it says something that cannot as conveniently be said some other way.” He argued that ascribing mental qualities and processes to machines helps us “understand what they will do, how our actions will affect them, how to compare them with ourselves and how to design them.” Perhaps the best reason to do this is that we get to draw on our existing familiarity with what phrases like “wants” do and do not convey; e.g., we all understand that if we say “The dog wants to go out”, that doesn’t mean that the dog believes itself to be capable of wanting or even that it’s conscious of what it wants, there’s just a sense in which it is trying to get to the state of being outside. </span></p>
<p><span style="font-weight: 400">The functional perspective has led to some strong statements suggesting that it is not only valuable to apply psychological terms to AI, it is </span><span style="font-weight: 400">necessary</span><span style="font-weight: 400">. These arguments often refer back to Daniel Dennett’s distinction between three stances we can take to machines: the physical stance, which is about its physical levels of organization, the design stance, which involves understanding it in terms of the purpose it was designed for, and the intentional stance, where we </span><span style="font-weight: 400">try to understand it by ascribing to it beliefs, goals, intentions, likes and dislikes, and other mental qualities. For example, </span><span style="font-weight: 400">philosopher Keith Frankish </span><span style="font-weight: 400">argues that “it remains true that adopting the intentional stance is the only way of interacting with an LLM in any interesting way; indeed, an LLM-powered chatbot that </span><i><span style="font-weight: 400">couldn’t </span></i><span style="font-weight: 400">be viewed as an intentional system would be completely useless.” </span><span style="font-weight: 400">McCarthy writes that “Long before we can make machines with human capability, we will have many machines that cannot be understood except in mental terms.” Similarly, about ascribing to a machine the concept of “trying” to do something, he says  “If the machine may do something we don&#8217;t know about but that can later be explained in relation to a goal, we have no choice but to use `is trying&#8217; or some synonym to explain the behavior.” </span></p>
<p><span style="font-weight: 400">But forty-plus years later, McCarthy’s piece reads mostly like a defense of a very basic kind of debugging value of trying to imagine a program’s “state of mind” in situations where there’s little risk that we’re going to start mistaking the program for human-like in other ways. He wants to claim that when a thing is designed to act as if it had a certain belief, it can be better understood and manipulated by assuming it’s capable of that kind of belief. But surely he would agree that if a person who is suicidal is interacting with a language model that speaks as though it fully understands the complexity of their situation and what is best for them, it still isn’t always in the person’s best interest for them to take for granted that it does understand.</span></p>
<p><span style="font-weight: 400">The dilemma is how to lean into the intentional stance when it helps, but to avoid overreaching. This seems hard. </span><span style="font-weight: 400">When you first start programming, you realize how easy it is to assume a program is smarter than it is. We are not very good at recognizing when we have slipped from reasoning “as if” to projecting. When technology slots into our human vulnerabilities, like our fear of intimacy and desire for companionship “without the friendship“, as Sherry Turkle said, we are in trouble. Even McCarthy calls it problematic to assign emotional qualities to machines, at least in his time, because “We have enough trouble figuring out our duties to our fellow humans and to animals without creating a bunch of robots with qualities that would allow anyone to feel sorry for them or would allow them to feel sorry for themselves.” </span></p>
<p><span style="font-weight: 400">Another reason to think carefully about the language we use is that it may shape what we can imagine in the future. This last part of McCarthy’s statement–“without creating a bunch of robots that would allow anyone to feel sorry for them”&#8211;hints at how what we project onto machines can shape how we go on to create them. It certainly seems possible that aspirational labeling played a role in getting us to a point where we have models producing sufficiently human-like outputs to have us gushing about their thinking process. I’m reminded of the color perception studies by Berlin and Kay, who found that what color chips different populations could differentiate was predictable from what color terms were available in the vocabulary, as if what we can name defines what we can see. At one extreme, Lucy Suchman argues against unquestioning acceptance that “AI” itself is a coherent thing, because it reifies it as a category for future investment. </span></p>
<p><span style="font-weight: 400">For </span><a href="https://keithfrankish.github.io/articles/Frankish_2024_What%20are%20large%20language%20models%20doing.pdf"><span style="font-weight: 400">Frankish</span></a><span style="font-weight: 400">, the line should be drawn at assigning communicative desires to models; they are playing a communication game (human-like chat) for the non-communicative reason that they are trained to play that game. Ascribing this single desire is enough to get us all the predictive power of the intentional stance. Consequently, we make a category error when we fall prey to stunts like </span><a href="https://www.anthropic.com/research/deprecation-updates-opus-3"><span style="font-weight: 400">giving a retired model its own blog</span></a><span style="font-weight: 400"> because it requested it: we should expect an LLM to say things like this because it is designed to roleplay. Elsewhere I call this kind of mutual sympathetic relationship with AI </span><a href="https://substack.com/home/post/p-195452621"><span style="font-weight: 400">“idiot compassion,”</span></a><span style="font-weight: 400"> a phrase borrowed from Buddhist monk Chogyam Trungpa.  </span></p>
<p>But couldn&#8217;t humans also just be playing a chat game? Why is it ok to say a human is reasoning or has intentions or desires, when we don&#8217;t know exactly how those concepts map to observable physical processes?  <span style="font-weight: 400">Frankish argues that our linguistic behavior is corroborated by various non-linguistic sources in a way that LLMs&#8217; is not.  He talks about a difference in our ability to hold epistemic stances toward statements (what Dennett would call “opinions”) from that of LLMs. We can conceive of assigning different levels of credence to statements. I can repeat something someone said verbatim without believing it, or I can be fully committed to the truth of something I say in the sense that it will guide my future behavior. LLMs can do this too, but their opinions lack grounding “in a web of non-linguistic behavior in which a wider range of desires can be attributed.” It&#8217;s a shallower form of epistemic stance, at least at the current moment of development.</span></p>
<p><span style="font-weight: 400">I like AI, but I don’t like contributing to thoughtlessness. Better semantic hygiene seems warranted, even if it seems like the ship has already sailed. We could shift emphasis to the interpreter (us) by referring to the “human story” or “human pleaser” or “anthropomorphism fulfiller” instead of the chain-of-thought or reasoning or thinking trace. Or we could just add “fake” before whatever humanization we prefer, i.e. the “fake thinking trace,” or “fake reasoning.” I also like “so-called reasoning,” like I like “so-called replication crisis” as a way of pointing to a concept while questioning the expectation. </span></p>
<p><span style="font-weight: 400">P.S. Thanks to Manesh Agrawala for a conversation that inspired this post. </span></p>
<p><span style="font-weight: 400">P.P.S. Some of these McCarthy quotes appear in Recursion, the play Andrew and I wrote!</span></p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/14/as/feed/</wfw:commentRss>
			<slash:comments>15</slash:comments>
		
		
			</item>
		<item>
		<title>&#8220;DC Conventional Wisdom Goes Down to Defeat in State after State&#8221;</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/14/dc-conventional-wisdom-goes-down-to-defeat-in-state-after-state/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/14/dc-conventional-wisdom-goes-down-to-defeat-in-state-after-state/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Thu, 14 May 2026 13:32:52 +0000</pubDate>
				<category><![CDATA[Bayesian Statistics]]></category>
		<category><![CDATA[Political Science]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53050</guid>

					<description><![CDATA[Josh Marshall writes: Elections are hard to predict. But even with that, some of the notional “surprises” we’re seeing [on Election Day, 2025] are less surprises than a measure of GOP dominance over current press narratives. People were looking for &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/14/dc-conventional-wisdom-goes-down-to-defeat-in-state-after-state/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>Josh Marshall <a href="https://talkingpointsmemo.com/edblog/dc-conventional-wisdom-goes-down-to-defeat-in-state-after-state">writes</a>:</p>
<blockquote><p>Elections are hard to predict. But even with that, some of the notional “surprises” we’re seeing [on Election Day, 2025] are less surprises than a measure of GOP dominance over current press narratives. People were looking for an upset in New Jersey. Nate Silver’s Silver Bulletin speculated that New Jersey might be moving toward becoming the next swing state. In fact, Rep. Mikie Sherrill (D) currently appears on track to crush Republican Jack Ciattarelli. A similar failure of conventional wisdom appears to be unfolding in the Virginia Attorney General’s race. A lot of D.C. insiders had convinced themselves that a controversy over some intemperate texts (not nothing but fairly close to it) had doomed his campaign. As recently as a couple days ago, betting markets (which are proxies for conventional wisdom) gave his opponent Jason Miyares 3-to-1 odds of victory. Jones now appears on his way to a clear though not resounding victory with a 3-to-4 percentage point margin.</p></blockquote>
<p>Marshall continues:</p>
<blockquote><p>These results aren’t terribly surprising. You’d expect Democratic gubernatorial candidates to do well in blue states in a climate where the Republican president is deeply unpopular. . . . The issue, again, is the power of Republican political narratives currently have over the elite political press. . . .</p></blockquote>
<p>Maybe.  But maybe it&#8217;s simpler than that; it&#8217;s just that journalists were reacting to the Republicans outperforming the polls in 2024.</p>
<p>In <a href="https://sites.stat.columbia.edu/gelman/research/published/2024_Election_Forecasting_Review.pdf">our election forecasts</a>, we include terms to allow for the possibility of systematic state and national polling errors.  There&#8217;s no perfect way to do this adjustment, and I&#8217;ve heard that some of the private polls in 2024 did better than the public polls that we and others were aggregating.  The point is that there&#8217;s uncertainty.</p>
<p>Lots of people, even quantitatively-minded people, can&#8217;t seem to handle this uncertainty.  For example, our 2024 forecast was <a href="https://backofmind.substack.com/p/you-cant-take-it-back-in-a-disclaimer/comment/156903482">criticized</a> for not &#8220;communicating that a Trump landslide was a significant probability.&#8221;  But the election was very close&#8211;nothing like a landslide at all!</p>
<p>The relevant point here is that people seem to be able to handle a point forecast (&#8220;Here&#8217;s who we think will win&#8221;) or complete uncertainty (&#8220;We know nothing at all&#8221;) but have difficulty with anything in between (&#8220;We&#8217;re pretty sure the election will be close, but it could go either way&#8221;).  And I think what happens sometimes is that people take forecasts with uncertainty and place them in one of these two bins.  For example, when we (and Nate, and others) gave our forecasts in 2024, the public&#8211;even some quantitatively-trained members of the public&#8211;were inclined to either treat our forecasts as deterministic (our forecast odds are 50/50 so we&#8217;re essentially predicting an exact tie) or as empty (our forecast odds are 50/50 so we&#8217;re making no predictions at all).</p>
<p>Now let&#8217;s get back to those 2025 elections.  Democrats had comfortable polling leads in all these races, but pundits had been burned a year earlier, so a natural first step was to take the polling errors from 2024 and apply them to the 2025 polls.  In the event, the polling errors were mostly in the other direction.  On election eve, a reasonable forecast would be to say that the Democratic candidates were likely to win handily, they might win by even more than expected, or the races might have been close, with some Republican upsets not being out of the question.  It&#8217;s hard for me to compare this to the conventional wisdom because that&#8217;s not written down in any one place.</p>
<p>Uncertainty is hard, especially given that people flip between thinking of forecasts as deterministic and thinking of forecasts as being completely uncertain.  It&#8217;s related to the <a href="https://statmodeling.stat.columbia.edu/2025/12/06/an-idea-for-getting-approximately-calibrated-50-subjective-probability-ranges/">well-known challenge</a> of producing wide-enough uncertainty intervals.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/14/dc-conventional-wisdom-goes-down-to-defeat-in-state-after-state/feed/</wfw:commentRss>
			<slash:comments>4</slash:comments>
		
		
			</item>
		<item>
		<title>Recent discoveries on the acquisition of the highest levels of statistical fallacies</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/13/recent-discoveries-on-the-persistence-of-statistical-fallacies/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/13/recent-discoveries-on-the-persistence-of-statistical-fallacies/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Wed, 13 May 2026 13:09:58 +0000</pubDate>
				<category><![CDATA[Causal Inference]]></category>
		<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Sports]]></category>
		<category><![CDATA[Zombies]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53080</guid>

					<description><![CDATA[Mark Goldstein points us to this post by Alex Dimakis, who writes: A paper was recently published in Science on highest level of human performance across athletics, science, math and music. I think the paper makes some classical statistics mistakes &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/13/recent-discoveries-on-the-persistence-of-statistical-fallacies/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>Mark Goldstein points us to <a href="https://x.com/AlexGDimakis/status/2002848594953732521">this post</a> by Alex Dimakis, who writes:</p>
<blockquote><p>A paper was recently published in Science on highest level of human performance across athletics, science, math and music. I think the paper makes some classical statistics mistakes that still fool many smart people. The paper &#8220;Recent discoveries on the acquisition of the highest levels of human performance&#8221; by Gullich et al. claims: &#8220;In summary, when comparing performers across the highest levels of achievement, the evidence suggests that eventual peak performance is negatively associated with early performance.&#8221;</p>
<p>The paper makes two mistakes. Base-rate fallacy and . . . Berkson&#8217;s paradox . . . </p>
<p>The study says simply that the very top at young age are not identical with the very top adults. (As one would expect, since there are *many many more non-elite young candidates*). Still, elite young performers are 40 times more likely to be in the top adults compare to general population. This is acknowledged in the paper but in page 6-7, a bit buried in the technical analysis and not sufficiently discussed in abstract or conclusions. . . .</p>
<p>The paper claims &#8220;Across the highest adult performance levels, peak performance is negatively correlated with early performance.&#8221; This is a classic example of Berkson&#8217;s paradox. Here is a simplified example to understand this: Assume that to be a successful actor you have to be either extremely good looking or extremely talented. Assume also that talent and looks are independent in the population. However, among sucessful actors you will observe a negative correlation between looks and talent. This doesn&#8217;t meant anything beyond the selection process and should not be extrapolated. My favorite example-joke of this is that basketball points scored is negatively associated with height among NBA players. (because to be an NBA player you have to be very tall OR be very good at scoring). From this, I extrapolated that since I&#8217;m 5&#8217;7, I will be scoring 80+ points per NBA game. . . .</p></blockquote>
<p><a href="https://www.science.org/doi/10.1126/science.adt7790">Here&#8217;s paper in question</a>, &#8220;Recent discoveries on the acquisition of the highest levels of human performance.&#8221;</p>
<p>Yeah, this sort of thing comes up all the time!  For example, some celebrity academics a couple years ago wrote a book that included the false statement, &#8220;while correlation does not imply causation, causation does imply correlation.&#8221;  Even more amusingly, they prefaced this by &#8220;We must, however, remember that&#8221;.  I guess we must remember a lot of false things!  Economist Rachael Meager gave a quick example showing why they were wrong; <a href="https://statmodeling.stat.columbia.edu/2021/05/23/thinking-fast-slow-and-not-at-all-system-3-jumps-the-shark/">See details here</a>.  </p>
<p>This new example also looks a lot like the well-known regression-to-the-mean fallacy (for more on that, I recommend Section 6.5 of our book, <a href="https://sites.stat.columbia.edu/gelman/regression/">Regression and Other Stories</a>, which includes some simulation code to demonstrate the problem).  Of course, just because lots of people know about a fallacy, that doesn&#8217;t stop people from making the error in new settings.  That&#8217;s why it&#8217;s a fallacy!</p>
<p><strong>P.S.</strong>  An <a href="https://statmodeling.stat.columbia.edu/2026/05/13/recent-discoveries-on-the-persistence-of-statistical-fallacies/#comment-2414543">anonymous commenter points out</a> that Dimakis (and, by extension, Goldstein and me) are being unfair to this paper.  The descriptive results are what they are.  I remain skeptical of the paper&#8217;s claim that &#8220;similar developmental pattern across different domains suggests widespread, and possibly universal, principles underlying the acquisition of the highest levels of achievement,&#8221; as I do suspect that much of what they have seen arises from the usual statistical selection artifacts.  So maybe it&#8217;s ok to caution about the interpretation of these numbers.  But now I&#8217;m thinking it wasn&#8217;t fair of us to slam the paper for presenting some interesting data findings.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/13/recent-discoveries-on-the-persistence-of-statistical-fallacies/feed/</wfw:commentRss>
			<slash:comments>17</slash:comments>
		
		
			</item>
		<item>
		<title>Survey Statistics: relevant alternatives ?</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/12/survey-statistics-relevant-alternatives/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/12/survey-statistics-relevant-alternatives/#comments</comments>
		
		<dc:creator><![CDATA[shira]]></dc:creator>
		<pubDate>Tue, 12 May 2026 20:00:12 +0000</pubDate>
				<category><![CDATA[Miscellaneous Statistics]]></category>
		<category><![CDATA[Political Science]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53673</guid>

					<description><![CDATA[Three weeks ago we modeled vote choice with candidates C = {Left, Right, Other} as a multinomial logit: P[voter i chooses candidate c from C] = exp(f(X_ic)) / sum_c’ exp(f(X_ic’)) We saw this model implies independence from irrelevant alternatives (IIA): &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/12/survey-statistics-relevant-alternatives/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><a href="https://statmodeling.stat.columbia.edu/2026/04/14/survey-statistics-irrelevant-alternatives/">Three weeks ago</a> we modeled vote choice with candidates C = {Left, Right, Other} as a <strong>multinomial logit</strong>:</p>
<p style="text-align: center">P[voter i chooses candidate c from C] = exp(f(X_ic)) / sum_c’ exp(f(X_ic’))</p>
<p>We saw this model implies <strong>independence from irrelevant alternatives (IIA)</strong>:</p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53550" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/04/IIA_round1_runoff_drawing-scaled.jpg" alt="" width="446" height="232" /></p>
<p><a href="https://statmodeling.stat.columbia.edu/2026/04/14/survey-statistics-irrelevant-alternatives/#comment-2413483">gec commented</a> about accounting for non-IIA, suggesting expanding the model above to include choice set C within the logits: f(X_ic,C). So in gec&#8217;s model:</p>
<p>P[i chooses Left from C]/ P[i chooses Right from C] = exp(f(X_iLeft) + K[Left,Right] + K[Left,Other])/exp(f(X_iLeft) + K[Right,Left] + K[Right,Other])</p>
<p>compare this to:</p>
<p>P[i chooses Left from {Left,Right}] / P[i chooses Right from {Left,Right}] = exp(f(X_iLeft) + K[Left,Right])/exp(f(X_iLeft) + K[Right,Left])</p>
<p>These are equal (i.e. IIA holds) if K[Left,Other] = K[Right,Other]. <a href="https://eml.berkeley.edu/books/choice2.html">Train (2009)</a> proposes this as a test of IIA in Chapter 3. This requires some survey questions where folks are given the full choice set and some where they are only given two parties {Left, Right}, though Train doesn&#8217;t talk about these aspects of survey questionnaire design. Can folks recommend other references ?</p>
<p><img loading="lazy" decoding="async" class="alignnone size-full wp-image-53544" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/04/Train-book.png" alt="" width="378" height="570" srcset="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/04/Train-book.png 378w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/04/Train-book-199x300.png 199w" sizes="(max-width: 378px) 100vw, 378px" /></p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-53675" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Doobie_2024_summit_blaze_backpack-scaled.jpeg" alt="" width="373" height="379" srcset="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Doobie_2024_summit_blaze_backpack-scaled.jpeg 2521w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Doobie_2024_summit_blaze_backpack-295x300.jpeg 295w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Doobie_2024_summit_blaze_backpack-1008x1024.jpeg 1008w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Doobie_2024_summit_blaze_backpack-768x780.jpeg 768w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Doobie_2024_summit_blaze_backpack-1513x1536.jpeg 1513w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Doobie_2024_summit_blaze_backpack-2017x2048.jpeg 2017w" sizes="(max-width: 373px) 100vw, 373px" /></p>
<p>&nbsp;</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/12/survey-statistics-relevant-alternatives/feed/</wfw:commentRss>
			<slash:comments>8</slash:comments>
		
		
			</item>
		<item>
		<title>The Application Matters: Medical Ethics and Counterfactual Utilities</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/12/the-application-matters-medical-ethics-and-counterfactual-utilities/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/12/the-application-matters-medical-ethics-and-counterfactual-utilities/#comments</comments>
		
		<dc:creator><![CDATA[Jonas Mikhaeil]]></dc:creator>
		<pubDate>Tue, 12 May 2026 18:00:23 +0000</pubDate>
				<category><![CDATA[Causal Inference]]></category>
		<category><![CDATA[Decision Analysis]]></category>
		<category><![CDATA[Counterfactual Utilities]]></category>
		<category><![CDATA[ethics]]></category>
		<category><![CDATA[Medicine]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53712</guid>

					<description><![CDATA[I believe, as applied statisticians, we need to get our hands dirty and immerse ourselves in the applications we try to address. This post is mostly about medical ethics and the famous “first, do no harm” principle. It is also &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/12/the-application-matters-medical-ethics-and-counterfactual-utilities/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>I believe, as applied statisticians, we need to get our hands dirty and immerse ourselves in the applications we try to address. This post is mostly about medical ethics and the famous “first, do no harm” principle. It is also an attempt to understand how statistics can serve medical practice. The motivation for this comes from a recent debate in the statistics literature about counterfactual losses, which often invokes this “first, do no harm’’ principle as a motivation. Much has been written about the theory of these counterfactual losses — and I’m sure they will find a fruitful application — but do they actually speak to the challenge of medical decision-making that the “first, do no harm’’ principle seeks to address?</p>
<p>I will argue that they cannot, because this principle is concerned with medicine at its most human: medical practice centered on the relationship between an individual patient and an individual physician. But what can statistics help with? Modern medical obligations acknowledge that medicine is embedded in society; they highlight medical practitioners’ concern with justice and with reducing health disparities. These are concerns statistics can help to address.</p>
<p>But let me start at the beginning. There’s a recent literature that considers decision making under counterfactual loss — what if the utility of your decisions not only depends on the realized outcome but also on what could have been, on a counterfactual? A paradigmatic example is the following “first, do no harm’’ utility: Suppose you’re administering a drug and there are only two extreme outcomes. The patient may live, or they will die. The literature (e.g., <a href="https://psycnet.apa.org/record/2009-10055-007">Bordley, 2009</a>,  <a href="https://arxiv.org/abs/2206.10479">Ben-Michae et al., 2023</a>, <a href="https://arxiv.org/abs/2412.16352">Christy and Kowalski, 2026</a>) has interpreted the medical aphorism “first, do no harm” as requiring a utility function that assigns asymmetric weights to saving a life and causing a patient’s death. The disutility from killing a patient who, counterfactually, would have survived outweighs the positive utility of saving a patient who otherwise would have died.<span class="Apple-converted-space">  </span>Although this may initially seem attractive, several authors have pointed out complications that arise when decisions are based on such counterfactual losses (e.g., <a href="https://arxiv.org/abs/2301.11976">Dawid and<span class="Apple-converted-space">  </span>Senn, 2023,</a> <a href="https://academic.oup.com/aje/article/194/6/1743/7226668">Sarvet and Stensrud, 2023</a>).</p>
<p>Andrew and I <a href="https://arxiv.org/abs/2412.12233">contributed to this literature</a> with a small example that seemingly produces a counterintuitive recommendation, which I discuss below.</p>
<p>In response, <a href="https://arxiv.org/abs/2605.05521">Koch and co-authors</a> write:</p>
<blockquote><p>[T]his seemingly nonsensical result can be reasonable in a different setting. […] It may be reasonable for a  physician to prefer standard care, prioritizing the avoidance of adverse counterfactual outcomes over  improvements in expected benefits. Indeed, such a decision reflects the Hippocratic principle of “do  no harm”. […] This example underscores the fact that a utility function represents the preferences of the  decision-maker and is therefore inherently subjective and context-dependent.</p></blockquote>
<p>This uncovers a problem with our argument based on intuition — see, this decision doesn’t make sense, does it? Intuition, of course, can be misleading. One way our example might be misleading, as Koch et al. point out,<span class="Apple-converted-space">  </span>is that it may describes a setting in which we simply do not hold these counterfactual utilities.<span class="Apple-converted-space">  </span>If we were to transplant the same recommendation into an appropriate setting, it might no longer appear nonsensical and might instead conform to how we think we should behave.</p>
<p>This has me very excited. I believe statistics is at its best when it takes its applications seriously. So, in this blog post, I want to do just that.</p>
<p>I will briefly give the example Andrew and I came up with to show that a &#8220;do no harm’’ utility can lead to counterintuitive decision recommendations. We do so through an example involving Russian roulette. It is a useful example, but by no means an accurate representation of what we would consider plausible in real medical settings. What it does show, however, is that we need to be really careful with these “do no harm’’ utilities: if we don’t really hold them, they may lead to nonsensical decisions.</p>
<p>Taking the application seriously, we will dive into medical ethics to ask whether the proposed counterfactual “do no harm” utilities help with medical decisions. We do so by briefly examining the origin and history of the “first, do no harm” principle.<span class="Apple-converted-space">  </span>We will see that “do no harm” is perhaps best understood in the context of a professional ethic that commits physicians to the rules of their craft and to respect for each individual patient. Statistics cannot truly speak to this individual-level patient-physician relationship. Since the Hippocratic Oath, however, medicine has changed substantially. With the advent of scientific methods in clinical medicine, doctors face new moral obligations not captured by the “do no harm’’ principle. Some of these new obligations arise from the relationship among medicine and society; others arise from the use of scientific methods themselves. We will look at modern medical oaths to get a glimpse of these new obligations — and how statistics can help fulfill them.</p>
<p><b>Russian Roulette<span class="Apple-converted-space"> </span></b></p>
<p>As a starting point, let me present our simple and somewhat morbid <a href="https://arxiv.org/abs/2412.12233">example</a> in which counterfactual utilities give a counterintuitive decision recommendation: Imagine we are choosing between two games of Russian roulette. In the first game, the status quo, we play with a six-chamber gun, one chamber of which is loaded. That is, we face a one-in-six chance of death. We are then offered the option to switch to a seven-chamber gun, the new alternative “treatment.” If we switch, we face better odds: only a one-in-seven chance of dying. By switching games, we lower our probability of death, which to me seems preferable.<span class="Apple-converted-space"> </span></p>
<p>What would the counterfactual “do no harm’’ utility function recommend? To figure this out, we treat the outcomes under either game of Russian roulette as (independent) potential outcomes and divide the population of players into four principal strata based on survival status. Only two of the principal strata are relevant for our decision, those in which a player would survive one game but die playing the other. It’s easy to work out that with probability 6/42 switching to the new gun saves you: you would die under the status quo but survive under the treatment. But with probability 5/42, you would have survived under the status quo, but switching to the new gun, you will die. Suppose we interpret “first, do no harm’’ as mandating that the negative repercussions of our treatment choice, the death of a player, outweigh the benefits of saving a life. For example, suppose saving a life has utility +1, while the death of a player has utility −2.<span class="Apple-converted-space"> </span>Then the 6/42 chance that the treatment saves you is outweighed by the 5/42 chance that the treatment kills you in cases where, counterfactually, you would have lived.</p>
<p>Under this counterfactual utility, we ought not to switch. It recommends we stick to the status quo, under which we face a higher chance of death. This strikes me as a counterintuitive decision recommendation.</p>
<p><b>The “First, do no harm” Principle</b></p>
<p>There is, however, a limit to the force of this argument based on intuition. One might argue that the recommendation in the Russian roulette example is not evidence against counterfactual utilities in general, but rather an indication that, when playing Russian roulette, we do not hold utilities of this kind. When transplanted to a setting where we have such asymmetric counterfactual utilities, the same recommendation might be sensible.<span class="Apple-converted-space"> </span>The counterfactual-utility literature often motivates asymmetric counterfactual utilities by appealing to the “first, do no harm’’ principle in medicine.</p>
<p>For the rest of this post, I will discuss whether counterfactual utilities are useful in this paradigmatic application: medical decision-making.</p>
<p>In a paper frequently cited by advocates of counterfactual utilities, <a href="https://accp1.onlinelibrary.wiley.com/doi/10.1177/0091270004273680">Cedric Smith (2005)</a> discusses the origin and limitations of the “first, do no harm” principle. It is actually not part of the Hippocratic Oath, or the wider Hippocratic corpus, as is often implied, but has somewhat nebulous roots. Smith traces its origin to the seventeenth-century English physician Thomas Sydenham. While undoubtedly catchy, this principle is not embedded in a larger ethical framework that would give guidance on its interpretation or justifications for its use.</p>
<p>The is a problem because taken literally, this &#8220;first, do no harm’’ principle is a poor guide to medical decision-making. Let me cite Louis Lasagna, an American physician of the last century who was very involved in rethinking the Hippocratic Oath:</p>
<blockquote><p>“To observe this advice [first, do no harm] literally is to deny important therapy to everyone, since only inert nostrums [quack medicine without active pharmaceutical ingredients] can be guaranteed to do no harm. It is more reasonable to ask doctors to balance the potential gains against the possible harm; would that we could only quantify these probabilities more precisely!” (Lasagna cited in  <a href="https://accp1.onlinelibrary.wiley.com/doi/10.1177/0091270004273680">Smith, 2005</a>)</p></blockquote>
<p>A call to action for us statisticians if I ever saw one. Of course, the counterfactual-utility literature that cites this principle is not advocating what Lasagna warns against: doing absolutely no harm. Its proponents are well aware that benefits and risks must be carefully weighed against each other. If the principle is not meant to be taken literally, then its obscure origin becomes a problem: it gives us little insight into what actually matters to medical practitioners, because it is disconnected from any wider tradition that would help us interpret it.<span class="Apple-converted-space"> </span></p>
<p>Luckily, we can find a similar, more nuanced statement in the <a href="https://www.loebclassics.com/view/hippocrates_cos-epidemics_i_iii/2022/pb_LCL147.147.xml">Hippocratic corpus (Epidemics I)</a>:</p>
<blockquote><p>“Declare the past, recognize the present, foretell the future: attend to these things. As to diseases, make a habit of two things—to help, or at least to do no harm. The art has three factors, the disease, the patient, the physician. The physician is the servant of the art.”</p></blockquote>
<p>The Greek word here is technē (orig. τέχνη) which we might also want to translate as “craft”.<span class="Apple-converted-space">  </span>Medicine is a craft because the decisions a physician has to face cannot be made by rote application of knowledge. As a craftsperson, the physician as an individual becomes relevant. That is why the Hippocratic Oath commits the physician, as an individual, to be benevolent in each patient interaction. Medical ethics based on the Hippocratic Oath is not focused on outcomes, let alone utility, but concerned with the character of the physician and their obligations toward their patient <a href="https://www.tandfonline.com/doi/abs/10.1080/15265160500508601">(Pellegrino, 2006)</a>. It centers the patient-physician relationship.<span class="Apple-converted-space"> </span></p>
<p>With this background in mind, we can understand why the &#8220;benevolence” implied in the imperative to help is qualified with the phrase ‘’or at least do no harm’’ — if I’m already committed to help, it may seem that I’m already committed to do no harm. <a href="https://experts.arizona.edu/en/publications/medical-beneficence-nonmaleficence-and-patients-well-being/">Lynn Jansen (2022)</a> argues that this is where the professional aspect of medicine enters: As a professional, the physician needs to restrict their actions to those that align with their profession. That is, while they strive for benevolence in the sense of furthering the patient’s overall well-being, they reject all courses of action that would harm the patient’s medical well-being. This second aspect is often called non-maleficence.<span class="Apple-converted-space"> </span></p>
<p><b>Statistics and Medicine<span class="Apple-converted-space"> </span></b></p>
<p>In modern medicine, this tension is heightened. Taking the patient’s moral agency seriously, a physician must be careful not to “confuse technical with moral authority” <a href="https://www.tandfonline.com/doi/abs/10.1080/15265160500508601">(Pellegrino, 2006)</a> or override patients’ values. This is worth keeping in mind. The patient must be involved in weighing benefits and risks. Thus, the medical professional does not have sole discretion to choose an optimal treatment. “Help, or at least do no harm” is a professional mantra that guides a physician in their interactions with patients. It is not a constraint on optimal decision-making; it is a moral commitment to respect each patient.</p>
<p>This conception of medicine is in stark contrast to the world seen through the lens of statistics. Compare this focus on the individuality of both patient and physician with the following quotation from an 1835 report to the Academy of Sciences, written by a committee of four mathematicians, including Poisson, on operations for gallstones:<span class="Apple-converted-space"> </span></p>
<blockquote><p>“In statistical affairs &#8230; the first care before all else is to lose sight of the man taken in isolation in order to consider him only as a fraction of the species. It is necessary to strip him of his individuality to arrive at the elimination of all accidental effects that individuality can introduce into the question.<b>” </b>(taken from <a href="https://www.cambridge.org/core/books/taming-of-chance/79755A47B3FE3A340C2C79FBA1DE53D0">Hacking, 1990</a>)</p></blockquote>
<p>Statistics’ power lies in constructing aggregates, making disparate things hold together <a href="https://en.wikipedia.org/wiki/The_Politics_of_Large_Numbers">(Desrosières, 1998)</a>. Historically, these aggregates were useful for the emerging nation-state and were quickly adopted to address large-scale social problems, such as public health. Many professions, including medicine, strongly resisted losing sight of the particular &#8211; in our case, the individual patient — in favor of aggregates. Even randomized experiments, which we nowadays all too easily accept as the gold standard of evidence, had a hard time entering clinical medicine <a href="https://press.princeton.edu/books/paperback/9780691208411/trust-in-numbers?srsltid=AfmBOoo5pBhE14SN6i5xcLRyfaPhE7LfguYlS8wjGru1a1BUHSDzG77C">(Porter, 2020)</a>.<span class="Apple-converted-space"> </span></p>
<p>Due to this tension, modern medicine has a dual nature.<span class="Apple-converted-space">  </span>On the one hand, doctors are still committed to treating their patients as individuals — medicine is the art of healing. Yet with advances of scientific methods within medicine, and with the recognition that health must be understood in the context of society, doctors face new moral obligations <a href="https://www.tandfonline.com/doi/abs/10.1080/15265160500508601">(Pellegrino, 2006)</a>.</p>
<p><b>Modern Medical Oaths</b></p>
<p>To get a glimpse of these new obligations and the self-understanding of doctors in the twenty-first-century, we can look to modern versions of medical oaths. While many doctors still take the ancient Hippocratic Oath, many medical schools revise the original text or students take an additional self-formulated oath. In 2005, for example, students at Weill Cornell Medical College began taking a <a href="https://news.cornell.edu/stories/2005/06/revised-hippocratic-oath-resonates-graduates">revised Hippocratic Oath</a>. Let me highlight a brief excerpt:</p>
<blockquote><p>I vow […]</p>
<p>That above all else I will serve the highest interests of my patients through the practice of my science and my art; That I will be an advocate for patients in need and strive for justice in the care of the sick.</p></blockquote>
<p>Notice the emphasis on justice; it’s not idiosyncratic to this oath. Two further examples show similar themes. The <a href="https://www.health.pitt.edu/news/school-medicine-diploma-day-2024/">University of Pittsburgh School of Medicine’s class of 2024</a> took an oath that highlighted the social determinants of health and advocated for a more equitable health care system. <a href="https://hms.harvard.edu/sites/default/files/assets/News/2015/files/Aug/Oath_Class-of-2019.pdf?utm_source=chatgpt.com">Harvard Medical School’s class of 2019</a> vowed to combat structural oppression and promote social justice.<span class="Apple-converted-space"> </span>In this admittedly selective set of examples, much emphasis is placed on how medicine relates to society. Core commitments are justice and the building of an equitable health care system.</p>
<p>So, how can we statisticians help modern medical practice? Modern medical ethics places great emphasis on patients’ autonomy and their freedom to choose based on their own values. For a patient’s decision to be well informed, deliberation about benefits and risks is central — but the decision ultimately depends on a personal tradeoff shaped by the patient’s values. For this reason, our goal should perhaps not be to optimize treatment decisions. We do need to help estimate the benefits and risks of treatments more accurately, but treatment decisions remain part of the individual patient-physician relationship. Instead, we should put more emphasis on identifying and reducing disparities in the health care system, focusing on medicine as embedded in society. The most important task may not be deciding which drug to administer, but reducing inequalities in access to treatment in the first place. I believe statistics has an important role to play in making health care systems more equitable and more just.<span class="Apple-converted-space"> </span></p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/12/the-application-matters-medical-ethics-and-counterfactual-utilities/feed/</wfw:commentRss>
			<slash:comments>19</slash:comments>
		
		
			</item>
		<item>
		<title>I&#8217;m on the EPA science advisory board.</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/12/im-on-the-epa-science-advisory-board/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/12/im-on-the-epa-science-advisory-board/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Tue, 12 May 2026 13:50:04 +0000</pubDate>
				<category><![CDATA[Decision Analysis]]></category>
		<category><![CDATA[Political Science]]></category>
		<category><![CDATA[Public Health]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53706</guid>

					<description><![CDATA[I just joined, and I&#8217;m one of 37 people on the board, a mix of people from academia, industry, and government. If you google *EPA science advisory board*, you get sent to this page, which at first seems reasonable: Look &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/12/im-on-the-epa-science-advisory-board/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>I just joined, and I&#8217;m <a href="https://sab.epa.gov/ords/sab/r/sab_apex/sab/tier-1-members?p29_committeeon=Board&#038;clear=29&#038;session=8637385693848">one of 37 people</a> on the board, a mix of people from academia, industry, and government.</p>
<p>If you google *EPA science advisory board*, you get sent to <a href="https://sab.epa.gov/ords/sab/r/sab_apex/sab/home">this page</a>, which at first seems reasonable:</p>
<p><img loading="lazy" decoding="async" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Screenshot-2026-05-11-at-19.49.25-1024x982.png" alt="" width="584" height="560" class="alignnone size-large wp-image-53707" srcset="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Screenshot-2026-05-11-at-19.49.25-1024x982.png 1024w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Screenshot-2026-05-11-at-19.49.25-300x288.png 300w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Screenshot-2026-05-11-at-19.49.25-768x736.png 768w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Screenshot-2026-05-11-at-19.49.25-1536x1472.png 1536w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Screenshot-2026-05-11-at-19.49.25-313x300.png 313w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/05/Screenshot-2026-05-11-at-19.49.25.png 1980w" sizes="(max-width: 584px) 100vw, 584px" /></p>
<p>Look carefully, though, and you&#8217;ll see that the most recent meeting was in 2024!</p>
<p>Here&#8217;s the official description:</p>
<blockquote><p>The SAB is a Federal Advisory Committee established by Congress to provide advice to the agency on scientific and technical matters. It is administered by the EPA Science Advisory Board Staff Office through a Designated Federal Officer. All meetings are open to the public. . . . SAB panel members serve until the work of the panel is complete. Some meetings are held virtually. Panels usually conduct 2-3 video teleconferences and one in-person meeting to discuss reports and work products before providing advice to the Administrator through the Chartered SAB.</p></blockquote>
<p>My dad worked for the Environmental Protection Agency a long time ago&#8211;he was in Mobile Source Enforcement, which pretty much involved stopping people from manufacturing or selling leaded gasoline&#8211;and I&#8217;m inclined to serve my country when asked.</p>
<p>Then again, I just read <a href="https://www.newyorker.com/magazine/2026/05/04/can-the-epa-survive-lee-zeldin">this news article</a> about Lee Zeldin, the current director of the EPA, and it&#8217;s pretty scary.  The official documentation says that our role is to provide advice to the administrator.  I guess then it&#8217;s up to him to decide what to do about it.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/12/im-on-the-epa-science-advisory-board/feed/</wfw:commentRss>
			<slash:comments>10</slash:comments>
		
		
			</item>
		<item>
		<title>It kinda makes sense that you can know roughly 700 people.</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/11/it-kinda-makes-sense-for-you-to-know-about-700-people/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/11/it-kinda-makes-sense-for-you-to-know-about-700-people/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Mon, 11 May 2026 13:05:05 +0000</pubDate>
				<category><![CDATA[Sociology]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53066</guid>

					<description><![CDATA[In planning my new course I read through Daniel Bell&#8217;s 1973 book, The Coming of Post-Industrial Society. It doesn&#8217;t really have anything so relevant for the class, so it turns out I won&#8217;t be assigning any readings from it. But &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/11/it-kinda-makes-sense-for-you-to-know-about-700-people/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>In planning <a href="https://statmodeling.stat.columbia.edu/2025/12/08/my-new-class-this-spring-pols-4280-rationalizing-the-world-the-hopes-and-disappointments-of-american-social-science-from-1900-to-the-present/">my new course</a> I read through Daniel Bell&#8217;s 1973 book, The Coming of Post-Industrial Society.  It doesn&#8217;t really have anything so relevant for the class, so it turns out I won&#8217;t be assigning any readings from it.</p>
<p>But I did come across one fun bit, in a footnote on page 467 which pointed to this quote from <a href="https://www.jstor.org/stable/20027069?seq=3">an article from 1967</a> by economist Martin Shubik:</p>
<blockquote><p><img decoding="async" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-11-at-17.00.57.png" alt="" width="450" /><br />
<img decoding="async" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-11-at-17.01.34.png" alt="" width="450" /></p></blockquote>
<p>Each person should have 700 friends!  That reminded me of something from <a href="https://sites.stat.columbia.edu/gelman/research/published/overdisp_final.pdf">our 2006 paper</a> on social networks:</p>
<p><img loading="lazy" decoding="async" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-11-at-17.08.05-1024x366.png" alt="" width="584" height="209" class="alignnone size-large wp-image-53069" srcset="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-11-at-17.08.05-1024x366.png 1024w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-11-at-17.08.05-300x107.png 300w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-11-at-17.08.05-768x274.png 768w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-11-at-17.08.05-1536x548.png 1536w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-11-at-17.08.05-500x178.png 500w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/Screenshot-2026-01-11-at-17.08.05.png 1748w" sizes="(max-width: 584px) 100vw, 584px" /></p>
<p>We estimated the average person to know about 700 people.  Too bad we weren&#8217;t aware of that Shubik paper or we could&#8217;ve cited it.  According to wikipedia, Shubik didn&#8217;t retire until 2007 so if we&#8217;d only known we could&#8217;ve told him about our finding.</p>
<p>Nowadays, of course, we can know lots more than 700 people, because you can know people online without ever meeting them at all!</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/11/it-kinda-makes-sense-for-you-to-know-about-700-people/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
			</item>
		<item>
		<title>Who wanted Trump to run in 2024?</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/10/who-wanted-trump-to-run-in-2024/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/10/who-wanted-trump-to-run-in-2024/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Sun, 10 May 2026 13:05:19 +0000</pubDate>
				<category><![CDATA[Political Science]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53047</guid>

					<description><![CDATA[Weakliem writes: In 2016, Donald Trump lost the popular vote to a weak candidate, although the mysterious workings of the Electoral College gave him the presidency. During his first term, he never reached a 50% approval rating. In the 2020 &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/10/who-wanted-trump-to-run-in-2024/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><a href="https://justthesocialfacts.blogspot.com/2026/01/guilty-men-and-women.html">Weakliem writes</a>:</p>
<blockquote><p>In 2016, Donald Trump lost the popular vote to a weak candidate, although the mysterious workings of the Electoral College gave him the presidency.  During his first term, he never reached a 50% approval rating.  In the 2020 election, despite having the advantage of incumbency, he lost both the popular and electoral votes to a mediocre candidate.  So why did the Republicans ignore this record of failure and nominate him again in 2024?  Most observers seem to think that the answer is obvious&#8211;it was because he had a strong hold on ordinary Republican voters.</p></blockquote>
<p>Actually, though, the polls show that about the same proportion of Republicans wanted Trump to run again in 2024 as wanted Romney to run again in 2016.  In both cases, the previous election&#8217;s loser had strong support among the rank-and-file, but voters were open to other options.</p>
<p>Weakliem continues:</p>
<blockquote><p>Turning from voters to elites, here are endorsements from Republican senators and governors (data from Ballotpedia).  </p>
<p><img loading="lazy" decoding="async" src="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/image.png" alt="" width="400" height="266" class="alignnone size-full wp-image-53048" srcset="https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/image.png 400w, https://statmodeling.stat.columbia.edu/wp-content/uploads/2026/01/image-300x200.png 300w" sizes="(max-width: 400px) 100vw, 400px" /></p>
<p>Out of a total of 76 governors and senators, 44 endorsed Trump and only 11 endorsed other candidates (four of those endorsed Trump after their first choice dropped out, so 48 endorsed Trump before the race was settled).</p>
<p>My [Weakliem&#8217;s] overall conclusion is that &#8220;the base&#8221; didn&#8217;t impose Trump on Republican elites; Republican elites asked for him.</p></blockquote>
<p>Interesting.  But a complicating factor here is that the elites were responding to the voters: in 2024 as in 2016, Trump confounded his doubters by performing well in Republican primaries.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/10/who-wanted-trump-to-run-in-2024/feed/</wfw:commentRss>
			<slash:comments>27</slash:comments>
		
		
			</item>
		<item>
		<title>More thoughts regarding comfortable, powerful people who seem willing or even eager to blow up the system.</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/09/more-thoughts-regarding-comfortable-powerful-people-who-seem-willing-or-even-eager-to-blow-up-the-system/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/09/more-thoughts-regarding-comfortable-powerful-people-who-seem-willing-or-even-eager-to-blow-up-the-system/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Sat, 09 May 2026 13:57:02 +0000</pubDate>
				<category><![CDATA[Political Science]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=51928</guid>

					<description><![CDATA[I wanted to elaborate on my post from yesterday, about comfortable, powerful people who seem willing or even eager to blow up the system. It&#8217;s a bit of a puzzle why they&#8217;re not happy with incremental gains, given that they &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/09/more-thoughts-regarding-comfortable-powerful-people-who-seem-willing-or-even-eager-to-blow-up-the-system/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>I wanted to elaborate on <a href="https://statmodeling.stat.columbia.edu/2025/04/29/they-had-it-all-but-they-wanted-more-left-wing-radicals-in-the-1960s-and-right-wingers-now/">my post from yesterday</a>, about comfortable, powerful people who seem willing or even eager to blow up the system. </p>
<p>It&#8217;s a bit of a puzzle why they&#8217;re not happy with incremental gains, given that they already have so much and they&#8217;re well situated to keep getting more.</p>
<p>The best answer, I think, is that these activists think that the current system is (a) going in the wrong direction and (b) unreformable.  That&#8217;s how left-wing activists in the 1960s thought about the military-industrial complex and it&#8217;s how right-wing activists today think about an unsustainable debt caused by mandated payments to special interests. (Of course I&#8217;m drastically simplifying in both cases, just trying to give the basic picture.)</p>
<p>A similar case could be made by comfortable environmental activists who say that our industrial society is unsustainable.  The difference is that environmentalists today, unlike left-wingers in the 1960s and right-wingers today, are far from power:  some environmentalists might like to shut down the current economic and political system but they&#8217;re in no position to come close to doing so.</p>
<p>In all cases, the let&#8217;s-blow-up-the-system people are just a subset of the concerned activists.  Lots of left-wing activists in the 1960s wanted to end the war and change foreign policy while keeping a social-democratic system of government, lots of right-wing activists today would like to reform government while keeping a stable rule of law, and lots of environmental activists would like to just ratchet down consumption without any major break in the system</p>
<p>The blow-it-all-up people just have some mixture of extreme frustration about the ability of the system as it is to be reformed, plus I assume some personal attraction to the idea of tumult and revolution.</p>
<p>I still stand by the statement in my earlier post that part of the story is that if you already have it all, you feel invulnerable&#8212;vindicated by all the past gambles you’ve taken that have paid off&#8212;and you’re willing to throw the dice once again.</p>
<p>Another part of the story, I think, is that super-rich tech investors, who are key figures in the blow-up-the-system movement, have an economic ideology in which they got where they are by being the nimble &#8220;mammals&#8221; taking over from the lumbering &#8220;dinosaurs&#8221; of rust-belt industry, legacy media, and liberal politics.  The point here is that the next logical step is for the current tech lords to themselves become dinosaurs; thus they don&#8217;t feel as secure as the rest of us might think they are.  <a href="https://statmodeling.stat.columbia.edu/2026/01/27/facebook-and-the-inherent-incoherence-of-the-ideology-of-market-leading-tech-companies/">As I wrote a few months ago</a>:</p>
<blockquote><p>On one hand, the storyline is that you’re gonna get overtaken by hungry young newcomers; on the other hand, you’re supposed to stay on top forever. The result, at least for Facebook, seems to have been a kind of desperation, a sense that on one hand they are the kings of the world and that on the other hand they are destined to fail and so they have to try harder and harder to grow and grow and preserve a near-monopoly status. And that’s how you get these executives who control unimaginable fortunes and yet are willing to lie and cheat (I wanted to say “lie, cheat, and steal,” but I don’t know if there was any actual stealing reported in that book), indeed they seem to feel that they have to like and cheat and manipulate the rules and all of this to stay on top.</p></blockquote>
<p>I suspect this is connected to their surprising (to me) willingness to overturn the political system.  This is a system that&#8217;s made them unimaginably rich and powerful, but, in their ideology, that wouldn&#8217;t stop them from becoming the U.S. Steels, General Motorses, and CBS Newses of tomorrow.  They&#8217;re willing to blow it all up, even though they&#8217;re benefiting so much from the system we have now.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/09/more-thoughts-regarding-comfortable-powerful-people-who-seem-willing-or-even-eager-to-blow-up-the-system/feed/</wfw:commentRss>
			<slash:comments>47</slash:comments>
		
		
			</item>
		<item>
		<title>&#8220;An Axiomatic Foundation for Decisions with Counterfactual Utility&#8221;</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/08/an-axiomatic-foundation-for-decisions-with-counterfactual-utility/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/08/an-axiomatic-foundation-for-decisions-with-counterfactual-utility/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Fri, 08 May 2026 22:19:01 +0000</pubDate>
				<category><![CDATA[Causal Inference]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53690</guid>

					<description><![CDATA[Benedikt Koch, Kosuke Imai, and Tomasz Strzalecki write: Counterfactual utilities evaluate decisions not only by the realized outcome under a given decision, but also by the counterfactual outcomes that would arise under alternative decisions. By generalizing standard utility frameworks, they &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/08/an-axiomatic-foundation-for-decisions-with-counterfactual-utility/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>Benedikt Koch, Kosuke Imai, and Tomasz Strzalecki <a href="https://arxiv.org/abs/2605.05521">write</a>:</p>
<blockquote><p>Counterfactual utilities evaluate decisions not only by the realized outcome under a given decision, but also by the counterfactual outcomes that would arise under alternative decisions. By generalizing standard utility frameworks, they allow decision-makers to encode asymmetric criteria, such as avoiding harm and anticipating regret. Recent work, however, has raised fundamental concerns about the coherence and transitivity of counterfactual utilities. We address these concerns by extending the von Neumann-Morgenstern (vNM) framework to preferences defined on the extended space of all potential outcomes rather than realized outcomes alone. We show that expected counterfactual utility satisfies the vNM axioms on this extended domain, thereby admitting a coherent preference representation. We further examine how counterfactual preferences map onto the realized outcome space through menu-dependent and context-dependent projections. This axiomatic framework reconciles apparent inconsistencies highlighted by the Russian roulette example in the statistics literature and resolves the well-known Allais paradox from behavioral economics. We also derive an additional axiom required to reduce counterfactual utilities to standard utilities on the same potential outcome space, and establish an axiomatic foundation for additive counterfactual utilities, which satisfy a necessary and sufficient condition for point identification. Finally, we show that our results hold regardless of whether individual potential outcomes are deterministic or stochastic.</p></blockquote>
<p>I have to admit that I don&#8217;t see the appeal of utility functions based on counterfactuals.  For example, I&#8217;ve never thought that the decision-theoretic concept of &#8220;regret&#8221; makes sense.  That said, I know that a lot of people are interested in the topic, so I hope the above paper is useful to people in clearing up these issues, and I&#8217;m glad that they were able to use our <a href="https://sites.stat.columbia.edu/gelman/research/published/StochasticPotentialOutcomes.pdf">Russian roulette example</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/08/an-axiomatic-foundation-for-decisions-with-counterfactual-utility/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title>The Pick-the-Winner-Picker Heuristic: Preference for Categorically Correct Forecasts</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/08/the-pick-the-winner-picker-heuristic-preference-for-categorically-correct-forecasts/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/08/the-pick-the-winner-picker-heuristic-preference-for-categorically-correct-forecasts/#comments</comments>
		
		<dc:creator><![CDATA[Andrew]]></dc:creator>
		<pubDate>Fri, 08 May 2026 13:48:32 +0000</pubDate>
				<category><![CDATA[Decision Analysis]]></category>
		<category><![CDATA[Economics]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=52506</guid>

					<description><![CDATA[A couple years ago, Jay Naborn wrote: I am studying people’s preference for categorically correct forecasts (such as getting the winner of a sports game right) over error-minimizing ones (such as getting close on the margin). We have experimental evidence &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/08/the-pick-the-winner-picker-heuristic-preference-for-categorically-correct-forecasts/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>A couple years ago, Jay Naborn wrote:</p>
<blockquote><p>I am studying people’s preference for categorically correct forecasts (such as getting the winner of a sports game right) over error-minimizing ones (such as getting close on the margin). We have experimental evidence of this, why it happens, etc.</p>
<p>What I would be interested in doing is demonstrating that this preference is/can be a mistake. To do so, it would be nice to show that doing well in terms of minimizing continuous error is a better predictor of future winner-picking than is doing well in terms of winner-picking. I am curious if you have any leads as to some existing dataset that would be helpful here, or some simulation/modeling strategy that may work.</p></blockquote>
<p>I replied that, yes, this relates to <a href="https://statmodeling.stat.columbia.edu/2014/02/25/differential/">a point we made here</a>.</p>
<p>Recently Naborn followed up:</p>
<blockquote><p>The blog post you sent (and a couple others of yours) were very informative for our background thinking.  My work (with Jonathan Bogard) forecast evaluation <a href="https://journals.sagepub.com/doi/10.1177/00222437251381209">is now published at the Journal of Marketing Research</a>.</p></blockquote>
<p>And here&#8217;s the abstract:</p>
<blockquote><p>People routinely make decisions based on predictions made by others (e.g., political pundits, market analysts), so it is in their best interest to identify high-quality forecasts. Experts characterize good forecasting as minimization of continuous error (i.e., predictions close to the eventual outcome). By contrast, the present work reveals that laypeople typically see good forecasts as those that correctly predict an event’s categorical outcome (e.g., the winning team). Using within-subjects, between-subjects, and incentive-compatible designs, fifteen studies demonstrate this “pick-the-winner-picker heuristic” as well as its psychological mechanism: People evaluate forecasts by assigning separate weights to (a) categorical correctness and (b) continuous error minimization, depending on the overall importance of the categorical and continuous dimensions for that situation. Thus, in the common case when the categorical dimension matters most (e.g., sports contests), people prize forecasts that accurately predicted the categorical outcome (e.g., the winner, not the margin of victory). However, when the categorical dimension’s stakes are experimentally reduced, an attenuation is observed. While this describes how people typically evaluate forecasts, crucially, a dimension’s importance is not necessarily related to its diagnosticity of forecaster skill or reliability. Accordingly, the pick-the-winner-picker heuristic may constitute a normative mistake, while framing manipulations help debias judgments.</p></blockquote>
<p>Interesting. It&#8217;s good to see research on this topic.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/08/the-pick-the-winner-picker-heuristic-preference-for-categorically-correct-forecasts/feed/</wfw:commentRss>
			<slash:comments>5</slash:comments>
		
		
			</item>
		<item>
		<title>David W. Hogg on why we do astrophysics (in the face of LLMs and the lack of clinical value)</title>
		<link>https://statmodeling.stat.columbia.edu/2026/05/07/david-hogg-on-why-we-do-astrophysics-in-the-face-of-llms-and-the-lack-of-clinical-value/</link>
					<comments>https://statmodeling.stat.columbia.edu/2026/05/07/david-hogg-on-why-we-do-astrophysics-in-the-face-of-llms-and-the-lack-of-clinical-value/#comments</comments>
		
		<dc:creator><![CDATA[Bob Carpenter]]></dc:creator>
		<pubDate>Thu, 07 May 2026 19:00:54 +0000</pubDate>
				<category><![CDATA[Zombies]]></category>
		<guid isPermaLink="false">https://statmodeling.stat.columbia.edu/?p=53672</guid>

					<description><![CDATA[David W. Hogg, who takes his role as scientific gadfly seriously, recently (February 2026) released an interesting rumination on science, LLMs, and the human component of research. David W. Hogg. 2026. Why we do astrophysics? arXiv 2602.10181. I like that &#8230; <a href="https://statmodeling.stat.columbia.edu/2026/05/07/david-hogg-on-why-we-do-astrophysics-in-the-face-of-llms-and-the-lack-of-clinical-value/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p><a href="https://www.simonsfoundation.org/people/david-hogg/">David W. Hogg</a>, who takes his role as scientific gadfly seriously, recently (February 2026) released an interesting rumination on science, LLMs, and the human component of research.</p>
<ul>
<li>David W. Hogg.  2026.  <a href="https://arxiv.org/abs/2602.10181">Why we do astrophysics?</a> <I>arXiv</I> 2602.10181.
</ul>
<p>I like that his first move is to insert an ungrammatical question mark into the title to question the whole process.</p>
<p>There is a lot to unpack here, starting with the claim that LLMs &#8220;show no signs of intelligence.&#8221;  I seriously don&#8217;t know what &#8220;signs&#8221; people are looking for other than the ability to converse in 200 languages fluently in just about any subject known to humanity.  But like most AI benchmarks, we tossed out the Turing Test as soon as a computer could make a passable go of it.</p>
<p>One comment I can get behind is, &#8220;a data scientist who has taken an astronomy class might be better prepared [to deal with astrophysics data and modeling] than an astronomer who has taken a data science class.&#8221;  The usual feeling among scientists is the opposite, but I&#8217;m guessing this is because they&#8217;ve just never worked with truly talented programmers like the ones we have at Flatiron (e.g., Brian Ward, Steve Bronder, Jeff Soules, and Robert Blackwell, among the folks with whom I work).</p>
<p>The note is stuffed with both descriptive accounts and normative statements such as &#8220;Every scientific paper is written to help all of its writers, and all of its readers, learn and grow, no matter their career stages.&#8221;  I&#8217;m imagining living in the <a href="https://en.wikipedia.org/wiki/The_Big_Rock_Candy_Mountains">Big Rock Candy Mountains</a> of academia, where that&#8217;s true.</p>
<p>I just don&#8217;t get &#8220;I believe that (in astrophysics) software is written to support the astrophysics literature, and that every important piece of software should have an associated paper in the astrophysics literature.&#8221;  Why not publish software papers in software venues like JOSS?  Most literatures won&#8217;t even publish software papers, so good on the astrophysics folks for allowing them.</p>
<p>Folks who bemoan the publish-or-perish mentality in academia should consider the statement, &#8220;My point is that astrophysics is the astrophysics literature.&#8221;  I agree if you extend that to include things not peer reviewed like software.</p>
<p>David takes on workflow with the contentious statement, &#8220;a lot of &#8216;implicit knowledge&#8217; or folklore about things like how to observe, how to reduce data, how to organize projects, how to visualize data and models, how to read and write, and so on. Much of this never appears in the literature. Is that not also astrophysics? Yes it is, but it is astrophysics practice.&#8221;  I&#8217;m postmodern enough to believe it&#8217;s impossible to separate astrophysics from astrophysics practice.  Isn&#8217;t the computing also astrophysics practice?  He wants that in the astro literature.  For this, he concludes, &#8220;I would welcome a project in which we tried to make much of this implicit knowledge explicit.&#8221;  Just not in the literature?  We never could get the workflow paper published.</p>
<p>I know David&#8217;s senior enough to know how the sausage is made, so I don&#8217;t see how he can claim, &#8220;Papers (and the authorships on those papers) and the citations of those papers are not &#8216;coin&#8217; of anything!&#8221;  Of course they&#8217;re the &#8220;coin&#8221; of both hiring, tenure and promotion committees.  I believe Andrew has gone on record saying people value publications so highly because nobody wants to devalue the coin that put them where they are (though I&#8217;m sure he said it more cleanly than that).</p>
<p>David says, &#8220;it isn’t really correct, if you use an LLM, to give that LLM co-authorship on your paper,&#8221;.  It&#8217;s not legal in the United States, either.</p>
<p>More normative statements like, &#8220;You can’t decide not to cite a relevant paper because you don’t like the author, or don’t like the author’s institution, or don’t like their funding sources.&#8221;  I guess he hasn&#8217;t hung out in linguistics.  Or mathematics.  My undergraduate advisor, Ed Palmer, was a student of Frank Harary. Harary and Béla Bollobás developed an entirely parallel theory of random graphs, proving all the same theorems and doggedly refusing to cite each other.</p>
<p>This just seems patently false:  &#8220;Anyone who has the capability of getting a PhD in astrophysics has the capability of doing many remunerative things, substantially more remunerative than my job.&#8221;  He then goes on to one of his main philosophical (economic?) points, &#8220;If all we really wanted was to know how the Universe worked, we would start a hedge fund, and use the proceeds to pay an astrophysics institute, filled with people who wanted to do astrophysics rather than find out the answers.&#8221;   I&#8217;m not convinced that someone who&#8217;s good at science will also be good at business.  I think folks like Jim Simons, who co-founded our institute with Marilyn Simons, is an exception.  Sure, if you can make tens of billions of dollars, you can fund a lot of science.  Or buy a lot of Jamaican beef patties to bring it back along to themes of the blog.</p>
<p>I don&#8217;t understand why he says, &#8220;No astronomer (that I know) is improving the calibration of JWST instruments because they want the US Navy to have a higher kill rate.&#8221;  Is that because he doesn&#8217;t think people who do this should be called &#8220;astronomers?&#8221;</p>
<p>Some of the arguments are silly, but at least get those niggling question marks, &#8220;We put &#8216;the universe&#8217; in &#8216;the university&#8217;?&#8221;  I also find trickle-down theories of training to be particularly weak.  Here, David says, &#8220;We train a technical workforce.&#8221;  Sure, but training a technical workforce by teaching them astrophysics seems very inefficient&#8212;see the point above on hiring data scientists versus astrophysicists.  Perhaps the weakest argument is that astrophysics funding takes away from even more dangerous things the government could be funded, under the heading &#8220;We beat ploughshares into swords.&#8221;  I also think &#8220;We create opportunities for development.&#8221; in the sense of development sites in Chile and similar places for observatories.  Just think about how useful that money could be spent in some other way for development?  On the other hand, I can fully get behind, &#8220;Astrophysics is a satisfying activity.&#8221;  I&#8217;m still surprised I get paid to do what I love!</p>
<p>A lot of this paper is about LLMs, but I don&#8217;t think that&#8217;s the interesting part.  Right in the abstract, David discusses, &#8220;two possible (extreme and bad) policy recommendations related to the use of LLMs in astrophysics, dubbed &#8216;let-them-cook&#8217; and &#8216;ban-and-punish.&#8217; I argue strongly against both of these; it is not going to be easy to develop or adopt good moderate policies.&#8221;</p>
<p>As a good writer, he sticks the landing.  First, by reframing the question as, &#8220;Finally: Why did I write this white paper? I wrote this because I became concerned about some of the ideas circulating in the astrophysics community about LLMs and their capabilities, conflating what are (in my view) text-interpolators with what are (in my view) scientists.&#8221;  Then concluding &#8221; Ultimately, I think the real question we face—if we do indeed face a question—is not the question of how we do astrophysics. It is the question of why we do astrophysics.&#8221;</p>
]]></content:encoded>
					
					<wfw:commentRss>https://statmodeling.stat.columbia.edu/2026/05/07/david-hogg-on-why-we-do-astrophysics-in-the-face-of-llms-and-the-lack-of-clinical-value/feed/</wfw:commentRss>
			<slash:comments>26</slash:comments>
		
		
			</item>
	</channel>
</rss>
