This week, however, I wanted to move away from my usual focus on the identification of causal effects to look at the modeling of DGPs.

Let us take an example from the first article I ever published (and which, to this day, remains my most-cited article). In that article, my coauthor and I were interested in the marketing behavior of the households in our sample. In some time periods, some households happened to be net sellers (i.e., their sales exceeded their purchases), some households happened to be net buyers (i.e., their purchases exceeded their sales), and some households happened to be autarkic (i.e., their neither bought nor sold).

The issue as I saw it was that the same variables (e.g., price, distance from market, etc.) would affect households in different regimes (i.e., the amount of sales of net sellers vs. the amount of purchases of net buyers) in different ways. Moreover, I was interested in the factors that drove households to be either net sellers, autarkic, or net buyers.

After thinking about the decision sequence of the households in our data for a little while, I realized that it could be drawn as follows (with apologies for the lo-fi graph):

That is, in the first instance, a household decides whether it’ll be net buyer, autarkic, or a net seller. Then, after having chosen whether to be a net buyer, autarkic, or a net seller, the household decides how much it buys (if it has chosen to be a net buyer) or how much it sells (if it has chosen to be a net seller). If it has chosen to remain autarkic, there is no further behavior to study.

Looking at the first stage, note that it is determined by whether a household’s net sales N, which can in theory be any number on the real line, are such that N < 0, N = 0, or N > 0. Because this is an ordered decision when the real line is partitioned in those three regimes, I thought that the first-stage decision lent itself well to an ordered categorical estimator like the ordered probit.

Looking at either side (i.e., net purchases or net sales) of the second stage, it struck me that they were both continuous decision truncated at zero (you can in theory buy or sell any strictly positive amount, but zero is the lower bound on both purchases and sales). Thus, those two decisions lent themselves to some kind of tobit estimator.

I had been thinking a lot about Heckman selection estimators at that time, so one thing that came to mind was that I could write a single likelihood function that would capture the household’s decision problem. The idea was to have three possible participation regimes in the first stage (N < 0, N = 0, or N > 0, or respectively , and then to have an extent-of-participation decision in cases where or . And obviously, because there was some selection into each of net purchases and net sales , there had to be a selection term in each of those extent-of-participation equations.

I was working on all that when I arrived in Madagascar in 2004 to do fieldwork for my dissertation, and I spent many an afternoon writing out likelihood functions in the bar of the Hôtel Colbert in Antananarivo.* Eventually, this is what I ended up with:

Going through each line after the equality sign:

- The first three lines are the part of the likelihood function that deals with net buyers, both what drives a household into being a net buyer and then how much it purchases conditional on being a net buyer.
- The fourth line is the part of the likelihood function that deals with autarkic households. That is, what drives a household into remaining autarkic, and choosing to neither buy nor sell.
- The last three lines are the part of the likelihood function that deals with net sellers, both what drives a household into being a net seller and then how much it sells conditional on being a net seller.

My coauthor and I called this an “ordered tobit,” and the name seems to have stuck. Since then, a Stata command (oheckman) was developed by Chiburis and Lokshin (2007) to estimate the kind of likelihood function like the one above. Moreover, in a recent article Burke et al. (2015) take the above setup a step further by adding a third stage of selection wherein households first decide whether to be producers or not (specifically, this makes this a “zero-th” stage of selection since that decision occurs before the household decides to be a net buyer, autarkic, or a net seller).

The bottom line is that when faced with complex decisions sequences, it is possible to break those sequences down into different components in order to study what drives each of those components. The way to do this is by combining bits and pieces of likelihood like my coauthor and I did above.

Again, this says nothing about identification, and if you are interested in the causal effect of some variable of interest on complex decision sequences, it is best to be dealing with experimental data so as to not have to worry about identification. But this goes to show that with a little bit of (econometric) structure, it is possible to study more complex decision sequences than what a basic linear projection allows.

* The likelihood function was not the most difficult part of the problem. Given selection issues, the standard errors had to be corrected in a manner similar to Heckman’s original contribution, but accounting for the ordered selection procedure. And then everything needed to be coded by hand using Stata’s “ml” set of commands.

]]>Anyway, at some point the authors make the following argument:

- Our random effects findings are almost identical to our fixed effects findings;
- Random effects should be used with a random sample from a population of interest and fixed effects in the absence of such a random sample;
- This means our (small, highly selected sample) is representative of the population of interest;
- Thus, this means we can use findings from our (small, highly selected sample) to make inferences about the population as a whole.

The problem with this entire reasoning is that it is mistaken, and that it stems from an old-school understanding of the difference between fixed and random effects (FE and RE, respectively).

When I was a Masters student at Montreal, we covered FE and RE estimators in the core econometrics class we all took and in the microeconometrics elective I chose to take (and which remains, to this day, the most useful class I have ever taken). In those classes, we were told: “You should use RE when you have a random sample from a broader population, and FE when you have a nonrandom sample, like when you have data on all ten provinces.”

That’s the old-school conception. Nowadays, in the wake of the Credibility Revolution, what we teach students is: “You should use RE when your variable of interest is orthogonal to the error term; if there is any doubt and you think your variable of interest is not orthogonal to the error term, use FE.”

And since the variable can be argued to be orthogonal pretty much only in cases where it is randomly assigned in the context of an experiment, experimental work is pretty much the only time the RE estimator should be used.

“But Marc,” you say, “if I use FE then my variable of interest collapses into the fixed effect because it does not vary within unit.” That’s too bad, and in this case, you should either interact your variable of interest with something that does vary within unit and which makes sense in the context of your application, or ditch this research project altogether.

That is the first point I wanted to make: That RE should really only be used when the variable of interest is (as good as) randomly assigned.

The second point I wanted to make is corollary to the first, and it’s that the fact that the FE and RE results look a lot alike (which really should be ascertained with a Hausman test instead of merely eyeballed) is confirmation of the fact that the variables on the RHS are orthogonal to the error term more than anything else, and that this says absolutely nothing about external validity. Thus, to claim that this makes it possible to make inference about the whole population is also wrong.

How these statements get by reviewers and editors is a bit puzzling. This goes to show that peer review is not a panacea, and that the body of published, peer-reviewed research is not some kind of unquestionable Volume of Sacred Law.

]]>That op-ed was based on the findings of a similarly titled working paper of mine, which one of the *New York Times* editors had gotten wind of after I first discussed it on this blog during the summer of 2015.

In my op-ed, however, I mentioned that I would soon post an updated version of our paper. But things got busy, and though I worked quite a bit on it here and there, I did not get to finish it until a few weeks ago.

(And by “finish,” I mean “stop working on it until it is returned to us with reviewer comments about how to improve it before it can get published.”)

Here is the new version. The major innovation is that we now exploit both the longitudinal nature of the data as well as a source of plausibly exogenous variation for the number of farmers markets in a given state in a given year. This obviously makes for much stronger results than we used to have. Here is the abstract of this latest version:

]]>Using administrative longitudinal data on all US states and the District of Columbia for the years 2004, 2006, and 2008-2013, we study the relationship between farmers markets and food-borne illness. We find a positive relationship between the number of farmers markets per million individuals and the number of reported (i) total outbreaks and cases of food-borne illness, (ii) outbreaks and cases of norovirus, and (iii) outbreaks of campylobacter per million in a given state-year. When we exploit weather shocks as a source of plausibly exogenous variation for the number of farmers markets per million, the majority of the aforementioned positive relationships persist. Allowing for small departures from the assumption of strict exogeneity of weather shocks, the relationship between farmers markets per million and the number of reported (i) total cases of food-borne illness as well as (ii) cases of norovirus per million turn out to be robust. Our estimates indicate that for every additional farmers market per million, there are six additional cases of food-borne illness per million, and that a doubling of the number of farmers markets in the average state-year would be associated with an economic cost of at least $220,000. Our core results are robust to different specifications and estimators as well as to deleting outliers and leverage points, and falsification and placebo tests indicate that they are unlikely to be spurious.

It sometimes happens that in the general regression equation

(1) ,

your outcome of interest will be a length of time, or duration. Classic examples from labor economics are the duration of individual unemployment spells, or the duration of a strike.

The problem with duration data is that they do not look like the continuous outcome variable ranging from minus to plus infinity (ideally normally distributed) found in most introductory textbooks. In the unemployment spell example, we typically know when someone loses their job, and we know when they find another one. Sometimes, however, the duration is censored; that is, we know when someone loses their job, but they remain unemployed when we record the data.

In both cases, the data look nothing like the textbook outcome variable, and so special care might be required in how we deal with a duration on the left-hand side of equation (1). Typically, this is done with duration analysis, as it is known in economics. Those are also known as survival models–a term that comes from the biostatistics, in which researchers are often interested in how long someone survives after some event of interest happens–but that is only one of the many names given to duration analysis.*

The most basic type of duration analysis is entirely nonparametric, and it is referred to as the Kaplan-Meier estimator. More than an “estimator,” it really is a graph which plots length of time on the *x*-axis and the proportion of the sample that remains in a given state on the *y*-axis. Predictably, a Kaplan-Meier plot looks like a descending staircase.

Here is an example from Bellemare and Novak (forthcoming),** in which we look at whether participating in contract farming (CF) reduces the duration of the hungry season (i.e., the length of time household members go without eating three meals a day) experienced by the households in the data.

Obviously, this fails to account for any confounding factor. For that, you need specific estimators. The two we use in Bellemare and Novak are the Cox proportional hazards model*** and the survival-time regression. The Cox proportional hazards model is such that

(2) ,

where is defined as in equation 1, but where denotes the “hazard” at time *t*, i.e., the likelihood at time *t *that an observation will exit the condition studied (in Bellemare and Novak, the likelihood at time *t* that a household will exit the hungry season), and , the same hazard at baseline. (The Stata help file for the Cox proportional hazards command notes that hazard at baseline is not directly estimated, but it is possible to recover it.)

The survival-time regression, in its proportional hazards version, is such that

(3) ,

where is a nonnegative function of the covariates (typically ). Without specifying , equation (3) reverts to equation (2), so the Cox proportional hazards model is nested within the survival-time estimator.

A survival-time regression involves a survival function , which is inversely related to the hazard function, i.e., it measures the probability of survival until time *t *(in Bellemare and Novak, the probability of having a hungry season of length *t*). The catch is that you have to make a specific distributional assumption if you want to estimate a (parametric) survival-time regression. The most popular choice appears to be Weibull, and this is what we adopted in our paper.

Now, the really cool thing about using these estimators along with the usual linear regression for robustness is this: Whereas the linear regression will tell you the effect of an increase of by one unit on the duration of interest for the average observation, both the Cox proportional hazards and survival-time regressions will tell you how much more likely the average observation is to exit the condition you are studying in response to an increase of by one unit.

In Bellemare and Novak, combining duration analysis with linear regressions tells us that participation in contract farming is associated with a hungry season that is 0.29 months (about eight days) shorter on average (as per the OLS results above), but also that it is associated with a likelihood of exiting the hungry season at any given time that is 15 or 17.1 percent (depending on whether you look at the Cox proportional hazards or survival time results) higher for participating households than it is for nonparticipating households. Both those results answer related but very different questions.

As always, a word of caution. The estimators just discussed work really well with an experimental design or with a selection-on-observables design, but I wouldn’t want to have to use them with an obviously endogenous variable of interest. In Bellemare and Novak, we could make the case that we had selection on observables, but if we had had to deal with an instrumental variable, for example, we would have stuck to trusty old OLS specifications.

* From the Wikipedia entry on survival analysis: “This topic is called reliability theory or reliability analysis in engineering, duration analysis or duration modelling in economics, and event history analysis in sociology.”

** I must confess that one of my initial reasons for wanting to write that paper was that I had never written anything involving the use of duration data. Little did I know it was to evolve into a piece from which we learn quite a bit about the welfare impacts of contract farming on outcomes other than income.

*** One day I will write a post about how much the use of the term “model” to designate a regression equation or an estimator grinds my gears. One day.

]]>

(1) .

This is a pretty standard equation when dealing with panel data: denotes an individual in the set , denotes the time period in the set , is an outcome of interest (say, wage), is a variable of interest (say, an indicator variable for whether someone has a college degree), is an individual fixed effect, and is an error term with mean zero. Normally with longitudinal data, it is the case that , so that there are more individuals in the data than there are time periods. (If , you are likely dealing more with a time-series problem than with a typical applied micro problem.)

Though we are normally interested in estimating and identifying the relationship between the variable of interest and the outcome variable , I wanted to focus today on heteroskedasticity.*

Under ideal circumstances, the variance of the error term, . This is what we mean when we say that the errors are “spherical,” which alludes to the shape of the scatter plot around the regression line. In this case, the variance of the outcome variable around the regression line is said to be constant across the range of values taken by .

It is most often the case, however, that . That is, it often happens that the errors are not spherical, and that the variance of the outcome variable around the regression line is not constant across the range of values taken by .

As I said, we are almost always interested in estimating equation (1) above and call it a day. In such cases, in the presence of heteroskedasticity, the standard errors around and are off, and our inferences are mistaken.

Luckily, this is easily corrected by using the Huber-White sandwich standard error correction. (The names Huber and White refer to a statistician and an econometrician, respectively; the name “sandwich” refers to how the variance-covariance matrix is “sandwiched” in the middle of the relevant estimator. For applied econometricians, the classic read is White, 1980.)

But every so often, it happens that heteroskedasticity has empirical content–that is, studying the variance of the error term , and how it varies as varies, can tell us something useful about the world.

In the wage-education example above, the variance of the error term has useful empirical content. Indeed, the regression

(2) ,

which linearly projects the variance of the error term in equation 1 on the regressors of equation (1), can be useful in studying how variable an individual’s wage is depending on whether that person has a college degree. Again, if the outcome variable in equation (1) is an individual’s wage and the variable of interest is a dummy variable for whether that person has a college degree, rejecting the null hypothesis in favor of the alternative hypothesis is useful, in that it tells us that having a college degree is associated with a less variable wage. (The intuition here is to picture the scatter and regression line for equation (1), in which case it is easy to see that the variance of the error term–how its distance from the regression line varies across the domain of the variable of interest–is the variance of the outcome variable in the same dimension.) For someone who is risk-averse and who prefers a stable income to an unstable one, this is useful information when deciding whether to go to college.

The general idea is that although in most cases we are interested in the first moment , it is sometimes the case that the second moment is useful in and of itself. There are many such possible applications, in which heteroskedasticity can be exploited to generate useful empirical content. I am currently working on such an application with two of my doctoral students, wherein we show that smallholder farmers who participate in modern agricultural value chains as growers not only have higher incomes, they also have less variable income. This is surprising: If you believe in efficient markets, there shouldn’t be any assets (here, the contract is the asset) that have both higher mean *and* lower variance.

* It can also be spelled “heteroscedasticity.” My understanding is that, much like for “skeptic” vs. “sceptic,” this is primarily a difference between American and British English, which respectively refer to the concept as heteroskedasticity and heteroscedasticity. But what does a guy named Marc whose name is routinely misspelled “Mark” know?

]]>How to deal with an imperfect instrument was an idea whose time apparently had come in 2012: In the same volume of the same journal, Nevo and Rosen (2012) develop an alternative method for dealing with imperfect IVs, which is what I wanted to discuss this week.

Again, imagine you are interested in the effect of treatment on outcome , with or without controls . You are interested in estimating

(1)

from which I am omitting the constant and the controls for brevity. Specifically, you are interested in the causal effect of the endogenous treatment on , and you have a plausibly exogenous instrument .

Intuitively, Nevo and Rosen’s method relaxes the assumption that –the exogeneity assumption–to allow for the possibility that at the cost of making a stronger assumption on the relationship between the endogenous variable and the instrument–the relevance assumption. In their own words, the method makes a weaker assumption on the unobservables, but a stronger assumption on the observables.

Specifically, Nevo and Rosen’s method assumes that (i) , and (ii) .

Again, in Nevo and Rosen’s own words, assumption (ii) is “an intuitive assumption for those applications where … is not necessarily exogenous but is ‘better’ or ‘less endogenous’ than the endogenous regressor.”

Assumptions (i) and (ii) above are assumptions 3 and 4 in Nevo and Rosen’s article; they make a few more regularity assumptions around these, but those really are the central ones.

With just assumption (i) and the regularity assumptions, you get Lemma 1 in the paper: If , then the true parameter lies between the OLS and 2SLS estimates, and . If, however, , in cases where and , and in cases where and .

With both assumptions (i) and (ii) (and the regularity assumptions), you get Proposition 1, which generates even better (as in sharper) bounds. This requires a bit more math than I can go into in this post.

Nevo and Rosen then generalize their findings to the case where there are additional regressors (i.e., controls), to the case where there are multiple imperfect instruments, and to the case where there are multiple treatment (i.e., endogenous) variables, and they have a section on inference, since beyond knowing , it’s also nice to know the confidence interval around it. Finally, they provide an application from the empirical IO literature.

To summarize both this post and last week’s: When you have an instrument that is plausibly but not strictly exogenous (in Conley et al.’s terminology) or imperfect (in Nevo and Rosen’s terminology), all is not lost. This is encouraging for those of us who rarely have the luxury of having experimental data and must often rely on observational data. For example, I am currently putting the finishing touch on a paper where I aim to incorporate one or both of the Conley et al. and Nevo and Rosen approaches in order to show that my 2SLS results are robust to departures from the strict exogeneity assumption.

Still, this does not mean that anything goes and that crappy instruments get a pass. The way I like to think of this (at least in the context of the Conley et al., 2012 approach) is that these methods are useful for cases where there is one possible but unlikely channel in which the exclusion restriction is violated, but not for cases where there are several likely such channels. When the latter happens, it is perhaps best to adopt the view according to which “whereof one cannot speak, thereof one must be silent.”

(This is the 50th post in the Metrics Monday series. At the end of 2010, when I started blogging, I never thought I would end up blogging for so long, let alone writing so much about econometrics. Thank you for reading; here is to 50 more of those posts!)

]]>In short, the article’s angle is that, *contra* a popular theory that holds that FGC persistence is due to community-level factors, the persistence of FGC seems to come from individual and household-level factors:

Some economists say it’s time for a new approach. Their work, itself controversial, questions long-held views on FGC — that communities either all follow the practice, or all give it up – and thereby challenges the very underpinnings of many interventions.

Interventions should stop trying, as most do, to sway entire villages, these scientists say. They should instead target cracks in support for the practice: the influential community leader who has decided his daughters will not be cut, or the husband and wife who are divided on the fate of their daughters.

The article also talks about some of the research that my PhD student Lindsey Novak* has done on FGC in her job-marker paper:

Despite this, economists are managing to glean valuable patterns by applying statistical filters to large datasets. In one such study, Marc Bellemare of the University of Minnesota in St Paul and his doctoral student Lindsey Novak waded through health surveys of more than 300,000 women in 13 countries, taken between 1995 and 2013. They found that attitudes on FGC, and women’s own cutting status, varied within single households.

Women who reported having undergone FGC were 16 percentage points more likely to support the practice. And the largest source of variation in attitudes toward FGC—87%—was at the household or individual level, across nearly all countries and years.

“It means that women living in two different households in the same village are likely to have different opinions,” Novak says. And that, she says, suggests that decisions on cutting are made by households, not villages or regions. Other research supports this contention, and flies in the face of a prominent hypothesis about why FGC persists.

The article also mentions work by several other economists working on FGC.

* Here is a Development Impact blog post summarizing Lindsey’s job-market paper.

]]>Here is how the NYT article began:

What do households on food stamps buy at the grocery store?

The answer was largely a mystery until now. The United States Department of Agriculture, which oversees the $74 billion food stamp program called SNAP, has published a detailed report that provides a glimpse into the shopping cart of the typical household that receives food stamps.The findings show that the No. 1 purchases by SNAP households are soft drinks, which accounted for 5 percent of the dollars they spent on food. The category of “sweetened beverages,” which includes fruit juices, energy drinks and sweetened teas, accounted for almost 10 percent of the dollars they spent on food.

Had the NYT’s intention been to provide arguments to those who wish to dismantle SNAP–a program which, in 2014, provided an average of $125 to spend on food to 46.5 million Americans with low or no income; that’s one in seven Americans–it wouldn’t have done a better job. This is especially given that title: “In the Shopping Cart of a Food Stamp Household: Lots of Soda.”

My Twitter feed came alive with (justified) criticism of the article. My University of Minnesota colleague Joe Soss wrote a long response on his Facebook page which should be read in full to appreciate just how bad the NYT reporting was, part of which reads as follows:

The story hammers away at the idea that “the No.1 purchases by SNAP households are soft drinks, which account for about 10 percent of the dollars they spend on food.” Milk is No. 1 among non-SNAP households, we’re told, not soft drinks. At the start of the article, [NYT reporter Anahad] O’Connor frames these and other alleged facts with a quote that tells readers what SNAP really is: “SNAP is a multibillion-dollar taxpayer subsidy of the soda industry.” The story doubles down on this misleading image of the program by ending with a discussion of how the big soda companies lobby to keep the SNAP funds flowing — and with a quote asserting, “This is the first time we’ve had confirmation that this massive taxpayer program is promoting all the wrong kinds of foods.”

I want to be clear here: This is bullshit. It’s a political hack job on a program that helps millions of Americans feed themselves, and we should all be outraged that the NYT has disguised it as a piece of factual news reporting on its front page …

But what does the USDA report actually say? … Spoiler Alert: The report does not say that SNAP changes what people buy at the grocery–and that includes encouraging them to buy soda–and the report’s findings differ considerably from the portrayal Anahad O’Connor presents in the NYT … Here are the top three items in the report’s own summary of its major findings, reported in an attention-grabbing, color-shaded box:

1. There were no major differences in the expenditure patterns of SNAP and non-SNAP households, no matter how the data were categorized. Similar to most American households:

– About 40 cents of every dollar of food expenditures by SNAP households was spent on basic items such as meat, fruits, vegetables, milk, eggs, and bread.

– Another 20 cents out of every dollar was spent on sweetened beverages, desserts, salty snacks, candy and sugar.

– The remaining 40 cents were spent on a variety of items such as cereal, prepared foods, dairy products, rice, and beans2. The top 10 summary categories and the top 7 commodities by expenditure were the same for SNAP and non-SNAP households, although ranked in slightly different orders.

3. Less healthy food items were common purchases for both SNAP and non-SNAP households. Sweetened beverages, prepared desserts and salty snacks were among the top 10 summary categories for both groups. Expenditures were greater for sweetened beverages compared to all milk for both groups, as well.

Later, the report adds these bullet points to its summary, in a separate “pay attention” box:

4. Overall, there were few differences between SNAP and non-SNAP household expenditures by USDA Food Pattern categories. Expenditure shares for each of the USDA Food Patter categories (dairy, fruits, grains, oils, protein foods, solid fats and added sugars (SoFAS), and vegetables) varied by no more than 3 cents per dollar when comparing SNAP and non-SNAP households.

5. Protein foods represented the largest expenditure share for both household types, while proportionally more was spent on fruits and vegetables than on solid fats and added sugars, grains or dairy.

Let me be clear: The report does a fine job documenting what people buy; it’s the interpretation of the report’s results by the NYT that leaves much to be desired.

Most of the other comments I read regarding the NYT article were to the effect of: “Who is the NYT to tell poor people what they can and cannot spend their money on?,” as though one being poor necessarily implies that one is morally inferior, and so one needs to get told by Wealthy, Educated White Liberals (WEWLs) what one can and cannot buy.

And WEWLs actually wonder why the other half flipped them the bird on November 8?

Look, I worship at the altar of Gary Taubes. I believe sugar is at the root of many of our so-called “diseases of civilization,” along with an excess consumption of refined carbohydrates by a largely sedentary population, and except to replenish my electrolytes during one particularly awful episode of food-borne illness in 2010 in West Africa, I have not had soda since December 28, 2004.

But I also find paternalism appalling,* no matter which side it comes from. It is especially appalling when said paternalism strongly hints at the idea that the poor are somehow morally deficient. The left gets up in arms when the right talks about mandatory drug tests for welfare recipients; this is no different.

The policy solution to Americans of every income level buying too much soda and too many sweetened drinks is not paternalism of the thou-shalt-not variety, it’s better health and nutrition education for everyone in elementary, middle, and high school.

And here is another thing about that NYT article: There are many, many agricultural economists who have done high-quality work on SNAP that steers clear from cheap advocacy. In no particular order: Parke Wilde at Tufts; Shelly Ver Ploeg at USDA’s Economic Research Service; my grad-school colleague Chad Meyerhoefer at Lehigh; Minnesota alum Travis Smith, now at UGA; my erstwhile colleague Tim Beatty; Craig Gundersen at Illinois; and so on, and so forth.

Were any of them interviewed in the article? Of course not. I mean, why would you talk to anyone over in icky flyover country? Why would you slum it at state schools? Instead, the reporter chose to go full TED Talk on the reader, remain comfortably ensconced in area code 212, and go with… Marion Nestle who, *faisant flèche de tout bois*, chose to use the report’s finding to attack her favorite *bête noire*: Big Bad Ag. Quoth Nestle:

“… SNAP is a multibillion-dollar taxpayer subsidy of the soda industry,” said Marion Nestle, a professor of nutrition, food studies and public health at New York University. “It’s pretty shocking.”

No. What is shocking is that an article which I would not have published when I was editor of my college’s newspaper not only gets published in but makes the front page of the *New York Times*, supposedly one of the last bastions of Real Journalism in this era of fake news and filter bubbles.

Update: After I wrote this post on Sunday morning, *Jacobin* decided to run a lightly edited and slightly modified version of Joe Soss’ Facebook post as an article on Monday.

* I don’t consider behavioral nudges to be paternalism, even though they were introduced to us in the behavioral economics class I took in grad school as “cautious paternalism.” At any rate, I don’t consider a nudge to be paternalism if it leaves people free to chose while pushing them in one direction. For example, automatically enrolling people in their 401(k) leaves people free to un-enroll if they feel like it.

]]>Happy New Year! After running out of easy, off-the-top-of-my-head topics for this series, I have decided to go with a friend’s suggestion of blogging econometrics papers whose results are useful for applied work.

Given that I am working on a paper in which I am dealing with an instrumental variable that is only plausibly exogenous–that is, the exclusion restriction is likely to hold, but there is a small chance that it does not–I thought I should begin the year with two posts on how to deal with imperfect instruments.

This does not mean that these posts will discuss what to do with plain-old bad instrumental variables (IVs), i.e., instruments for which the exclusion restriction clearly does not hold. Again, this post and the next will discuss situations where your IV most likely meets the exclusion restriction, but wherein there is a small chance that it does not.

Let’s start with the results in Conley et al. (REStat, 2012; see here for a non-gated version). The core idea is as follows: You are interested in the effect of treatment on outcome , with or without controls . You are interested in estimating

(1)

from which I am omitting the constant and the controls for brevity. Specifically, you are interested in the causal effect of the endogenous treatment on , and you have a plausibly exogenous instrument .

In the equation

(2)

parameters and are not jointly identified because is endogenous. For a strictly exogenous IV–one whose exclusion restriction is met–we have that .

The problem with an IV that is only plausibly exogenous is that , though it is likely to be small, is unlikely to be zero. So how do you go about the problem? One way to do it is to incorporate prior information about what looks like, and in their paper, Conley et al. present four different ways to do that by (i) specifying only a range of possible values for , (ii) imposing a distribution on , and (iv) adopting a full Bayesian approach, which requires imposing priors on all the parameters (in my example, you’d need to have a prior for both and for ).

Then, it is possible to either obtain a point estimate or confidence interval, depending on the method chosen, for , the estimand of interest. If the point estimate is different from zero, or if the confidence interval excludes zero, then this is a sign that the 2SLS estimate is robust to a small departure from the strict exogeneity assumption–one wherein the IV is only plausibly but not strictly exogenous.

These posts are meant to be short, so I cannot possibly go into the details of Conley et al.’s four methods. If you are interested in the method, read the paper. There is also a Stata add-on called –plausexog– that can be used to implement some of the methods delineated above (as far as I have played with it, -plausexog- does not allow using a full Bayesian approach).

Again, a word of caution: This is not a cure for a bad IV, and no amount of using this method will turn a bad IV into a good one. Moreover, for all its benefits, this method can involve a certain amount of arbitrary decisions when it comes to incorporating prior information.

Another useful insight in Conley et al.’s paper is the following:

… The sensitivity of the 2SLS estimator to violations of the exclusion restriction depends on the strength of the instrument … The desire to use instruments that are strong but may violate the exclusion restriction provides a direct motivation for the methods in this paper.

In other words, because a weak IV biases the 2SLS estimate of away from the true value of , it is sometimes preferable to use a strong IV that is only plausiby exogenous to a weak IV that is strictly exogenous. See Bound et al. (1995; click here for an ungated version.)

HT: Ag econ wunderkind Nate Hendricks, whose seminar here last fall introduced me to the method.

]]>I am preparing a longer review that should see the light of day in the near future, but in the meantime, the book features a series of conversations with prominent economists–from Abhijit Banerjee and Esther Duflo to Angust Deaton, and from Jonathan Morduch to David McKenzie–not only about the advantages and disadvantages of RCTs, but also about the past, present, and future of RCTs.

If you are not already familiar with his work, Timothy Ogden is the managing director of the Financial Access Initiative at NYU, and he is the editor in chief of *Philanthropy Action*.