The post Improvisation in the Mathematics Classroom appeared first on All About Statistics.

]]>The following is a guest post by Andrea Young, requested by Dr Nic Petty.

Improvisation comedy, or improv for short, is theater that is unscripted. Performers create characters, stories, and jokes on the spot, much to the delight of audience members. Surprisingly, the goal of improv is *not* to be funny! (Or maybe this isn’t surprising–people trying hard to be funny rarely succeed.) Rather, improv comedians are encouraged to be “in the moment,” to support their fellow players, and to take risks–the humor follows as a natural consequence.

What does this have to do with mathematics and mathematics education? If you are a math teacher or professor, you might want to have a classroom where students are deeply engaged with the lesson (i.e. are “in the moment”), actively collaborating with peers (i.e. supporting their fellow players), and willing to make mistakes (i.e. taking risks). In other words, you want them to develop the skills that improvisers are trained in from their very first improv class.

I started taking improv classes in 2002 at the Hideout Theatre in Austin, TX right around the same time I started a Ph.D. program in mathematics at the University of Texas at Austin. I realized that the dynamics being developed in my improv classes and troupes were exactly the ones I wanted to develop among the students in my math classes. So I started using improv games and exercises in my courses. And I haven’t stopped. I have now taught mathematics to hundreds of college students, and in every course, I have incorporated some amount of improv. I have given workshops and presentations to mathematicians, high school teachers, and students about how to use improv to improve group dynamics or to foster communication. It is powerful to see joy and play cultivated in a college-level mathematics course. Anecdotally, these techniques work–not for every student, every time–but for enough students enough of the time that I keep using my old favorites and finding new ones to try.

Here are three exercises that you might try in your own math classes. I use these in college classes, but they are easily (and some might argue, more readily) adaptable to younger ages.

** Scream circle:** Have the students stand in a circle and put their heads down. On the count of three, they should all raise their heads and look

This exercise is a great way to pair up students to work together. It also develops the idea of risk-taking because students are encouraged to scream as loud as they can. It is also quick–depending on the size of the class, this can take fewer than 2 minutes.

** Five-headed expert:** Have five students come to the front of the room and stand in a line. This can be played a few ways. Here are two:

- The students respond to questions one word at a time, as though they are five heads on the same body. Introduce the visiting “expert” and ask them questions, related to course content. Time permitting, have the class ask questions.
- The students respond to questions all in one voice. Otherwise, the game is the same.

This game is a fun way to review concepts and definitions. (For example, what is the limit definition of the derivative?) It also works on the skills of collaboration and being “in the moment.” Students must listen to each other and work together to say things that make sense.

For an example of how this game works in an improv performance, watch this video from the improv group Stranger Things Have Happened.

** I am a tree: ** Have the students stand in a circle. One student walks to the center and makes an “I am” statement while striking a pose. The next student enters the circle and adds to the tableau with another “I am” statement. A third (and probably final student) enters the tableau like the second. The professor then clears the tableau, either leaving one of the students to repeat their “I am” statement or not.

This game really highlights the need for collaboration, especially when used in a math context. I use this as a review or as a way to synthesize concepts. For example, this could be used to review different sets of numbers. Student one might start with “I am the set of real numbers” and hold his or her arms in a big circle to indicate a set. Student two could enter the “set” and say, “I am the rationals.” Another student might intersect the reals with their arms and say, “I am the complex numbers.”

For an introduction to I am a tree, check out this demonstration video from my former improv teacher and troupe mate, Shana Merlin of Merlin Works.

I use a lot of active learning techniques in my classes, and I have found improv exercises to be a quick and fun way to develop some of the non-mathematical skills that my students need to be successful in my classroom. It takes some courage to engage with your students in this way, but I think it is well worth it.

As a final thought, improvisational comedy techniques are not just for students. They can help professional mathematicians become better communicators and more effective teachers. They can even stimulate creativity and problem-solving skills. I encourage you to visit your local comedy theater and to sign up for an improv class.

*Andrea Young is the Special Assistant to the President and Liaison to the Board of Trustees AND an Associate Professor of Mathematical Sciences at Ripon College. For many years, she performed improv all around the country with **Girls, Girls, Girls Improvised Musicals** and a variety of other Austin improv troupes. These days she mostly does community theater, although she regularly improvises silly songs and dances for her toddler. For more about using improv in math courses, check out **mathprov.wordpress.com**.*

Thanks Andrea – it was so great to find someone who was already doing what I was thinking about doing. I would love to hear from other people who have used improv games and techniques in maths and statistics classes. I am learning improv at present, and like the idea of “Yes and…” I will write some more about this in time.

Advertisements

**Please comment on the article here:** **Learn and Teach Mathematics and Statistics**

The post Improvisation in the Mathematics Classroom appeared first on All About Statistics.

]]>Kevin Lewis asks: What are the odds of Trump’s winning in 2020, given that the last three presidents were comfortably re-elected despite one being a serial adulterer, one losing the popular vote, and one bringing race to the forefront? My reply: Serial adulterer, poor vote in previous election, ethnicity . . . I don’t think […]

The post What are the odds of Trump’s winning in 2020? appeared first on Statistical Modeling, Causal Inference, and Social Science.

The post What are the odds of Trump’s winning in 2020? appeared first on All About Statistics.

]]>Kevin Lewis asks:

What are the odds of Trump’s winning in 2020, given that the last three presidents were comfortably re-elected despite one being a serial adulterer, one losing the popular vote, and one bringing race to the forefront?

My reply:

Serial adulterer, poor vote in previous election, ethnicity . . . I don’t think these are so important. It does seem that parties do better when running for a second term (i.e., reelection) than when running for third term (i.e., a new candidate), but given our sparse data it’s hard to distinguish these three stories:

1. Incumbency advantage: some percentage of voters support the president.

2. Latent variable: given that a candidate wins once, that’s evidence that he’s a strong candidate, hence it’s likely he’ll win again.

3. Pendulum or exhaustion: after awhile, voters want a change.My guess is that the chances in 2020 of the Republican candidate (be it Trump or someone else) will depend a lot on how the economy is growing at the time. This is all with the approximately 50/50 national division associated with political polarization. If the Republican party abandons Trump, that could hurt him a lot. But the party stuck with Trump in 2016 so they very well might in 2020 as well.

I guess I should blog this. Not because I’m telling you anything interesting but because it can provide readers a clue as to how little I really know.

Also, by the time the post appears in March, who knows what will be happening.

The post What are the odds of Trump’s winning in 2020? appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**

The post What are the odds of Trump’s winning in 2020? appeared first on All About Statistics.

]]>And if I can remain there I will say – Baby Dee Obviously this is a blog that love the tabloids. But as we all know, the best stories are the ones that confirm your own prior beliefs (because those must be true). So I’m focussing on this article in Science that talks about how STEM […]

The post What is not but could be if appeared first on Statistical Modeling, Causal Inference, and Social Science.

The post What is not but could be if appeared first on All About Statistics.

]]>*And if I can remain there I will say – Baby Dee*

Obviously this is a blog that love the tabloids. But as we all know, the best stories are the ones that confirm your own prior beliefs (because those must be true). So I’m focussing on this article in Science that talks about how STEM undergraduate programmes in the US lose gay and bisexual students. This *leaky pipeline* narrative (that diversity is smaller the further you go in a field because minorities drop out earlier) is pretty common when you talk about diversity in STEM. But this article says that there are now numbers! So let’s have a look…

From the article:

The new study looked at a 2015 survey of 4162 college seniors at 78 U.S. institutions, roughly 8% of whom identified as LGBQ (the study focused on sexual identity and did not consider transgender status). All of the students had declared an intention to major in STEM 4 years earlier. Overall, 71% of heterosexual students and 64% of LGBQ students stayed in STEM. But looking at men and women separately uncovered more complexity. After controlling for things like high school grades and participation in undergraduate research, the study revealed that heterosexual men were 17% more likely to stay in STEM than their LGBQ male counterparts. The reverse was true for women: LGBQ women were 18% more likely than heterosexual women to stay in STEM.

Ok. There’s a lot going on here. First things first, let’s say a big hello to Simpson’s paradox! Although LGBQ people have a lower attainment rate in STEM, it’s driven by men going down and women going up. I think the thing that we can read straight off this is that there are “base rate” problems happening all over the place. (Note that the effect is similar across the two groups and in opposite directions, yet the combined total is fairly strongly aligned with the male effect.) We are also talking about a drop out of around 120 of the 333 LGBQ students in the survey. So the estimate will be noisy.

I’m less worried about forking paths–I don’t think it’s unreasonable to expect the experience to differ across gender. Why? Well there is a well known problem with gender diversity in STEM. Given that gay women are potentially affected by two different leaky pipelines, it sort of makes sense that the interaction between gender and LGBQ status would be important.

The actual article does better–it’s all done with multilevel logistic regression, which seems like an appropriate tool. There are p-values everywhere, but that’s just life. I struggled from the paper to work out exactly what the model was (sometimes my eyes just glaze over…), but it seems to have been done fairly well.

As with anything however (see also Gayface), the study is only as generalizable as the data set. The survey seems fairly large, but I’d worry about non-response. And, if I’m honest with you, me at 18 would’ve filled out that survey as straight, so there are also some problems there.

So a very shallow read of the paper makes it seems like the stats is good enough. But what if it’s not? Does that really matter?

This is one of those effects that’s anecdotally expected to be true. But more importantly, a lot of the proposed fixes are the types of low-cost interventions that don’t really need to work very well to be “value for money”.

For instance, it’s suggested that STEM departments work to make LGBT+ visibility more prominent (have visible, active inclusion policies). They suggest that people teaching pay attention to diversity in their teaching material.

The common suggestion for the last point is to pay special attention to work by women and under-represented groups in your teaching. This is never a bad thing, but if you’re teaching something very old (like the central limit theorem or differentiation), there’s only so much you can do. The thing that we all have a lot more control over is our examples and exercises. It is a no-cost activity to replace, for example, “Bob and Alice” with “Barbra and Alice” or “Bob and Alex”.

This type of low-impact diversity work signals to students that they are in a welcoming environment. Sometimes this is enough.

A similar example (but further up the pipeline) is that when you’re interviewing PhD students, postdocs, researchers, or faculty, don’t ask the men if they have a wife. Swapping to a gender neutral catch-all (partner) is super-easy. Moreover, it doesn’t force a person who is not in an opposite gender relationship to throw themselves a little pride parade (or, worse, to let the assumption fly because they’re uncertain if the mini-pride parade is a good idea in this context). *Partner* is a gender-neutral term. *They* is a gender-neutral pronoun. They’re not hard to use.

These environmental changes are important. In the end, if you value science you need to value diversity. Losing women, racial and ethnic minorities, LGBT+ people, disabled people, and other minorities really means that you are making your talent pool more shallow. A deeper pool leads to better science and creating a welcoming, positive environment is a serious step towards deepening the pool.

Making a welcoming environment doesn’t fix STEM’s diversity problem. There is a lot more work to be done. Moreover, the ideas in the paragraph above may do very little to improve the problem. They are also fairly quiet solutions–no one knows you’re doing these things on purpose. That is, they are half-arsed activism.

The thing is, as much as it’s lovely to have someone loudly on my side when I need it, I mostly just want to feel welcome where I am. So this type of work is actually really important. No one will ever give you a medal, but that doesn’t make it less appreciated.

The other thing to remember is that sometimes half-arsed activism is all that’s left to you. If you’re a student, or a TA, or a colleague, you can’t singlehandedly change your work environment. More than that, if a well-intentioned-but-loud intervention isn’t carefully thought through it may well make things worse. (For example, a proposal at a previous workplace to ensure that all female students (about 400 of them) have a female faculty mentor (about 7 of them) would’ve put a completely infeasible burden on the female faculty members.)

So don’t discount low-key, low-cost, potentially high-value interventions. They may not make things perfect, but they can make things better and maybe even “good enough”.

The post What is not but could be if appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**

The post What is not but could be if appeared first on All About Statistics.

]]>Shira Mitchell wrote: I gave a talk today at Mathematica about NHST in low power settings (Type M/S errors). It was fun and the discussion was great. One thing that came up is bias from doing some kind of regularization/shrinkage/partial-pooling versus selection bias (confounding, nonrandom samples, etc). One difference (I think?) is that the first […]

The post What We Talk About When We Talk About Bias appeared first on Statistical Modeling, Causal Inference, and Social Science.

The post What We Talk About When We Talk About Bias appeared first on All About Statistics.

]]>Shira Mitchell wrote:

I gave a talk today at Mathematica about NHST in low power settings (Type M/S errors). It was fun and the discussion was great.

One thing that came up is bias from doing some kind of regularization/shrinkage/partial-pooling versus selection bias (confounding, nonrandom samples, etc). One difference (I think?) is that the first kind of bias decreases with sample size, but the latter won’t. Though I’m not sure how comforting that is in small-sample settings. I’ve read this post which emphasizes that unbiased estimates don’t actually exist, but I’m not sure how relevant this is.

I replied that the error is to think that an “unbiased” estimate is a good thing. See p.94 of BDA.

And then Shira shot back:

I think what is confusing to folks is when you use unbiasedness as a principle here, for example here:

Ahhhh, good point! I was being sloppy. One difficulty is that in classical statistics, there are two similar-sounding but different concepts, unbiased *estimation* and unbiased *prediction*. For Bayesian inference we talk about calibration, which is yet another way that an estimate can be correct on average.

The point of my above-linked BDA excerpt is that, in some settings, unbiased estimation is not just a nice idea that can’t be done in practice or can be improved in some ways; rather it’s an actively bad idea that leads to terrible estimates. The key is that classical unbiased estimation requires E(theta.hat|theta) = theta *for any theta*, and, given that some outlying regions of theta are highly unlikely, the unbiased estimate has to be a contortionist in order to get things right for those values.

But in certain settings the idea of unbiasedness is relevant, as in the linked post above where we discuss the problems of selection bias. And, indeed, type M and type S errors are defined with respect to the true parameter values. The key difference is that we’re estimating these errors—these biases—conditional on reasonable values of the underlying parameters. We’re not interested in these biases conditional on unreasonable values of theta.

Subtle point, worth thinking about carefully. Bias is important, but only conditional on reasonable values of theta.

**P.S.** Thanks to Jaime Ashander for the above picture.

The post What We Talk About When We Talk About Bias appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**

The post What We Talk About When We Talk About Bias appeared first on All About Statistics.

]]>It’s at the Institute for Data Science at Berkeley. Hierarchical Modeling in Stan for Pooling, Prediction, and Multiple Comparisons 22 March 2018, 3pm 190 Doe Library. UC Berkeley. And here’s the abstract: I’ll provide an end-to-end example of using R and Stan to carry out full Bayesian inference for a simple set of repeated binary […]

The post Bob’s talk at Berkeley, Thursday 22 March, 3 pm appeared first on Statistical Modeling, Causal Inference, and Social Science.

The post Bob’s talk at Berkeley, Thursday 22 March, 3 pm appeared first on All About Statistics.

]]>It’s at the Institute for Data Science at Berkeley.

- Hierarchical Modeling in Stan for Pooling, Prediction, and Multiple Comparisons

22 March 2018, 3pm

190 Doe Library. UC Berkeley.

And here’s the abstract:

I’ll provide an end-to-end example of using R and Stan to carry out full Bayesian inference for a simple set of repeated binary trial data: Efron and Morris’s classic baseball batting data, with multiple players observed for many at bats; clinical trial, educational testing, and manufacturing quality control problems have the same flavor.

We will consider three models that provide complete pooling (every player is the same), no pooling (every player is independent), and partial pooling (every player is to some degree like every other player). Hierarchical models allow the degree of similarity to be jointly modeled with individual effects, tightening estimates and sharpening predictions compared to the no pooling and complete pooling models. They also outperform empirical Bayes and max marginal likelihood predictively, both of which rely on point estimates of hierarchical parameters (aka “mixed effects”). I’ll show how to fit observed data to make predictions for future observations, estimate event probabilities, and carry out (multiple) comparisons such as ranking. I’ll explain how hierarchical modeling mitigates the multiple comparison problem by partial pooling (and I’ll tie it into rookie of the year effects and sophomore slumps). Along the way, I will show how to evaluate models predictively, preferring those that are well calibrated and make sharp predictions. I’ll also show how to evaluate model fit to data with posterior predictive checks and Bayesian p-values.

The post Bob’s talk at Berkeley, Thursday 22 March, 3 pm appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**

The post Bob’s talk at Berkeley, Thursday 22 March, 3 pm appeared first on All About Statistics.

]]>Greggor Mattson, Dan Simpson, and I wrote this paper, which begins: Recent media coverage of studies about “gaydar,” the supposed ability to detect another’s sexual orientation through visual cues, reveal problems in which the ideals of scientific precision strip the context from intrinsically social phenomena. This fallacy of objective measurement, as we term it, leads […]

The post Gaydar and the fallacy of objective measurement appeared first on Statistical Modeling, Causal Inference, and Social Science.

The post Gaydar and the fallacy of objective measurement appeared first on All About Statistics.

]]>Greggor Mattson, Dan Simpson, and I wrote this paper, which begins:

Recent media coverage of studies about “gaydar,” the supposed ability to detect another’s sexual orientation through visual cues, reveal problems in which the ideals of scientific precision strip the context from intrinsically social phenomena. This fallacy of objective measurement, as we term it, leads to nonsensical claims based on the predictive accuracy of statistical significance. We interrogate these gaydar studies’ assumption that there is some sort of pure biological measure of perception of sexual orientation. Instead, we argue that the concept of gaydar inherently exists within a social context and that this should be recognized when studying it. We use this case as an example of a more general concern about illusory precision in the measurement of social phenomena, and suggest statistical strategies to address common problems.

There’s a funny backstory to this one.

I was going through my files a few months ago and came across an unpublished paper of mine from 2012, “The fallacy of objective measurement: The case of gaydar,” which I didn’t even remember ever writing! A completed article, never submitted anywhere, just sitting in my files.

How can that happen? I must be getting old.

Anyway, I liked the paper—it addresses some issues of measurement that we’ve been talking about a lot lately. In particular, “the fallacy of objective measurement”: researchers took a rich real-world phenomenon and abstracted it so much that they removed its most interesting content. “Gaydar” existed within a social context—a world in which gays were an invisible minority, hiding in plain sight and seeking to be inconspicuous to the general population while communicating with others of their subgroup. How can it make sense to boil this down to the shapes of faces?

Stripping a phemenon of its social context, normalizing a base rate to 50%, and seeking an on-off decision: all of these can give the feel of scientific objectivity—but the very steps taken to ensure objectivity can remove social context and relevance.

We had some gaydar discussion (also here) on the blog recently and this motivated me to freshen up the gaydar paper, with the collaboration of Mattson and Simpson. I also recently met Michal Kosinski, the coauthor of one of the articles under discussion, and that was helpful too.

The post Gaydar and the fallacy of objective measurement appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**

The post Gaydar and the fallacy of objective measurement appeared first on All About Statistics.

]]>The post "Partitioning a Large Simulation as It Runs" (Next Week at the Statistics Seminar) appeared first on All About Statistics.

]]>Attention conservation notice:Only of interest if you (1) care about running large simulations which are actually good for something, and (2) will be in Pittsburgh on Tuesday.

- Kary Myers, "Partitioning a Large Simulation as It Runs" (Technometrics forthcoming)
*Abstract:*As computer simulations continue to grow in size and complexity, they present a particularly challenging class of big data problems. Many application areas are moving toward exascale computing systems, systems that perform $10^{18}$ FLOPS (FLoating-point Operations Per Second) --- a billion billion calculations per second. Simulations at this scale can generate output that exceeds both the storage capacity and the bandwidth available for transfer to storage, making post-processing and analysis challenging. One approach is to embed some analyses in the simulation while the simulation is running --- a strategy often called in situ analysis --- to reduce the need for transfer to storage. Another strategy is to save only a reduced set of time steps rather than the full simulation. Typically the selected time steps are evenly spaced, where the spacing can be defined by the budget for storage and transfer. Our work combines both of these ideas to introduce an online in situ method for identifying a reduced set of time steps of the simulation to save. Our approach significantly reduces the data transfer and storage requirements, and it provides improved fidelity to the simulation to facilitate post-processing and reconsruction. We illustrate the method using a computer simulation that supported NASA's 2009 Lunar Crater Observation and Sensing Satellite mission.*Time and place:*4--5 pm on Tuesday, 10 May 2016, in Baker Hall 235B

As always, the talk is free and open to the public.

**Please comment on the article here:** **Three-Toed Sloth **

The post "Partitioning a Large Simulation as It Runs" (Next Week at the Statistics Seminar) appeared first on All About Statistics.

]]>Yesterday I shared the following exam question: In causal inference, it is often important to study varying treatment effects: for example, a treatment could be more effective for men than for women, or for healthy than for unhealthy patients. Suppose a study is designed to have 80% power to detect a main effect at a […]

The post You need 16 times the sample size to estimate an interaction than to estimate a main effect appeared first on Statistical Modeling, Causal Inference, and Social Science.

The post You need 16 times the sample size to estimate an interaction than to estimate a main effect appeared first on All About Statistics.

]]>Yesterday I shared the following exam question:

In causal inference, it is often important to study varying treatment effects: for example, a treatment could be more effective for men than for women, or for healthy than for unhealthy patients. Suppose a study is designed to have 80% power to detect a main effect at a 95% confidence level. Further suppose that interactions of interest are half the size of main effects. What is its power for detecting an interaction, comparing men to women (say) in a study that is half men and half women? Suppose 1000 studies of this size are performed. How many of the studies would you expect to report a statistically significant interaction? Of these, what is the expectation of the ratio of estimated effect size to actual effect size?

Here’s the solution:

If you have 80% power, then the underlying effect size for the main effect is 2.8 standard errors from zero. That is, the z-score has a mean of 2.8 and standard deviation of 1, and there’s an 80% chance that the z-score exceeds 1.96 (in R, pnorm(2.8, 1.96, 1) = 0.8).

Now to the interaction. The standard of an interaction is roughly twice the standard error of the main effect, as we can see from some simple algebra:

– The estimate of the main effect is ybar_1 – ybar_2, which has standard error sqrt(sigma^2/(N/2) + sigma^2/(N/2)) = 2*sigma/sqrt(N); for simplicity I’m assuming a constant variance within groups, which will typically be a good approximation for binary data, for example.

– The estimate of the interaction is (ybar_1 – ybar_2) – (ybar_3 – ybar_4), which has standard error sqrt(sigma^2/(N/4) + sigma^2/(N/4) + sigma^2/(N/4) + sigma^2/(N/4)) = 4*sigma/sqrt(N). [algebra fixed]

And, from the statement of the problem, we’ve assumed the interaction is half the size of the main effect. So if the main effect is 2.8 on some scale with a se of 1, then the interaction is 1.4 with an se of 2, thus the z-score of the interaction has a mean of 0.7 and a sd of 1, and the probability of seeing a statistically significant effect difference is pnorm(0.7, 1.96, 1) = 0.10. That’s right: if you have 80% power to estimate the main effect, you have 10% power to estimate the interaction.

And 10% power is really bad. It’s worse than it looks. 10% power kinda looks like it might be OK; after all, it still represents a 10% chance of a win. But that’s not right at all: if you do get “statistical significance” in that case, your estimate is a huge overestimate:

> raw <- rnorm(1e6, .7, 1) > significant <- raw > 1.96 > mean(raw[significant]) [1] 2.4

So, the 10% of results which do appear to be statistically significant give an estimate of 2.4, on average, which is over 3 times higher than the true effect.

**Take-home point**

The most important point here, though, has nothing to do with statistical significance. It’s just this: Based on some reasonable assumptions regarding main effects and interactions, *you need 16 times the sample size to estimate an interaction than to estimate a main effect*.

And this implies a major, major problem with the usual plan of designing a study with a focus on the main effect, maybe even preregistering, and then looking to see what shows up in the interactions. Or, even worse, designing a study, not finding the anticipated main effect, and then using the interactions to bail you out. The problem is not just that this sort of analysis is “exploratory”; it’s that these data are a lot noisier than you realize, so what you think of as interesting exploratory findings could be just a bunch of noise.

I don’t know if all this in the textbooks, but it should be.

**Some regression simulations in R**

In response to a comment I did some simulations which I thought were worth including in the main post.

I played around in R to get a sense of how the standard errors depend on the parameterization.

For simplicity, all my simulations assume that the true (underlying) coefficients are 0; that’s no big deal, as my point here is to work out the standard error.

I started with the basic model in which I simulate 1000 data points with two predictors, each taking on the value (-0.5, 0.5). This is the same as the model in the above post: the estimated main effects are simple differences, and the estimated interaction is a difference in differences. I’ve also assumed the two predictors are independent, which is how I’d envisioned the problem:

library("arm") N <- 1000 sigma <- 10 y <- rnorm(N, 0, sigma) x1 <- sample(c(-0.5,0.5), N, replace=TRUE) x2 <- sample(c(-0.5,0.5), N, replace=TRUE) display(lm(y ~ x1)) display(lm(y ~ x1 + x2 + x1:x2))

And here's the result:

lm(formula = y ~ x1 + x2 + x1:x2) coef.est coef.se (Intercept) -0.09 0.32 x1 -0.41 0.63 x2 -0.13 0.63 x1:x2 -0.91 1.26

Ignore the estimates; they're pure noise. Just look at the standard errors. They go just as in the above formulas: 2*sigma/sqrt(N) = 2*10/sqrt(1000) = 0.63, and 4*sigma/sqrt(N) = 1.26.

Now let's do the exact same thing but make the predictors take on the value (0, 1) rather than (-0.5, 0.5):

x1 <- sample(c(0,1), N, replace=TRUE) x2 <- sample(c(0,1), N, replace=TRUE) display(lm(y ~ x1)) display(lm(y ~ x1 + x2 + x1:x2))

Here's what we see:

lm(formula = y ~ x1 + x2 + x1:x2) coef.est coef.se (Intercept) -0.44 0.64 x1 0.03 0.89 x2 0.95 0.89 x1:x2 -0.54 1.26

Again, just look at the standard errors. The s.e. for the interaction is still 1.26, but the standard errors for the main effects went up to 0.89. What happened?

What happened was that the main effects are now estimated at the edge of the data: the estimated coefficient of x1 is now the difference in y, comparing the two values of x1, just at x2=0. So its s.e. is sqrt(sigma^2/(N/4) + sigma^2/(N/4)) = 2*sqrt(2)*sigma/sqrt(N). Under this parameterization, the coefficient of x1 is estimated just from the data given x2=0, only half the data so the s.e. is sqrt(2) times as big as before. Similarly for x2.

But these aren't really "main effects"; in the context of the above problem, the main effect of the treatment is the average over men and women, hence if we put the problem in a regression framework, we should be coding the predictors as (-0.5, 0.5), not (0, 1).

But here's another possibility: what about coding each predictor as (-1, 1)? What happens then? Let's take a look:

x1 <- sample(c(-1,1), N, replace=TRUE) x2 <- sample(c(-1,1), N, replace=TRUE) display(lm(y ~ x1)) display(lm(y ~ x1 + x2 + x1:x2))

This yields:

coef.est coef.se (Intercept) -0.23 0.31 x1 0.28 0.31 x2 -0.60 0.31 x1:x2 0.05 0.31

(Again, ignore the coefficient estimates and look at the standard errors.)

Hey---the coefficients are all smaller (by a factor of 2), and they're all equal! What happened?

The factor of 2 is clear enough: If you multiply x by 2, and x*beta doesn't change, then you have to divide beta by 2 to compensate, and its standard error gets divided by 2 as well. But what happened to the interaction? Well, that's clear too: we've multiplied x1 and x2 each by 2, so x1*x2 is multiplied by 4.

So to make sense of all these standard errors, you have to have a feel for the appropriate scale for the coefficients. In my post above, when I talked about interactions being half the size of main effects, I was thinking of differences between the two groups, which corresponds to parameterizing the predictors as (-0.5, 0.5).

As you can see from the above simulations, the exact answer will depend on your modeling assumptions.

**It all depends on what you mean by "interaction half the size of the main effect"**

This came up in comments.

What do I mean by "interactions of interest are half the size of main effects"? Suppose the main effect of the treatment is, say, 0.6 and the interaction with sex is 0.3, then the treatment effect is 0.45 for women and 0.75 for men. That's what I meant.

If, by "interactions of interest are half the size of main effects," you have in mind something like a main effect of 0.6 that is 0.3 for women and 0.9 for men, then you'll only need 4x the sample size, not 16x.

**P.S.** Further discussion here. Thanks, everyone, for all the thoughtful comments.

The post You need 16 times the sample size to estimate an interaction than to estimate a main effect appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**

The post You need 16 times the sample size to estimate an interaction than to estimate a main effect appeared first on All About Statistics.

]]>The post Math Notation for R Plot Titles: expression, bquote, & Greek Letters appeared first on All About Statistics.

]]>In this post you will learn:

**How to create expressions that have mixed (1) strings, (2) expressions, (3) variables & (4) Greek letters****How to pass in values as variables to an expression**

I wanted to name this post “Ahhhhhhhhhhh #$@%&!!!!” but SEO isn’t terrific for this title so I tried to make the actual title as Googleable as possible. I’m writing this post for future me and past me. If some of the rest of you find it useful even better.

**Problem:** Math Expressions

**Specifically:** Plotting Them

Seems every-time I need to plot a title with math notation I wind up wasting a half an hour on what ought to be an easy task. It’s probably because I don’t have the need to do this task often. It’s also because R has it’s own way to write maths (not LaTeX or something I’m familiar with). It is also because there’s several ways to accomplish this task in R. It’s also because I’ve never spent the time defining how to do the process. I can only control the latter two of these four. Today I define how to write a plot with a title that has a math expression.

**Success is if I can easily plot the following title:**

This is a layperson’s guide written for and by a layperson. I’m sure there’s a precise reason for what and why plotting math notations is quirky. I don’t have the cognitive space or care to know why. I’m learning and sharing enough to reliably get the job done. In the future maybe I’ll care more about the why.

This was my first aha. I don’t know exactly what an `expression`

or `call`

is. I also don’t know if math notation can be done without these but since I’m focused on a single way to get this don’t let’s just go with this for now. I just know it’s not a string for sure. Hadley has a whole chapter on expressions if you really want the full treatment read about it here: http://adv-r.had.co.nz/Expressions.html I’ve read it a few times but my brain hasn’t retained the distinctions long-term yet. This stackoverflow question also explains a great bit of the detail between an expression and a call: https://stackoverflow.com/q/20355547/1000343 But let’s get back to the task at hand…a single way to plot the above title with mixed strings, notation, & numbers.

I’ve seen mixed string, number, and math expression annotation done with `expression`

, `parse(text=`

, `bquote`

, `substitute`

& clever use of back ticks. For me, the most straight forward way is `bquote`

. It seems to be pretty flexible for most tasks. Here are the four rules to overcome math notation title blues when using `bquote`

:

**Strings –**Require quotes wrapped w/ tilde separator (e.g.,`"my text" ~`

).**Math Expressions –**Unquoted & follow`?plotmath`

**Numbers –**Unquoted when part of math notation**Variables –**Use`.()`

(pass in string or numeric)

Got that? Great!

Now you can build what ever. For example say we want to (1) pass a variable name to a plot title, (2) followed by a math notation (correlation), (3) being equal to a correlation value, (4) followed by a string, and lastly, (5) one more math notation. Well that’s:

Use the rules. Here’s a visual representation of the rules. Notice that only a string gets quotes around it? Notice the tilde separators around quoted strings? Notice the *cor* value is passed in (more on this in a moment)? If you struggle with the math notation see `?plotmath`

Note that if the correlation being passed in as a variable were just a number manually placed in the expression the value is simply part of the math notation.

I’m going to plot this 2 times. One where the variable `cor`

being passed in is a double and then a string (`cor2`

). Notice that the leading zero is used in the double?

## A variable to pass in cor <- -.321 cor2 <- '-.321' par(mfrow = c(1, 2)) plot(1:10, 1:10, main = bquote("Hello" ~ r[xy] == .(cor) ~ "and" ~ B^2)) plot(1:10, 1:10, main = bquote("Hello" ~ r[xy] == .(cor2) ~ "and" ~ B^2))

Works for **ggplot2** as well.

library(ggplot2) ggplot() + labs(title = bquote("Hello" ~ r[xy] == .(cor2) ~ "and" ~ B^2))

Alas there is more than one way to accomplish math notation in titles in R. If you want one way then `bquote`

and the 4 rules will likely always get it done. Skip this brief section.

If you’re still reading…You can also use a `parse(text =`

and `paste`

method. The two approaches are similar to a `sprintf`

vs. `paste`

approach to string manipulation (see paste, paste0, and sprintf). This approach requires a bit more reasoning and the `cor2`

is coerced to a double with a leading zero when it’s evaluate?

There are more ways too but I’ll leave that for the curious reader

**Please comment on the article here:** **TRinker's R Blog**

The post Math Notation for R Plot Titles: expression, bquote, & Greek Letters appeared first on All About Statistics.

]]>The post Doing my duty on Pi Day #onelesspie appeared first on All About Statistics.

]]>Xan Gregg and I started a #onelesspie campaign a few years ago. On Pi Day each year, we find a pie chart, and remake it. On Wikipedia, you can find all manners of pie chart. Try this search, and see for yourself.

Here's one found on the Wiki page about the city of Ogema, in Canada:

This chart has 20 age groups, each given a different color. That's way too much!

I was able to find data on 10-year age groups, not five. But the "shape" of the distribution is much easily seen on a column chart (a histogram).

Only a single color is needed.

The reason why I gravitated to this chart was the highly unusual age distribution... this town has almost uniform distribution of age groups, with each of the 10-year ranges accounting for about 11% of the population. Given that there are 9 groups, a perfectly even distribution would be 11% for each column. (Well, the last group of 80+ is cheating a bit as it has more than 10 years.)

I don't know about Ogema. Maybe a reader can explain this unusual age distribution!

**Please comment on the article here:** **Junk Charts**

The post Doing my duty on Pi Day #onelesspie appeared first on All About Statistics.

]]>