Lots of activity this week, as usual. * Lots of people got involved in pushing Stan 2.16 and interfaces out the door; Sean Talts got the math library, Stan library (that’s the language, inference algorithms, and interface infrastructure), and CmdStan out, while Allen Riddell got PyStan 2.16 out and Ben Goodrich and Jonah Gabry are […]

The post Stan Weekly Roundup, 23 June 2017 appeared first on Statistical Modeling, Causal Inference, and Social Science.

The post Stan Weekly Roundup, 23 June 2017 appeared first on All About Statistics.

]]>Lots of activity this week, as usual.

* Lots of people got involved in pushing Stan 2.16 and interfaces out the door; Sean Talts got the math library, Stan library (that’s the language, inference algorithms, and interface infrastructure), and CmdStan out, while Allen Riddell got PyStan 2.16 out and Ben Goodrich and Jonah Gabry are tackling RStan 2.16

* Stan 2.16 is the last series of releases that will not require C++11; let the coding fun begin!

* Ari Hartikainen (of Aalto University) joined the Stan dev team—he’s working with Allen Riddell on PyStan, where judging from the pull request traffic, he put in a lot of work on the 2.16 release. Welcome!

* Imad Ali’s working on adding more cool features to RStanArm including time series and spatial models; yesterday he and Mitzi were scheming to get intrinsic conditional autoregressive models in and I heard all those time series name flying around (like ARIMA)

* Michael Betancourt rearranged the Stan web site with some input from me and Andrew; Michael added more descriptive text and Sean Talts managed to get the redirects in so all of our links aren’t broken; let us know what you think

* Markus Ojala of Smartly wrote a case study on their blog, Tutorial: How We Productized Bayesian Revenue Estimation with Stan

* Mitzi Morris got in the pull request for adding compound assignment and arithmetic; this adds statements such as `n += 1`

.

* lots of chatter about characterization tests and a pull request from Daniel Lee to update some of update some of our our existing performance tests

* Roger Grosse from U.Toronto visited to tell us about his, Siddharth Ancha, and Daniel Roy’s 2016 NIPS paper on testing MCMC using bidirectional Monte Carlo sampling; we talked about how he modified Stan’s sampler to do annealed importance sampling

* GPU integration continues apace

* I got to listen in on Michael Betancourt and Maggie Lieu of the European Space Institute spend a couple days hashing out astrophysics models; Maggie would really like us to add integrals.

* Speaking of integration, Marco Inacio has updated his pull request; Michael’s worried there may be numerical instabilities, because trying to calculate arbitrary bounded integrals is not so easy in a lot of cases

* Andrew continues to lobby for being able to write priors directly into parameter declarations; for example, here’s what a hierarchical prior for `beta`

might look like

parameters { real mu ~ normal(0, 2); realsigma ~ student_t(4, 0, 2); vector[N] beta ~ normal(mu, sigma); }

* I got the go-ahead on adding foreach loops; Mitzi Morris will probably be coding them. We’re talking about

real ys[N]; ... for (y in ys) target += log_mix(lambda, normal_lpdf(y | mu[1], sigma[1]), normal_lpdf(y | mu[2], sigma[2]));

* Kalman filter case study by Jouni Helske was discussed on Discourse

* This is a very ad hoc list. I’m sure I missed lots of good stuff, so feel free to either send updates to me directly for next week’s letter or add things to comments. This project’s now way too big for me to track all the activity!

The post Stan Weekly Roundup, 23 June 2017 appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**

The post Stan Weekly Roundup, 23 June 2017 appeared first on All About Statistics.

]]>Micah Wright writes: I first encountered your explanation of secret weapon plots while I was browsing your blog in grad school, and later in your 2007 book with Jennifer Hill. I found them immediately compelling and intuitive, but I have been met with a lot of confusion and some skepticism when I’ve tried to use […]

The post Question about the secret weapon appeared first on Statistical Modeling, Causal Inference, and Social Science.

The post Question about the secret weapon appeared first on All About Statistics.

]]>Micah Wright writes:

I first encountered your explanation of secret weapon plots while I was browsing your blog in grad school, and later in your 2007 book with Jennifer Hill. I found them immediately compelling and intuitive, but I have been met with a lot of confusion and some skepticism when I’ve tried to use them. I’m uncertain as to whether it’s me that’s confused, or whether my audience doesn’t get it. I should note that my formal statistical training is somewhat limited—while I was able to take a couple of stats courses during my masters, I’ve had to learn quite a bit on the side, which makes me skeptical as to whether or not I actually understand what I’m doing.

My main question is this: when using the secret weapon, does it make sense to subset the data across any arbitrary variable of interest, as long as you want to see if the effects of other variables vary across its range? My specific case concerns tree growth (ring widths). I’m interested to see how the effect of competition (crowding and other indices) on growth varies at different temperatures, and if these patterns change in different locations (there are two locations). To do this, I subset the growth data in two steps: first by location, then by each degree of temperature, which I rounded to the nearest integer. I then ran the same linear model on each subset. The model had growth as the response, and competition variables as predictors, which were standardized. I’ve attached the resulting figure [see above], which plots the change in effect for each predictor over the range of temperature.

My reply: I like these graphs! In future you might try a 6 x K grid, where K is the number of different things you’re plotting. That is, right now you’re wasting one of your directions because your 2 x 3 grid doesn’t mean anything. These plots are fine, but if you have more information for each of these predictors, you can consider plotting the existing information as six little graphs stacked vertically and then you’ll have room for additional columns. In addition, you should make the tick marks much smaller, put the labels closer to the axes, and reduce the number of axis labels, especially on the vertical axes. For example, (0.0, 0.3, 0.6, 0.9) can be replaced by labels at 0, 0.5, 1.

Regarding the larger issue of, what is the secret weapon, as always I see it as an approximation to a full model that bridges the different analyses. It’s a sort of nonparametric analysis. You should be able to get better estimates by using some modeling, but a lot of that smoothing can be done visually anyway, so the secret weapon gets you most of the way there, and in my view it’s much much better than the usual alternative of fitting a single model to all the data *without* letting all the coefficients vary.

The post Question about the secret weapon appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**

The post Question about the secret weapon appeared first on All About Statistics.

]]>Rudy Malka writes: I think you’ll enjoy this nice piece of pop regression by David Robinson: developers who use spaces make more money than those who use tabs. I’d like to know your opinion about it. At the above link, Robinson discusses a survey that allows him to compare salaries of software developers who use […]

The post “Developers Who Use Spaces Make More Money Than Those Who Use Tabs” appeared first on Statistical Modeling, Causal Inference, and Social Science.

The post “Developers Who Use Spaces Make More Money Than Those Who Use Tabs” appeared first on All About Statistics.

]]>Rudy Malka writes:

I think you’ll enjoy this nice piece of pop regression by David Robinson: developers who use spaces make more money than those who use tabs. I’d like to know your opinion about it.

At the above link, Robinson discusses a survey that allows him to compare salaries of software developers who use tabs to those who use spaces. The key graph is above. Robinson found similar results after breaking down the data by country, job title, or computer language used, and it also showed up in a linear regression controlling in a simple way for a bunch of factors.

As Robinson put it in terms reminiscent of our Why Ask Why? paper:

This is certainly a surprising result, one that I didn’t expect to find when I started exploring the data. . . . I tried controlling for many other confounding factors within the survey data beyond those mentioned here, but it was difficult to make the effect shrink and basically impossible to make it disappear.

Speaking with the benefit of hindsight—that is, seeing Robinson’s results and assuming they are a correct representation of real survey data—it all makes sense to me. Tabs seem so amateurish, I much prefer spaces—2 spaces, not 4, please!!!—so from that perspective it makes sense to me that the kind of programmers who use tabs tend to be programmers with poor taste and thus, on average, of lower quality.

I just want to say one thing. Robinson writes, “Correlation is not causation, and we can never be sure that we’ve controlled for all the confounding factors present in a dataset.” But this isn’t quite the point. Or, to put it another way, I think he has the right instinct here but isn’t quite presenting the issue precisely. To see why, suppose the survey had only 2 questions: How much money do you make? and Do you use spaces or tabs? And suppose we had no other information on the respondents. And, for that matter, suppose there was no nonresponse and that we had a simple random sample of all programmers from some specified set of countries. In that case, we’d know for sure that there are no other confounding factors in the dataset, as the dataset is nothing but those two columns of numbers. But we’d still be able to come up with a zillion potential explanations.

To put it another way, the descriptive comparison is interesting in its own right, and we just should be careful about misusing causal language. Instead of saying, “using spaces instead of tabs leads to an 8.6% higher salary,” we could say, “comparing two otherwise similar programmers, the one who uses spaces has, on average, an 8.6% higher salary than the one who uses tabs.” That’s a bit of a mouthful—but such a mouthful is necessary to accurately describe the comparison that’s being made.

The post “Developers Who Use Spaces Make More Money Than Those Who Use Tabs” appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**

The post “Developers Who Use Spaces Make More Money Than Those Who Use Tabs” appeared first on All About Statistics.

]]>Jamie Druckman writes: Time-sharing Experiments for the Social Sciences (TESS) is an NSF-funded initiative. Investigators propose survey experiments to be fielded using a nationally representative Internet platform via NORC’s AmeriSpeak® Panel (see http:/tessexperiments.org for more information). In an effort to enable younger scholars to field larger-scale studies than what TESS normally conducts, we are pleased to announce a Special […]

The post Time-sharing Experiments for the Social Sciences appeared first on Statistical Modeling, Causal Inference, and Social Science.

The post Time-sharing Experiments for the Social Sciences appeared first on All About Statistics.

]]>Jamie Druckman writes:

Time-sharing Experiments for the Social Sciences (TESS) is an NSF-funded initiative. Investigators propose survey experiments to be fielded using a nationally representative Internet platform via NORC’s AmeriSpeak® Panel (see http:/tessexperiments.org for more information). In an effort to enable younger scholars to field larger-scale studies than what TESS normally conducts, we are pleased to announce a Special Competition for Young Investigators. While anyone can submit at any time through TESS’s regular proposal mechanism, this Special Competition is limited to graduate students and individuals who are who are no more than 3 years post-PhD. Winning projects will be allowed to be fielded at a size up to twice the usual budget as a regular TESS study. For more specifics on the special competition, see: http://tessexperiments.org/yic.html We will begin accepting proposals for the Special Competition on August 1, 2017, and the deadline is October 1, 2017. Full details about the competition are available at http://www.tessexperiments.org/yic.html. This page includes information about what is required of proposals and how to submit, and should be reviewed by anyone entering the competition.

The post Time-sharing Experiments for the Social Sciences appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**

The post Time-sharing Experiments for the Social Sciences appeared first on All About Statistics.

]]>The post scala-glm: Regression modelling in Scala appeared first on All About Statistics.

]]>As discussed in the previous post, I’ve recently constructed and delivered a short course on statistical computing with Scala. Much of the course is concerned with writing statistical algorithms in Scala, typically making use of the scientific and numerical computing library, Breeze. Breeze has all of the essential tools necessary for building statistical algorithms, but doesn’t contain any higher level modelling functionality. As part of the course, I walked through how to build a small library for regression modelling on top of Breeze, including all of the usual regression diagnostics (such as standard errors, t-statistics, p-values, F-statistics, etc.). While preparing the course materials it occurred to me that it would be useful to package and document this code properly for general use. In advance of the course I packaged the code up into a bare-bones library, but since then I’ve fleshed it out, tidied it up and documented it properly, so it’s now ready for people to use.

The library covers PCA, linear regression modelling and simple one-parameter GLMs (including logistic and Poisson regression). The underlying algorithms are fairly efficient and numerically stable (eg. linear regression uses the QR decomposition of the model matrix, and the GLM fitting uses QR within each IRLS step), though they are optimised more for clarity than speed. The library also includes a few utility functions and procedures, including a pairs plot (scatter-plot matrix).

Plenty of documentation is available from the scala-glm github repo which I won’t repeat here. But to give a rough idea of how things work, I’ll run through an interactive session for the linear regression example.

First, download a dataset from the UCI ML Repository to disk for subsequent analysis (caching the file on disk is good practice, as it avoids unnecessary load on the UCI server, and allows running the code off-line):

import scalaglm._ import breeze.linalg._ val url = "http://archive.ics.uci.edu/ml/machine-learning-databases/00291/airfoil_self_noise.dat" val fileName = "self-noise.csv" // download the file to disk if it hasn't been already val file = new java.io.File(fileName) if (!file.exists) { val s = new java.io.PrintWriter(file) val data = scala.io.Source.fromURL(url).getLines data.foreach(l => s.write(l.trim. split('\t').filter(_ != ""). mkString("", ",", "\n"))) s.close }

Once we have a CSV file on disk, we can load it up and look at it.

val mat = csvread(new java.io.File(fileName)) // mat: breeze.linalg.DenseMatrix[Double] = // 800.0 0.0 0.3048 71.3 0.00266337 126.201 // 1000.0 0.0 0.3048 71.3 0.00266337 125.201 // 1250.0 0.0 0.3048 71.3 0.00266337 125.951 // ... println("Dim: " + mat.rows + " " + mat.cols) // Dim: 1503 6 val figp = Utils.pairs(mat, List("Freq", "Angle", "Chord", "Velo", "Thick", "Sound")) // figp: breeze.plot.Figure = breeze.plot.Figure@37718125

We can then regress the response in the final column on the other variables.

val y = mat(::, 5) // response is the final column // y: DenseVector[Double] = DenseVector(126.201, 125.201, ... val X = mat(::, 0 to 4) // X: breeze.linalg.DenseMatrix[Double] = // 800.0 0.0 0.3048 71.3 0.00266337 // 1000.0 0.0 0.3048 71.3 0.00266337 // 1250.0 0.0 0.3048 71.3 0.00266337 // ... val mod = Lm(y, X, List("Freq", "Angle", "Chord", "Velo", "Thick")) // mod: scalaglm.Lm = // Lm(DenseVector(126.201, 125.201, ... mod.summary // Estimate S.E. t-stat p-value Variable // --------------------------------------------------------- // 132.8338 0.545 243.866 0.0000 * (Intercept) // -0.0013 0.000 -30.452 0.0000 * Freq // -0.4219 0.039 -10.847 0.0000 * Angle // -35.6880 1.630 -21.889 0.0000 * Chord // 0.0999 0.008 12.279 0.0000 * Velo // -147.3005 15.015 -9.810 0.0000 * Thick // Residual standard error: 4.8089 on 1497 degrees of freedom // Multiple R-squared: 0.5157, Adjusted R-squared: 0.5141 // F-statistic: 318.8243 on 5 and 1497 DF, p-value: 0.00000 val fig = mod.plots // fig: breeze.plot.Figure = breeze.plot.Figure@60d7ebb0

There is a `.predict`

method for generating point predictions (and standard errors) given a new model matrix, and fitting GLMs is very similar – these things are covered in the quickstart guide for the library.

scala-glm is a small Scala library built on top of the Breeze numerical library which enables simple and convenient regression modelling in Scala. It is reasonably well documented and usable in its current form, but I intend to gradually add additional features according to demand as time permits.

**Please comment on the article here:** **Darren Wilkinson's research blog**

The post scala-glm: Regression modelling in Scala appeared first on All About Statistics.

]]>Someone pointed me to this post by “Neuroskeptic”: A new paper in the prestigious journal PNAS contains a rather glaring blooper. . . . right there in the abstract, which states that “three neuropeptides (β-endorphin, oxytocin, and dopamine) play particularly important roles” in human sociality. But dopamine is not a neuropeptide. Neither are serotonin or […]

The post After Peptidegate, a proposed new slogan for PPNAS. And, as a bonus, a fun little graphics project. appeared first on Statistical Modeling, Causal Inference, and Social Science.

The post After Peptidegate, a proposed new slogan for PPNAS. And, as a bonus, a fun little graphics project. appeared first on All About Statistics.

]]>Someone pointed me to this post by “Neuroskeptic”:

A new paper in the prestigious journal PNAS contains a rather glaring blooper. . . . right there in the abstract, which states that “three neuropeptides (β-endorphin, oxytocin, and dopamine) play particularly important roles” in human sociality. But dopamine is not a neuropeptide. Neither are serotonin or testosterone, but throughout the paper, Pearce et al. refer to dopamine, serotonin and testosterone as ‘neuropeptides’. That’s just wrong. A neuropeptide is a peptide active in the brain, and a peptide in turn is the term for a molecule composed of a short chain of amino acids. Neuropeptides include oxytocin, vasopressin, and endorphins – which do feature in the paper. But dopamine and serotonin aren’t peptides, they’re monoamines, and testosterone isn’t either, it’s a steroid. This isn’t a matter of opinion, it’s basic chemistry.

The error isn’t just an isolated typo: ‘neuropeptide’ occurs 27 times in the paper, while the correct terms for the non-peptides are never used.

Neuroskeptic speculates on how this error got in:

It’s a simple mistake; presumably whoever wrote the paper saw oxytocin and vasopressin referred to as “neuropeptides” and thought that the term was a generic one meaning “signalling molecule.” That kind of mistake could happen to anyone, so we shouldn’t be too harsh on the authors . . .

The authors of the papers work in a psychology department so I guess they’re rusty on their organic chemistry.

Fair enough; I haven’t completed a chemistry class since 11th grade, and I didn’t know what a peptide is, either. Then again, I’m not writing articles on peptides for the National Academy of Sciences.

But how did this get through the review process? Let’s take a look at the published article:

Ahhhh, now I understand. The editor is Susan Fiske, notorious as the person who opened the gates of PPNAS for the articles on himmicanes, air rage, and ages ending in 9. I wonder who were the reviewers of this new paper. Nobody who knows what a peptide is, I guess. Or maybe they just read it very quickly, flipped through to the graphs and the conclusions, and didn’t read a lot of the words.

Did you catch that? Neuroskeptic refers to “the prestigious journal PNAS.” That’s PPNAS for short. This is fine, I guess. Maybe the science is ok. Based on a quick scan of the paper, I don’t think we should take a lot of the specific claims seriously, as they seem to based on the difference between “significant” and “non-significant.”

In particular, I’m not quite sure what is their support for the statement from the abstract that “each neuropeptide is quite specific in its domain of influence.” They’re rejecting various null hypotheses but I don’t know that this is supporting their substantive claims in the way that they’re saying.

I might be missing something here—I might be missing a lot—but in any case there seem to be some quality control problems at PPNAS. This should be no surprise: PPNAS is a huge journal, publishing over 3000 papers each year.

On their website they say, “PNAS publishes only the highest quality scientific research,” but this statement is simply false. I can’t really comment on this particular paper—it doesn’t seem like “the highest quality scientific research” to me, but, again, maybe I’m missing something big here. But I can assure you that the papers on himmicanes, air rage, and ages ending in 9 are *not* “the highest quality scientific research.” They’re not high quality research at all! What they are, is low-quality research that happens to be high-quality clickbait.

OK, let’s be fair. This is not a problem unique to PPNAS. The Lancet publishes crap papers, Psychological Science published crap papers, even JASA and APSR have their share of duds. Statistical Science, to its eternal shame, published that Bible Code paper in 1994. That’s fine, it’s how the system operates. Editors are only human.

But, really, do we have to make statements that we know are false? **Platitudes are fine but let’s avoid intentional untruths.**

**So, instead of “PNAS publishes only the highest quality scientific research,” how about this: “PNAS aims to publish only the highest quality scientific research.”** That’s fair, no?

**P.S.** Here’s a fun little graphics project: Redo Figure 1 as a lineplot. You’ll be able to show a lot more comparisons much more directly using lines rather than bars. The current grid of barplots is not the worst thing in the world—it’s much better than a table—but it could be much improved.

**P.P.S.** Just to be clear: (a) I don’t know anything about peptides so I’m offering no independent judgment of the paper in question; (b) whatever the quality is of this particular paper, does not affect my larger point that PPNAS publishes some really bad papers and so they should change their slogan to something more accurate.

**P.P.P.S.** The relevant Pubpeer page pointed to the following correction note that was posted on the PPNAS site after I wrote the above post but before it was posted:

The authors wish to note, “We used the term ‘neuropeptide’ in referring to the set of diverse neurochemicals that we examined in this study, some of which are not peptides; dopamine and serotonin are neurotransmitters and should be listed as such, and testosterone should be listed as a steroid. Our usage arose from our primary focus on the neuropeptides endorphin and oxytocin. Notwithstanding the biochemical differences between these neurochemicals, we note that these terminological issues have no implications for the significance of the findings reported in this paper.”

The post After Peptidegate, a proposed new slogan for PPNAS. And, as a bonus, a fun little graphics project. appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**

The post After Peptidegate, a proposed new slogan for PPNAS. And, as a bonus, a fun little graphics project. appeared first on All About Statistics.

]]>One way to assess the precision of a statistic (a point estimate) is to compute the standard error, which is the standard deviation of the statistic's sampling distribution. A relatively large standard error indicates that the point estimate should be viewed with skepticism, either because the sample size is small [...]

The post The jackknife method to estimate standard errors in SAS appeared first on The DO Loop.

The post The jackknife method to estimate standard errors in SAS appeared first on All About Statistics.

]]>
One way to assess the precision of a statistic (a point estimate) is to compute the *standard error*, which is the standard deviation of the statistic's sampling distribution. A relatively large standard error indicates that the point estimate should be viewed with skepticism, either because the sample size is small or because the data themselves have a large variance.
The *jackknife method* is one way to estimate the standard error of a statistic.

Some simple statistics have explicit formulas for the standard error, but the formulas often assume normality of the data or a very large sample size. When your data do not satisfy the assumptions or when no formula exists, you can use resampling techniques to estimate the standard error. Bootstrap resampling is one choice, and the jackknife method is another. Unlike the bootstrap, which uses random samples, the jackknife is a deterministic method.

This article explains the jackknife method and describes how to compute jackknife estimates in SAS/IML software. This is best when the statistic that you need is also implemented in SAS/IML. If the statistic is computed by a SAS procedure, you might prefer to download and use the %JACK macro, which does not require SAS/IML.

The jackknife method estimates the standard error (and bias) of statistics without making any parametric assumptions about the population that generated the data. It uses only the sample data.

The jackknife method manufactures *jackknife samples* from the data.
A jackknife sample is a "leave-one-out" resample of the data. If there are *n* observations, then there are *n* jackknife samples, each of size *n*-1. If the original data are
{*x*_{1}, *x*_{2},..., *x*_{n}},
then the *i*_th jackknife sample is

{*x*_{1},..., *x*_{i-1},*x*_{i+1},..., *x*_{n}}

You then compute *n* *jackknife replicates*. A jackknife replicate is the statistic of interest computed on a jackknife sample. You can obtain an estimate of the standard error from the variance of the jackknife replicates. The jackknife method is summarized by the following:

- Compute a statistic, T, on the original sample of size
*n*. - For
*i*= 1 to*n*, repeat the following: - Leave out the
*i*_th observation to form the*i*_th jackknife sample. - Compute the
*i*_th jackknife replicate statistic, T_{i}, by computing the statistic on the*i*_th jacknife sample. - Compute the average (mean) of the jackknife replicates: T
_{avg}= Σ_{i}T_{i}/*n*. - (Optional) Estimate the bias as BiasT
_{jack}= (n-1)(T_{avg}- T) - Estimate the standard error as SE
_{jack}= sqrt( (n-1)/n (Σ_{i}T_{i}- T_{avg})**2 )

Resampling methods are not hard, but the notation in some books can be confusing. To clarify the method, let's choose a particular statistic and look at example data. The following example is from Martinez and Martinez (2001, 1st Ed, p. 241), which is also the source for this article. The data are the LSAT scores and grade-point averages (GPAs) for 15 randomly chosen students who applied to law school.

data law; input lsat gpa @@; datalines; 653 3.12 576 3.39 635 3.30 661 3.43 605 3.13 578 3.03 572 2.88 545 2.76 651 3.36 555 3.00 580 3.07 594 2.96 666 3.44 558 2.81 575 2.74 ; |

The statistic of interest (T) will be the correlation coefficient between the LSAT and the GPA variables for the *n*=15 observations. The observed correlation is T_{Data} = 0.776. The standard error of T helps us understand how much T would change if we took a different random sample of 15 students.
The next sections show how to implement the jackknife analysis in the SAS/IML language.

The SAS/IML matrix language is the simplest way to perform a general jackknife estimates. If X is an *n* x *p* data matrix, you can obtain the *i*_th jackknife sample by excluding the *i*_th row of X. The following two helper functions encapsulate some of the computations. The SeqExclude function returns the index vector {1, 2, ..., i-1, i+1, ..., n}. The JackSample function returns the data matrix without the *i*_th row:

proc iml; /* return the vector {1,2,...,i-1, i+1,...,n}, which excludes the scalar value i */ start SeqExclude(n,i); if i=1 then return 2:n; if i=n then return 1:n-1; return (1:i-1) || (i+1:n); finish; /* return the i_th jackknife sample for (n x p) matrix X */ start JackSamp(X,i); n = nrow(X); return X[ SeqExclude(n, i), ]; /* return data without i_th row */ finish; |

By using the helper functions, you can carry out each step of the jackknife method. To make the method easy to modify for other statistics, I've written a function called EvalStat which computes the correlation coefficient. This function is called on the original data and on each jackknife sample.

/* compute the statistic in this function */ start EvalStat(X); return corr(X)[2,1]; /* <== Example: return correlation between two variables */ finish; /* read the data into a (n x 2) data matrix */ use law; read all var {"gpa" "lsat"} into X; close; /* 1. compute statistic on observed data */ T = EvalStat(X); /* 2. compute same statistic on each jackknife sample */ n = nrow(X); T_LOO = j(n,1,.); /* LOO = "Leave One Out" */ do i = 1 to n; Y = JackSamp(X,i); T_LOO[i] = EvalStat(Y); end; /* 3. compute mean of the LOO statistics */ T_Avg = mean( T_LOO ); /* 4 & 5. compute jackknife estimates of bias and standard error */ biasJack = (n-1)*(T_Avg - T); stdErrJack = sqrt( (n-1)/n * ssq(T_LOO - T_Avg) ); result = T || T_Avg || biasJack || stdErrJack; print result[c={"Estimate" "Mean Jackknife Estimate" "Bias" "Std Error"}]; |

The output shows that the estimate of bias for the correlation coefficient is very small. The standard error of the correlation coefficient is estimated as 0.14, which is about 18% of the estimate.

To use this code yourself, simply modify the EvalStat function. The remainder of the program does not need to change.

When the data are univariate, you can sometimes eliminate the loop that computes jackknife samples and evaluates the jackknife replicates.
If X is column vector, you can computing the (*n*-1) x *n* matrix whose *i*_th column represents the *i*_th jackknife sample. (To prevent huge matrices, this method is best for n < 20000.)
Because many statistical functions in SAS/IML operate on the columns of a matrix, you can often compute the jackknife replicates in a vectorized manner.

In the following program, the JackSampMat function returns the matrix of jackknife samples for univariate data. The function calls the REMOVE function in SAS/IML, which deletes specified elements of a matrix and returns the results in a row vector. The EvalStatMat function takes the matrix of jackknife samples and returns a row vector of statistics, one for each column. In this example, the function returns the sample standard deviation.

/* If x is univariate, you can construct a matrix where each column contains a jackknife sample. Use for univariate column vector x when n < 20000 */ start JackSampMat(x); n = nrow(x); B = j(n-1, n,0); do i = 1 to n; B[,i] = remove(x, i)`; /* transpose to column vevtor */ end; return B; finish; /* Input: matrix where each column of X is a bootstrap sample. Return a row vector of statistics, one for each column. */ start EvalStatMat(x); return std(x); /* <== Example: return std dev of each sample */ finish; |

Let's use these functions to get a jackknife estimate of the standard error for the statistic (the standard deviation). The data (from Martinez and Martinez, p. 246) have been studied by many researchers and represent the weight gain in grams for 10 rats who were fed a low-protein diet of cereal:

x = {58,67,74,74,80,89,95,97,98,107}; /* Weight gain (g) for 10 rats */ /* optional: visualize the matrix of jackknife samples */ *M = JackSampMat(x); *print M[c=("S1":"S10") r=("1":"9")]; /* Jackknife method for univariate data */ /* 1. compute observed statistic */ T = EvalStatMat(x); /* 2. compute same statistic on each jackknife sample */ T_LOO = EvalStatMat( JackSampMat(x) ); /* LOO = "Leave One Out" */ /* 3. compute mean of the LOO statistics */ T_Avg = mean( T_LOO` ); /* transpose T_LOO */ /* 4 & 5. compute jackknife estimates of bias and standard error */ biasJack = (n-1)*(T_Avg - T); stdErrJack = sqrt( (n-1)/n * ssq(T_LOO - T_Avg) ); result = T || T_Avg || biasJack || stdErrJack; print result[c={"Estimate" "Mean Jackknife Estimate" "Bias" "Std Error"}]; |

The output shows that the standard deviation of these data is about 15.7 grams. The jackknife method computes that the standard error for this statistic about 2.9 grams, which is about 18% of the estimate.

In summary, jackknife estimates are straightforward to implement in SAS/IML. This article shows a general implementation that works for all data and a specialized implementation that works for univariate data. In both cases, you can adapt the code for your use by modifying the function that computes the statistic on a data set. This approach is useful and efficient when the statistic is implemented in SAS/IML.

The post The jackknife method to estimate standard errors in SAS appeared first on The DO Loop.

**Please comment on the article here:** **The DO Loop**

The post The jackknife method to estimate standard errors in SAS appeared first on All About Statistics.

]]>The post Why I’m not celebrating the 2016 impact factors appeared first on All About Statistics.

]]>Once every year, the journal citation reports are released including journal impact factors. This year, the International Journal of Forecasting 2-year impact factor has increased to 2.642 which is the highest it has been in the journal’s history, and puts the journal higher than such notable titles as Journal of the American Statistical Association and just below Management Science. The 2-year impact factor is the average number of citations for articles published in the previous 2 years.

**Please comment on the article here:** **Hyndsights on Rob J Hyndman**

The post Why I’m not celebrating the 2016 impact factors appeared first on All About Statistics.

]]>The post Picky people appeared first on All About Statistics.

]]>Our book on Bayesian cost-effectiveness analysis using BCEA is out (I think as of last week). This has been a long process (I've talked about this here, here and here).

Today I've come back to the office and have open the package with my copies. The book looks nice $-$ I am only a bit disappointed about a couple of formatting things, specifically the way in which computer code got badly formatted in chapter 4.

We had originally used specific font, but for some reason in that chapter all computer code is formatted in Times New Romans. I think we did check in the proofs and I don't recall seeing this (which, to be fair, isn't necessarily to swear that we didn't miss it, while checking...).

Not a biggie. But it bothers me, a bit. Well, OK: a lot. But then again, I

**Please comment on the article here:** **Gianluca Baio's blog**

The post Picky people appeared first on All About Statistics.

]]>Here they are. I love seeing all the titles lined up in one place; it’s like a big beautiful poem about statistics: After Peptidegate, a proposed new slogan for PPNAS. And, as a bonus, a fun little graphics project. “Developers Who Use Spaces Make More Money Than Those Who Use Tabs” Question about the secret […]

The post On deck through the rest of the year (and a few to begin 2018) appeared first on Statistical Modeling, Causal Inference, and Social Science.

The post On deck through the rest of the year (and a few to begin 2018) appeared first on All About Statistics.

]]>Here they are. I love seeing all the titles lined up in one place; it’s like a big beautiful poem about statistics:

- After Peptidegate, a proposed new slogan for PPNAS. And, as a bonus, a fun little graphics project.
- “Developers Who Use Spaces Make More Money Than Those Who Use Tabs”
- Question about the secret weapon
- Incentives Matter (Congress and Wall Street edition)
- Analyze all your comparisons. That’s better than looking at the max difference and trying to do a multiple comparisons correction.
- Problems with the jargon the jargon “statistically significant” and “clinically significant”
- Capitalist science: The solution to the replication crisis?
- Bayesian, but not Bayesian enough
- Let’s stop talking about published research findings being true or false
- Plan 9 from PPNAS
- No, I’m not blocking you or deleting your comments!
- “Furthermore, there are forms of research that have reached such a degree of complexity in their experimental methodology that replicative repetition can be difficult.”
- “The Null Hypothesis Screening Fallacy”?
- What is a pull request?
- Turks need money after expensive weddings
- Statisticians and economists agree: We should learn from data by “generating and revising models, hypotheses, and data analyzed in response to surprising findings.”
- My unpublished papers
- Bigshot psychologist, unhappy when his famous finding doesn’t replicate, won’t consider that he might have been wrong; instead he scrambles furiously to preserve his theories
- Night Hawk
- Why they aren’t behavioral economists: Three sociologists give their take on “mental accounting”
- Further criticism of social scientists and journalists jumping to conclusions based on mortality trends
- Daryl Bem and Arthur Conan Doyle
- Classical statisticians as Unitarians
- Slaying Song
- What is “overfitting,” exactly?
- Graphs as comparisons: A case study
- Should we continue not to trust the Turk? Another reminder of the importance of measurement
- “The ‘Will & Grace’ Conjecture That Won’t Die” and other stories from the blogroll
- His concern is that the authors don’t control for the position of games within a season.
- How does a Nobel-prize-winning economist become a victim of bog-standard selection bias?
- “Bayes factor”: where the term came from, and some references to why I generally hate it
- A stunned Dyson
- Applying human factors research to statistical graphics
- Recently in the sister blog
- Adding a predictor can increase the residual variance!
- Died in the Wool
- “Statistics textbooks (including mine) are part of the problem, I think, in that we just set out ‘theta’ as a parameter to be estimated, without much reflection on the meaning of ‘theta’ in the real world.”
- An improved ending for The Martian
- Delegate at Large
- Iceland education gene trend kangaroo
- Reproducing biological research is harder than you’d think
- The fractal zealots
- Giving feedback indirectly by invoking a hypothetical reviewer
- It’s hard to know what to say about an observational comparison that doesn’t control for key differences between treatment and control groups, chili pepper edition
- PPNAS again: If it hadn’t been for the jet lag, would Junior have banged out 756 HRs in his career?
- Look. At. The. Data. (Hollywood action movies example)
- “This finding did not reach statistical significance, but it indicates a 94.6% probability that statins were responsible for the symptoms.”
- Wolfram on Golomb
- Irwin Shaw, John Updike, and Donald Trump
- What explains my lack of openness toward this research claim? Maybe my cortex is just too damn thick and wrinkled
- I love when I get these emails!
- Consider seniority of authors when criticizing published work?
- Does declawing cause harm?
- Bird fight! (Kroodsma vs. Podos)
- The Westlake Review
- “Social Media and Fake News in the 2016 Election”
- Also holding back progress are those who make mistakes and then label correct arguments as “nonsensical.”
- Just google “Despite limited statistical power”
- It is somewhat paradoxical that good stories tend to be anomalous, given that when it comes to statistical data, we generally want what is typical, not what is surprising. Our resolution of this paradox is . . .
- “Babbage was out to show that not only was the system closed, with a small group controlling access to the purse strings and the same individuals being selected over and again for the few scientific honours or paid positions that existed, but also that one of the chief beneficiaries . . . was undeserving.”
- Irish immigrants in the Civil War
- Mixture models in Stan: you can use log_mix()
- Don’t always give ’em what they want: Practicing scientists want certainty, but I don’t want to offer it to them!
- Cumulative residual plots seem like they could be useful
- Sucker MC’s keep falling for patterns in noise
- Nice interface, poor content
- “From that perspective, power pose lies outside science entirely, and to criticize power pose would be a sort of category error, like criticizing The Lord of the Rings on the grounds that there’s no such thing as an invisibility ring, or criticizing The Rotter’s Club on the grounds that Jonathan Coe was just making it all up.”
- Chris Moore, Guy Molyneux, Etan Green, and David Daniels on Bayesian umpires
- Using statistical prediction (also called “machine learning”) to potentially save lots of resources in criminal justice
- “Mainstream medicine has its own share of unnecessary and unhelpful treatments”
- What are best practices for observational studies?
- The Groseclose endgame: Getting from here to there.
- Causal identification + observational study + multilevel model
- All cause and breast cancer specific mortality, by assignment to mammography or control
- Iterative importance sampling
- Rosenbaum (1999): Choice as an Alternative to Control in Observational Studies
- Gigo update (“electoral integrity project”)
- How to design and conduct a subgroup analysis?
- Local data, centralized data analysis, and local decision making
- Too much backscratching and happy talk: Junk science gets to share in the reputation of respected universities
- Selection bias in the reporting of shaky research: An example
- Self-study resources for Bayes and Stan?
- Looking for the bottom line
- “How conditioning on post-treatment variables can ruin your experiment and what to do about it”
- Trial by combat, law school style
- Causal inference using data from a non-representative sample
- Type M errors studied in the wild
- Type M errors in the wild—really the wild!
- Where does the discussion go?
- Maybe this paper is a parody, maybe it’s a semibluff
- As if the 2010s never happened
- Using black-box machine learning predictions as inputs to a Bayesian analysis
- It’s not enough to be a good person and to be conscientious. You also need good measurement. Cargo-cult science done very conscientiously doesn’t become good science, it just falls apart from its own contradictions.
- Air rage update
- Getting the right uncertainties when fitting multilevel models
- Chess records page
- Weisburd’s paradox in criminology: it can be explained using type M errors
- “Cheerleading with an agenda: how the press covers science”
- Automated Inference on Criminality Using High-tech GIGO Analysis
- Some ideas on using virtual reality for data visualization: I don’t really agree with the details here but it’s all worth discussing
- Contribute to this pubpeer discussion!
- For mortality rate junkies
- The “fish MRI” of international relations studies.
- “5 minutes? Really?”
- 2 quick calls
- Should we worry about rigged priors? A long discussion.
- I’m not on twitter
- I disagree with Tyler Cowen regarding a so-called lack of Bayesianism in religious belief
- “Why bioRxiv can’t be the Central Service”
- Sudden Money
- The house is stronger than the foundations
- Please contribute to this list of the top 10 do’s and don’ts for doing better science
- Partial pooling with informative priors on the hierarchical variance parameters: The next frontier in multilevel modeling
- Does racquetball save lives?
- When do we want evidence-based change? Not “after peer review”
- “I agree entirely that the way to go is to build some model of attitudes and how they’re affected by recent weather and to fit such a model to “thick” data—rather than to zip in and try to grab statistically significant stylized facts about people’s cognitive illusions in this area.”
- “Bayesian evidence synthesis”
- Freelance orphans: “33 comparisons, 4 are statistically significant: much more than the 1.65 that would be expected by chance alone, so what’s the problem??”
- Beyond forking paths: using multilevel modeling to figure out what can be learned from this survey experiment
- From perpetual motion machines to embodied cognition: The boundaries of pseudoscience are being pushed back into the trivial.
- Why I think the top batting average will be higher than .311: Over-pooling of point predictions in Bayesian inference
- “La critique est la vie de la science”: I kinda get annoyed when people set themselves up as the voice of reason but don’t ever get around to explaining what’s the unreasonable thing they dislike.
- How to discuss your research findings without getting into “hypothesis testing”?
- Does traffic congestion make men beat up their wives?
- The Publicity Factory: How even serious research gets exaggerated by the process of scientific publication and reporting
- I think it’s great to have your work criticized by strangers online.
- In the open-source software world, bug reports are welcome. In the science publication world, bug reports are resisted, opposed, buried.
- If you want to know about basketball, who ya gonna trust, the Irene Blecker Rosenfeld Professor of Psychology at Cornell University and author of “The Wisest One in the Room: How You Can Benefit from Social Psychology’s Most Powerful Insights,” . . . or that poseur Phil Jackson??
- Quick Money
- An alternative to the superplot
- Where the money from Wiley Interdisciplinary Reviews went . . .
- Retract or correct, don’t delete or throw into the memory hole
- Using Mister P to get population estimates from respondent driven sampling
- “Americans Greatly Overestimate Percent Gay, Lesbian in U.S.”
- “It all reads like a classic case of faulty reasoning where the reasoner confuses the desirability of an outcome with the likelihood of that outcome.”
- Pseudoscience and the left/right whiplash
- The time reversal heuristic (priming and voting edition)
- The Night Riders
- Why you can’t simply estimate the hot hand using regression
- Stan to improve rice yields
- When people proudly take ridiculous positions
- “A mixed economy is not an economic abomination or even a regrettably unavoidable political necessity but a natural absorbing state,” and other notes on “Whither Science?” by Danko Antolovic
- Noisy, heterogeneous data scoured from diverse sources make his metanalyses stronger.
- What should this student do? His bosses want him to p-hack and they don’t even know it!
- Fitting multilevel models when predictors and group effects correlate
- I hate that “Iron Law” thing
- High five: “Now if it is from 2010, I think we can make all sorts of assumptions about the statistical methods without even looking.”
- “What is a sandpit?”
- No no no no no on “The oldest human lived to 122. Why no person will likely break her record.”
- Tips when conveying your research to policymakers and the news media
- Graphics software is not a tool that makes your graphs for you. Graphics software is a tool that allows you to make your graphs.
- Spatial models for demographic trends?
- A pivotal episode in the unfolding of the replication crisis
- We start by talking reproducible research, then we drift to a discussion of voter turnout
- Wine + Stan + Climate change = ?
- Stan is a probabilistic programming language
- Using output from a fitted machine learning algorithm as a predictor in a statistical model
- Poisoning the well with a within-person design? What’s the risk?
- “Dear Professor Gelman, I thought you would be interested in these awful graphs I found in the paper today.”
- I know less about this topic than I do about Freud.
- Driving a stake through that ages-ending-in-9 paper
- What’s the point of a robustness check?
- Oooh, I hate all talk of false positive, false negative, false discovery, etc.
- Trouble Ahead
- A new definition of the nerd?
- Orphan drugs and forking paths: I’d prefer a multilevel model but to be honest I’ve never fit such a model for this sort of problem
- Popular expert explains why communists can’t win chess championships!
- The four missing books of Lawrence Otis Graham
- “There was this prevalent, incestuous, backslapping research culture. The idea that their work should be criticized at all was anathema to them. Let alone that some punk should do it.”
- Loss of confidence
- “How to Assess Internet Cures Without Falling for Dangerous Pseudoscience”
- Ed Jaynes outta control!
- A reporter sent me a Jama paper and asked me what I thought . . .
- Workflow, baby, workflow
- Two steps forward, one step back
- Yes, you can do statistical inference from nonrandom samples. Which is a good thing, considering that nonrandom samples are pretty much all we’ve got.
- The Night Riders
- The piranha problem in social psychology / behavioral economics: The “take a pill” model of science eats itself
- Ready Money
- Stranger than fiction
- “The Billy Beane of murder”?
- Red doc, blue doc, rich doc, rich doc
- Working Class Postdoc
- “We wanted to reanalyze the dataset of Nelson et al. However, when we asked them for the data, they said they would only share the data if we were willing to include them as coauthors.”
- UNDER EMBARGO: the world’s most unexciting research finding
- Setting up a prior distribution in an experimental analysis
- Walk a Crooked MiIe
- It’s . . . spam-tastic!
- The failure of null hypothesis significance testing when studying incremental changes, and what to do about it
- Robust standard errors aren’t for me
- Stupid-ass statisticians don’t know what a goddam confidence interval is
- Forking paths plus lack of theory = No reason to believe any of this.
- Turn your scatterplots into elegant apparel and accessories!
- Your (Canadian) tax dollars at work

And a few to begin 2018:

- The Ponzi threshold and the Armstrong principle
- I’m with Errol: On flypaper, photography, science, and storytelling
- Politically extreme yet vital to the nation
- How does probabilistic computation differ in physics and statistics?
- “Each computer run would last 1,000-2,000 hours, and, because we didn’t really trust a program that ran so long, we ran it twice, and it verified that the results matched. I’m not sure I ever was present when a run finished.”

Enjoy.

We’ll also intersperse topical items as appropriate.

The post On deck through the rest of the year (and a few to begin 2018) appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**

The post On deck through the rest of the year (and a few to begin 2018) appeared first on All About Statistics.

]]>