dasonk2017-03-24T17:16:55+00:00http://dasonk.comDason Kurkiewiczdocstring release coming soon2017-03-24T00:00:00+00:00http://dasonk.com/r/2017/03/24/docstring
<p><em>This was originally posted at <a href="http://dasonk.github.com">Point Mass Prior</a> and features MathML. If you’re viewing the blog elsewhere the math probably won’t show up properly and it would be beneficial to view the post <a href="/r/2017/03/24/docstring">here</a></em></p>
<p>I official submitted version 1.0.0 of <a href="https://github.com/Dasonk/docstring">docstring</a> to CRAN. Hopefully a release will be on CRAN in the near future. Once that goes live I’ll post an update with more information about the package.</p>
Turn Down (For What)2014-07-18T00:00:00+00:00http://dasonk.com/2014/07/18/Lil-Jon-Analysis
<p>The following is an analysis of Lil’ Jon’s magnum opus “Turn Down (For What)”.</p>
<p>The buildup features a beat with a Shepard tone in the background building up the tension. One gets the impression that this is going to be a fun time. Lil’ Jon doesn’t take too long to come in and at the 16 second mark we hear him proclaim “Fire up your loud - another round of shots”. The implication of course being that it’s time to party and clearly they’ve had at least one round of shots already. This seems like it’s going to be a good night. After the music drops Lil’ Jon grabs your attention with the question “Turn down for what?”. It’s a question he wants you to ponder. What is there that would make you want to ‘turn down’? Now turning up is taken to mean having a crazy good time partying typically by means of large amounts of alcohol and/or illegal drug use. Lil’ Jon can’t find a reason one wouldn’t want to turn up. He asks the question five times in total. He is demanding an answer and you have failed to provide one. He is searching for meaning in this life and failing to find one he turns other means to numb the pain he feels. Notice that he is not alone during his inquisition. There are clearly others having a good time in the background calling out “eh” and making some fun sounding noises! He has his posse there but is still searching. He crew isn’t quite enough for him. He wants to know why somebody would turn down but can’t think of a reason.</p>
<p>Failing to get an answer he proclaims once again “Fire up your loud - another round of shots”. This is at least the third round of shots in the song. He isn’t slowing down either and his search for meaning hasn’t slowed down either. He once again asks “turn down for what” five more times. What seemed like a party anthem is quickly turning into a cry for help. One really connects with the pain and the longing that Lil’ Jon is expressing through the repeated asking of this simple question.</p>
<p>Just then it seems like he has given up hope in finding an answer because he starts again with “fire up your loud - another round of shots”. But he doesn’t stop. Just like Miley he can’t stop. Again he yells “fire up your loud - another round of shots”. And once more “fire up your loud - another round of shots”. Six shots at least so far in this short period of time. Just when you start to think he <em>has</em> to slow down eventually he picks up the pace and does “fire up your loud - another round of shots shots shots …” repeating shots at least thirty times. He is downing these shots and one has to hope they are either watered down or tiny, tiny shots because he is going to end up in the hospital if he really did consume that much hard alcohol in such a short period of time. That very well might be his goal. This is quite clearly a cry for help. He is desperate and seeking an answer by any means necessary even if that means death via alcohol poisoning. How much pain has this young man felt? What else has he tried in his life that he found to not be a suitable replacement for turning up? Was it a lover that hurt him? A rough childhood that he wants to forget? Does his place his hope, like the philosopher 50 cent, in the goal to “get rich or die trying” but hasn’t been able to satisfy this lust for money? Whatever is causing his pain you can feel it in this song.</p>
<p>After sinking into this depression and trying to kill all feeling you would think that he just wouldn’t care anymore. Lil’ Jon is full of surprises though because just when you think he’s down and out he turns philosopher once more. He asks the question “turn down for what” five more times. However, the last “turn down for what” sounds different. It is pained. It sounds like death. Could this be an attempt to symbolize the death of hope that we thought we saw so many times in this song already? There are no more lyrics to shed any light on this. All we are left with is a simple beat. And like all things in life - that too must end.</p>
Finally bought my own domain name2014-04-26T00:00:00+00:00http://dasonk.com/2014/04/26/finally-bought-my-own-domain-name
<p>I did it. I finally bought my own domain name. You’ll notice that
the site is now dasonk.com instead of dasonk.github.io. Previously
I didn’t want to have to pay a monthly fee and that’s what I thought
I would have to do for a domain name. Turns out I’m an idiot and since
github is already hosting my site all I needed to do was purchase
the domain name. I used Hover and the site cost $12 a year (I found
an online coupon to reduce the price by $3). I could have
obtained the domain name for cheaper but I’ve read good things about
Hover so I decided to go through them. Then it was just a matter
of configuring a few things so that the github site would show up
when you visit dasonk.com. I’m sure most of you know all of this
but I never really cared enough to learn it and finally decided
to take that step.</p>
<p>Now that I updated the domain name over the next few days I’m
thinking I’m going to give this entire place an overhaul: new theme,
new layout, make it more than just a blog…</p>
Using nls in place of the delta method2014-03-22T00:00:00+00:00http://dasonk.com/r/2014/03/22/Using-nls-in-place-of-the-delta-method
<p><em>This was originally posted at <a href="http://dasonk.github.com">Point Mass Prior</a> and features MathML. If you’re viewing from StatsBlogs the math probably won’t show up properly and it would be beneficial to view the post <a href="/r/2014/03/22/Using-nls-in-place-of-the-delta-method">here</a></em>.</p>
<p>It’s been a while since my last post which was on <a href="http://dasonk.github.io/r/2013/02/09/Using-the-delta-method/">using the delta method</a> in R with a specific application to finding the ‘x’ value that corresponds to the maximum/minimum value in a quadratic regression. This post will be about how to do the same thing in a slightly different way. Quadratic regression can be fit using a linear model of the form</p>
<script type="math/tex; mode=display">y_i = \beta_0 + \beta_1x_i + \beta_2x_i^2 + \epsilon_i</script>
<p>where <script type="math/tex">\epsilon_i</script> are independent and identically distributed normal random variables with mean 0 and a variance of <script type="math/tex">\sigma^2</script>. However, if our concern is on the ‘x’ value that provides the minimum/maximum value and possibly the value of the response at the minimum/maximum we can reformulate the model as</p>
<script type="math/tex; mode=display">y_i = \theta_1(x_i - \theta_2)^2 + \theta_3 + \epsilon_i</script>
<p>So that the x value that corresponds to the minimum/maximum is represented directly through the parameter <script type="math/tex">\theta_2</script>. The actual minimum/maximum value is also represented as a parameter in the model as <script type="math/tex">\theta_3</script>. In this case <script type="math/tex">\theta_1</script> can be interpreted as half of the second derivative with respect to x but that isn’t as much of an interest here. Note that we can expand this model out to get the same form as the linear model so it really is representing the same model but notice that we don’t actually have a <em>linear</em> model anymore.</p>
<p>To fit this we would need to use something other than <code class="highlighter-rouge">lm</code>. The natural choice in R is to use <code class="highlighter-rouge">nls</code>. We’ll look at an example of how to fit this model and get confidence intervals for the quantities of interest. We’ll use the same simulated data as my previous post so we can compare how the delta method and nls compare for this problem.</p>
<p>As a reminder the exact model we fit previously was where <script type="math/tex">y = - x*(x-10) + \epsilon</script> so to write it using the same form as our nonlinear model we have <script type="math/tex">y = -1*(x - 5)^2 + 25 + \epsilon</script>. So the maximum occurs at <script type="math/tex">x=5</script> and produces an output of 25 at that location.</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">set.seed</span><span class="p">(</span><span class="m">500</span><span class="p">)</span><span class="w">
</span><span class="n">n</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">30</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">runif</span><span class="p">(</span><span class="n">n</span><span class="p">,</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">10</span><span class="p">)</span><span class="w">
</span><span class="n">y</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="o">-</span><span class="n">x</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="m">10</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">rnorm</span><span class="p">(</span><span class="n">n</span><span class="p">,</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">8</span><span class="p">)</span><span class="w"> </span><span class="err">#</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="w"> </span><span class="m">+10</span><span class="n">x</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">x</span><span class="o">^</span><span class="m">2</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">error</span></code></pre></figure>
<p>Now we fit our model using <code class="highlighter-rouge">nls</code>. We need to provide starting values for the parameters since it fits using an iterative procedure. I provide some pretty bad starting values here but it still fits its just fine.</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">o</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">nls</span><span class="p">(</span><span class="n">y</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">t</span><span class="m">1</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">t</span><span class="m">2</span><span class="p">)</span><span class="o">^</span><span class="m">2</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">t</span><span class="m">3</span><span class="p">,</span><span class="w"> </span><span class="n">start</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">t</span><span class="m">1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="n">t</span><span class="m">2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="n">t</span><span class="m">3</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">))</span></code></pre></figure>
<p>Now we can look at the output we get</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">summary</span><span class="p">(</span><span class="n">o</span><span class="p">)</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-text" data-lang="text">##
## Formula: y ~ t1 * (x - t2)^2 + t3
##
## Parameters:
## Estimate Std. Error t value Pr(>|t|)
## t1 -1.040 0.239 -4.35 0.00017 ***
## t2 5.190 0.265 19.57 < 2e-16 ***
## t3 25.089 2.661 9.43 4.9e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.71 on 27 degrees of freedom
##
## Number of iterations to convergence: 8
## Achieved convergence tolerance: 2.22e-07</code></pre></figure>
<p>We see that the estimated value at which the maximum occurs is 5.1903. If we go back to the delta method post we see that we obtained the same estimate. Another interesting point is the the standard error for this term is the same as we obtained using the delta method. In both cases we get a standard error of 0.2652</p>
<p>We can easily obtain a confidence interval for this using <code class="highlighter-rouge">confint</code></p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">confint</span><span class="p">(</span><span class="n">o</span><span class="p">)</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-text" data-lang="text">## Waiting for profiling to be done...</code></pre></figure>
<figure class="highlight"><pre><code class="language-text" data-lang="text">## 2.5% 97.5%
## t1 -1.530 -0.5499
## t2 4.639 5.8815
## t3 19.650 30.5620</code></pre></figure>
<p>Now recall that we used the asymptotic normality of the transformation applied when we used the delta method to obtain a confidence interval so that previous interval which went from 4.671 to 5.710 was based on a normal distribution assumption. When using confint with a nls object it uses t-based methods to get a confidence interval so it will be a little bit wider. Recall that we have the same estimate and the same standard error as when we used the delta method so if we want we could get the same interval based on asymptotic normality as well. Alternatively if you use <code class="highlighter-rouge">confint.default</code> it will use a normal distribution to create your confidence intervals</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">confint.default</span><span class="p">(</span><span class="n">o</span><span class="p">)</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-text" data-lang="text">## 2.5 % 97.5 %
## t1 -1.508 -0.5719
## t2 4.671 5.7101
## t3 19.874 30.3054</code></pre></figure>
<p>And here we see that we get the same confidence interval as when we used the asymptotic normality argument to get the confidence intervals for the delta method approach.</p>
Using the delta method2013-02-09T00:00:00+00:00http://dasonk.com/r/2013/02/09/Using-the-delta-method
<p><em>This was originally posted at <a href="http://dasonk.github.com">Point Mass Prior</a> and features MathML. If you’re viewing from StatsBlogs the math probably won’t show up properly and it would be beneficial to view the post <a href="/r/2013/02/09/Using-the-delta-method">here</a></em></p>
<p>Somebody recently asked me about the delta method and specifically the <code class="highlighter-rouge">deltamethod</code> function in the msm package. I thought I would write about that and so to motivate this we’ll look at an example. The example we’ll consider is a simple case where we fit a quadratic regression to some data. This means our model has the form</p>
<script type="math/tex; mode=display">y_i = \beta_0 + \beta_1x_i + \beta_2x_i^2 + \epsilon_i</script>
<p>where <script type="math/tex">\epsilon_i</script> are independent and identically distributed normal random variables with mean 0 and a variance of <script type="math/tex">\sigma^2</script>.</p>
<p>To start we’ll generate some data such that we have roots at x=0 and x=10 and the quadratic is such that we have a maximum instead of a minimum.</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">set.seed</span><span class="p">(</span><span class="m">500</span><span class="p">)</span><span class="w">
</span><span class="n">n</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">30</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">runif</span><span class="p">(</span><span class="n">n</span><span class="p">,</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">10</span><span class="p">)</span><span class="w">
</span><span class="n">y</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="o">-</span><span class="n">x</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="m">10</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">rnorm</span><span class="p">(</span><span class="n">n</span><span class="p">,</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">8</span><span class="p">)</span><span class="w"> </span><span class="err">#</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="w"> </span><span class="m">+10</span><span class="n">x</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">x</span><span class="o">^</span><span class="m">2</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">error</span></code></pre></figure>
<p>We can plot the data to get a feel for it:</p>
<p><img src="/../figs/2013-02-09-Using-the-delta-method/plotdat.bmp" alt="center" /></p>
<p>Now it might be that what we’re really interested in is the input value that gives us the maximum value for the response (on average). Let’s call that value <script type="math/tex">x_\text{max}</script>. Now if we knew the true parameters for this data we could figure out exactly where that maximum occurs. We know that for a quadratic function <script type="math/tex">y = \beta_0 + \beta_1x + \beta_2x^2</script> the maximum value occurs at <script type="math/tex">x = \frac{-\beta_1}{2\beta_2}</script>. In our specific case we have <script type="math/tex">y = 0 + 10x -x^2</script> so the maximum occurs at <script type="math/tex">x = 5</script>. Just eyeballing our plot it doesn’t look like the quadratic that will be fit will give us a maximum that occurs exactly at <script type="math/tex">x=5</script>. Let’s actually fit the quadratic regression and see what we get for the estimated value of <script type="math/tex">x_\text{max}</script> which I will call <script type="math/tex">\hat{x}_\text{max}</script>.</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="c1"># Estimate quadratic regression
</span><span class="n">o</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">lm</span><span class="p">(</span><span class="n">y</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">I</span><span class="p">(</span><span class="n">x</span><span class="o">^</span><span class="m">2</span><span class="p">))</span><span class="w">
</span><span class="c1"># View the output
</span><span class="n">summary</span><span class="p">(</span><span class="n">o</span><span class="p">)</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-text" data-lang="text">##
## Call:
## lm(formula = y ~ x + I(x^2))
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.26 -6.13 -1.49 7.62 13.87
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.923 4.983 -0.59 0.56233
## x 10.794 2.422 4.46 0.00013 ***
## I(x^2) -1.040 0.239 -4.35 0.00017 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.71 on 27 degrees of freedom
## Multiple R-squared: 0.424, Adjusted R-squared: 0.381
## F-statistic: 9.93 on 2 and 27 DF, p-value: 0.000585</code></pre></figure>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="c1"># Make scatterplot and add the estimated curve
</span><span class="n">plot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">,</span><span class="w"> </span><span class="n">main</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Scatterplot with estimated regression curve"</span><span class="p">,</span><span class="w"> </span><span class="n">xlim</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w">
</span><span class="m">10</span><span class="p">))</span><span class="w">
</span><span class="n">curve</span><span class="p">(</span><span class="n">coef</span><span class="p">(</span><span class="n">o</span><span class="p">)[</span><span class="m">1</span><span class="p">]</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">coef</span><span class="p">(</span><span class="n">o</span><span class="p">)[</span><span class="m">2</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">coef</span><span class="p">(</span><span class="n">o</span><span class="p">)[</span><span class="m">3</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">x</span><span class="o">^</span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="n">col</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">,</span><span class="w"> </span><span class="n">add</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">)</span><span class="w">
</span><span class="c1"># Add a line at the theoretical maximum
</span><span class="n">abline</span><span class="p">(</span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">5</span><span class="p">,</span><span class="w"> </span><span class="n">col</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"black"</span><span class="p">)</span><span class="w">
</span><span class="c1"># Estimate the xmax value
</span><span class="n">beta2</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">coef</span><span class="p">(</span><span class="n">o</span><span class="p">)[</span><span class="s2">"I(x^2)"</span><span class="p">]</span><span class="w">
</span><span class="n">beta1</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">coef</span><span class="p">(</span><span class="n">o</span><span class="p">)[</span><span class="s2">"x"</span><span class="p">]</span><span class="w">
</span><span class="n">estmax</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">unname</span><span class="p">(</span><span class="o">-</span><span class="n">beta1</span><span class="o">/</span><span class="p">(</span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">beta2</span><span class="p">))</span><span class="w">
</span><span class="c1"># Add a line at estimated maximum
</span><span class="n">abline</span><span class="p">(</span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">estmax</span><span class="p">,</span><span class="w"> </span><span class="n">col</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"blue"</span><span class="p">,</span><span class="w"> </span><span class="n">lty</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">)</span><span class="w">
</span><span class="n">legend</span><span class="p">(</span><span class="s2">"topleft"</span><span class="p">,</span><span class="w"> </span><span class="n">legend</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"True max"</span><span class="p">,</span><span class="w"> </span><span class="s2">"Estimated max"</span><span class="p">,</span><span class="w"> </span><span class="s2">"Estimated curve"</span><span class="p">),</span><span class="w">
</span><span class="n">col</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"black"</span><span class="p">,</span><span class="w"> </span><span class="s2">"blue"</span><span class="p">,</span><span class="w"> </span><span class="s2">"red"</span><span class="p">),</span><span class="w"> </span><span class="n">lty</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="m">1</span><span class="p">))</span></code></pre></figure>
<p><img src="/../figs/2013-02-09-Using-the-delta-method/estimatequadratic.bmp" alt="center" /></p>
<p>So our estimate of the value where the maximum occurs is <script type="math/tex">\hat{x}_\text{max} =</script> 5.1903. This is pretty close but it would still be nice to have some sort of interval to go along with our estimate. This is where <a href="http://en.wikipedia.org/wiki/Delta_method">the delta method</a> can help us out. The delta method can be thought of as a way to get an estimated standard error for a transformation of estimated parameter values. In our case we’re interested in applying the function</p>
<script type="math/tex; mode=display">g(x, y) = \frac{-x}{2y}</script>
<p>to our estimated parameters.</p>
<script type="math/tex; mode=display">\hat{x}_\text{max} = g(\hat{\beta_1}, \hat{\beta_2}) = \frac{-\hat{\beta_1}}{2\hat{\beta_2}}</script>
<p>To perform the delta method we need to know a little bit of calculus. The method requires taking derivatives of our function of interest. Now this isn’t too bad to do in practice but not everybody that wants to perform an analysis will know how to take derivatives (or at least it might have been a long time since they’ve taken a derivative).</p>
<p>Luckily for us we don’t have to do the delta method by hand though as long as we know the transformation of interest. The <code class="highlighter-rouge">deltamethod</code> function in the msm package provides a convenient way to get the estimated standard error of the transformation as long as we can provide</p>
<ol>
<li>The transformation of interest</li>
<li>The estimated parameter values</li>
<li>The covariance matrix of the estimated parameters</li>
</ol>
<p>We already know (1) but we have to make sure it write it in the proper syntax for <code class="highlighter-rouge">deltamethod</code>. We can easily obtain (2) by using <code class="highlighter-rouge">coef</code> on our fitted model and we can also easily obtain (3) by using <code class="highlighter-rouge">vcov</code> on the estimated model.</p>
<p>When writing the syntax for the transformation when using the <code class="highlighter-rouge">deltamethod</code> function you need to refer to the first parameter as <code class="highlighter-rouge">x1</code>, the second parameter as <code class="highlighter-rouge">x2</code>, and so on. So if I wanted to find the standard error for the sum of two parameters I would write that as <code class="highlighter-rouge">~ x1 + x2</code>. In our case our estimated parameters are from the output of <code class="highlighter-rouge">coef(o)</code> so let’s sneak a peak at them to remind ourselves the output order.</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">coef</span><span class="p">(</span><span class="n">o</span><span class="p">)</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-text" data-lang="text">## (Intercept) x I(x^2)
## -2.923 10.794 -1.040</code></pre></figure>
<p>So in this case when writing our transformation we would refer to <script type="math/tex">\hat{\beta_0}</script> as <code class="highlighter-rouge">x1</code>, <script type="math/tex">\hat{\beta_1}</script> as <code class="highlighter-rouge">x2</code>, and <script type="math/tex">\hat{\beta_2}</script> as <code class="highlighter-rouge">x3</code>. As a reminder the transformation we applied was <script type="math/tex">\frac{-\hat{\beta_1}}{2\hat{\beta_2}}</script> so the formula we want is <code class="highlighter-rouge">~ -x2 / (2 * x3)</code>.</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">library</span><span class="p">(</span><span class="n">msm</span><span class="p">)</span><span class="w">
</span><span class="n">standerr</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">deltamethod</span><span class="p">(</span><span class="o">~-</span><span class="n">x</span><span class="m">2</span><span class="o">/</span><span class="p">(</span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">x</span><span class="m">3</span><span class="p">),</span><span class="w"> </span><span class="n">coef</span><span class="p">(</span><span class="n">o</span><span class="p">),</span><span class="w"> </span><span class="n">vcov</span><span class="p">(</span><span class="n">o</span><span class="p">))</span><span class="w">
</span><span class="n">standerr</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-text" data-lang="text">## [1] 0.2652</code></pre></figure>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="c1"># Make a confidence interval
</span><span class="p">(</span><span class="n">ci</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">estmax</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="m">-1</span><span class="p">,</span><span class="w"> </span><span class="m">1</span><span class="p">)</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">qnorm</span><span class="p">(</span><span class="m">0.975</span><span class="p">)</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">standerr</span><span class="p">)</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-text" data-lang="text">## [1] 4.671 5.710</code></pre></figure>
<p>So we see that our confidence interval does contain the true value. We could also do a hypothesis test if we wanted to test against a certain value. Here we’ll test using a null hypothesis that the true <script type="math/tex">x_\text{max} = 5</script>.</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="c1"># If we want to do a hypothesis test of Ho: xmax = 5 Ha: xmax != 5
</span><span class="n">z</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="p">(</span><span class="n">estmax</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="m">5</span><span class="p">)</span><span class="o">/</span><span class="n">standerr</span><span class="w">
</span><span class="c1"># Calculate p-value
</span><span class="n">pval</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">pnorm</span><span class="p">(</span><span class="o">-</span><span class="nf">abs</span><span class="p">(</span><span class="n">z</span><span class="p">))</span><span class="w">
</span><span class="n">pval</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-text" data-lang="text">## [1] 0.4729</code></pre></figure>
<p>Our p-value of 0.473 doesn’t allow us to reject the null hypothesis in this situation.</p>
<p>So we can see that it’s fairly easy to implement the delta method in R. Now this isn’t necessarily my favorite way to get intervals for transformations of parameters but if you’re a frequentist then it can be quite useful.</p>
An election puzzle2012-11-12T00:00:00+00:00http://dasonk.com/puzzle/2012/11/12/An-election-puzzle
<p>John Jackson and Jack Johnson were going head to head in a mayoral race. Ultimately John Jackson obtained 300 votes and Jack Johnson obtained 200 votes. While counting the votes they announced after each vote what the current results were. For example after they counted the first vote they announced something like:</p>
<p>“John Jackson: 1; Jack Johnson: 0”</p>
<p>and then after the second vote</p>
<p>“John Jackson: 1; Jack Johnson: 1”</p>
<p>and so on until they finally counted the last vote and announced</p>
<p>“John Jackson: 300; Jack Johnson: 200”</p>
<p>Not counting the very beginning when the results were “John: 0; Jack: 0” - What is the probability that at some point during the counting the result was a tie?</p>
<p>There is a nice solution to this that I will give in my next post. Try to figure it out in the meantime!</p>
Using Jekyll and Mathjax2012-10-09T00:00:00+00:00http://dasonk.com/blog/2012/10/09/Using-Jekyll-and-Mathjax
<p>To get Mathjax support the easiest thing to do is to add the following to _include/themes/yourtheme/default.html where “yourtheme” is whichever theme you’re currently using.</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><script </span><span class="na">type=</span><span class="s">"text/javascript"</span>
<span class="na">src=</span><span class="s">"http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"</span><span class="nt">></span>
<span class="nt"></script></span></code></pre></figure>
<p>I’m sure there is a better way to do this so that you don’t need to add this whenever you change themes but this is working for me for the time being.</p>
<p>However using Maruku (the default markdown rendering engine with Jekyll) actually using math can be a pain since it likes to replace <code class="highlighter-rouge">_</code> with <code class="highlighter-rouge"><em></code> no matter where it is in the post. So <code class="highlighter-rouge">$$x_i$$</code> gets converted to <code class="highlighter-rouge">$$x<em>i$$</code> which is not what we want.</p>
<p>To have a nice setup for using math in my posts I switched my markdown rendering engine from the default with Jekyll (Maruku) to kramdown. I just installed the kramdown gem</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">gem install kramdown
<span class="c"># Although you probably need to do...</span>
sudo gem install kramdown</code></pre></figure>
<p>and then I needed to add</p>
<div class="highlighter-rouge"><pre class="highlight"><code>markdown: kramdown
</code></pre>
</div>
<p>to the _config.yml file at the top directory of my repository.</p>
<p>After that using Mathjax is as simple as surrounding any math in double dollar signs <code class="highlighter-rouge">$$ insert_math_here $$</code> so that <code class="highlighter-rouge">$$x^2$$</code> gets rendered as <script type="math/tex">x^2</script>. If your dollar signs are standalone not inline with other text like this:</p>
<figure class="highlight"><pre><code class="language-latex" data-lang="latex"><span class="p">$$</span><span class="nb">a</span><span class="p">^</span><span class="m">2</span><span class="nb"> </span><span class="o">+</span><span class="nb"> b</span><span class="p">^</span><span class="m">2</span><span class="nb"> </span><span class="o">=</span><span class="nb"> c</span><span class="p">^</span><span class="m">2</span><span class="p">$$</span></code></pre></figure>
<p>then it will get centered and displayed like “display math” like:</p>
<script type="math/tex; mode=display">a^2 + b^2 = c^2</script>
<p>Combining this with the fact that I write my posts using R Markdown so I can easily insert code and output into my posts makes the whole process a lot nicer.</p>
<p>I have heard good things about rdiscount as a markdown rendering engine but I don’t think I’ll use it. It doesn’t appear to support the syntax I described in this post and I rather like this syntax. kramdown appears to work well enough for my needs so I think I’ll stick with that for now.</p>
printR2012-10-07T00:00:00+00:00http://dasonk.com/r/2012/10/07/printR
<p>As my first actual post on this new blog I thought I would do something very silly and pointless. What better way to start off a new adventure than to write some <em>very</em> obfuscated code. The following is a mess and it is meant to be that way. To be honest I wrote this a while ago and I have a slightly decrypted copy that still doesn’t even make sense. Enjoy!</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">h</span><span class="o">=</span><span class="n">character</span><span class="p">;</span><span class="n">r</span><span class="o">=</span><span class="n">rep</span><span class="p">;</span><span class="n">a</span><span class="o">=</span><span class="n">b</span><span class="o">=</span><span class="n">h</span><span class="p">(</span><span class="m">0</span><span class="p">);</span><span class="n">p</span><span class="o">=</span><span class="n">options</span><span class="p">()</span><span class="o">$</span><span class="n">width</span><span class="o">%/%</span><span class="m">2-5</span><span class="p">;</span><span class="n">n</span><span class="o">=</span><span class="s2">"
"</span><span class="p">;</span><span class="n">j</span><span class="o">=</span><span class="n">r</span><span class="p">(</span><span class="n">toupper</span><span class="p">(</span><span class="n">substring</span><span class="p">(</span><span class="n">mode</span><span class="p">(</span><span class="n">a</span><span class="p">),</span><span class="m">4</span><span class="p">,</span><span class="m">4</span><span class="p">)),</span><span class="nf">sum</span><span class="p">(</span><span class="n">r</span><span class="p">(</span><span class="m">5</span><span class="o">:</span><span class="m">9</span><span class="p">,</span><span class="m">2</span><span class="p">)</span><span class="m">+1</span><span class="p">)</span><span class="m">-3</span><span class="p">)</span><span class="w">
</span><span class="n">k</span><span class="o">=</span><span class="n">r</span><span class="p">(</span><span class="m">5</span><span class="o">:</span><span class="m">9</span><span class="p">,</span><span class="m">2</span><span class="p">);</span><span class="n">k</span><span class="p">[</span><span class="m">4</span><span class="o">:</span><span class="m">5</span><span class="p">]</span><span class="o">=</span><span class="m">7</span><span class="p">;</span><span class="n">k</span><span class="o">=</span><span class="nf">cumsum</span><span class="p">(</span><span class="n">k</span><span class="m">+1</span><span class="p">);</span><span class="n">j</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="o">=</span><span class="n">n</span><span class="p">;</span><span class="n">m</span><span class="o">=</span><span class="n">paste</span><span class="p">(</span><span class="n">h</span><span class="p">(</span><span class="m">1</span><span class="p">),</span><span class="w">
</span><span class="n">h</span><span class="p">(</span><span class="m">1</span><span class="p">));</span><span class="n">s</span><span class="o">=</span><span class="nf">c</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="n">k</span><span class="p">[</span><span class="m">-10</span><span class="p">])</span><span class="m">+1</span><span class="p">;</span><span class="n">j</span><span class="p">[</span><span class="nf">c</span><span class="p">(</span><span class="m">16</span><span class="o">:</span><span class="m">17</span><span class="p">,</span><span class="m">24</span><span class="o">:</span><span class="m">26</span><span class="p">,</span><span class="m">32</span><span class="o">:</span><span class="m">33</span><span class="p">,</span><span class="m">46</span><span class="o">:</span><span class="m">47</span><span class="p">,</span><span class="m">53</span><span class="o">:</span><span class="m">55</span><span class="p">,</span><span class="w">
</span><span class="m">61</span><span class="o">:</span><span class="m">64</span><span class="p">,</span><span class="m">70</span><span class="o">:</span><span class="m">74</span><span class="p">)]</span><span class="o">=</span><span class="n">m</span><span class="p">;</span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="m">10</span><span class="p">)</span><span class="n">a</span><span class="o">=</span><span class="nf">c</span><span class="p">(</span><span class="n">a</span><span class="p">,</span><span class="n">r</span><span class="p">(</span><span class="n">m</span><span class="p">,</span><span class="n">p</span><span class="p">),</span><span class="n">j</span><span class="p">[</span><span class="n">s</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">:</span><span class="n">k</span><span class="p">[</span><span class="n">i</span><span class="p">]])</span><span class="w">
</span><span class="n">cat</span><span class="p">(</span><span class="nf">c</span><span class="p">(</span><span class="n">n</span><span class="p">,</span><span class="n">a</span><span class="p">),</span><span class="n">sep</span><span class="o">=</span><span class="n">b</span><span class="p">)</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-text" data-lang="text">##
## RRRRR
## RRRRRR
## RR RRR
## RR RR
## RR RRR
## RRRRR
## RR RR
## RR RR
## RR RR
## RR RR</code></pre></figure>