<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>John D. Cook</title>
	<atom:link href="http://www.johndcook.com/blog/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.johndcook.com/blog</link>
	<description>Applied Mathematics Consulting</description>
	<lastBuildDate>Tue, 12 May 2026 00:49:41 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://www.johndcook.com/wp-content/uploads/2020/01/cropped-favicon_512-32x32.png</url>
	<title>John D. Cook</title>
	<link>https://www.johndcook.com/blog</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Euler function</title>
		<link>https://www.johndcook.com/blog/2026/05/11/euler-function/</link>
					<comments>https://www.johndcook.com/blog/2026/05/11/euler-function/#respond</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Tue, 12 May 2026 00:49:41 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=247019</guid>

					<description><![CDATA[<p>This morning I wrote a post about the probability that a random matrix over a finite field is invertible. If the field has q elements and the matrix has dimensions n × n then the probability is In that post I made observation that p(q, n) converges very quickly as a function of n [1]. [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/05/11/euler-function/">Euler function</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>This morning I wrote a <a href="https://www.johndcook.com/blog/2026/05/11/random-binary-matrices/">post</a> about the probability that a random matrix over a finite field is invertible. If the field has <em>q</em> elements and the matrix has dimensions <em>n</em> × <em>n</em> then the probability is</p>
<p><img class='aligncenter' src='https://www.johndcook.com/euler_fun1.svg' alt='p(q, n) = \prod_{i=1}^n \left(1 - \frac{1}{q^i}\right)' style='background-color:white' height='54' width='175' /></p>
<p>In that post I made observation that <em>p</em>(<em>q</em>, <em>n</em>) converges very quickly as a function of <em>n</em> [1]. One way to see that the convergence is quick is to note that</p>
<p><img class='aligncenter' src='https://www.johndcook.com/euler_fun2.svg' alt='\prod_{i=1}^\infty \left(1 - \frac{1}{q^i}\right) = \prod_{i=1}^n \left(1 - \frac{1}{q^i}\right) \, \prod_{i=n+1}^\infty \left(1 - \frac{1}{q^i}\right)' style='background-color:white' height='54' width='336' /></p>
<p>and</p>
<p><img class='aligncenter' src='https://www.johndcook.com/euler_fun3.svg' alt='\prod_{i=n+1}^\infty \left(1 - \frac{1}{q^i}\right) = 1 - {\cal O}\left(\frac{1}{q^{n+1}}\right)' style='background-color:white' height='54' width='237' /></p>
<p>John Baez pointed out in the comments that <em>p</em>(<em>q</em>, ∞) = φ(1/<em>q</em>) where φ is the Euler function.</p>
<p>Euler was extremely prolific, and many things are named after him. Several functions are known as Euler&#8217;s function, the most common being his totient function in number theory. The Euler function we&#8217;re interested in here is</p>
<p><img class='aligncenter' src='https://www.johndcook.com/euler_fun4.svg' alt='\phi(x) = \prod_{i=1}^\infty \left( 1 - x^i \right)' style='background-color:white' height='54' width='154' /></p>
<p>for −1 &lt; x &lt; 1. Usually the argument of φ is denoted &#8220;<em>q</em>&#8221; but that would be confusing in our context because our <em>q</em>, the number of elements in a field, is the reciprocal of Euler&#8217;s <em>q</em>, i.e. <em>x</em> = 1/<em>q</em>.</p>
<p>Euler&#8217;s identity [2] (in this context, not to be confused with other Euler identities!) says</p>
<p><img class='aligncenter' src='https://www.johndcook.com/euler_fun5.svg' alt='\phi(x) = \sum_{n=-\infty}^\infty (-1)^n x^{(3n^2 - n)/2}' style='background-color:white' height='53' width='222' /></p>
<p>This function is easy to calculate because the series converges very quickly. From the alternating series theorem we have</p>
<p><img class='aligncenter' src='https://www.johndcook.com/euler_fun6.svg' alt='\phi(x) = \sum_{n=-N}^N (-1)^n (-1)^n x^{(3n^2 - n)/2} + {\cal O}\left( x^{(3(N+1)^2 - (N+1))/2} \right)' style='background-color:white' height='57' width='464' /></p>
<p>When <em>q</em> = 2 and so <em>x</em> = 1/2, <em>N</em> = 6 is enough to compute φ(<em>x</em>) with an error less than 2<sup>−70</sup>, beyond the precision of a floating point number. When <em>q</em> is larger, even fewer terms are needed.</p>
<p>To illustrate this, we have the following Python script.</p>
<pre>
def phi(x, N):
    s = 0
    for n in range(-N, N+1):
        s += (-1)**n * x**((3*n**2 - n)/2)
    return s

print(phi(0.5, 6))
</pre>
<p>Every digit in the output is correct.</p>
<h2>Related posts</h2>
<ul>
<li class='link'><a href='https://www.johndcook.com/blog/2024/07/23/q-pochhammer/'>q-analogs of rising powers</a></li>
<li class='link'><a href='https://www.johndcook.com/blog/2021/11/11/another-pentagonal-number-theorem/'>Another pentagonal number theorem</a></li>
</ul>
<p>[1] I didn&#8217;t say that explicitly, but I pointed out that <em>p</em>(2, 8) was close to <em>p</em>(2, ∞).</p>
<p>[2] This identity is also known as the pentagonal number theorem because of its connection to <a href="https://www.johndcook.com/blog/2021/11/10/partitions-and-pentagons/">pentagonal numbers</a>.</p>The post <a href="https://www.johndcook.com/blog/2026/05/11/euler-function/">Euler function</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/05/11/euler-function/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Inverse shift</title>
		<link>https://www.johndcook.com/blog/2026/05/11/inverse-shift/</link>
					<comments>https://www.johndcook.com/blog/2026/05/11/inverse-shift/#respond</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Mon, 11 May 2026 15:30:30 +0000</pubDate>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Linear algebra]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=247017</guid>

					<description><![CDATA[<p>What is the inverse of shifting a sequence to the right? Shifting it to the left, obviously. But wait a minute. Suppose you have a sequence of eight bits abcdefgh and you shift it to the right. You get 0abcdefg. If you shift this sequence to the left you get abcdefg0 You can&#8217;t recover the [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/05/11/inverse-shift/">Inverse shift</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>What is the inverse of shifting a sequence to the right? Shifting it to the left, obviously.</p>
<p>But wait a minute. Suppose you have a sequence of eight bits</p>
<p style="padding-left: 40px;"><em>abcdefgh</em></p>
<p>and you shift it to the right. You get</p>
<p style="padding-left: 40px;">0<em>abcdefg.</em></p>
<p>If you shift this sequence to the left you get</p>
<p style="padding-left: 40px;"><em>abcdefg</em>0</p>
<p>You can&#8217;t recover the last element <em>h</em> because the right-shift destroyed information about <em>h</em>.</p>
<p>A left-shift doesn&#8217;t fully recover a right-shift, and yet surely left shift and right shift are in some sense inverses.</p>
<p><a href="https://www.johndcook.com/blog/2026/05/10/the-linear-algebra-of-bit-twiddling/">Yesterday</a> I wrote a post about representing bit manipulations, including shifts, as matrix operators. The matrix corresponding to shifting right by <em>k</em> bits has 1s on the <em>k</em>th diagonal above the main diagonal and 0s everywhere else. For example, here is the matrix for shifting an 8-bit number right two bits. A black square represents a 1 and a white square represents a 0.</p>
<p><img loading="lazy" decoding="async" class="aligncenter" src="https://www.johndcook.com/right_shift_2_matrix.png" width="360" height="360" /></p>
<p>This matrix isn&#8217;t invertible. When you&#8217;d like to take the inverse of a non-invertible matrix, your kneejerk response should be to compute the pseudoinverse. (Technically the <a href="https://www.johndcook.com/blog/2018/05/05/svd/">Moore-Penrose pseudoinverse</a>. There are <a href="https://www.johndcook.com/blog/2025/05/21/drazin-pseudoinverse/">other</a> pseudoinverses, but Moore-Penrose is the most common.)</p>
<p>As you might hope/expect, the pseudoinverse of a right-shift matrix is a left-shift matrix. In this case the pseudoinverse is simply the transpose, though of course that isn&#8217;t always the case.</p>
<p><img loading="lazy" decoding="async" class="aligncenter" src="https://www.johndcook.com/left_shift_2_matrix.png" width="360" height="360" /></p>
<p>If you&#8217;d like to prove that the pseudoinverse of a matrix that shifts right by <em>k</em> places is a matrix that shifts left by <em>k</em> places, you don&#8217;t have to compute the pseudo inverse per se: you can verify your guess. <a href="https://www.johndcook.com/blog/2018/04/24/moore-penrose-pseudoinverse-is-not-an-adjoint/">This post</a> gives four requirements for a pseudoinverse. You can prove that left shift is the inverse of right shift by showing that it satisfies the four equations.</p>The post <a href="https://www.johndcook.com/blog/2026/05/11/inverse-shift/">Inverse shift</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/05/11/inverse-shift/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Probability that a random binary matrix is invertible</title>
		<link>https://www.johndcook.com/blog/2026/05/11/random-binary-matrices/</link>
					<comments>https://www.johndcook.com/blog/2026/05/11/random-binary-matrices/#comments</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Mon, 11 May 2026 13:25:13 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Linear algebra]]></category>
		<category><![CDATA[Probability]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=247016</guid>

					<description><![CDATA[<p>The two latest posts have involved invertible matrices with 0 and 1 entries. If you fill an n × n matrix with 0s and 1s randomly, how likely is it to be invertible? What kind of inverse? There are a couple ways to find the probability that a binary matrix is invertible, depending on what [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/05/11/random-binary-matrices/">Probability that a random binary matrix is invertible</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>The two latest posts have involved invertible matrices with 0 and 1 entries. If you fill an <em>n</em> × <em>n</em> matrix with 0s and 1s randomly, how likely is it to be invertible?</p>
<h2>What kind of inverse?</h2>
<p>There are a couple ways to find the probability that a binary matrix is invertible, depending on what you mean by the inverse.</p>
<p>Suppose you have a matrix <em>M</em> filled with 0s and 1s and you&#8217;re looking for a matrix <em>N</em> such that <em>MN</em> is the identity matrix. Do you want the entries of <em>N</em> to also be 0s and 1s? And when you multiply the matrices, are you doing ordinary integer arithmetic or are you working mod 2?</p>
<p>In the previous posts we were working over GF(2), the field with two elements, 0 and 1. All the elements of a matrix are either 0 or 1, and arithmetic is carried out mod 2. In that context there&#8217;s a nice expression for the probability a square matrix is invertible.</p>
<p>If you&#8217;re working over the real numbers, the probability of binary matrix being invertible is higher. One way to see this is that the inverse of a binary matrix is allowed to be binary but it isn&#8217;t required to be.</p>
<p>Another way to see this is to look at determinants. If you think of a matrix <em>M</em> as a real matrix whose entries happen to only be 0 or 1, <em>M</em> is invertible if and only its determinant is non-zero. But if you think of <em>M</em> as a matrix over GF(2), the entries are either 0 or 1 out of necessity, and <em>M</em> is invertible if and only if its determinant, <em>computed in</em> GF(2), is non-zero. If the determinant of <em>M</em> as a real matrix is a non-zero even number, then <em>M</em> is invertible as a real matrix but not as a matrix over GF(2).</p>
<h2>Probability of invertibility in GF(2)</h2>
<p>Working over GF(2), what is the probability that a random matrix is invertible? Turns out it&#8217;s just as easy to answer a more general question: what is the probability that a random <em>n</em> × <em>n</em> matrix over GF(<em>q</em>), a finite field with <em>q</em> elements, is invertible? This is</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/prob_finite_field_matrix_invertible.svg" alt="\prod_{i=1}^n\left( 1 - \frac{1}{q^i} \right)" width="98" height="54" /></p>
<p>When <em>q</em> = 2 and <em>n</em> = 8 this probability is 0.289919. The probability is roughly the same for all larger values of <em>n</em>, converging to approximately 0.288788 as <em>n</em> → ∞.</p>
<h2>Probability of invertibility in ℝ</h2>
<p>What is the probability that an 8 × 8 matrix with random 0 and 1 entries is invertible as a real matrix? We can estimate this by simulation.</p>
<pre>import numpy as np

def simulate_prob_invertible_real(n, numreps=1000):
    s = 0
    for _ in range(numreps):
        M = np.random.randint(0, 2, size=(n, n))
        det = np.linalg.det(M)
        if abs(det) &gt; 1e-9:
            s += 1
    return s/numreps
</pre>
<p>When <em>n</em> = 8, I got 0.5477 when running the code with 10,000 reps.</p>
<p>When <em>n</em> = 32, I got a probability of 1. Obviously it is possible for a 32 × 32 binary matrix to be singular, but it&#8217;s very unlikely: it didn&#8217;t happen in 10,000 random draws.</p>The post <a href="https://www.johndcook.com/blog/2026/05/11/random-binary-matrices/">Probability that a random binary matrix is invertible</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/05/11/random-binary-matrices/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title>The linear algebra of bit twiddling</title>
		<link>https://www.johndcook.com/blog/2026/05/10/the-linear-algebra-of-bit-twiddling/</link>
					<comments>https://www.johndcook.com/blog/2026/05/10/the-linear-algebra-of-bit-twiddling/#comments</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Sun, 10 May 2026 18:51:40 +0000</pubDate>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[Linear algebra]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=247014</guid>

					<description><![CDATA[<p>The previous post looked at the tempering step of the Mersenne Twister, formulating a sequence of bit operations as multiplication by a matrix mod 2. This post will look at the components more closely. The theorems of linear algebra generally hold independent of the field of scalars. Typically the field is ℝ or ℂ, but [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/05/10/the-linear-algebra-of-bit-twiddling/">The linear algebra of bit twiddling</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>The <a href="https://www.johndcook.com/blog/2026/05/10/reverse-mersenne-twister/">previous post</a> looked at the tempering step of the Mersenne Twister, formulating a sequence of bit operations as multiplication by a matrix mod 2. This post will look at the components more closely.</p>
<p>The theorems of linear algebra generally hold independent of the field of scalars. Typically the field is ℝ or ℂ, but most of basic linear algebra works the same over every field [1]. In particular, we can do linear algebra over a finite field, and we&#8217;re interested in the most finite of finite fields GF(2), the field with just two elements, 0 and 1.</p>
<p>In GF(2), addition corresponds to XOR. We will denote this by ⊕ to remind us that although it&#8217;s addition, it&#8217;s not the usual addition, i.e. 1 ⊕ 1 = 0. Similarly, multiplication corresponds to AND. We&#8217;ll work with 8-bit numbers to make the visuals easier to see.</p>
<p>Shifting a number left one bit corresponds to multiplication by a matrix with 1&#8217;s below the diagonal main. Shifting left by <em>k</em> bits is the same as shifting left by 1 bit <em>k</em> times, so the the matrix representation for <em>x</em> &lt;&lt; <em>k</em> is the <em>k</em>th power of the matrix representation of shifting left once. This matrix has 1s on the <em>k</em>th diagonal below the main diagonal. Below is the matrix for shifting left two bits, <em>x</em> &lt;&lt; <em>k</em>.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/left_shift_2_matrix.png" width="360" height="360" /></p>
<p>Right shifts are the mirror image of left shifts. Here&#8217;s the matrix for shifting right two bits, <em>x</em> &gt;&gt; <em>k</em>.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/right_shift_2_matrix.png" width="360" height="360" /></p>
<p>Shifts are not fully invertible because bits either fall off the left or the right end. The steps in the Mersenne Twister are invertible because shifts are always XOR&#8217;d with the original argument. For example, although the function that takes <em>x</em> to <em>x</em> &gt;&gt; 2 is not invertible, the function that takes <em>x</em> to <em>x</em> ⊕ (<em>x</em> &gt;&gt; 2) is invertible. This operation corresponds to the matrix below.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/right_shift_xor.png" width="360" height="360" /></p>
<p>This is an upper triangular matrix, so its determinant is the product of the diagonal elements. These are all 1s, so the determinant is 1, and the matrix is invertible.</p>
<p>Bitwise AND multiplies each bit of the input by the corresponding bit in another number known as the mask. The bits aligned with a 1 are kept and the bits aligned with a 0 are cleared. This corresponds to multiplying by a diagonal matrix whose diagonal elements correspond to the bits in the mask. For example, here is the matrix that corresponds to taking the bitwise AND with 10100100.</p>
<p><img loading="lazy" decoding="async" src="https://www.johndcook.com/mask_matrix.png" width="360" height="360" class="aligncenter size-medium" /></p>
<p>Each of the steps in the Mersenne Twister tempering process are invertible because they all correspond to triangular matrices with all 1&#8217;s on the diagonal. For example, the line</p>
<pre>y ^= (y <<  7) &#038; 0x9d2c5680 </pre>
<p>says to shift the bits of <code>y</code> left 7 places, then zero out the elements corresponding to 0s in the mask, then XOR the result with <em>y</em>. In matrix terms, we multiply by a lower triangular matrix with zeros on the main diagonal, then multiply by a diagonal matrix that zeros out some of the terms, then add the identity matrix. So the matrix corresponding to the line of code above is lower triangular, with all 1s on the diagonal, so it is invertible. </p>
<p>[1] Until you get to eigenvalues. Then it matters whether the field is algebraically complete, which no finite field is.</p>The post <a href="https://www.johndcook.com/blog/2026/05/10/the-linear-algebra-of-bit-twiddling/">The linear algebra of bit twiddling</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/05/10/the-linear-algebra-of-bit-twiddling/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title>Reverse engineering Mersenne Twister with Linear Algebra</title>
		<link>https://www.johndcook.com/blog/2026/05/10/reverse-mersenne-twister/</link>
					<comments>https://www.johndcook.com/blog/2026/05/10/reverse-mersenne-twister/#respond</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Sun, 10 May 2026 17:32:02 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Linear algebra]]></category>
		<category><![CDATA[RNG]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=247013</guid>

					<description><![CDATA[<p>The Mersenne Twister (MT) is a random number generator with good statistical properties but bad cryptographic properties. In buzzwords, it&#8217;s a PRNG but not a CSPRNG. This post will show how the internal state of a MT generator can be recovered from its output. We&#8217;ll do this using linear algebra. The bit twiddling approach is [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/05/10/reverse-mersenne-twister/">Reverse engineering Mersenne Twister with Linear Algebra</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>The Mersenne Twister (MT) is a random number generator with good statistical properties but bad cryptographic properties. In buzzwords, it&#8217;s a PRNG but not a CSPRNG.</p>
<p>This post will show how the internal state of a MT generator can be recovered from its output. We&#8217;ll do this using linear algebra. The bit twiddling approach is more common and more efficient, but the linear algebra approach is conceptually simpler.</p>
<h2>How MT works</h2>
<p>There are numerous variations on the Mersenne Twister. We&#8217;ll focus on the original version that had internal state consists of vector <code>x</code> of length 640 filled with 32-bit numbers. The ideas in this post would apply equally to all MT versions.</p>
<p>The first call to MT returns a &#8220;tempered&#8221; version of <code>x[0]</code>. The next call returned a tempered version of <code>x[1]</code>, and so on. After every 640 calls, the state is scrambled. This scrambling is where the &#8220;twist&#8221; in the name Mersenne Twister comes from. (The Mersenne part comes from the fact that the period of an MT generator is a <a href="https://www.johndcook.com/blog/2011/09/09/five-interesting-things-about-mersenne-primes/">Mersenne prime</a>.)</p>
<h2>Tempering</h2>
<p>Here is Python code for performing the tempering step.</p>
<pre>def temper(y):
    y ^= (y &gt;&gt; 11) 
    y ^= (y &lt;&lt;  7) &amp; 0x9d2c5680 
    y ^= (y &lt;&lt; 15) &amp; 0xefc60000 
    y ^= (y &gt;&gt; 18)  
    return y
</pre>
<p>Each step is reversible, and so the <code>temper</code> function is reversible.</p>
<p>Because the tempering step is reversible, the first output can be used to infer the first element of the internal state, the second output to infer the second element, and so on. After 640 calls one can know the entire internal state and predict the rest of the generator&#8217;s output from then on.</p>
<h2>Linear algebra</h2>
<p>The bitwise operations above all correspond to linear operations over GF(2), the field with just two elements, 0 and 1. Addition in this field is XOR and multiplication is AND.</p>
<p>Each step corresponds to multiplying a vector of 32 bits on the left by a 32 × 32 matrix with entries that are 0&#8217;s and 1&#8217;s, with the understanding that the sum of two bits is their XOR and the product of two bits is their AND. Equivalently, arithmetic is carried out mod 2. So you can compute the matrix-vector product as ordinary integers if you then reduce every integer mod 2.</p>
<p>We will find the matrix <em>M</em> that corresponds to the temper operation and prove that it is invertible by finding its inverse. This proves that tempering is invertible, and one could compute the inverse of tempering by multiplying by <em>M</em><sup>−1</sup>, though it would be more efficient to undo temporing by bit twiddling.</p>
<p>One way to recover a matrix is to multiplying by unit vectors <em>e</em><sub><em>i</em></sub> where the <em>i</em>th component of <em>e</em><sub><em>i</em></sub> is 1 and the rest of the components are zero. Then</p>
<p style="padding-left: 40px;"><em>M</em> <em>e</em><sub><em>i</em></sub></p>
<p>is the <em>i</em>th column of <em>M</em>.</p>
<p>So we can find the <em>n</em>th column of <em>M</em> by tempering 1 &lt;&lt; <em>n</em> = 2<sup><em>n</em></sup>.</p>
<pre>M = np.zeros((32, 32), dtype=int)
for i in range(32):
    t = temper(1 &lt;&lt; (31-i))
    s = f"{t:032b}"
    for j in range(32):
        M[j, i] = int(s[j])
</pre>
<p>Let&#8217;s generate a random number and compute its tempered form two ways: directly and matrix multiplication.</p>
<pre>x = random.getrandbits(32)
y = temper(x)
print(f"{y:032b}")

vx = np.array([int(b) for b in f"{x:032b}"]) # vector form of x
vy = M @ vx % 2 # vector form of y
print("".join(str(b) for b in vy))
</pre>
<p>Both produce the same bits:</p>
<pre>10100101100101101100110101000110
10100101100101101100110101000110
</pre>
<p>We can find the matrix representation of the untemper function by inverting the matrix <em>M</em>. However, we need to invert it over the field GF(2), not over the integers or reals.</p>
<pre>import galois
GF2 = galois.GF(2)
Minv = np.linalg.inv(GF2(M))
</pre>
<p>Here are visualizations of <em>M</em> and its inverse using a black square for a 1 and a white square for a 0.</p>
<p><em>M</em>:</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/MT_M.png" width="400" height="400" /></p>
<p><em>M</em><sup>−1</sup>:</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/MT_U.png" width="400" height="400" /></p>
<p>The <a href="https://www.johndcook.com/blog/2026/05/10/the-linear-algebra-of-bit-twiddling/">next post</a> will back up and look at the linear algebra of the components that comprise the Mersenne Twister.</p>The post <a href="https://www.johndcook.com/blog/2026/05/10/reverse-mersenne-twister/">Reverse engineering Mersenne Twister with Linear Algebra</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/05/10/reverse-mersenne-twister/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Calculating curvature</title>
		<link>https://www.johndcook.com/blog/2026/05/08/calculating-curvature/</link>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Fri, 08 May 2026 13:54:18 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Differential geometry]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=247012</guid>

					<description><![CDATA[<p>Curvature is conceptually simple but usually difficult to calculate. For a level set curve f(x, y) = c, such as in the previous couple posts, the equation for curvature is Even when f has a fairly simple expression, the expression for κ can be complicated. If we define then the level set of f(x, y) = c is [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/05/08/calculating-curvature/">Calculating curvature</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>Curvature is conceptually simple but usually difficult to calculate. For a level set curve <em>f</em>(<em>x</em>, <em>y</em>) = c, such as in the previous couple posts, the equation for curvature is</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/curvature_implicit.svg" alt="\kappa = \frac{\left|f_y^2f_{xx}-2f_xf_yf_{xy}+f_x^2f_{yy}\right|}{\bigl(f_x^2+f_y^2\bigr)^{3/2}}" width="238" height="72" /></p>
<p>Even when <em>f</em> has a fairly simple expression, the expression for κ can be complicated.</p>
<p>If we define</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/equilateral_triangle_contours.svg" alt="f(x, y) = y^3 - 3 x^2 - 3 y^2 - 3 x^2 y" width="255" height="22" /></p>
<p>then the level set of <em>f</em>(<em>x</em>, <em>y</em>) = c is an equilateral triangle when <em>c</em> = −4. The level sets are smoothed triangles for −4 &lt; <em>c</em> &lt; 0.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/triangle_contours.png" width="400" height="394" /></p>
<p>The curvature of these level sets at any point is given by</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/equilateral_triangle_contours2.svg" alt="\frac{\left|2 (y+1) \left((y-2)^2-3 x^2\right) \left(x^2+y^2\right)\right|}{\left(x^4+2 x^2 (y (y+6)+2)+(y-2)^2 y^2\right)^{3/2}}" width="322" height="59" /></p>
<h2>Simplification</h2>
<p>But there is one instance in which curvature is easy to calculate. For the graph of a function <em>g</em>(<em>x</em>) = <em>y</em>, the curvature is approximately the absolute value of the second derivative of <em>g</em>, provided the first derivative is small.</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/curvature_approx.svg" alt=" \kappa = \frac{|g^{\prime\prime}(x)|}{(1 + g^\prime(x))^{3/2}} \approx |g^{\prime\prime}(x)|" width="228" height="47" /></p>
<p>At a local maximum or local minimum of <em>g</em>(<em>x</em>) the approximation is exact because the first derivative is zero.</p>
<h2>Max and min curvature of smoothed triangles</h2>
<p>This means that in the example above, we can calculate the maximum and minimum curvature of the level sets. The maximum curvature occurs in the corners and the minimum occurs in the middle of the sides. By symmetry there are three maxima and three minima, but we can take the ones corresponding to <em>x</em> = 0 for convenience. There we find the curvature is simply</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/curvature_approx2.svg" alt="|6 + 6y|" width="59" height="18" /></p>
<p>When <em>x</em> = 0, we have</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/curvature_approx3.svg" alt="f(x,y) = c = y^3 - 3y^2" width="175" height="22" /></p>
<p>and so the maximum and minimum curvature are the two roots of the cubic equation for <em>y</em> that lie in the interval [−1, 2]. (There&#8217;s another root greater than 2.)</p>
<p>For example, when <em>c</em> = −3, the roots are 0.8794, 1.3473, and 2.5321. The first root corresponds to the minimum curvature, the second to the maximum, and the third is outside our region of interest. It follows that the minimum curvature is 0.7237 and the maximum is 14.0838.</p>
<p>When <em>c</em> = −1 the minimum and maximum curvature are 2.80747 and 9.91622 respectively,</p>
<p>Since <em>c</em> = −4 corresponds to the triangle, the minimum curvature is 0 and the maximum is infinite. As <em>c</em> increases, the minimum and maximum curvature come together because the level set is becoming more round.</p>The post <a href="https://www.johndcook.com/blog/2026/05/08/calculating-curvature/">Calculating curvature</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Smoothed polygons</title>
		<link>https://www.johndcook.com/blog/2026/05/07/smoothed-polygons/</link>
					<comments>https://www.johndcook.com/blog/2026/05/07/smoothed-polygons/#comments</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Thu, 07 May 2026 18:19:24 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=247009</guid>

					<description><![CDATA[<p>The previous post constructed a triangular analog of the squircle, the unit circle in the p-norm where p is typically around 4. The case p = 2 is a Euclidean circle and the limit as p → ∞ is a Euclidean square. The previous post introduced three functions Li(x, y) such that the level set of each function forms [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/05/07/smoothed-polygons/">Smoothed polygons</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>The previous post constructed a triangular analog of the squircle, the unit circle in the <em>p</em>-norm where <em>p</em> is typically around 4. The case <em>p</em> = 2 is a Euclidean circle and the limit as <em>p</em> → ∞ is a Euclidean square.</p>
<p>The previous post introduced three functions <em>L</em><sub><em>i</em></sub>(<em>x</em>, <em>y</em>) such that the level set of each function</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/trisqr4.svg" alt="\{(x,y) \mid L_i(x,y) = 1\}" width="170" height="18" /></p>
<p>forms a side of a triangle. Then it introduced a soft penalty for each <em>L</em> being away 1, and the level sets of that penalty function formed the rounded triangles we were looking for.</p>
<p>Another approach would be to change the <em>L</em>&#8216;s slightly so that the sides are the levels sets <em>L</em><sub><em>i</em></sub>(<em>x</em>, <em>y</em>) = 0. The advantage to this formulation is that the product of three numbers is 0 if and only if one of the numbers is zero. That means if we define</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/trisqr5.svg" alt="f(x, y) = L_1(x, y) L_2(x, y) L_3(x, y)" width="284" height="18" /><br />
then the set of points</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/trisqr6.svg" alt="\{(x, y) \mid f(x,y) = c\}" width="165" height="18" /></p>
<p>corresponds to the three lines when <em>c</em> = 0 and the level sets for small <em>c</em> &gt; 0 are nearly triangles. The level sets will be smooth if the gradient is non-zero, i.e. <em>c</em> is not zero.</p>
<p><strong>This approach is not unique to triangles</strong>. You could create smooth approximations any polygon by multiplying linear functions that define the sides. Or you could do something similar with curved arcs.</p>
<p>If we define our <em>L&#8217;</em>s by</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/trisqr7.svg" alt="\begin{align*} L_1(x,y) &amp;= y\\ L_2(x,y) &amp;= y - 2x - 2 \\ L_3(x,y) &amp;= 3y + x -3\\ \end{align*}" width="169" height="77" /></p>
<p>then our curves will be the level sets of</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/trisqr8.svg" alt="f(x, y) = y(y - 2 x - 2)  (3 y + x - 3)" width="290" height="18" /></p>
<p>A few level sets are shown below. The level set for <em>c</em> = 0 is the straight lines.</p>
<p>Note the level sets are not connected. If you&#8217;re interested in approximate triangles, you want the components that are inside the triangle.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/trisqr3.png" width="600" height="320" /></p>
<p>&nbsp;</p>The post <a href="https://www.johndcook.com/blog/2026/05/07/smoothed-polygons/">Smoothed polygons</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/05/07/smoothed-polygons/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title>Triangular analog of the squircle</title>
		<link>https://www.johndcook.com/blog/2026/05/06/triangular-analog-of-the-squircle/</link>
					<comments>https://www.johndcook.com/blog/2026/05/06/triangular-analog-of-the-squircle/#comments</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Wed, 06 May 2026 19:53:29 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=247007</guid>

					<description><![CDATA[<p>TimF left a comment on my guitar pick post saying the image was a &#8220;squircle-ish analog for an isosceles triangle.&#8221; That made me wonder what a more direct analog of the squircle might be for a triangle. A squircle is not exactly a square with rounded corners. The sides are continuously curved, but curved most [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/05/06/triangular-analog-of-the-squircle/">Triangular analog of the squircle</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>TimF left a comment on my <a href="https://www.johndcook.com/blog/2026/05/03/guitar-pick/">guitar pick post</a> saying the image was a &#8220;squircle-ish analog for an isosceles triangle.&#8221; That made me wonder what a more direct analog of the squircle might be for a triangle.</p>
<p>A squircle is not exactly a square with rounded corners. The sides are continuously curved, but curved most at the corners. See, for example, <a href="https://www.johndcook.com/blog/2018/02/13/squircle-curvature/">this post</a>.</p>
<p>Suppose the sides of our triangle are given by <em>L</em><sub>1</sub>(<em>x</em>, <em>y</em>) = 1 for <em>i</em> = 1, 2, 3. For example, </p>
<p><img class='aligncenter' src='https://www.johndcook.com/trisqr1.svg' alt='\begin{align*}
L_1(x, y) &#038;= y \\
L_2(x, y) &#038;= \phantom{-}\frac{\sqrt{3}}{2}x - \frac{1}{2}y \\
L_3(x, y) &#038;= -\frac{\sqrt{3}}{2}x - \frac{1}{2}y \\
\end{align*}
' style='background-color:white' height='127' width='182' /></p>
<p>We design a function <em>f</em>(<em>x</em>, <em>y</em>) as a soft penalty for points not being on one of the sides and look at the set of points <em>f</em>(<em>x</em>, <em>y</em>) = 1.</p>
<p><img class='aligncenter' src='https://www.johndcook.com/trisqr2.svg' alt='f(x, y) = \left( \sum_{i=1}^3 \exp(p (L_i(x, y) - 1)) \right)^{1/p}' style='background-color:white' height='65' width='310' /></p>
<p>You might recognize this as a Lebesgue norm, analogous to the one used to define a squircle.</p>
<p>The larger <em>p</em> is, the heavier the penalty for being far from a side and the closer the level set <em>f</em>(<em>x</em>, <em>y</em>) = 1 comes to being a triangle.</p>
<p><img loading="lazy" decoding="async" src="https://www.johndcook.com/triangular_squircle.png" width="600" height="600" class="aligncenter size-medium" /></p>The post <a href="https://www.johndcook.com/blog/2026/05/06/triangular-analog-of-the-squircle/">Triangular analog of the squircle</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/05/06/triangular-analog-of-the-squircle/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
			</item>
		<item>
		<title>Unified config files</title>
		<link>https://www.johndcook.com/blog/2026/05/06/unified-config-files/</link>
					<comments>https://www.johndcook.com/blog/2026/05/06/unified-config-files/#comments</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Wed, 06 May 2026 16:32:20 +0000</pubDate>
				<category><![CDATA[Computing]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=247006</guid>

					<description><![CDATA[<p>I try to maintain a consistent work environment across computers that I use. The computers differ for important reasons, but I&#8217;d rather they not differ for unimportant reasons. Unified keys One thing I do is remap keys so that the same key does the same thing everywhere, to the extent that&#8217;s practical. This requires remapping [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/05/06/unified-config-files/">Unified config files</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>I try to maintain a consistent work environment across computers that I use. The computers differ for important reasons, but I&#8217;d rather they not differ for unimportant reasons.</p>
<h2>Unified keys</h2>
<p>One thing I do is remap keys so that the same key does the same thing everywhere, to the extent that&#8217;s practical. This requires <a href="https://www.johndcook.com/blog/2022/08/17/cross-platform-muscle-memory/">remapping keys</a>. In particular, I want the key <b>functionality</b>, not the key <b>name</b>, to be the same across operating systems. For example, the Command key on a Mac does what the Control key does on Windows and Linux. I have my machines set up so the key in the lower left corner, whatever you call it, does things like copy, paste, and cut.</p>
<h2>Unified config files</h2>
<p>I use bash everywhere even though zshell is the default on Mac OS. But Linux and MacOS are sufficiently different that I have two <code>.bashrc</code> files, one for each OS. However, both <code>.bashrc</code> files source a common file <code>.bash_aliases</code> for aliases. You can set that up by putting the following code in your <code>.bashrc</code> file.</p>
<pre>if [ -f ~/.bash_aliases ]; then
    . ~/.bash_aliases
fi
</pre>
<p>Sometimes you need some kind of branching logic even for two machines running the same OS. For example, if you have two computers, named Castor and Pollux, you might have code like the following in your <code>.bashrc</code>.</p>
<pre>HOSTNAME_SHORT=$(hostname -s)
if [ "$HOSTNAME_SHORT" = "Castor" ]; then
...
elif [ "$HOSTNAME_SHORT" = "Pollux" ]; then
...
</pre>
<p>One problem with using hostname is that the OS can change the name on you. At least MacOS will do this; I don&#8217;t know whether other operating systems will. If you give your machine an uncommon name then this is unlikely to happen.</p>
<p>I have one <code>init.el</code> file for Emacs. It contains some branching logic for OS:</p>
<pre>(cond
    ((string-equal system-type "windows-nt") ; Microsoft Windows
     ...)
    ((string-equal system-type "gnu/linux") ; linux
     ...)
    ((string-equal system-type "darwin") ; Mac
     ...)
)
</pre>
<p>You can also branch on hostname.</p>
<pre>(if (string-equal (system-name) "Castor")
...)
(if (string-equal (system-name) "Pollux")
...)
</pre>
<p>This isn&#8217;t difficult, but in the short run it&#8217;s easier to just make one ad hoc edit to a config file at a time, letting the files drift apart. Along those lines, see <a href="https://www.johndcook.com/blog/2012/08/01/bicycle-skills/">bicycle skills</a>. </p>The post <a href="https://www.johndcook.com/blog/2026/05/06/unified-config-files/">Unified config files</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/05/06/unified-config-files/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title>The mythology of category theory</title>
		<link>https://www.johndcook.com/blog/2026/05/06/category-mythology/</link>
					<comments>https://www.johndcook.com/blog/2026/05/06/category-mythology/#comments</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Wed, 06 May 2026 11:37:42 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Category theory]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=247005</guid>

					<description><![CDATA[<p>Yesterday a friend and I had a conversation about category theory, how it can be a useful pattern description language, but also about how people have unrealistic expectations for it, believing category theory can deliver something for nothing. Later I ran across the following post from Qiaochu Yuan. It felt as if he had overheard [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/05/06/category-mythology/">The mythology of category theory</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>Yesterday a friend and I had a conversation about category theory, how it can be a useful pattern description language, but also about how people have unrealistic expectations for it, believing category theory can deliver something for nothing.</p>
<p>Later I ran across the following <a href="https://x.com/qiaochuyuan/status/2051833309962088537?s=46">post</a> from Qiaochu Yuan. It felt as if he had overheard my conversation and summarized it in a tweet:</p>
<blockquote><p>category theory is just some straightforwardly useful stuff for some purposes in some fields! you can elegantly simplify and streamline some proofs. then there is the mythology of category theory, which is some other thing entirely, mostly wishful thinking and projection afaict</p></blockquote>
<p>His phrase &#8220;the mythology of category theory&#8221; gives a name to this idea that category theory can deliver specific outputs without specific inputs. It helps to distinguish CT as a scrapbook of patterns from CT as sorcery.</p>The post <a href="https://www.johndcook.com/blog/2026/05/06/category-mythology/">The mythology of category theory</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/05/06/category-mythology/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title>Changing one character in a PDF</title>
		<link>https://www.johndcook.com/blog/2026/05/05/changing-one-character-in-a-pdf/</link>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Tue, 05 May 2026 22:45:26 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=247003</guid>

					<description><![CDATA[<p>I saw a post on X saying Changing a hyphen to an en-dash increases your PDF file size by ~10 bytes. My first thought was that it had something to do with hyphen being an ASCII character and an en-dash not. Changing a hyphen to an en-dash would make a UTF-8 encoded text file a [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/05/05/changing-one-character-in-a-pdf/">Changing one character in a PDF</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>I saw a <a href="https://x.com/kjoshanssen/status/2051778800934150397?s=20">post</a> on X saying</p>
<p style="padding-left: 40px;">Changing a hyphen to an en-dash increases your PDF file size by ~10 bytes.</p>
<p>My first thought was that it had something to do with hyphen being an ASCII character and an en-dash not. Changing a hyphen to an en-dash would make a UTF-8 encoded text file a couple bytes longer. (See why <a href="https://www.johndcook.com/blog/2019/09/09/how-utf-8-works/">here</a>.) Maybe adding one non-ASCII character could cause the file to include a glyph it didn&#8217;t before.</p>
<p>I did a couple experiments. I made a minimal LaTeX file with only the text</p>
<pre>    See pages 9-10.</pre>
<p>and another with</p>
<pre>    See pages 9--10.</pre>
<p>(In LaTeX, a hyphen compiles to a hyphen and two hyphens compile to an en-dash.)</p>
<p>I compiled both files using <code>pdflatex</code>. The PDF with the hyphen was 13172 bytes and the one with the en-dash was 13099 bytes. Replacing the hyphen with the en-dash made the file 73 bytes <b>smaller</b>.</p>
<p>I repeated the experiment with a Libre Office ODT document. Changing the hyphen to an en-dash reduced the file size from 13548 bytes to 13514 bytes.</p>
<p>Then I went back to LaTeX and pasted a paragraph of lorem ipsum text into both files. Now the file with the hyphen produced a 18131 byte PDF and the file with the en-dash produced a 18203 byte PDF. So in this instance changing the hyphen to an en-dash <b>increased</b> the size of the file by 72 bytes.</p>
<p>After adding the lorem ipsum text to the ODT files, the PDF resulting from the file with the hyphen was 23 bytes larger, 17846 bytes versus 17823 bytes.</p>
<p>PDFs are complicated. Changing one character can make the file bigger or smaller, by an unpredictable amount. It depends, among other things, on what software was used to create the PDF. Even changing one letter in an all-ASCII text can change the size of the PDF, I suppose due to some internal text compression and aesthetic control characters. I don&#8217;t pretend to understand what&#8217;s going on inside a PDF.</p>
<h2>Related posts</h2>
<ul>
<li class="link"><a href="https://www.johndcook.com/blog/2025/10/31/smaller-qr-codes/">How text case can change QR code size</a></li>
<li class="link"><a href="https://www.johndcook.com/blog/2011/12/30/handyman-lorem-ipsum/">Handyman lorem ipsum</a></li>
<li class="link"><a href="https://www.johndcook.com/blog/2024/02/08/pdf-forensics/">PDF metadata</a></li>
</ul>The post <a href="https://www.johndcook.com/blog/2026/05/05/changing-one-character-in-a-pdf/">Changing one character in a PDF</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>The shape of a guitar pick</title>
		<link>https://www.johndcook.com/blog/2026/05/03/guitar-pick/</link>
					<comments>https://www.johndcook.com/blog/2026/05/03/guitar-pick/#comments</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Sun, 03 May 2026 20:49:41 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246998</guid>

					<description><![CDATA[<p>I saw a post on X that plotted the function (log x)² + (log y)² = 1. Of course the plot of x² + y² = 1 is a circle, but I never thought what taking logs would do to the shape. Here&#8217;s what the contours look like setting the right hand side equal to 1, 2, [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/05/03/guitar-pick/">The shape of a guitar pick</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>I saw a <a href="https://x.com/sculd_88/status/2050841316435841231">post</a> on X that plotted the function</p>
<p style="padding-left: 40px;">(log <em>x</em>)² + (log <em>y</em>)² = 1.</p>
<p>Of course the plot of</p>
<p style="padding-left: 40px;"><em>x</em>² + <em>y</em>² = 1</p>
<p>is a circle, but I never thought what taking logs would do to the shape.</p>
<p>Here&#8217;s what the contours look like setting the right hand side equal to 1, 2, 3, …, 10.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/logcircle.png" width="360" height="363" /></p>
<pre>ContourPlot[Log[x]^2 + Log[y]^2, {x, 0, 10}, {y, 0, 10}, 
    Contours -&gt; Range[10]]</pre>
<p>The dark blue contour near the origin reminded me of a guitar pick, so I decided to take a stab at creating an equation for the shape of a guitar pick.</p>
<p>I wanted to rotate the image so the axis of symmetry for the pick is vertical, so I replaced <em>x</em> and <em>y</em> with <em>x</em> + <em>y</em> and <em>x</em> − <em>y</em>.</p>
<p>The aspect ratio was too wide, so I experimented with</p>
<p style="padding-left: 40px;">log(<em>y</em> + <em>kx</em>)² + log(<em>y</em> − <em>kx</em>)² = <em>r</em>²</p>
<p>where increasing <em>k</em> increases the height-to-width ratio. After a little experimentation I settled on <em>k</em> = 1.5 and <em>r</em> = 1.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/logcircle2.png" width="360" height="360" /></p>
<p>This has an aspect ratio of roughly 5:4, which is about what I measured from a photo of a guitar pick.</p>
<h2>Updating: refining the fit</h2>
<p>After posting this article on X, Paul Graham <a href="https://x.com/paulg/status/2051047606114005425?s=20">replied</a> with a photo of a Fender guitar pick with the image above overlaid. The fit was fairly good, but the aspect ratio wasn&#8217;t quite right.</p>
<p>So then I did a little research. The shape referred to in this post is known as the &#8220;351,&#8221; but even for the 351 shape the aspect ratio varies slightly between picks.</p>
<p>Setting <em>k</em> = 1.6 gives a better fit to Paul Graham&#8217;s pick.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/pgpick.png" width="360" height="422" /></p>
<p>The blue line represents my fit using <em>k</em> = 1.5 and the red line represents my fit using <em>k</em> = 1.6.</p>The post <a href="https://www.johndcook.com/blog/2026/05/03/guitar-pick/">The shape of a guitar pick</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/05/03/guitar-pick/feed/</wfw:commentRss>
			<slash:comments>4</slash:comments>
		
		
			</item>
		<item>
		<title>Approximating even functions by powers of cosine</title>
		<link>https://www.johndcook.com/blog/2026/04/30/burmanns-theorem/</link>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Fri, 01 May 2026 00:06:12 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Special functions]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246995</guid>

					<description><![CDATA[<p>A couple days ago I wrote a post about turning a trick into a technique, finding another use for a clever way to construct simple, accurate approximations. I used as my example approximating the Bessel function J(x) with (1 + cos(x))/2. I learned via a helpful comment on Mathstodon that my approximation was the first-order [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/30/burmanns-theorem/">Approximating even functions by powers of cosine</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>A couple days ago I wrote a post about t<a href="https://www.johndcook.com/blog/2026/04/28/even-series-trick/">urning a trick into a technique</a>, finding another use for a clever way to construct simple, accurate approximations. I used as my example approximating the Bessel function <em>J</em>(<em>x</em>) with (1 + cos(x))/2. I learned via a helpful <a href="https://mathstodon.xyz/@tpfto/116484963022062150">comment</a> on Mathstodon that my approximation was the first-order part of a more general series</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/burmann1.svg" alt="J_0(x) = 1 + \frac{\cos(x) - 1}{2} - \frac{(\cos(x) - 1)^2}{48} + \frac{7(\cos(x) - 1)^3}{1440} + \cdots" width="504" height="44" /></p>
<p>The first-order approximation has error <em>O</em>(<em>x</em><sup>4</sup>), as shown in the earlier post. Adding the second-order term makes the error <em>O</em>(<em>x</em><sup>6</sup>), and adding the third-order term makes it <em>O</em>(<em>x</em><sup>8</sup>).</p>
<p>I&#8217;ve written a few times about cosine approximations to the normal probability density. For example, see <a href="https://www.johndcook.com/blog/2023/07/05/normal-approximation/">this post</a>. We could use the same idea as the series above to approximate the normal density with a series of powers of cosine. This gives us</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/burmann2.svg" alt="\exp(-x^2/2) = 1 + (\cos(x) - 1) + \frac{(\cos(x) - 1)^2}{3} + \frac{2(\cos(x) - 1)^3}{45} + \cdots" width="569" height="44" /></p>
<p>and as before, the first, second, and third order truncated series have error <em>O</em>(<em>x</em><sup>4</sup>), <em>O</em>(<em>x</em><sup>6</sup>), and <em>O</em>(<em>x</em><sup>8</sup>).</p>
<p>The general theory behind what&#8217;s going on here is an extension of <a href="https://mathworld.wolfram.com/BuermannsTheorem.html">Bürmann&#8217;s theorem</a>. The original version of the theorem relies on a series inversion theorem that in turn relies on the approximating function, in our case cos(<em>x</em>) − 1, not having zero derivative at the center of the series. But there is a more general form of Bürmann&#8217;s theorem based on a <a href="https://www.johndcook.com/blog/2025/09/21/mollweide-for-high-latitudes/">more general form of series inversion</a>. We will always need a more general version of the theorem when working with even functions because even functions have zero derivative at zero.</p>
<p>Here&#8217;s another example, this time using the Bessel function <em>J</em><sub>1</sub>, an odd function, which does use the original version of Bürmann&#8217;s theorem to approximate <em>J</em><sub>1</sub> by powers of sine.</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/burmann3.svg" alt="J_1(x) = \frac{1}{2} \sin(x) + \frac{1}{48} \sin^3(x) + \frac{17}{1920} \sin^5(x) + \cdots" width="406" height="41" /></p>
<p>In this case truncating the series after sin<sup><em>k</em></sup>(<em>x</em>) gives an error <em>O</em>(<em>x</em><sup><em>k</em> + 2</sup>).</p>
<p>You can find more on Bürmann&#8217;s theorem in <a href="https://www.johndcook.com/blog/2022/07/20/whittaker-and-watson/">Whittker and Watson</a>.</p>The post <a href="https://www.johndcook.com/blog/2026/04/30/burmanns-theorem/">Approximating even functions by powers of cosine</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Three ways to differentiate ReLU</title>
		<link>https://www.johndcook.com/blog/2026/04/30/derivative-of-relu/</link>
					<comments>https://www.johndcook.com/blog/2026/04/30/derivative-of-relu/#comments</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Thu, 30 Apr 2026 14:51:08 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246993</guid>

					<description><![CDATA[<p>When a function is not differentiable in the classical sense there are multiple ways to compute a generalized derivative. This post will look at three generalizations of the classical derivative, each applied to the ReLU (rectified linear unit) function. The ReLU function is a commonly used activation function for neural networks. It&#8217;s also called the [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/30/derivative-of-relu/">Three ways to differentiate ReLU</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>When a function is not differentiable in the classical sense there are multiple ways to compute a generalized derivative. This post will look at three generalizations of the classical derivative, each applied to the ReLU (rectified linear unit) function. The ReLU function is a commonly used <a href="https://www.johndcook.com/blog/2023/07/01/activation-functions/">activation function</a> for neural networks. It&#8217;s also called the ramp function for obvious reasons.</p>
<p><img decoding="async" class="aligncenter" src="https://www.johndcook.com/ReLU_plot.png" /></p>
<p>The function is simply <em>r</em>(<em>x</em>) = max(0, <em>x</em>).</p>
<h2>Pointwise derivative</h2>
<p>The <strong>pointwise</strong> derivative would be 0 for <em>x</em> &lt; 0, 1 for <em>x</em> &gt; 0, and undefined at <em>x</em> = 0. So except at 0, the pointwise derivative of the ramp function is the <a href="https://www.johndcook.com/blog/2010/01/06/heaviside/">Heaviside</a> function.</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/heaviside3.svg" alt="H(x) = \left\{ \begin{array}{ll} 1 &amp; \mbox{if } x \geq 0 \\ 0 &amp; \mbox{if } x &lt; 0 \end{array} \right." width="173" height="48" /><br />
In a real analysis course, you&#8217;d simply say <em>r</em>′(<em>x</em>) =<em>H</em>(<em>x</em>) because functions are only defined up to equivalent modulo sets of measure zero, i.e. the definition at <em>x</em> = 0 doesn&#8217;t matter.</p>
<h2>Distributional derivative</h2>
<p>In <strong>distribution theory</strong> you&#8217;d identify the function <em>r</em>(<em>x</em>) with the distribution whose action on a test function φ is</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/ramp_action.svg" alt="\langle r, \varphi \rangle = \int_{-\infty}^\infty r(x)\, \varphi(x) \, dx" width="207" height="45" /></p>
<p>Then the derivative of <em>r</em> would be the distribution <em>r</em>′ satisfying</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/ramp_derivative_action.svg" alt="\langle r^{\prime}, \varphi\rangle = -\langle r, \varphi^{\prime} \rangle" width="132" height="19" /></p>
<p>for all smooth functions φ with compact support. You can prove using integration by parts that the above equals the integral of φ from 0 to ∞, which is the same as the action of <em>H</em>(<em>x</em>) on φ.</p>
<p>In this case the distributional derivative of <em>r</em> is the same as the pointwise derivative of <em>r</em> interpreted as a distribution. This does not happen in general [1]. For example, the pointwise derivative of <em>H</em> is zero but the distributional derivative of <em>H</em> is δ, the Dirac delta distribution.</p>
<p>For more on distributional derivatives, see <a href="https://www.johndcook.com/blog/2009/10/25/how-to-differentiate-a-non-differentiable-function/">How to differentiate a non-differentiable function</a>.</p>
<h2>Subgradient</h2>
<p>The subgradient of a function <em>f</em> at a point <em>x</em>, written ∂<em>f</em>(<em>x</em>), is the set of slopes of tangent lines to the graph of <em>f</em> at <em>x</em>. If <em>f</em> is differentiable at <em>x</em>, then there is only one slope, namely <em>f</em>′(<em>x</em>), and we typically say the subgradient of <em>f</em> at <em>x</em> is simply <em>f</em>′(<em>x</em>) when strictly speaking we should say it is the one-element set {<em>f</em>′(<em>x</em>)}.</p>
<p>A line tangent to the graph of the ReLU function at a negative value of <em>x</em> has slope 0, and a tangent line at a positive <em>x</em> has slope 1. But because there&#8217;s a sharp corner at <em>x</em> = 0, a tangent at this point could have any slope between 0 and 1.</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/subgradient.svg" alt="\partial f(x) = \left\{ \begin{array}{cl} 1 &amp; \text{if } x &gt; 0 \\&lt;br /&gt; \left[0,1\right] &amp; \text{if } x = 0 \\&lt;br /&gt; 0 &amp; \text{if } x &lt; 0 \end{array} \right." width="214" height="73" /></p>
<p>My dissertation was full of subgradients of convex functions. This made me uneasy because subgradients are not real-valued functions; they&#8217;re set-valued functions. Most of the time you can blithely ignore this distinction, but there&#8217;s always a nagging suspicion that it&#8217;s going to bite you unexpectedly.</p>
<p>&nbsp;</p>
<p>[1] When <em>is</em> the pointwise derivative of <em>f</em> as a function equal to the derivative of <em>f</em> as a distribution? It&#8217;s not enough for <em>f</em> to be continuous, but it is sufficient for <em>f</em> to be <em>absolutely</em> continuous.</p>The post <a href="https://www.johndcook.com/blog/2026/04/30/derivative-of-relu/">Three ways to differentiate ReLU</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/04/30/derivative-of-relu/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title>Turning a trick into a technique</title>
		<link>https://www.johndcook.com/blog/2026/04/28/even-series-trick/</link>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Tue, 28 Apr 2026 21:44:58 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Special functions]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246992</guid>

					<description><![CDATA[<p>Someone said a technique is a trick that works twice. I wanted to see if I could get anything interesting by turning the trick in the previous post into a technique. The trick created a high-order approximation by subtracting a multiple one even function from another. Even functions only have even-order terms, and by using [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/28/even-series-trick/">Turning a trick into a technique</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>Someone said a technique is a trick that works twice.</p>
<p>I wanted to see if I could get anything interesting by turning the trick in the <a href="https://www.johndcook.com/blog/2026/04/28/circular-arc-approximation/">previous post</a> into a technique. The trick created a high-order approximation by subtracting a multiple one even function from another. Even functions only have even-order terms, and by using the right multiple you can cancel out the second-order term as well.</p>
<p>For an example, I&#8217;d like to approximate the Bessel function <em>J</em><sub>0</sub>(<em>x</em>) by the better known cosine function. Both are even functions.</p>
<p style="padding-left: 40px;"><em>J</em><sub>0</sub>(<em>x</em>) = 1 − <em>x</em><sup>2</sup>/4 + <em>x</em><sup>4</sup>/64 + …<br />
cos(<em>x</em>) = 1 − <em>x</em><sup>2</sup>/2 + <em>x</em><sup>4</sup>/24 + …</p>
<p>and so</p>
<p style="padding-left: 40px;">2 <em>J</em><sub>0</sub>(<em>x</em>) − cos(<em>x</em>) = 1 − <em>x</em><sup>4</sup>/96 + …</p>
<p>which means</p>
<p style="padding-left: 40px;"><em>J</em><sub>0</sub>(<em>x</em>) ≈ (1 + cos(<em>x</em>))/2</p>
<p>is an excellent approximation for small <em>x</em>.</p>
<p>Let&#8217;s try this for a couple examples.</p>
<p><em>J</em><sub>0</sub>(0.2) = 0.990025 and (1 + cos(0.2))/2 = 0.990033.</p>
<p><em>J</em><sub>0</sub>(0.05) = 0.99937510 and (1 + cos(0.05))/2 = 0.99937513.</p>The post <a href="https://www.johndcook.com/blog/2026/04/28/even-series-trick/">Turning a trick into a technique</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Circular arc approximation</title>
		<link>https://www.johndcook.com/blog/2026/04/28/circular-arc-approximation/</link>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Tue, 28 Apr 2026 13:09:42 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Geometry]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246991</guid>

					<description><![CDATA[<p>Suppose you have an arc a, a portion of a circle of radius r, and you know two things: the length c of the chord of the arc, and the length b of the chord of half the arc, illustrated below. Here θ is the central angle of the arc. Then the length of the arc, rθ, [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/28/circular-arc-approximation/">Circular arc approximation</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>Suppose you have an arc <em>a</em>, a portion of a circle of radius <em>r</em>, and you know two things: the length <em>c</em> of the chord of the arc, and the length <em>b</em> of the chord of half the arc, illustrated below.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/circular_arc_approx1.png" width="360" height="360" /></p>
<p>Here θ is the central angle of the arc. Then the length of the arc, <em>r</em>θ, is approximately</p>
<p style="padding-left: 40px;"><span style="background-color: #dee;"><em>a</em> = <em>r</em>θ ≈ 12 <em>b</em>²/(<em>c</em> + 4<em>b</em>).</span></p>
<p>If the arc is moderately small, the approximation is very accurate.</p>
<p>This approximation is simple, accurate, and not obvious, much like the one in <a href="https://www.johndcook.com/blog/2026/04/23/solve-a-right-triangle/">this post</a></p>
<h2>Derivation</h2>
<p>Let φ = θ/4. Then the angle between the chords <em>b</em> and <em>c</em> is φ. This follows from the inscribed angle theorem, illustrated below.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/circular_arc_approx2.png" width="360" height="360" /></p>
<p>There are two right triangles in the diagram above that have an angle φ: a smaller triangle with hypotenuse <em>b</em> and a larger triangle with hypotenuse 2<em>r</em>. From the smaller triangle we learn</p>
<p style="padding-left: 40px;">cos(φ) = <em>c</em> / 2<em>b</em></p>
<p>and from the larger triangle we learn</p>
<p style="padding-left: 40px;">sin(φ) = <em>b</em> / 2<em>r</em>.</p>
<p>Now expand in power series.</p>
<p style="padding-left: 40px;"><em>c</em> / 2<em>b</em> = cos(φ) = 1 − φ<sup>2</sup>/2! + φ<sup>4</sup>/4! − …<br />
2<em>b</em> / <em>a</em> = sin(φ) / φ = 1 − φ<sup>2</sup>/3! + φ<sup>4</sup>/5! − …</p>
<p>If we multiply 2<em>b</em> / <em>a</em> by 3 and subtract <em>c</em> / 2<em>b</em> then the φ<sup>2</sup> terms cancel out and we get</p>
<p style="padding-left: 40px;">6<em>b</em> / <em>a</em> − <em>c</em> / 2<em>b</em> = 2 − φ<sup>4</sup>/60 + …</p>
<p>and so</p>
<p style="padding-left: 40px;">6<em>b</em> / <em>a</em> − <em>c</em> / 2<em>b</em> ≈ 2</p>
<p>to a very high degree of accuracy when φ is small. The approximation follows by solving for <em>a</em>.</p>
<h2>Example</h2>
<p>Let θ = π/3 and so φ = 0.26…, not a particularly small value of φ, but small enough for the approximation to work well.</p>
<p>Set <em>r</em> = 1 so <em>a</em> = θ. Then</p>
<p style="padding-left: 40px;"><em>b</em> = 2 sin(π/12) = 0.51764</p>
<p>and</p>
<p style="padding-left: 40px;"><em>c</em> = 2<em>b</em> cos(π/12) = 1.</p>
<p>Now in application, we know <em>b</em> and <em>c</em>, not θ, and so pretend we <strong>measured</strong> <em>b</em> = 0.51764 and <em>c</em> = 1. Then we would approximate <em>a</em> by</p>
<p style="padding-left: 40px;">12<em>b</em>²/(<em>c</em> + 4<em>b</em>) = 1.04718</p>
<p>while the exact value is 1.04720. Unless you can measure lengths to more than four significant figures, the approximation may has well be exact because approximation error would be less than measurement error.</p>
<p>&nbsp;</p>
<p>[1] J. M. Bruce. Approximation to a Circular Arc. The American Mathematical Monthly. Vol. 49, No. 3 (March 1942), pp. 184–185</p>The post <a href="https://www.johndcook.com/blog/2026/04/28/circular-arc-approximation/">Circular arc approximation</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Closed-form solution to the nonlinear pendulum equation</title>
		<link>https://www.johndcook.com/blog/2026/04/25/exact-solution-nonlinear-pendulum/</link>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Sat, 25 Apr 2026 16:16:12 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Differential equations]]></category>
		<category><![CDATA[Special functions]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246989</guid>

					<description><![CDATA[<p>The previous post looks at the nonlinear pendulum equation and what difference it makes to the solutions if you linearize the equation. If the initial displacement is small enough, you can simply replace sin θ with θ. If the initial displacement is larger, you can improve the accuracy quite a bit by solving the linearized [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/25/exact-solution-nonlinear-pendulum/">Closed-form solution to the nonlinear pendulum equation</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>The <a href="https://www.johndcook.com/blog/2026/04/24/nonlinear-pendulum/">previous post</a> looks at the nonlinear pendulum equation and what difference it makes to the solutions if you linearize the equation.</p>
<p>If the initial displacement is small enough, you can simply replace sin θ with θ. If the initial displacement is larger, you can improve the accuracy quite a bit by solving the linearized equation and then adjusting the period.</p>
<p>You can also find an exact solution, but not in terms of elementary functions; you have to use Jacobi elliptic functions. These are functions somewhat analogous to trig functions, though it&#8217;s not helpful to try to pin down the analogies. For example, the Jacobi function sn is like the sine function in some ways but very different in others, depending on the range of arguments.</p>
<p>We start with the differential equation</p>
<p style="padding-left: 40px;">θ″(<em>t</em>) + <em>c</em>² sin( θ(<em>t</em>) ) = 0</p>
<p>where <em>c</em>² = <em>g</em>/<em>L</em>, i.e. the gravitational constant divided by pendulum length, and initial conditions θ(0) = θ<sub>0</sub> and θ′(0) = 0. We assume −π &lt; θ<sub>0</sub> &lt; π.</p>
<p>Then the solution is</p>
<p style="padding-left: 40px;">θ(<em>t</em>) = 2 arcsin( <em>a</em> cd(<em>c</em><em>t</em> | <em>m</em> ) )</p>
<p>where <em>a</em> = sin(θ<sub>0</sub>/2), <em>m</em> = <em>a</em>², and cd is one of the 12 Jacobi elliptic functions. Note that cd, like all the Jacobi functions, has an argument and a parameter. In the equation above the argument is <em>ct</em> and the parameter is <em>m</em>.</p>
<p>The last plot in the previous post was misleading, showing roughly equal parts genuine difference and error from solving the differential equation numerically. Here&#8217;s the code that was used to solve the nonlinear equation.</p>
<pre>from scipy.special import ellipj, ellipk
from numpy import sin, cos, pi, linspace, arcsin
from scipy.integrate import solve_ivp

def exact_period(θ):
    return 2*ellipk(sin(θ/2)**2)/pi

def nonlinear_ode(t, z):
    x, y = z
    return [y, -sin(x)]    

theta0 = pi/3
b = 2*pi*exact_period(theta0)
t = linspace(0, 2*b, 2000)

sol = solve_ivp(nonlinear_ode, [0, 2*b], [theta0, 0], t_eval=t)
</pre>
<p>The solution is contained in <code>sol.y[0]</code>.</p>
<p>Let&#8217;s compare the numerical solution to the exact solution.</p>
<pre>def f(t, c, theta0):
    a = sin(theta0/2)
    m = a**2
    sn, cn, dn, ph = ellipj(c*t, m)
    return 2*arcsin(a*cn/dn)
</pre>
<p>There are a couple things to note about the code. First,SciPy doesn&#8217;t implement the cd function, but it can be computed as cn/dn. Second, the function <code>ellipj</code> returns four functions at once because it takes about as much time to calculate all four as it does to compute one of them.</p>
<p>Here is a plot of the error in solving the differential equation.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/nonlinpend4.png" width="480" height="360" /></p>
<p>And here is the difference between the exact solution to the nonlinear pendulum equation and the stretched solution to the linear equation.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/nonlinpend5.png" width="480" height="360" /></p>The post <a href="https://www.johndcook.com/blog/2026/04/25/exact-solution-nonlinear-pendulum/">Closed-form solution to the nonlinear pendulum equation</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>nth derivative of a quotient</title>
		<link>https://www.johndcook.com/blog/2026/04/25/nth-derivative-of-a-quotient/</link>
					<comments>https://www.johndcook.com/blog/2026/04/25/nth-derivative-of-a-quotient/#comments</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Sat, 25 Apr 2026 13:12:39 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246988</guid>

					<description><![CDATA[<p>There&#8217;s a nice formula for the nth derivative of a product. It looks a lot like the binomial theorem. There is also a formula for the nth derivative of a quotient, but it&#8217;s more complicated and less known. We start by writing the quotient rule in an unusual way. Applying the quotient rule twice gives the following. [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/25/nth-derivative-of-a-quotient/">nth derivative of a quotient</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>There&#8217;s a nice formula for the <em>n</em>th derivative of a product. It looks a lot like the binomial theorem.</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/leibniz.svg" alt="(gh)^{(n)} = \sum_{k=0}^n \binom{n}{k} g^{(k)} h^{(n-k)}" width="212" height="54" /></p>
<p>There is also a formula for the <em>n</em>th derivative of a quotient, but it&#8217;s more complicated and less known.</p>
<p>We start by writing the quotient rule in an unusual way.</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/quotient1.svg" alt="\left(\frac{g}{h}\right)^{(1)} = \frac{1}{h^2} \left| \begin{array}{cc} h &amp; g \\ h^\prime &amp; g^\prime \\ \end{array} \right|" width="171" height="48" /></p>
<p>Applying the quotient rule twice gives the following.</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/quotient2.svg" alt="\left(\frac{g}{h}\right)^{(2)} = \frac{1}{h^3} \left| \begin{array}{ccc} h &amp; 0 &amp; g \\ h^\prime &amp; h &amp; g^\prime \\ h^{\prime\prime} &amp; 2h^\prime &amp; g^{\prime\prime} \\ \end{array} \right|" width="222" height="72" /></p>
<p>And here&#8217;s the general rule in all its glory.</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/quotientn.svg" alt="\left(\frac{g}{h}\right)^{(n)} = \frac{1}{h^{\,n+1}} \left| \begin{array}{cccccc} h &amp; 0 &amp; 0 &amp; \cdots &amp; 0 &amp; g \\[3pt] h^\prime &amp; h &amp; 0 &amp; \cdots &amp; 0 &amp; g^\prime \\[3pt] h^{\prime\prime} &amp; 2h^\prime &amp; h &amp; \cdots &amp; 0 &amp; g^{\prime\prime} \\[3pt] \cdots &amp; \cdots &amp; \cdots &amp; \cdots &amp; \cdots &amp; \cdots \\[3pt] h^{(n)} &amp; \binom{n}{1}h^{(n-1)} &amp; \binom{n}{2}h^{(n-2)} &amp; \cdots &amp; \binom{n}{1}h^\prime &amp; g^{(n)} \end{array} \right|" width="492" height="144" /></p>
<p>&nbsp;</p>
<p>Source: V. F. Ivanoff. The <em>n</em>th Derivative of a Fractional Function. The American Mathematical Monthly, Vol. 55, No. 8 (Oct., 1948), p. 491</p>The post <a href="https://www.johndcook.com/blog/2026/04/25/nth-derivative-of-a-quotient/">nth derivative of a quotient</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/04/25/nth-derivative-of-a-quotient/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title>How nonlinearity affects a pendulum</title>
		<link>https://www.johndcook.com/blog/2026/04/24/nonlinear-pendulum/</link>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Fri, 24 Apr 2026 23:53:29 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Differential equations]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246986</guid>

					<description><![CDATA[<p>The equation of motion for a pendulum is the differential equation where g is the acceleration due to gravity and ℓ is the length of the pendulum. When this is presented in an introductory physics class, the instructor will immediately say something like &#8220;we&#8217;re only interested in the case where θ is small, so we can [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/24/nonlinear-pendulum/">How nonlinearity affects a pendulum</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>The equation of motion for a pendulum is the differential equation</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" style="background-color: white;" src="https://www.johndcook.com/nonlinear_pendulum.svg" alt="\theta'' + \frac{g}{\ell}\sin \theta = 0" width="119" height="33" /></p>
<p>where <em>g</em> is the acceleration due to gravity and ℓ is the length of the pendulum. When this is presented in an introductory physics class, the instructor will immediately say something like &#8220;we&#8217;re only interested in the case where θ is small, so we can rewrite the equation as</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" style="background-color: white;" src="https://www.johndcook.com/linear_pendulum.svg" alt="\theta'' + \frac{g}{\ell} \theta = 0" width="91" height="33" /></p>
<h2>Questions</h2>
<p>This raises a lot of questions, or at least it should.</p>
<ol>
<li>Why not leave sin θ alone?</li>
<li>What justifies replacing sin θ with just θ?</li>
<li>How small does θ have to be for this to be OK?</li>
<li>How do the solutions to the exact and approximate equations differ?</li>
</ol>
<p>First, sine is a nonlinear function, making the differential equation nonlinear. The nonlinear pendulum equation cannot be solved using mathematics that students in an introductory physics class have seen. There is a <a href="https://www.johndcook.com/blog/2026/04/25/exact-solution-nonlinear-pendulum/">closed-form solution</a>, but only if you extend &#8220;closed-form&#8221; to mean more than the elementary functions a student would see in a calculus class.</p>
<p>Second, the approximation is justified because sin θ ≈ θ when θ is small. That&#8217;s true, but it&#8217;s kinda subtle. Here&#8217;s a <a href="https://www.johndcook.com/blog/2010/07/27/sine-approximation-for-small-x/">post</a> unpacking that.</p>
<p>The third question doesn&#8217;t have a simple answer, though simple answers are often given. An instructor could make up an answer on the spot and say &#8220;less than 10 degrees&#8221; or something like that. A more thorough answer requires answering the fourth question.</p>
<p>I address how nonlinear affects the solutions in a <a href="https://www.johndcook.com/blog/2023/11/20/nonlinear-pendulum-period/">post</a> a couple years ago. This post will expand a bit on that post.</p>
<h2>Longer period</h2>
<p>The primary difference between the nonlinear and linear pendulum equations is that the solutions to the nonlinear equation have longer periods. The solution to the linear equation is a cosine. Solving the equation determines the frequency, amplitude, and phase shift of the cosine, but qualitatively it&#8217;s just a cosine. The solution to the corresponding nonlinear equation, with sin θ rather than θ, is not exactly a cosine, but it looks a lot like a cosine, only the period is a little longer [1].</p>
<p>OK, the nonlinear pendulum has a longer period, but how much longer? The period is increased by a factor <em>f</em>(θ<sub>0</sub>) where θ<sub>0</sub> is the initial displacement.</p>
<p>You can find the exact answer in my <a href="https://www.johndcook.com/blog/2023/11/20/nonlinear-pendulum-period/">earlier post</a>. The exact answer depends on a special function called the “complete elliptic integral of the first kind,” but to a good approximation</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/nonlinear_pendulum_period3.svg" alt="f(\theta) \approx \frac{1}{\sqrt{\cos(\theta/2)}}" width="141" height="45" /></p>
<p>The earlier post compares this approximation to the exact function.</p>
<h2>Linear solution with adjusted period</h2>
<p>Since the nonlinear pendulum equation is roughly the same as the linear equation with a longer period, you can approximate the solution to the nonlinear equation by solving the linear equation but increasing the period. How good is that approximation?</p>
<p>Let&#8217;s do an example with θ<sub>0</sub> = 60° = π/3 radians. Then sin θ<sub>0</sub> = 0.866 but θ<sub>0</sub>, in radians, is 1.047, so we definitely can&#8217;t say sin θ<sub>0</sub> is approximately θ<sub>0</sub>. To make things simple, let&#8217;s set ℓ = <em>g</em>. Also, assume the pendulum starts from rest, i.e. θ'(0) = 0.</p>
<p>Here&#8217;s a plot of the solutions to the nonlinear and linear equations.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/nonlinpend1.png" width="480" height="360" /></p>
<p>Obviously the solution to the nonlinear equation has a longer period. In fact it&#8217;s 7.32% longer. (The approximation above would have estimated 7.46%.)</p>
<p>Here&#8217;s a plot comparing the solution of the nonlinear equation and the solution to the linear equations with period stretched by 7.32%.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/nonlinpend2.png" width="480" height="360" /></p>
<p>The solutions differ by less than the width of the plotting line, so it&#8217;s too small to see. But we can see there&#8217;s a difference when we subtract the two solutions.</p>
<p>Here&#8217;s a plot of the solutions to the nonlinear and linear equations.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/nonlinpend3.png" width="480" height="360" /></p>
<p><b>Update</b>: The plot above is misleading. Part of what it shows is numerical error from solving the pendulum equation. When we redo the plot using the exact solution the error is about half as large. And the error is periodic, as we&#8217;d expect. See <a href="https://www.johndcook.com/blog/2026/04/25/exact-solution-nonlinear-pendulum/">this post</a> for more on the exact solution using Jacobi functions and the differential equation solver that was used to make the original plot.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/nonlinpend5.png" width="480" height="360" /></p>
<h2>Related posts</h2>
<ul>
<li class="link"><a href="https://www.johndcook.com/blog/2013/02/19/mechanical-vibrations/">Mechanical vibrations</a></li>
<li class="link"><a href="https://www.johndcook.com/blog/2025/12/23/bowie-integrator-and-the-nonlinear-pendulum/">Bowie integrator and the nonlinear pendulum</a></li>
</ul>
<p>[1] The period of a pendulum depends on its length ℓ, and so we can think of the nonlinear term effectively replacing ℓ by a longer effective length ℓ<sub>eff</sub>.</p>
<p>&nbsp;</p>The post <a href="https://www.johndcook.com/blog/2026/04/24/nonlinear-pendulum/">How nonlinearity affects a pendulum</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Approximation to solve an oblique triangle</title>
		<link>https://www.johndcook.com/blog/2026/04/23/solve-an-oblique-triangle/</link>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Thu, 23 Apr 2026 15:56:02 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246984</guid>

					<description><![CDATA[<p>The previous post gave a simple and accurate approximation for the smaller angle of a right triangle. Given a right triangle with sides a, b, and c, where a is the shortest side and c is the hypotenuse, the angle opposite side a is approximately in radians. The previous post worked in degrees, but here we&#8217;ll use radians. If the [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/23/solve-an-oblique-triangle/">Approximation to solve an oblique triangle</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>The <a href="https://www.johndcook.com/blog/2026/04/23/solve-a-right-triangle/">previous post</a> gave a simple and accurate approximation for the smaller angle of a right triangle. Given a right triangle with sides <em>a</em>, <em>b</em>, and <em>c</em>, where <em>a</em> is the shortest side and <em>c</em> is the hypotenuse, the angle opposite side <em>a</em> is approximately</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/oblique1.svg" alt="A \approx \frac{3a}{b + 2c}" width="87" height="40" /></p>
<p>in radians. The previous post worked in degrees, but here we&#8217;ll use radians.</p>
<p>If the triangle is oblique rather than a right triangle, there an approximation for the angle <em>A</em> that doesn&#8217;t require inverse trig functions, though it does require square roots. The approximation is derived in [1] using the same series that is the basis of the approximation in the earlier post, the power series for 2 csc(<em>x</em>) + cot(<em>x</em>).</p>
<p>For an oblique triangle, the approximation is<img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/oblique2.svg" alt="A \approx \frac{6 \sqrt{(s - b)(s - c)}}{2\sqrt{bc} + \sqrt{s(s-a)}}" width="179" height="57" /></p>
<p>where <em>s</em> is the semiperimeter.</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/oblique3.svg" alt="s = \frac{a + b + c}{2}" width="101" height="41" /></p>
<p>For comparison, we can find the exact value of <em>A</em> using the law of cosines.</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/oblique4.svg" alt="a^2 = b^2 + c^2 - 2 bc \cos A" width="191" height="18" /></p>
<p>and so</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/oblique5.svg" alt="A = \cos^{-1}\left(\frac{b^2 + c^2 - a^2}{2bc}\right)" width="199" height="49" /></p>
<p>Here&#8217;s a little Python script to see how accurate the approximation is.</p>
<pre>from math import sqrt, acos

def approx(a, b, c):
    "approximate the angle opposite a"
    s = (a + b + c)/2
    return 6*sqrt((s - b)*(s - c)) / (2*sqrt(b*c) + sqrt(s*(s - a)))

def exact(a, b, c):
    "exact value of the angle opposite a"    
    return acos((b**2 + c**2 - a**2)/(2*b*c))

a, b, c = 6, 7, 12
print( approx(a, b, c) )
print( exact(a, b, c) )
</pre>
<p>This prints</p>
<pre>0.36387538476776243
0.36387760856668505
</pre>
<p>showing that in our example the approximation is good to five decimal places.</p>
<p>[1] H. E. Stelson. Note on the approximate solution of an oblique triangle without tables. American Mathematical Monthly. Vol 56, No. 2 (February, 1949), pp. 84–95.</p>The post <a href="https://www.johndcook.com/blog/2026/04/23/solve-an-oblique-triangle/">Approximation to solve an oblique triangle</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Simple approximation for solving a right triangle</title>
		<link>https://www.johndcook.com/blog/2026/04/23/solve-a-right-triangle/</link>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Thu, 23 Apr 2026 12:13:35 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246983</guid>

					<description><![CDATA[<p>Suppose you have a right triangle with sides a, b, and c, where a is the shortest side and c is the hypotenuse. Then the following approximation from [1] for the angle A opposite side a seems too simple and too accurate to be true. In degrees, A ≈ a 172° / (b + 2c). The approximation above only involves simple [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/23/solve-a-right-triangle/">Simple approximation for solving a right triangle</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>Suppose you have a right triangle with sides <em>a</em>, <em>b</em>, and <em>c</em>, where <em>a</em> is the shortest side and <em>c</em> is the hypotenuse. Then the following approximation from [1] for the angle <em>A</em> opposite side <em>a</em> seems too simple and too accurate to be true. In degrees,</p>
<p style="padding-left: 40px;"><em>A</em> ≈ <em>a</em> 172° / (<em>b</em> + 2<em>c</em>).</p>
<p>The approximation above only involves simple arithmetic. No trig functions. Not even a square root. It could be carried out with pencil and paper or even mentally. And yet it is surprisingly accurate.</p>
<p>If we use the 3, 4, 5 triangle as an example, the exact value of the smallest angle is</p>
<p style="padding-left: 40px;"><em>A</em> = arctan(3/4) × 180°/π ≈ 36.8699°</p>
<p>and the approximate value is</p>
<p style="padding-left: 40px;"><em>A</em> ≈  3 × 172° / (4 + 2×5) = 258°/7 ≈ 36.8571°,</p>
<p>a difference of 0.0128°. When the angle is more acute the approximation is even better.</p>
<h2>Derivation</h2>
<p>Where does this magical approximation come from? It boils down to the series</p>
<p style="padding-left: 40px;">2 csc(<em>x</em>) + cot(<em>x</em>) = 3/<em>x</em> + <em>x</em>³/60 + <em>O</em>(<em>x</em><sup>4</sup>)</p>
<p>where <em>x</em> is in radians. When <em>x</em> is small,  <em>x</em>³/60 is extremely small and so we have</p>
<p style="padding-left: 40px;">2 csc(<em>x</em>) + cot(<em>x</em>) ≈ 3/<em>x.</em></p>
<p>Apply this approximation with csc(<em>x</em>) = <em>c</em>/<em>a</em> and cot(<em>x</em>) = <em>b</em>/<em>a</em>. and you have</p>
<p style="padding-left: 40px;"><em>x</em> ≈ 3<em>a</em>/(<em>b</em> + 2<em>c</em>)</p>
<p>in radians. Multiply by 180°/π to convert to degrees, and note that 540/π ≈ 172.</p>
<h2>Discovery</h2>
<p>It&#8217;s unmotivated to say &#8220;just expand 2 csc(<em>x</em>) + cot(<em>x</em>) in a series.&#8221; Where did <em>that</em> come from?</p>
<p>There&#8217;s a line in [1] that says &#8220;It can been seen, either from tables or from a consideration of power series that the radian measure of a small angle lies approximately one-third of the way from the sine to the tangent.&#8221; In other words</p>
<p style="padding-left: 40px;">3<em>x</em> ≈ 2 sin(<em>x</em>) + tan(<em>x</em>)<em>.</em></p>
<p>You can verify that by adding the power series and noting that the cubic terms cancel out.</p>
<p>But that&#8217;s just the beginning. The author then makes the leap to conjecturing that if the weighted <em>arithmetic</em> mean gives a good approximation, maybe the weighted <em>harmonic</em> mean gives an even better approximation, and that leads to considering</p>
<p style="padding-left: 40px;">2 csc(<em>x</em>) + cot(<em>x</em>) ≈ 3/<em>x.</em></p>
<h2>Extension</h2>
<p>See the <a href="https://www.johndcook.com/blog/2026/04/23/solve-an-oblique-triangle/">next post</a> for an extension to oblique triangles. Not as simple, but based on the same trick.</p>
<p>&nbsp;</p>
<p>[1] J. S. Frame. Solving a right triangle without tables. The American Mathematical Monthly, Vol. 50, No. 10 (Dec., 1943), pp. 622-626</p>The post <a href="https://www.johndcook.com/blog/2026/04/23/solve-a-right-triangle/">Simple approximation for solving a right triangle</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>An AI Odyssey, Part 4: Astounding Coding Agents</title>
		<link>https://www.johndcook.com/blog/2026/04/21/an-ai-odyssey-part-4-astounding-coding-agents/</link>
					<comments>https://www.johndcook.com/blog/2026/04/21/an-ai-odyssey-part-4-astounding-coding-agents/#comments</comments>
		
		<dc:creator><![CDATA[Wayne Joubert]]></dc:creator>
		<pubDate>Tue, 21 Apr 2026 20:14:40 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Productivity]]></category>
		<category><![CDATA[Software development]]></category>
		<category><![CDATA[Codex]]></category>
		<category><![CDATA[OpenAI]]></category>
		<category><![CDATA[Software tools]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246930</guid>

					<description><![CDATA[<p>AI coding agents improved greatly last summer, and again last December-January. Here are my experiences since my last post on the subject. The models feel subjectively much smarter. They can accomplish a much broader range of tasks. They seem to have a larger, more comprehensive in-depth view of the code base and what you are [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/21/an-ai-odyssey-part-4-astounding-coding-agents/">An AI Odyssey, Part 4: Astounding Coding Agents</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>AI coding agents improved greatly last summer, and again last December-January. Here are my experiences since my <a href="https://www.johndcook.com/blog/2025/10/16/experiences-with-gpt-5-codex/">last post</a> on the subject.</p>
<p>The models feel subjectively much smarter. They can accomplish a much broader range of tasks. They seem to have a larger, more comprehensive in-depth view of the code base and what you are trying to accomplish. They can locate more details in obscure parts of the code related to the specific task at hand. By rough personal estimate, I would say that they helped me with 20% of my coding effort last August and about 60% now. Both of these figures may be below what is possible, I may not be using them to fullest capability.</p>
<p>This being said, they are not a panacea. Sometimes they need help and direction to where to look for a problem. Sometimes they myopically look at the trees and don&#8217;t see the forest, don&#8217;t step back to see the big picture to understand something is wrong at a high level. Also they can overoptimize to the test harness. They can also generate code that is not conceptually consistent with existing code.</p>
<p>They can also generate much larger amounts of code than necessary. Some warn that that this will lead to explosion of coding debt. On the other hand, you can also guide the coding agent to refactor and improve its own code&#8212;quickly. In one case I&#8217;ve gotten it to reduce the size of one piece of code to less than half its original size with no change in behavior.</p>
<p>I use OpenAI Codex rather than Claude Code.  I&#8217;m glad to hear <a href="https://steipete.me/posts/just-talk-to-it">some technically credible people</a> think this a good choice. Though maybe I should try both.</p>
<p>My work is a research project for which the code itself is the research product, so I can&#8217;t give specifications of everything in advance; writing the code itself is a process of discovery. Also, I want a code base that remains human-readable. So I am deeply involved in discussion with the coding agent that will sometimes go off to do a task for some period of time. It is not my desire to treat the agent as what some have called a <a href="https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/">dark software factory</a>.</p>
<p>Some say they fear that using a coding agent will result in forgetting how to write code without one. I have felt this, but from having exercised this muscle for such a very long time I don&#8217;t think it is a skill I would easily forget. The flip side of this argument is, you might even learn new coding idioms from observing what the coding agent writes, that is a good thing.</p>
<p>Some say they haven&#8217;t written a line of code in weeks because the coding agent does it for them. I don&#8217;t think I&#8217;ll ever stop writing code, any more than I will stop scribbling random ideas on the back of an envelope, or typing out some number of lines of code by which I discover some new idea in real time while I am typing. Learning is multisensory.</p>
<p>My hat’s off to the developers who are able to keep many different agents in flight at the same time. Personally I have difficulty handling the cognitive load of thinking deeply about each one and doing a mental context switch across many agents. Though I am experimenting with running more than one agent so I can be doing something while another agent is working.</p>
<p>I continue to be astounded by productivity gains in some situations. I recently added a new capability, which normally I think would&#8217;ve taken me two months to learn a whole new algorithmic method and library to use. With the coding agent, it took me four days to get this done, on the order of 10X productivity increase. Though admittedly things are not always that rosy.</p>
<p>In short, the tools are getting better and better. I&#8217;m looking forward to what they will be like a few months or a year from now.</p>The post <a href="https://www.johndcook.com/blog/2026/04/21/an-ai-odyssey-part-4-astounding-coding-agents/">An AI Odyssey, Part 4: Astounding Coding Agents</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/04/21/an-ai-odyssey-part-4-astounding-coding-agents/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title>More on Newton&#8217;s diameter theorem</title>
		<link>https://www.johndcook.com/blog/2026/04/20/newton-diameter-quintic/</link>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Mon, 20 Apr 2026 13:50:33 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246980</guid>

					<description><![CDATA[<p>A few days ago I wrote a post on Newton&#8217;s diameter theorem. The theorem says to plot the curve formed by the solutions to f(x, y) = 0 where f is a polynomial in x and y of degree n. Next plot several parallel lines that cross the curve at n points and find the [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/20/newton-diameter-quintic/">More on Newton’s diameter theorem</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>A few days ago I wrote a post on <a href="https://www.johndcook.com/blog/2026/04/16/newton-diameters/">Newton&#8217;s diameter theorem</a>. The theorem says to plot the curve formed by the solutions to <em>f</em>(<em>x</em>, <em>y</em>) = 0 where <em>f</em> is a polynomial in <em>x</em> and <em>y</em> of degree <em>n</em>. Next plot several parallel lines that cross the curve at <em>n</em> points and find the centroid of the intersections on each line. Then the centroids will fall on a line.</p>
<p>The previous post contained an illustration using a cubic polynomial and three evenly spaced parallel lines. This post uses a fifth degree polynomial, and shows that the parallel lines need not be evenly spaced.</p>
<p>In this post</p>
<p style="padding-left: 40px;"><em>f</em>(<em>x</em>, <em>y</em>) = <em>y</em>³ + <em>y</em> − <em>x</em> (<em>x</em> + 1) (<em>x</em> + 2) (<em>x</em> − 3) (<em>x</em> − 2).</p>
<p>Here&#8217;s an example of three lines that each cross the curve five times.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/hyperelliptic_newton1.png" width="575" height="578" /></p>
<p>The lines are <em>y</em> = <em>x</em> + <em>k</em> where <em>k</em> = 0.5, −0.5, and −3. The coordinates of the centroids are (0.4, 0.9), (0.4, -0.1), and (0.4, -2.6).</p>
<p>And to show that the requirement that the lines cross five times is necessary, here&#8217;s a plot where one of the parallel lines only crosses three times. The top line is now <em>y</em> = <em>x</em> + 2 and the centroid on the top line moved to (0.0550019, 2.055).</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/hyperelliptic_newton2.png" width="575" height="578" /></p>The post <a href="https://www.johndcook.com/blog/2026/04/20/newton-diameter-quintic/">More on Newton’s diameter theorem</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Gaussian distributed weights for LLMs</title>
		<link>https://www.johndcook.com/blog/2026/04/18/qlora/</link>
					<comments>https://www.johndcook.com/blog/2026/04/18/qlora/#comments</comments>
		
		<dc:creator><![CDATA[John Cook]]></dc:creator>
		<pubDate>Sat, 18 Apr 2026 14:49:27 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Number systems]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246976</guid>

					<description><![CDATA[<p>The previous post looked at the FP4 4-bit floating point format. This post will look at another 4-bit floating point format, NF4, and higher precision analogs. NF4 and FP4 are common bitsandbytes 4-bit data types. If you download LLM weights from Hugging Face quantized to four bits, the weights might be in NF4 or FP4 [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/18/qlora/">Gaussian distributed weights for LLMs</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>The <a href="https://www.johndcook.com/blog/2026/04/17/fp4/">previous post</a> looked at the <strong>FP4</strong> 4-bit floating point format. This post will look at another 4-bit floating point format, <strong>NF4</strong>, and higher precision analogs. NF4 and FP4 are common bitsandbytes 4-bit data types. If you download LLM weights from Hugging Face quantized to four bits, the weights might be in NF4 or FP4 format. Or maybe some other format: there&#8217;s a surprising amount of variety in how 4-bit numbers are implemented.</p>
<h2>Why NF4</h2>
<p>LLM parameters have a roughly Gaussian distribution, and so evenly spaced numeric values are not ideal for parameters. Instead, you&#8217;d like numbers that are closer together near 0. </p>
<p>The FP4 floating point numbers, described in the previous post, are spaced 0.5 apart for small values, and the larger values are spaced 1 or 2 apart. That&#8217;s hardly a Gaussian distribution, but it&#8217;s closer to Gaussian than a uniform distribution would be. NF4 deliberately follows more of a Gaussian distribution.</p>
<h2>QLoRA</h2>
<p>The QLoRA formats [1], unlike FP4, are not analogs of IEEE numbers. The bits are not interpreted as sign, exponent, and mantissa, but rather as integers to be used as indexes. An NF<em>n</em> number is an index into a list of 2<sup><em>n</em></sup> real numbers with Gaussian spacing. To put it another way, the numbers represented by NF<em>n</em> have uniformly distributed <em>z</em>-scores. </p>
<p>That makes sense at a high level, but the paper [1] is hard to follow in detail. It says</p>
<blockquote><p>More formally, we estimate the 2<sup><em>k</em></sup> values <em>q</em><sub><em>i</em></sub> of the data type as follows:</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/qlora.svg" alt="q_i = \frac{1}{2}\left(Q_X\left(\frac{i}{2^k + 1}\right) + Q_X\left(\frac{i+1}{2^k + 1} \right) \right)" width="293" height="48" /></p>
<p>where <em>Q</em><sub><em>X</em></sub>(·) is the quantile function of the standard normal distribution <em>N</em>(0, 1).</p></blockquote>
<p>The paper doesn&#8217;t give the range of <em>i</em> but it says there are 2<sup><em>k</em></sup> values, implying that <em>i</em> runs from 0 to 2<sup><em>k</em></sup> −1 or from 1 to 2<sup><em>k</em></sup>. Either way runs into infinite values since <em>Q</em>(0) = −∞ and <em>Q</em>(1) = ∞. We could avoid infinities by letting <em>i</em> run from 1 to 2<sup><em>n</em></sup> − 1.</p>
<p>The next sentence is puzzling.</p>
<blockquote><p>A problem for a symmetric k-bit quantization is that this approach does not have an exact representation of zero, which is an important property to quantize padding and other zero-valued elements with no error.</p></blockquote>
<p>I understand the desire to represent 0 exactly, but the equation above has an exact representation of 0 when <em>i</em> = 2<sup><em>n</em> − 1</sup>. Perhaps the authors had in mind that <em>i</em> takes on the values ½, 1 + ½, 2 + ½, …, 2<sup><em>n</em></sup> − ½. This would be reasonable, but a highly unusual use of notation. It seems that the real problem is not the lack of a representation of 0 but an unused index, with <em>i</em> running from 1 to 2<sup><em>n</em></sup> − 1.</p>
<p>To be fair, the first sentence quoted above says &#8220;we <em>estimate</em> the 2<sup><em>k</em></sup> values &hellip;&#8221; and so the equation above may not be intended as a definition but as motivation for the actual definition.</p>
<h2>Reproducing NF4</h2>
<p>The authors give a procedure for using 2<sup><em>n</em></sup> values of <em>i</em> and obtaining an exact representation of 0, and they give a list of NF4 values in Appendix E. I was not able to get the two to match. I implemented a few possible interpretations of the procedure described in the paper, and each approximates the list of values in the appendix, but not closely.</p>
<p>The following code, written with the help of ChatGPT, reverse engineers the NF4 values to 8 decimal places, i.e. to the precision of a 32-bit floating point number.</p>
<pre>
from scipy.stats import norm

Q = norm.ppf

α  = 0.9677083
Z  = Q(α)
δ1 = (α - 0.5)/7
δ2 = (α - 0.5)/8

q = [0]*16
for i in range(7):
    q[i] = -Q(α - i*δ1)/Z
for i in range(8):
    q[i+8] = Q(0.5 + (i+1)*δ2)/Z
    
# Values given in Appendix E
NF4 = [
    -1.0,
    -0.6961928009986877,
    -0.5250730514526367,
    -0.39491748809814453,
    -0.28444138169288635,
    -0.18477343022823334,
    -0.09105003625154495,
    0.0,
    0.07958029955625534,
    0.16093020141124725,
    0.24611230194568634,
    0.33791524171829224,
    0.44070982933044434,
    0.5626170039176941,
    0.7229568362236023,
    1.0
]

# Compare 
for i in range(16):
    print(i, NF4[i] - q[i])
</pre>
<p>The magic number α = 0.9677083 is a mystery. I asked ChatGPT to look into this further, and it said that bitsandbytes uses α = 929/960 = 0.9677083333333333. When I use this value for α the precision is about the same, which is fine. However, the values in the paper were given to 16 decimal places, so I thought it might be able to match the values to more precision.</p>
<p>Quibbles over the exact values of NF4 aside, the NF4 format works well in practice. Models. quantized to 4 bits using NF4 perform better than models quantized to other 4-bit formats on some benchmarks.</p>
<h2>Related posts</h2>
<ul>
<li class='link'><a href='https://www.johndcook.com/blog/2009/04/06/anatomy-of-a-floating-point-number/'>Anatomy of a floating point number</a></li>
<li class='link'><a href='https://www.johndcook.com/blog/2018/04/11/anatomy-of-a-posit-number/'>Anatomy of a posit number</a></li>
<li class='link'><a href='https://www.johndcook.com/blog/2018/11/15/bfloat16/'>Half precision numbers</a></li>
</ul>
<p>[1] QLoRA: Efficient Finetuning of Quantized LLMs by Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, and Luke Zettlemoyer. <a href="https://arxiv.org/abs/2305.14314">https://arxiv.org/abs/2305.14314</a>.</p>The post <a href="https://www.johndcook.com/blog/2026/04/18/qlora/">Gaussian distributed weights for LLMs</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/04/18/qlora/feed/</wfw:commentRss>
			<slash:comments>4</slash:comments>
		
		
			</item>
		<item>
		<title>4-bit floating point FP4</title>
		<link>https://www.johndcook.com/blog/2026/04/17/fp4/</link>
					<comments>https://www.johndcook.com/blog/2026/04/17/fp4/#comments</comments>
		
		<dc:creator><![CDATA[John Cook]]></dc:creator>
		<pubDate>Sat, 18 Apr 2026 01:08:54 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Number systems]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246975</guid>

					<description><![CDATA[<p>In ancient times, floating point numbers were stored in 32 bits. Then somewhere along the way 64 bits became standard. The C programming language retains the ancient lore, using float to refer to a 32-bit floating point number and double to refer to a floating point number with double the number of bits. Python simply [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/17/fp4/">4-bit floating point FP4</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>In ancient times, floating point numbers were stored in 32 bits. Then somewhere along the way 64 bits became standard. The C programming language retains the ancient lore, using <code>float</code> to refer to a 32-bit floating point number and <code>double</code> to refer to a floating point number with double the number of bits. Python simply uses <code>float</code> to refer to the most common floating point format, which C calls <code>double</code>.</p>
<p>Programmers were grateful for the move from 32-bit floats to 64-bit floats. It doesn&#8217;t hurt to have more precision, and some numerical problems go away when you go from 32-bits to 64-bits. (Though not all. Something I&#8217;ve written about numerous times.)</p>
<p>Neural networks brought about something extraordinary: demand for floating point numbers with <em>less</em> precision. These networks have an enormous number of parameters, and its more important to fit more parameters into memory than it is to have higher precision parameters. Instead of double precision (64 bits) developers wanted <em>half</em> precision (16 bits), or even less, such as FP8 (8 bits) or FP4 (4 bits). This post will look at 4-bit floating point numbers.</p>
<p>Why even bother with floating point numbers when you don&#8217;t need much precision? Why not use integers? For example, with four bits you could represent the integers 0, 1, 2, …, 15. You could introduce a bias, subtracting say 7 from each value, so your four bits represent −7 through 8. Turns out it&#8217;s useful to have a more dynamic range.</p>
<h2>FP4</h2>
<p>Signed 4-bit floating point numbers in FP4 format use the first bit to represent the sign. The question is what to do with the remaining three bits. The notation E<em>x</em>M<i>y</i> denotes a format with <em>x</em> exponent bits and <em>y</em> mantissa bits. In the context of signed 4-bit numbers,</p>
<p style="padding-left: 40px;"><em>x</em> + <em>y</em> = 3</p>
<p>but in other contexts the sum could be larger. For example, for an 8-bit signed float, <em>x</em> + <em>y</em> = 7.</p>
<p>For 4-bit signed floats we have four possibilities: E3M0, E2M1, E1M2, and E0M3. All are used somewhere, but E2M1 is the most common and is supported in Nvidia hardware.</p>
<p>A number with sign bit <em>s</em>, exponent <em>e</em>, and mantissa <em>m</em> has the value</p>
<p style="padding-left: 40px;">(−1)<sup><em>s</em></sup> 2<sup><em>e</em>−<em>b</em></sup> (1 + <em>m</em>/2)</p>
<p>where <em>b</em> is the bias. The purpose of the bias is to allow positive and negative exponents without using signed numbers for <em>e</em>. So, for example, if <em>b</em> = 1 and <em>e</em> = 1, 2, or 3 then the exponent part 2<sup><em>e</em>−<em>b</em></sup> can represent 1, 2, or 4.</p>
<p>The bias impacts the range of possible numbers but not their relative spacing. For any value of bias <em>b</em>, the E3M0 format is all exponent, no mantissa, and so its possible values are uniformly distributed on a log scale. The E0M3 format is all mantissa, so its values are uniformly distributed on a linear scale. The E1M2 and E2M1 formats are unevenly spaced on both log and linear scales.</p>
<p>There is an exception to the expression above for converting (<em>s</em>, <em>e</em>, <em>m</em>) into a real number when <em>e</em> = 0. In that case, <em>m</em> = 0 represents 0 and <em>m</em> = 1 represents ½.</p>
<h2>Table of values</h2>
<p>Since there are only 16 possible FP4 numbers, it&#8217;s possible to list them all. Here is a table for the E2M1 format.</p>
<pre>Bits s exp m  Value
-------------------
0000 0  00 0     +0
0001 0  00 1   +0.5
0010 0  01 0     +1
0011 0  01 1   +1.5
0100 0  10 0     +2
0101 0  10 1     +3
0110 0  11 0     +4
0111 0  11 1     +6
1000 1  00 0     -0
1001 1  00 1   -0.5
1010 1  01 0     -1
1011 1  01 1   -1.5
1100 1  10 0     -2
1101 1  10 1     -3
1110 1  11 0     -4
1111 1  11 1     -6
</pre>
<p>Note that even in this tiny floating point format, there are two zeros, +0 and −0, just like full precision floats. More on that <a href="https://www.johndcook.com/blog/2010/06/15/why-computers-have-signed-zero/">here</a>.</p>
<h2>Pychop library</h2>
<p>The Python library <a href="https://github.com/inexascale/pychop">Pychop</a> emulates a wide variety of reduced-precision floating point formats. Here is the code that used Pychop to create the table above.</p>
<pre>import pychop

# Pull the format metadata from Pychop.
spec = pychop.MX_FORMATS["mxfp4_e2m1"]
assert (spec.exp_bits, spec.sig_bits) == (2, 1)

def e2m1_value(s: int, e: int, m: int) -&gt; float:
    sign = -1.0 if s else 1.0

    # Subnormal / zero
    if e == 0:
        return sign * (m / 2.0)

    # Normal
    return sign * (2.0 ** (e - 1)) * (1.0 + m / 2.0)

def display_value(bits: int, x: float) -&gt; str:
    if bits == 0b0000:
        return "+0"
    if bits == 0b1000:
        return "-0"
    return f"{x:+g}"

rows = []
for bits in range(16):
    s = (bits &gt;&gt; 3) &amp; 0b1
    e = (bits &gt;&gt; 1) &amp; 0b11
    m = bits &amp; 0b1
    x = e2m1_value(s, e, m)

    rows.append(
        {
            "Bits": f"{bits:04b}",
            "s": s,
            "exp_bits": f"{e:02b}",
            "m": m,
            "Value": display_value(bits, x),
        }
    )

# Pretty-print the table.
header = f"{'Bits':&lt;4} {'s':&gt;1} {'exp':&gt;3} {'m':&gt;1} {'Value':&gt;6}"
print(header)
print("-" * len(header))
for row in rows:
    print(
        f"{row['Bits']:&lt;4} " f"{row['s']:&gt;1} "
        f"{row['exp_bits']:&gt;3} "
        f"{row['m']:&gt;1} "
        f"{row['Value']:&gt;6}"
    )
</pre>
<h2>Other formats</h2>
<p>FP4 isn&#8217;t the only 4-bit floating point format. There&#8217;s a surprisingly large number of formats in use. I intend to address another format my next post.</p>
<p><b>Update</b>: See the <a href="https://www.johndcook.com/blog/2026/04/18/qlora/">next post</a> for a discussion of NF4, a format whose representable numbers more closely matches the distribution of LLM weights.</p>The post <a href="https://www.johndcook.com/blog/2026/04/17/fp4/">4-bit floating point FP4</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/04/17/fp4/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title>Newton diameters</title>
		<link>https://www.johndcook.com/blog/2026/04/16/newton-diameters/</link>
		
		<dc:creator><![CDATA[John Cook]]></dc:creator>
		<pubDate>Thu, 16 Apr 2026 12:02:13 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246973</guid>

					<description><![CDATA[<p>Let f(x, y) be an nth degree polynomial in x and y. In general, a straight line will cross the zero set of f in n locations [1]. Newton defined a diameter to be any line that crosses the zero set of f exactly n times. If f(x, y) = x² + y² − 1 then the zero set of f is a circle and diameters of the [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/16/newton-diameters/">Newton diameters</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>Let <em>f</em>(<em>x</em>, <em>y</em>) be an <em>n</em>th degree polynomial in <em>x</em> and <em>y</em>. In general, a straight line will cross the zero set of <em>f</em> in <em>n</em> locations [1].</p>
<p>Newton defined a <strong>diameter</strong> to be any line that crosses the zero set of <em>f</em> exactly <em>n</em> times. If</p>
<p style="padding-left: 40px;"><em>f</em>(<em>x</em>, <em>y</em>) = <em>x</em>² + <em>y</em>² − 1</p>
<p>then the zero set of <em>f</em> is a circle and diameters of the circle in the usual sense are diameters in Newton&#8217;s sense. But Newton&#8217;s notion of diameter is more general, including lines the cross the circle without going through the center.</p>
<p>Newton&#8217;s theorem of diameters says that if you take several parallel diameters (in his sense of the word), the centroids of the intersections of each diameter with the curve <em>f</em>(<em>x</em>, <em>y</em>) = 0 all line on a line.</p>
<p>To illustrate this theorem, let&#8217;s look at the elliptic curve</p>
<p style="padding-left: 40px;"><em>y</em>² = <em>x</em>³ − 2<em>x</em> + 1,</p>
<p>i.e. the zeros of <em>f</em>(<em>x</em>, <em>y</em>) = <em>y</em>² − (<em>x</em>³ − 2<em>x</em> + 1). This is a third degree curve, and so in general a straight line will cross the curve three times [2].</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/newton_diameters2.png" width="640" height="480" /></p>
<p>The orange, green, and red lines are parallel, each intersecting the blue elliptic curve three times. The dot on each line is the centroid of the intersection points, the center of mass if you imagine each intersection to be a unit point mass. The centroids all lie on a line, a vertical line in this example though in general the line could have any slope.</p>
<p>I hadn&#8217;t seen this theorem until I ran across it recently when skimming [3]. Search results suggest the theorem isn&#8217;t widely known, which is surprising for a result that goes back to Newton.</p>
<p><strong>Update</strong>: See <a href="https://www.johndcook.com/blog/2026/04/20/newton-diameter-quintic/">this post</a> for another illustration using a fifth degree polynomial.</p>
<h2>Related posts</h2>
<ul>
<li class="link"><a href="https://www.johndcook.com/blog/2019/02/21/what-is-an-elliptic-curve/">What is an elliptic curve?</a></li>
<li class="link"><a href="https://www.johndcook.com/blog/2020/05/27/fundamental-theorem-of-algebra/">The fundamental theorem of algebra</a></li>
<li class="link"><a href="https://www.johndcook.com/blog/2022/10/28/where-quadratic-curves-intersect/">Finding where two quadratic curves intersect</a></li>
</ul>
<p>[1] Bézout&#8217;s theorem says a curve of degree <em>m</em> and a curve of degree <em>n</em> will always intersect in <em>mn</em> points. But that includes complex roots, adds a line at infinity, and counts intersections with multiplicity. So a line, a curve of degree 1, will intersect a curve of degree <em>n</em> at <em>n</em> points in this extended sense.</p>
<p>[2] See the description of Bézout&#8217;s theorem in the previous footnote. In the elliptic curve example, the parallel lines meet at a point at infinity. A line that misses the closed component of the elliptic curve and only passes through the second component has 1 real point of intersection but there would be 2 more if we were working in ℂ² rather than ℝ².</p>
<p>In algebraic terms, the system of equations</p>
<p style="padding-left: 40px;"><em>y</em>² = <em>x</em>³ − 2<em>x</em> + 1<br />
3<em>y</em> = 2<em>x</em> + <em>k<br />
</em></p>
<p>has three real solutions for small values of <em>k</em>, but for sufficiently large values of |<em>k</em>| two of the solutions will be complex.</p>
<p>[3] Mathematics: Its Content, Methods, and Meaning. Edited by A. D. Aleksandrov, A. N. Kolmogorov, and M. A. Lavrent&#8217;ev. Volume 1.</p>The post <a href="https://www.johndcook.com/blog/2026/04/16/newton-diameters/">Newton diameters</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Intersecting spheres and GPS</title>
		<link>https://www.johndcook.com/blog/2026/04/14/intersecting-spheres-and-gps/</link>
					<comments>https://www.johndcook.com/blog/2026/04/14/intersecting-spheres-and-gps/#comments</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Tue, 14 Apr 2026 14:08:36 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Geometry]]></category>
		<category><![CDATA[Navigation]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246970</guid>

					<description><![CDATA[<p>If you know the distance d to a satellite, you can compute a circle of points that passes through your location. That&#8217;s because you&#8217;re at the intersection of two spheres—the earth&#8217;s surface and a sphere of radius d centered on the satellite—and the intersection of two spheres is a circle. Said another way, one observation [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/14/intersecting-spheres-and-gps/">Intersecting spheres and GPS</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>If you know the distance <em>d</em> to a satellite, you can compute a circle of points that passes through your location. That&#8217;s because you&#8217;re at the intersection of two spheres—the earth&#8217;s surface and a sphere of radius <em>d</em> centered on the satellite—and the intersection of two spheres is a circle. Said another way, one observation of a satellite determines a circle of possible locations.</p>
<p>If you know the distance to a second satellite as well, then you can find two circles that contain your location. The two circles intersect at two points, and you know that you&#8217;re at one of two possible positions. If you know your approximate position, you may be able to rule out one of the intersection points.</p>
<p>If you know the distance to three different satellites, now you know three circles that you&#8217;re standing on, and the third circle will only pass through one of the two points determined by the first two satellites. Now you know exactly where you are.</p>
<p>Knowing the distance to more satellites is even better. In theory additional observations are redundant but harmless. In practice, they let you partially cancel out inevitable measurement errors.</p>
<p>If you&#8217;re not on the earth&#8217;s surface, you&#8217;re still at the intersection of <em>n</em> spheres if you know the distance to <em>n</em> satellites. If you&#8217;re in an airplane, or on route to the moon, the same principles apply.</p>
<h2>Errors and corrections</h2>
<p>How do you know the distance to a satellite? The satellite can announce what time it is by its clock, then when you receive the announcement you compare it to the time by your clock. The difference between the two times tells you how long the radio signal traveled. Multiply by the speed of light and you have the distance.</p>
<p>However, your clock will probably not be exactly synchronized with the satellite clock. Observing a fourth satellite can fix the problem of your clock not being synchronized with the satellite clocks. But it doesn&#8217;t fix the more subtle problems of special relativity and general relativity. See <a href="https://perthirtysix.com/how-does-gps-work">this post</a> by Shri Khalpada for an accessible discussion of the physics.</p>
<h2>Numerical computation</h2>
<p>Each distance measurement gives you an equation:</p>
<p style="padding-left: 40px;">|| <em>x</em> &#8211; <em>s</em><sub><em>i</em></sub> || = <em>d</em><sub><em>i</em></sub></p>
<p>where <em>s</em><sub><em>i</em></sub> is the location of the <em>i</em>th satellite and <em>d</em><sub><em>i</em></sub> is your distance to that satellite. If you square both sides of the equation, you have a quadratic equation. You have to solve a system of nonlinear equations, and yet there is a way to transform the problem into solving linear equations, i.e. using linear algebra. See <a href="https://www.cambridge.org/core/journals/anziam-journal/article/note-on-computing-the-intersection-of-spheres-in-mathbbrn/D15FB22917024962409980AC7D3C086D">this article</a> for details.</p>
<h2>Related posts</h2>
<ul>
<li class="link"><a href="https://www.johndcook.com/blog/2025/12/04/the-navigational-triangle/">The navigation triangle</a></li>
<li class="link"><a href="https://www.johndcook.com/blog/2023/03/07/duttons/">Dutton&#8217;s Navigation and Piloting</a></li>
</ul>The post <a href="https://www.johndcook.com/blog/2026/04/14/intersecting-spheres-and-gps/">Intersecting spheres and GPS</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/04/14/intersecting-spheres-and-gps/feed/</wfw:commentRss>
			<slash:comments>4</slash:comments>
		
		
			</item>
		<item>
		<title>Finding a parabola through two points with given slopes</title>
		<link>https://www.johndcook.com/blog/2026/04/14/artz-parabola/</link>
					<comments>https://www.johndcook.com/blog/2026/04/14/artz-parabola/#comments</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Tue, 14 Apr 2026 12:19:42 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Geometry]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246968</guid>

					<description><![CDATA[<p>The Wikipedia article on modern triangle geometry has an image labeled &#8220;Artzt parabolas&#8221; with no explanation. A quick search didn&#8217;t turn up anything about Artzt parabolas [1], but apparently the parabolas go through pairs of vertices with tangents parallel to the sides. The general form of a conic section is ax² + bxy + cy² [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/14/artz-parabola/">Finding a parabola through two points with given slopes</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>The Wikipedia article on modern triangle geometry has an image labeled &#8220;Artzt parabolas&#8221; with no explanation.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/artz_triangle.png" width="330" height="247" /></p>
<p>A quick search didn&#8217;t turn up anything about Artzt parabolas [1], but apparently the parabolas go through pairs of vertices with tangents parallel to the sides.</p>
<p>The general form of a conic section is</p>
<p style="padding-left: 40px;"><em>ax</em>² + <em>bxy</em> + <em>cy</em>² + <em>dx</em> + <em>ey</em> + <em>f</em> = 0</p>
<p>and the constraint <em>b</em>² = 4<em>ac</em> means the conic will be a parabola.</p>
<p>We have 6 parameters, each determined only up to a scaling factor; you can multiply both sides by any non-zero constant and still have the same conic. So a general conic has 5 degrees of freedom, and the parabola condition <em>b</em>² = 4<em>ac</em> takes us down to 4. Specifying two points that the parabola passes through takes up 2 more degrees of freedom, and specifying the slopes takes up the last two. So it&#8217;s plausible that there is a unique solution to the problem.</p>
<p>There is indeed a solution, unique up to scaling the parameters. The following code finds parameters of a parabola that passes through (<em>x</em><sub><em>i</em></sub>, <em>y</em><sub><em>i</em></sub>) with slope <em>m</em><sub><em>i</em></sub> for <em>i</em> = 1, 2.</p>
<pre>def solve(x1, y1, m1, x2, y2, m2):
    
    Δx = x2 - x1
    Δy = y2 - y1
    λ = 4*(Δx*m1 - Δy)*(Δx*m2 - Δy)/(m1 - m2)**2
    k = x2*y1 - x1*y2

    a = Δy**2 + λ*m1*m2
    b = -2*Δx*Δy - λ*(m1 + m2)
    c = Δx**2 + λ
    d =  2*k*Δy + λ*(m1*y2 + m2*y1 - m1*m2*(x1 + x2))
    e = -2*k*Δx + λ*(m1*x1 + m2*x2 - y1 - y2)
    f = k**2 + λ*(m1*x1 - y1)*(m2*x2 - y2)

    return (a, b, c, d, e, f)
</pre>
<p>[1] The page said &#8220;Artz&#8221; when I first looked at it, but it has since been corrected to &#8220;Artzt&#8221;. Maybe I didn&#8217;t find anything because I was looking for the wrong spelling.</p>The post <a href="https://www.johndcook.com/blog/2026/04/14/artz-parabola/">Finding a parabola through two points with given slopes</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/04/14/artz-parabola/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title>Mathematical minimalism</title>
		<link>https://www.johndcook.com/blog/2026/04/13/the-smallest-math-library/</link>
					<comments>https://www.johndcook.com/blog/2026/04/13/the-smallest-math-library/#comments</comments>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Mon, 13 Apr 2026 14:33:16 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246966</guid>

					<description><![CDATA[<p>Andrzej Odrzywolek recently posted an article on arXiv showing that you can obtain all the elementary functions from just the function and the constant 1. The following equations, taken from the paper&#8217;s supplement, show how to bootstrap addition, subtraction, multiplication, and division from the eml function. See the paper and supplement for how to obtain [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/13/the-smallest-math-library/">Mathematical minimalism</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<p>Andrzej Odrzywolek recently posted an article on <a href="https://arxiv.org/abs/2603.21852v2">arXiv</a> showing that you can obtain all the elementary functions from just the function</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/eml.svg" alt="\operatorname{eml}(x,y) = \exp(x) - \log(y)" width="224" height="18" /></p>
<p>and the constant 1. The following equations, taken from the paper&#8217;s <a href="https://arxiv.org/src/2603.21852v2/anc/SupplementaryInformation.pdf">supplement</a>, show how to bootstrap addition, subtraction, multiplication, and division from the eml function.</p>
<p><img loading="lazy" decoding="async" class="aligncenter" style="background-color: white;" src="https://www.johndcook.com/elm.svg" alt="\begin{align*} \exp(z) &amp;\mapsto \operatorname{eml}(z,1) \\ \log(z) &amp;\mapsto \operatorname{eml}(1,\exp(\operatorname{eml}(1,z))) \\ x - y &amp;\mapsto \operatorname{eml}(\log(x),\exp(y)) \\ -z &amp;\mapsto (\log 1) - z \\ x + y &amp;\mapsto x - (-y) \\ 1/z &amp;\mapsto \exp(-\log z) \\ x \cdot y &amp;\mapsto \exp(\log x + \log y) \end{align*}" width="264" height="193" /></p>
<p>See the paper and supplement for how to obtain constants like π and functions like square and square root, as well as the standard circular and hyperbolic functions.</p>
<h2>Related posts</h2>
<ul>
<li class="link"><a href="https://www.johndcook.com/blog/2021/01/05/bootstrapping-math-library/">Bootstrapping a small math library</a></li>
<li class="link"><a href="https://www.johndcook.com/blog/2026/04/06/tofolli-gates/">Toffoli gates are all you need</a></li>
</ul>The post <a href="https://www.johndcook.com/blog/2026/04/13/the-smallest-math-library/">Mathematical minimalism</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
					<wfw:commentRss>https://www.johndcook.com/blog/2026/04/13/the-smallest-math-library/feed/</wfw:commentRss>
			<slash:comments>5</slash:comments>
		
		
			</item>
		<item>
		<title>Lunar period approximations</title>
		<link>https://www.johndcook.com/blog/2026/04/12/lunations/</link>
		
		<dc:creator><![CDATA[John]]></dc:creator>
		<pubDate>Sun, 12 Apr 2026 23:42:01 +0000</pubDate>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Calendars]]></category>
		<guid isPermaLink="false">https://www.johndcook.com/blog/?p=246963</guid>

					<description><![CDATA[<p>The date of Easter The church fixed Easter to be the first Sunday after the first full moon after the Spring equinox. They were choosing a date in the Roman (Julian) calendar to commemorate an event whose date was known according to the Jewish lunisolar calendar, hence the reference to equinoxes and full moons. The [&#8230;]</p>
The post <a href="https://www.johndcook.com/blog/2026/04/12/lunations/">Lunar period approximations</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></description>
										<content:encoded><![CDATA[<h2>The date of Easter</h2>
<p>The church fixed Easter to be the first Sunday after the first full moon after the Spring equinox. They were choosing a date in the Roman (Julian) calendar to commemorate an event whose date was known according to the Jewish lunisolar calendar, hence the reference to equinoxes and full moons.</p>
<p>The <a href="https://www.johndcook.com/blog/2026/04/12/orthodox-western-easter/">previous post</a> explained why the Eastern and Western dates of Easter differ. The primary reason is that both churches use March 21 as the first day of Spring, but the Eastern church uses March 21 on the Julian calendar and the Western church uses March 21 on the Gregorian calendar.</p>
<p>But that&#8217;s not the only difference. The churches chose different algorithms for calculating when the first full moon would be. The date of Easter doesn&#8217;t depend on the date of the full moon per se, but the methods used to predict full moons.</p>
<p>This post will show why determining the date of the full moon is messy.</p>
<h2>Lunation length</h2>
<p>The moon takes between 29 and 30 days between full moons (or between new moons, which are easier to objectively measure). This period is called a <b>lunation</b>. The average length of a lunation is <em>L</em> = 29.530588853 days. This is not a convenient number to work with, and so there&#8217;s no simple way of reconciling the orbital period of the moon with the rotation period of the earth [1]. Lunar calendars alternate months with 29 and 30 days, but they can&#8217;t be very accurate, so they have to have some fudge factor analogous to leap years.</p>
<p>The value of <em>L</em> was known from ancient times. Meton of Athens calculated in 432 BC that 235 lunar cycles equaled 19 tropical years or 6940 days. This corresponds to <em>L</em> ≈ 29.5319. Around a century later the Greek scholar Callippus refined this to 940 cycles in 76 years or 27,759 days. This corresponds to <em>L</em> ≈ 29.53085.</p>
<p>The problem wasn&#8217;t <em>knowing</em> <em>L</em> but devising a convenient way of <em>working</em> with <em>L</em>. There is no way to work with lunations that is as easy as the way the Julian (or even the more complicated Gregorian) calendar reconciles days with years.</p>
<h2>Approximations</h2>
<p>Let&#8217;s look at the accuracy of several approximations for <em>L</em>. We&#8217;d like an approximation that is not only accurate in an absolute sense, but also accurate relative to its complexity. The complexity of a fraction is measured by a <a href="https://www.johndcook.com/blog/2023/09/17/rational-height-functions/">height function</a>. We&#8217;ll use what&#8217;s called the &#8220;classic&#8221; height function: log( max(<em>n</em>, <em>d</em>) ) where <em>n</em> and <em>d</em> are the numerator and denominator of a fraction. Since we&#8217;re approximating a number bigger than 1, this will be simply log(<em>n</em>).</p>
<p>We will compare the first five convergents, approximations that come from the continued fraction form of <em>L</em>, and the approximations of Meton and Callippus. Here&#8217;s a plot.</p>
<p><img loading="lazy" decoding="async" class="aligncenter size-medium" src="https://www.johndcook.com/lunation.png" width="480" height="360" /></p>
<p>And here&#8217;s the code that produced the plot, showing the fractions used.</p>
<pre>from numpy import log
import matplotlib.pyplot as plt

fracs = [
    (30, 1), 
    (59, 2),
    (443, 15),
    (502, 17),
    (1447, 49),
    (6940, 235),
    (27759, 940)
]

def error(n, d):
    L = 29.530588853    
    return abs(n/d - L)

for f in fracs:
    plt.plot(log(f[0]), log(error(*f)), 'o')
plt.xlabel("log numerator")
plt.ylabel("log error")
plt.show()
</pre>
<p>The approximation 1447/49 is the best by far, both in absolute terms and relative to the size of the numerator. But it&#8217;s not very useful for calendar design because 1447 is not nicely related to the number of days in a year.</p>
<p>&nbsp;</p>
<p>[1] The time between full moons is a synodic month, the time it takes for the moon to return to the same position relative to the sun. This is longer than a sidereal month, the time it takes the moon to complete one orbit relative to the fixed stars.</p>The post <a href="https://www.johndcook.com/blog/2026/04/12/lunations/">Lunar period approximations</a> first appeared on <a href="https://www.johndcook.com/blog">John D. Cook</a>.]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
