As the degree of an interpolating polynomial increases, does the polynomial converge to the underlying function? The short answer is maybe. I want to describe a visual tool to help you investigate this question yourself.... read more >>

]]>As the degree of an interpolating polynomial increases, does the polynomial converge to the underlying function? The short answer is maybe. I want to describe a visual tool to help you investigate this question yourself.

Carl Runge lived from 1856 until 1927. We know his name because he was the first to write about what we now call the Runge-Kutta method for the numerical solution of ordinary differential equations. His paper was published in 1895. Martin Kutta came along six years later. But Runge made many other contributions, including the subject of today's post.

A classical result of Runge's advisor, Karl Weierstrass, is that for any continuous function, *there exists* a sequence of polynomials of increasing order that converge *uniformly* to the function. But this result does not tell us whether the polynomials can be *interpolating* or where the interpolating points might be located.

Runge's famous counterexample for interpolation is the function

$f(x) = \frac{1}{1+25x^2}$

If this function is interpolated at equally spaced points in the interval [-1,1], the polynomials *do not* converge uniformly. In fact, the maximum error goes to infinity.

I call my MATLAB® program `interp_gadget`. You can vary three parameters, $c$, $n$ and $w$. You can choose one of the three target functions involving the coefficient $c$.

$f_1(x) = \frac{1}{1+c x^2}$

$f_2(x) = e^{-c x^2}$

$f_3(x) = |cx|$

The parameter $n$ is the number of interpolation points in the interval [-1 1]. The degree of the interpolating polynomial is $n-1$.

The distribution of the points involves the weight $w$. The points are a weighted average between equally spaced points and Chebyshev points concentrated towards the end of the interval.

$x_{ch} = \cos({\frac{n-\frac{1}{2}:-1:\frac{1}{2}}{n}\pi})$

$x_{eq} = -1:\frac{2}{n-1}:1$

$x = wx_{ch} + (1-w)x_{eq}$

The green curve in the following animations and plots is the target function $f(x)$ and the blue curve is the polynomial that interpolates the target at the blue circles. The interpolation error is the difference between the two curves.

Let's vary the coefficient $c$ in the bell-shaped target $f_1$, while keeping the number of equally spaced points fixed. The value $c = 25$ gives us Runge's function. As $c$ increases, the peak in the target becomes sharper and the interpolation error increases.

Let's vary the number of points, while keeping the target fixed and the points equally spaced. As $n$ increases, we see the error near the ends of the interval increase dramatically.

Let's vary the weight in the point distribution. As we move from equal spacing towards Chebyshev spacing, the error near the end points decreases significantly.

Now some static plots comparing a few different settings of the parameters. Here is the initial setting with Runge's value of $c$ and nine equally spaced interpolation points.

Increase the number of points while keeping them equally spaced. There is trouble near the end points.

Distributing a modest number of points nearer the ends of the interval is a big improvement.

The Gaussian target $f_2(x)$ behaves pretty much like Runge's.

Behavior with the target $f_3(x)$ warrants further investigation.

Here is Runge's example again, with many equally spaced interpolation points. I've added red lines at $x = \pm .726$. In section 3.4 of a classic numerical analysis text that is now available as an inexpensive Dover reprint, Dover link, Eugene Isaacson and Herb Keller begin a proof of the fact that polynomial interpolation converges for $x$ between these red lines and diverges for $x$ outside them. But a key portion of their proof is left as an exercise.

So, I have a few tasks for those inclined to undertake them. The first is to finish Isaacson and Keller's argument, which explains where the $.726$ comes from. How does $.726$ depend on the coefficient $c$ in function $f_1(x)$? Where are the red lines for our two other target functions? How does their location depend upon $w$?

I don't know the answers to these questions. If you discover something, please let us know in the comments.

I have submitted `interp_gadget` to the MATLAB Central file exchange, available at this link. It is also included in version 4.01 of Cleve's Laboratory, available at this link.

Get
the MATLAB code

Published with MATLAB® R2018b

MathWorks is creating a deck of playing cards that will be offered as gifts at our trade show booths. The design of the cards is based on Penrose Tilings and plots of the Finite Fourier Transform Matrix.... read more >>

]]>MathWorks is creating a deck of playing cards that will be offered as gifts at our trade show booths. The design of the cards is based on Penrose Tilings and plots of the Finite Fourier Transform Matrix.

This is an example of a Penrose tiling.

Penrose tilings are aperiodic tilings of the plane named after Oxford emeritus physicist Roger Penrose, who studied them in the 1970s. MathWorks' Steve Eddins has written a paper about his MATLAB program for generating Penrose tilings. I will include the paper in this blog post. Steve has also submitted his paper and code to the MATLAB Central File Exchange, Penrose Rhombus Tiling.

Look carefully at the example tiling. It's made entirely from two rhombuses, one colored gold and one colored blue. Each rhombus, in turn, is made from two isosceles triangles. These shapes inspired MathWorks designer Gabby Lydon to reimagine the traditional symbols for the four suits in a deck of cards.

Here are the diamonds and hearts.

Here are the clubs and spades.

The theme of triangles and rhombuses is continued in the face cards.

The backs of the cards feature two mathematical objects that I have described in this blog. I wrote a five-part series about our logo five years ago, beginning with part one. And, I wrote about the graphic produced by the Finite Fourier Transform matrix.

Start with a prime integer.

n = 31;

Take a Fourier transform of the columns of the identity matrix.

F = fft(eye(n,n));

Plot it. Because 31 is prime, this is the complete graph on 31 points. Every point on the circumference is connected to every other point.

p = plot(F); axis square axis off

That's too colorful.

color = get(gca,'colororder'); set(p,'color',color(1,:))

Now graphic design software takes over and makes room for the logo.

All this has reminded me of two FFT-related apps from *Numerical Computing with MATLAB*, `fftgui` and `fftmatrix`. I've added them to Cleve's Laboratory with Version 3.9. This is the image from `fftmatrix` for n = 12. Because 12 is not prime, this is not a completely connected graph. You can vary both the matrix order and the columns to be transformed.

Now, here is Steve's paper about Penrose tiling.

**by Steve Eddins**

This story is about creating planar tilings like this:

This is an example of a Penrose tiling. Penrose tilings are aperiodic tilings that named after Roger Penrose, who studied them in the 1970s. This particular form, made from two rhombuses, is called a *P3 tiling*.

Construction of a Penrose P3 tiling is based on 4 types of isosceles triangles. The types are labeled A, A', B, and B'. The A and A' triangles have an apex angle of 36 degrees, and the B and B' triangles have an apex angle of 72 degrees.

subplot(2,2,1) showLabeledTriangles(aTriangle([],-1,1)) subplot(2,2,2) showLabeledTriangles(apTriangle([],-1,1)) subplot(2,2,3) showLabeledTriangles(bTriangle([],-1,1)) subplot(2,2,4) showLabeledTriangles(bpTriangle([],-1,1))

vertices = -1.0000 + 0.0000i 0.0000 + 3.0777i 1.0000 + 0.0000i vertices = -1.0000 + 0.0000i 0.0000 + 3.0777i 1.0000 + 0.0000i vertices = -1.0000 + 0.0000i 0.0000 + 0.7265i 1.0000 + 0.0000i vertices = -1.0000 + 0.0000i 0.0000 + 0.7265i 1.0000 + 0.0000i

Before proceeding further, let's pause to look at what the functions `aTriangle`, `apTriangle`, `bTriangle`, and `bpTriangle` do.

The function `aTriangle` takes three arguments: `aTriangle(apex,left,right)`. Each of the three arguments is a point in the complex plane that represents one triangle vertex. You specify any two points, passing in the third point as `[]`, and `aTriangle` computes the missing vertex for you. Here's an example showing how to compute a type A triangle whose base is on the real axis, extending from -1 to 1.

t_a = aTriangle([],-1,1)

t_a = 1×4 table Apex Left Right Type _________ ____ _____ ____ 0+3.0777i -1 1 A

The result is returned as a table, which is convenient because we'll be creating large collections of these triangles, and it is helpful to be able to refer to the different vertices of the triangles using the notation `t.Apex`, `t.Left`, and `t.Right`.

The function `showLabeledTriangles` takes a table of triangles and displays them all, with the triangle types and their sides labeled. (We'll talk more about the labeling of the sides below.)

t_b = bTriangle(t_a.Apex,t_a.Right,[]); T = [t_a ; t_b] clf showLabeledTriangles(T)

T = 2×4 table Apex Left Right Type _________ ____ _____________ ____ 0+3.0777i -1 1+0i A 0+3.0777i 1 2.618+4.9798i B vertices = -1.0000 + 0.0000i 0.0000 + 3.0777i 1.0000 + 0.0000i 1.0000 + 0.0000i 0.0000 + 3.0777i 2.6180 + 4.9798i

The markers on the sides of the triangles help us to distinguish between A and A' triangles, as well as between B and B' triangles. For example, an A triangle has the circle marker on the left side (assuming the apex is oriented at the top), whereas the A' triangle has the circle marker on the right side.

The side labels also help us confirm whether we have a correct arrangement of triangles in our Penrose tiling. Triangles are only allowed to share an edge in the Penrose P3 tiling if the side markers align together and are the same. For example, in the two triangles shown above, the square marker on the right edge of the triangle lines up with the square marker on the left edge of the B triangle. If we had used a B' triangle instead, the markers would not have been identical.

t_bp = bpTriangle(t_a.Apex,t_a.Right,[]); T = [t_a ; t_bp]; showLabeledTriangles(T)

vertices = -1.0000 + 0.0000i 0.0000 + 3.0777i 1.0000 + 0.0000i 1.0000 + 0.0000i 0.0000 + 3.0777i 2.6180 + 4.9798i

In the Penrose P3 tiling, one rhombus is made from an A and an A' triangle, and the other is made from a B and a B' triangle.

subplot(1,2,1) r1 = [ ... aTriangle([],-1,1) apTriangle([],1,-1)]; showLabeledTriangles(r1) subplot(1,2,2) r2 = [ ... bTriangle([],-1,1) bpTriangle([],1,-1)]; showLabeledTriangles(r2)

vertices = -1.0000 + 0.0000i 0.0000 + 3.0777i 1.0000 + 0.0000i 1.0000 + 0.0000i 0.0000 - 3.0777i -1.0000 + 0.0000i vertices = -1.0000 + 0.0000i 0.0000 + 0.7265i 1.0000 + 0.0000i 1.0000 + 0.0000i 0.0000 - 0.7265i -1.0000 + 0.0000i

Construction of the P3 tiling proceeds by starting with one triangle and then successively decomposing it. Each of the four types of triangles has a different rule for decomposition.

An A triangle decomposes into an A triangle and a B' triangle.

t_a = aTriangle([],-1,1); t_a_d = decomposeATriangle(t_a) clf subplot(1,2,1) showLabeledTriangles(t_a) lims = axis; subplot(1,2,2) showLabeledTriangles(t_a_d) axis(lims)

t_a_d = 2×4 table Apex Left Right Type _______________ _________ _______________ ____ -1+0i 1+0i 0.61803+1.1756i A 0.61803+1.1756i 0+3.0777i -1+0i Bp vertices = -1.0000 + 0.0000i 0.0000 + 3.0777i 1.0000 + 0.0000i vertices = 1.0000 + 0.0000i -1.0000 + 0.0000i 0.6180 + 1.1756i 0.0000 + 3.0777i 0.6180 + 1.1756i -1.0000 + 0.0000i

You can look at the very short implementation of `decomposeATriangle` to see how this is done.

```
function out = decomposeATriangle(in)
```

```
out = [ ...
aTriangle(in.Left,in.Right,[])
bpTriangle([],in.Apex,in.Left) ];
```

The smaller A triangle is determined by placing its apex at the left vertex of the input triangle and placing its left vertex at the right vertex of the input triangle.

The smaller B' triangle is determined by placing its left vertex at the apex of the input triangle and placing its right vertex at the left vertex of the input triangle.

There are similar rules for decomposing the other three types of triangles.

t_ap = apTriangle([],-1,1); t_ap_d = decomposeApTriangle(t_ap) clf subplot(1,2,1) showLabeledTriangles(t_ap) lims = axis; subplot(1,2,2) showLabeledTriangles(t_ap_d) axis(lims)

t_ap_d = 2×4 table Apex Left Right Type ________________ ________________ __________ ____ 1+0i -0.61803+1.1756i -1+0i Ap -0.61803+1.1756i 1+0i 0+3.0777i B vertices = -1.0000 + 0.0000i 0.0000 + 3.0777i 1.0000 + 0.0000i vertices = -0.6180 + 1.1756i 1.0000 + 0.0000i -1.0000 + 0.0000i 1.0000 + 0.0000i -0.6180 + 1.1756i 0.0000 + 3.0777i

t_b = bTriangle([],-1,1); t_b_d = decomposeBTriangle(t_b) clf subplot(2,1,1) showLabeledTriangles(t_b) lims = axis; subplot(2,1,2) showLabeledTriangles(t_b_d) axis(lims)

t_b_d = 3×4 table Apex Left Right Type _________________ ___________ _________________ ____ 0.23607+0i 1+0i 0+0.72654i B 0.23607+0i 0+0.72654i -0.38197+0.44903i A -0.38197+0.44903i -1+0i 0.23607+0i Bp vertices = -1.0000 + 0.0000i 0.0000 + 0.7265i 1.0000 + 0.0000i vertices = 1.0000 + 0.0000i 0.2361 + 0.0000i 0.0000 + 0.7265i 0.0000 + 0.7265i 0.2361 + 0.0000i -0.3820 + 0.4490i -1.0000 + 0.0000i -0.3820 + 0.4490i 0.2361 + 0.0000i

t_bp = bpTriangle([],-1,1); t_bp_d = decomposeBpTriangle(t_bp) clf subplot(2,1,1) showLabeledTriangles(t_bp) lims = axis; subplot(2,1,2) showLabeledTriangles(t_bp_d) axis(lims)

t_bp_d = 3×4 table Apex Left Right Type _________________ _________________ ___________ ____ -0.23607+0i 0+0.72654i -1+0i Bp -0.23607+0i 0.38197+0.44903i 0+0.72654i Ap 0.38197+0.44903i -0.23607+0i 1+0i B vertices = -1.0000 + 0.0000i 0.0000 + 0.7265i 1.0000 + 0.0000i vertices = 0.0000 + 0.7265i -0.2361 + 0.0000i -1.0000 + 0.0000i 0.3820 + 0.4490i -0.2361 + 0.0000i 0.0000 + 0.7265i -0.2361 + 0.0000i 0.3820 + 0.4490i 1.0000 + 0.0000i

Let's start with a B triangle and decompose it three times successively.

t = bTriangle([],-1,1); t = decomposeTriangles(t) t = decomposeTriangles(t) t = decomposeTriangles(t)

t = 3×4 table Apex Left Right Type _________________ ___________ _________________ ____ 0.23607+0i 1+0i 0+0.72654i B 0.23607+0i 0+0.72654i -0.38197+0.44903i A -0.38197+0.44903i -1+0i 0.23607+0i Bp t = 8×4 table Apex Left Right Type _________________ _________________ _________________ ____ 0.38197+0.44903i 0+0.72654i 0.23607+0i B 0.38197+0.44903i 0.23607+0i 0.52786+0i A 0.52786+0i 1+0i 0.38197+0.44903i Bp 0+0.72654i -0.38197+0.44903i -0.1459+0.27751i A -0.1459+0.27751i 0.23607+0i 0+0.72654i Bp -0.52786+0i -0.38197+0.44903i -1+0i Bp -0.52786+0i -0.1459+0.27751i -0.38197+0.44903i Ap -0.1459+0.27751i -0.52786+0i 0.23607+0i B t = 21×4 table Apex Left Right Type __________________ _________________ __________________ ____ 0.1459+0.27751i 0.23607+0i 0.38197+0.44903i B 0.1459+0.27751i 0.38197+0.44903i 0.23607+0.55503i A 0.23607+0.55503i 0+0.72654i 0.1459+0.27751i Bp 0.23607+0i 0.52786+0i 0.47214+0.17151i A 0.47214+0.17151i 0.38197+0.44903i 0.23607+0i Bp 0.76393+0.17151i 0.52786+0i 1+0i Bp 0.76393+0.17151i 0.47214+0.17151i 0.52786+0i Ap 0.47214+0.17151i 0.76393+0.17151i 0.38197+0.44903i B -0.38197+0.44903i -0.1459+0.27751i -0.09017+0.44903i A -0.09017+0.44903i 0+0.72654i -0.38197+0.44903i Bp 0.1459+0.27751i -0.1459+0.27751i 0.23607+0i Bp 0.1459+0.27751i -0.09017+0.44903i -0.1459+0.27751i Ap -0.09017+0.44903i 0.1459+0.27751i 0+0.72654i B -0.61803+0.27751i -0.52786+0i -0.38197+0.44903i Bp -0.61803+0.27751i -0.7082+0i -0.52786+0i Ap -0.7082+0i -0.61803+0.27751i -1+0i B -0.38197+0.44903i -0.2918+0.17151i -0.1459+0.27751i Ap -0.2918+0.17151i -0.38197+0.44903i -0.52786+0i B -0.055728+0i 0.23607+0i -0.1459+0.27751i B -0.055728+0i -0.1459+0.27751i -0.2918+0.17151i A -0.2918+0.17151i -0.52786+0i -0.055728+0i Bp

clf showLabeledTriangles(t)

vertices = 0.2361 + 0.0000i 0.1459 + 0.2775i 0.3820 + 0.4490i 0.3820 + 0.4490i 0.1459 + 0.2775i 0.2361 + 0.5550i 0.0000 + 0.7265i 0.2361 + 0.5550i 0.1459 + 0.2775i 0.5279 + 0.0000i 0.2361 + 0.0000i 0.4721 + 0.1715i 0.3820 + 0.4490i 0.4721 + 0.1715i 0.2361 + 0.0000i 0.5279 + 0.0000i 0.7639 + 0.1715i 1.0000 + 0.0000i 0.4721 + 0.1715i 0.7639 + 0.1715i 0.5279 + 0.0000i 0.7639 + 0.1715i 0.4721 + 0.1715i 0.3820 + 0.4490i -0.1459 + 0.2775i -0.3820 + 0.4490i -0.0902 + 0.4490i 0.0000 + 0.7265i -0.0902 + 0.4490i -0.3820 + 0.4490i -0.1459 + 0.2775i 0.1459 + 0.2775i 0.2361 + 0.0000i -0.0902 + 0.4490i 0.1459 + 0.2775i -0.1459 + 0.2775i 0.1459 + 0.2775i -0.0902 + 0.4490i 0.0000 + 0.7265i -0.5279 + 0.0000i -0.6180 + 0.2775i -0.3820 + 0.4490i -0.7082 + 0.0000i -0.6180 + 0.2775i -0.5279 + 0.0000i -0.6180 + 0.2775i -0.7082 + 0.0000i -1.0000 + 0.0000i -0.2918 + 0.1715i -0.3820 + 0.4490i -0.1459 + 0.2775i -0.3820 + 0.4490i -0.2918 + 0.1715i -0.5279 + 0.0000i 0.2361 + 0.0000i -0.0557 + 0.0000i -0.1459 + 0.2775i -0.1459 + 0.2775i -0.0557 + 0.0000i -0.2918 + 0.1715i -0.5279 + 0.0000i -0.2918 + 0.1715i -0.0557 + 0.0000i

Now the labels are getting in the way.

showTriangles(t)

That's better, but it's hard to visualize the rhombuses because the triangle bases are being drawn. Let's switch to a different visualization function that draws the rhombuses with colored shading and without the triangle bases.

showTiles(t)

Let's do five more levels of decomposition.

for k = 1:5 t = decomposeTriangles(t); end num_triangles = height(t)

num_triangles = 2584

Now we have lots of triangles. Let's take a look.

clf showTiles(t)

Zoom in.

axis([-0.3 0.2 0.1 0.5])

For fun, you can add to the visualization by inserting arcs or other shapes in the triangles. You just need to make everything match up across the different types of triangles. The `showDecoratedTiles` function shows one possible variation.

clf showDecoratedTiles(t) axis([-0.2 0.1 0.2 0.4])

Another interesting thing to try is to start from a pattern of multiple triangles instead of just one. You just to arrange the initial triangles so that they satisfy the side matching rules. Let's use a circular pattern of alternating A and A' triangles sharing a common apex.

t = table; for k = 1:5 thetad = 72*(k-1); t_a = aTriangle(0,cosd(thetad) + 1i*sind(thetad),[]); t_ap = apTriangle(0,t_a.Right,[]); t = [t ; t_a ; t_ap]; end showLabeledTriangles(t)

vertices = 1.0000 + 0.0000i 0.0000 + 0.0000i 0.8090 + 0.5878i 0.8090 + 0.5878i 0.0000 + 0.0000i 0.3090 + 0.9511i 0.3090 + 0.9511i 0.0000 + 0.0000i -0.3090 + 0.9511i -0.3090 + 0.9511i 0.0000 + 0.0000i -0.8090 + 0.5878i -0.8090 + 0.5878i 0.0000 + 0.0000i -1.0000 + 0.0000i -1.0000 + 0.0000i 0.0000 + 0.0000i -0.8090 - 0.5878i -0.8090 - 0.5878i 0.0000 + 0.0000i -0.3090 - 0.9511i -0.3090 - 0.9511i 0.0000 + 0.0000i 0.3090 - 0.9511i 0.3090 - 0.9511i 0.0000 + 0.0000i 0.8090 - 0.5878i 0.8090 - 0.5878i 0.0000 + 0.0000i 1.0000 + 0.0000i

t2 = t; for k = 1:4 t2 = decomposeTriangles(t2); end clf showDecoratedTiles(t2)

Get
the MATLAB code

Published with MATLAB® R2018a

I am addicted. I keep coming back to John Conway's Game of Life. Years ago, I wrote a chapter about Life in Experiments with MATLAB. I wrote a three-part series about Life shortly after I started blogging. Part 1, Part 2, Part 3. I have recently made significant enhancements to my MATLAB program for Life. I never seem to finish the code because I get diverted by actually using the program to explore the Universe. I invite you to join me. But, fair warning, you might become addicted too.... read more >>

]]>I am addicted. I keep coming back to John Conway's Game of Life. Years ago, I wrote a chapter about Life in Experiments with MATLAB. I wrote a three-part series about Life shortly after I started blogging. Part 1, Part 2, Part 3. I have recently made significant enhancements to my MATLAB program for Life. I never seem to finish the code because I get diverted by actually using the program to explore the Universe. I invite you to join me. But, fair warning, you might become addicted too.

Recall how the Game of Life works. The *universe* is an infinite, two-dimensional rectangular grid. The *population* is a collection of grid cells that are marked as *alive*. The population evolves at discrete time steps known as *generations*. At each step, the fate of each cell is determined by the vitality of its eight nearest neighbors and this rule:

- A live cell with two live neighbors, or any cell with three live neighbors, is alive at the next step.

The fascination of Life is that this deceptively simple rule leads to an incredible variety of patterns, puzzles, and unsolved problems.

Here is the core of the code. Sparse indexing makes it look easy. The graphics is done by the sparse display program, `spy`.

dbtype 69:85 life_lex

69 % Whether cells stay alive, die, or generate new cells depends 70 % upon how many of their eight possible neighbors are alive. 71 % Index vectors increase or decrease the centered index by one. 72 73 n = size(X,1); 74 p = [1 1:n-1]; 75 q = [2:n n]; 76 77 % Count how many of the eight neighbors are alive. 78 79 Y = X(:,p) + X(:,q) + X(p,:) + X(q,:) + ... 80 X(p,p) + X(q,q) + X(p,q) + X(q,p); 81 82 % A live cell with two live neighbors, or any cell with 83 % three live neigbhors, is alive at the next step. 84 85 X = (X & (Y == 2)) | (Y == 3);

So how, exactly, does an infinite universe work? The same question is asked by astrophysicists and cosmologists about our own universe. Over the years, I have offered several MATLAB Game of Life programs that have tackled this question in different ways. I finally have a solution that I like.

In all my programs, the population is represented by a *sparse matrix*. This means that the storage requirement, and the execution time, are proportional to the size of the population, not to the size of some grid. The sparse matrix readily accommodates a growing population. There are no boundaries and consequently no boundary conditions.

But how do I *display* an infinite universe? In the past I've had a window that is a snapshot of the entire population. But many interesting Life configurations throw off isolated *gliders*. These are five-celled objects that move forever in one direction. (Think of Voyager I.) A display window that includes all of the gliders must expand accordingly. This means the rest of the population, which often remains bounded, is relegated to a shrinking fraction of the window. The solution is to let isolated gliders leave the display window and count them as they go.

My first example is a single *glider*. At each step two cells die, and two new ones are born. After four steps the original population reappears, but it has moved diagonally up and across the grid. It moves in this direction forever, continuing to exist in the infinite universe. This animated gif shows every step for the first eight steps, then captures every fourth step. When the glider reaches the edge of the window, it is counted and disappears from view. The clock continues to run, even though there is nothing to see.

My latest program is called `life_lex`. The starting populations for `life_lex` are obtained from the Life Lexicon, a valuable Web resource cataloging well over a thousand terms related to the Game of Life. The Lexicon now includes almost seven hundred starting populations. The supporting community is very active -- many discoveries have been made in the last few years. You can see the entire collection by selecting "slide show" in the `life_lex` menu. The show lasts about 11 minutes.

The Lexicon is available in two forms. The text form is used by `life_lex` and is distributed with it as `lexicon.txt`. The HTML form, which is extensively cross linked, is available on the Web. The `lexicon` toggle provides the link.

Named after the guy who discovered Uranus in 1781. As to *how* it came to be named after him, you will have to read the Lexicon. It is one of the most frequent features occurring in larger populations. It starts with a *heptomino*, a population of size 7. The population grows to over 100 before it reaches a steady state of size 24 at time 128. It creates two gliders in the process.

Here are the first eight generations and every fourth one after that.

The story behind the name of this configuration is very technical. "Fx" means the signal moves forward and produces a mirror-image output. The "77" comes from the fact that a pair of Herschels appear at time 77. Notice that if you cut the window in half horizontally, the two halves have very similar subpopulations. The population count begins at 67 at time zero, reaches a maximum of 294 at time 179, and eventually stabilizes at 104 for time greater than 206.

Fx77 creates seven gliders. I've been using it to test my glider counting code. It is a good example of a population that remains bounded in space except for its gliders. Here is every tenth generation.

There are many flavors of *soup*. This example is a massive population that is unbounded and does not produce any gliders. The Lexicon informs us that it is an unusual mirror-symmetric soup that produces a *pufferfish* and nothing else.

Here is a screen shot of my latest program, `life_lex`. The toggles let you select the display time for a generation -- single step, slow, or fast. You can page forward or backward through the Lexicon. You can access this blog post. You can select from a menu of more than a dozen suggestions, populations that I have found especially interesting.

Here is the current list in the suggestions menu. It's subject to change as I come across other goodies.

'random' 'pre-block' 'blinker' 'glider' 'Gosper glider gun' 'Gabriel''s p138' 'Snark' 'spacefiller' 'Fx77' 'against the grain' 'soup' 'block and glider' 'lightspeed' 'volatility' 'gliders by the dozen' 'total aperiodic' .... 'Noah''s ark' 'slide show'

I will include `life_lex` in a new release, 3.8, of Cleve's Laboratory.

In case you want `life_lex` by itself, I will submit it separately to https://www.mathworks.com/matlabcentral/fileexchange/69383-game-of-life. The submission includes a copy of `Lexicon.txt`.

Get
the MATLAB code

Published with MATLAB® R2018a

Two months ago I wrote a blog post about Teaching Calculus to a Deep Learner. We wrote the code for that post in one afternoon in the MathWorks booth at the SIAM Annual Meeting. Earlier that day, during his invited talk, MIT Professor Gil Strang had spontaneously wondered if it would possible to teach calculus to a deep learning computer program. None of us in the booth were experts in deep learning.... read more >>

]]>Two months ago I wrote a blog post about Teaching Calculus to a Deep Learner. We wrote the code for that post in one afternoon in the MathWorks booth at the SIAM Annual Meeting. Earlier that day, during his invited talk, MIT Professor Gil Strang had spontaneously wondered if it would possible to teach calculus to a deep learning computer program. None of us in the booth were experts in deep learning.

But MathWorks does have experts in deep learning. When they saw my post, they did not hesitate to suggest some significant improvements. In particular, Conor Daly, in our MathWorks UK office, contributed the code for the following post. Conor takes up the Gil's challenge and begins the process of learning about derivatives.

We are going to employ two different neural nets, a convolutional neural net, which is often used for images, and a recurrent neural net, which is often used for sounds and other signals.

Is a derivative more like an image or a sound?

Here are the functions and derivatives that we are going to consider.

F = {@(x) x, @(x) x.^2, @(x) x.^3, @(x) x.^4, ... @(x) sin(pi*x), @(x) cos(pi*x) }; dF = { @(x) ones(size(x)), @(x) 2*x, @(x) 3*x.^2, @(x) 4*x.^3, ... @(x) pi.*cos(pi.*x), @(x) -pi*sin(pi*x) }; Fchar = { 'x', 'x^2', 'x^3', 'x^4', 'sin(\pi x)', 'cos(\pi x)' }; dFchar = { '1', '2x', '3x^2', '4x^3', '\pi cos(\pi x)', '-\pi sin(\pi x)' };

Set some parameters. First, the random number generator state.

rng(0)

A function to generate uniform random variables on [-1, 1].

randu = @(m,n) (2*rand(m,n)-1);

A function to generate random +1 or -1.

randsign = @() sign(randu(1,1));

The number of functions.

m = length(F);

The number of repetitions, i.e. independent observations.

n = 500;

The number of samples in the interval.

nx = 100;

The white noise level.

noise = .001;

Generate the training set predictors `X` and the responses `T`.

X = cell(n*m, 1); T = cell(n*m, 1); for j = 1:n x = sort(randu(1, nx)); for i = 1:m k = i + (j-1)*m; % Predictors are x, a random vector from -1, 1, and +/- f(x). sgn = randsign(); X{k} = [x; sgn*F{i}(x)+noise*randn(1,nx)]; % Responses are +/- f'(x) T{k} = sgn*dF{i}(x)+noise*randn(1,nx); end end

Separate the training set from the test set.

idxTest = ismember( 1:n*m, randperm(n*m, n) ); XTrain = X( ~idxTest ); TTrain = T( ~idxTest ); XTest = X( idxTest ); TTest = T( idxTest );

Choose some test indices to plot.

iTest = find( idxTest ); idxM = mod( find(idxTest), m ); idxToPlot = zeros(1, m); for k = 0:(m-1) im = find( idxM == k ); if k == 0 idxToPlot(m) = im(1); else idxToPlot(k) = im(1); end end

Re-format the data for CNN.

[XImgTrain, TImgTrain] = iConvertDataToImage(XTrain, TTrain); [XImgTest, TImgTest] = iConvertDataToImage(XTest, TTest);

Here are the layers of the CNN architecture. Notice that the `'ReLU'`, or "rectified linear unit", that I was so proud of in my previous post has been replaced by the more appropriate `'leakyRelu'`, which does not completely cut off negative values.

layers = [ ... imageInputLayer([1 nx 2], 'Normalization', 'none') convolution2dLayer([1 5], 128, 'Padding', 'same') batchNormalizationLayer() leakyReluLayer(0.5) convolution2dLayer([1 5], 128, 'Padding', 'same') batchNormalizationLayer() leakyReluLayer(0.5) convolution2dLayer([1 5], 1, 'Padding', 'same') regressionLayer() ];

Here are the options for CNN. The solver is `'sgdm'`, which stands for "stochastic gradient descent with momentum".

options = trainingOptions( ... 'sgdm', ... 'MaxEpochs', 30, ... 'Plots', 'training-progress', ... 'MiniBatchSize', 200, ... 'Verbose', false, ... 'GradientThreshold', 1, ... 'ValidationData', {XImgTest, TImgTest} );

Train the network. This requires a little over 3 minutes on my laptop. I don't have a GPU.

convNet = trainNetwork(XImgTrain, TImgTrain, layers, options);

Here are plots of randomly selected results. The limits on the y axes are set to the theoretical max and min. Three of the six plots have their signs flipped.

PImgTest = convNet.predict( XImgTest ); for k = 1:m subplot(3, 2, k); plot( XImgTest(1, :, 1, idxToPlot(k)), TImgTest(1, :, 1, idxToPlot(k)), '.' ) plot( XImgTest(1, :, 1, idxToPlot(k)), PImgTest(1, :, 1, idxToPlot(k)), 'o' ) title([ '(' Fchar{k} ')'' = ' dFchar{k} ] ); switch k case {1,2}, set(gca,'ylim',[-2 2]) case {3,4}, set(gca,'ylim',[-k k],'ytick',[-k 0 k]) case {5,6}, set(gca,'ylim',[-pi pi],'ytick',[-pi 0 pi], ... 'yticklabels',{'-\pi' '0' '\pi'}) end end

Here are the layers of the RNN architecture, including `'bilstm'` which stands for "bidirectional long short-term memory."

```
layers = [ ...
sequenceInputLayer(2)
bilstmLayer(128)
dropoutLayer()
bilstmLayer(128)
fullyConnectedLayer(1)
regressionLayer() ];
```

Here are the RNN options. `'adam'` is not an acronym; it is an extension of stochastic gradient descent derived from adaptive moment estimation.

options = trainingOptions( ... 'adam', ... 'MaxEpochs', 30, ... 'Plots', 'training-progress', ... 'MiniBatchSize', 200, ... 'ValidationData', {XTest, TTest}, ... 'Verbose', false, ... 'GradientThreshold', 1);

Train the network. This takes almost 22 minutes on my machine. It makes me wish I had a GPU.

recNet = trainNetwork(XTrain, TTrain, layers, options);

PTest = recNet.predict( XTest ); for k = 1:m subplot(3, 2, k); plot( XTest{idxToPlot(k)}(1,:), TTest{idxToPlot(k)}(1,:), '.' ) plot( XTest{idxToPlot(k)}(1,:), PTest{idxToPlot(k)}(1,:), 'o' ) title([ '(' Fchar{k} ')'' = ' dFchar{k} ] ); switch k case {1,2}, set(gca,'ylim',[-2 2]) case {3,4}, set(gca,'ylim',[-k k],'ytick',[-k 0 k]) case {5,6}, set(gca,'ylim',[-pi pi],'ytick',[-pi 0 pi], ... 'yticklabels',{'-\pi' '0' '\pi'}) end end

function [XImg, TImg] = iConvertDataToImage(X, T) % Convert data to CNN format % Re-format data for CNN XImg = cat(4, X{:}); XImg = permute(XImg, [3 2 1 4]); TImg = cat(4, T{:}); TImg = permute(TImg, [3 2 1 4]); end

I used to teach calculus. I have been critical of the way calculus is sometimes taught and more often learned. Here is a typical scenario.

*Instructor*: What is the derivative of $x^4$?

*Student*: $4x^3$.

*Instructor*: Why?

*Student*: You take the $4$, put it in front, then subtract one to get $3$, and put that in place of the $4$ . . .

I am afraid we're doing that here. The learner is just looking for patterns. There is no sense of *velocity*, *acceleration*, or *rate of change*. The is little chance of differentiating an expression that is not in the training set. There is no *product rule*, no *chain rule*, no *Fundamental Theorem of Calculus*.

In short, there is little *understanding*. But maybe that is a criticism of machine learning in general.

Get
the MATLAB code

Published with MATLAB® R2018b

This week’s post is by Reece Teramoto.... read more >>

]]>This week’s post is by Reece Teramoto.

MathWorks recently took over sponsorship of the Math Modeling Challenge, a contest for high school juniors and seniors in the U.S organized by SIAM – Society for Industrial and Applied Mathematics. https://m3challenge.siam.org. Students work in teams to solve real-world problems that professional mathematicians face. Teams are given 14 hours to complete a three-part problem using math modeling and put together a paper showing their results. Teams can choose to use any software or language (or none) to do their mathematical computations.

We decided to put together a set of sample solutions to this past year’s problem done with MATLAB.

This past year’s challenge focused on the problem of wasted food. Students were challenged to perform computations to analyze the amount of wasted food and apply the results to come up with solutions of feeding the millions of food-insecure individuals in the country. You can view the full problem here and the winning solutions here.

In this post, I’ll be walking through a brief overview of each part of the problem as well as a summary of a sample solution put together in MATLAB. The corresponding live script for each example will demonstrate how to do the following in MATLAB:

- Import data
- Process and model data
- Visualize the results (charts, graphs, curve fitting, plots)
- Make meaningful conclusions from the results

While all three parts of the problem demonstrate the steps above, the three parts will each focus on a different tool in MATLAB. We will focus on the Import Tool, Curve Fitting app, and Mapping Toolbox.

Before reading through the rest of this post, please download the full MATLAB examples from File Exchange so that you may refer to them.

- Just Eat It! (Import Tool)
- Food Foolish? (Curve Fitting app)
- Hunger Game Plan? (Mapping Toolbox)
- Want to participate in the challenge?

You can find the full example for this problem by opening the file `Just_Eat_It.mlx` from the archive linked to at the beginning of this blog post.

For this part of the problem, students are to create a mathematical model that a state could use to determine if it could feed its food-insecure population using the wasted food generated in that state. Students are provided with sample data for food waste and insecurity in the state of Texas.

This is one way to approach this problem, employing technical computing skills:

- Calculate how many dollars of food the state consumers waste.
- Calculate how many dollars are needed to feed all the food-insecure individuals in the state.
- Compare the two values to determine how much of the state’s food-insecure population can be fed by using the wasted food of the state’s consumers.

First, we can use the MATLAB Import Tool to import the provided spreadsheet data and save everything to tables. This import can be performed with a few clicks:

Once we have our data in MATLAB, we can use many built-in MATLAB functions to organize and summarize our data.

From the data, we can calculate the dollar value of wasted food in Texas for different categories, which we can visualize:

bar(foodWasteTable.TypeofFood, foodWasteTable.WastedDollars); ytickformat('usd') xlabel('Type of Food') ylabel('Amount Wasted') title('Dollars of Wasted Food in Texas (2016)') % change the y-axis to be in the form of n x 10^6 (millions) ax = gca; ax.YAxis.Exponent = 6;

% sort from smallest to largest food waste foodWasteTable = sortrows(foodWasteTable,'WastedDollars'); % combine the 3 smallest categories and label them as "other" p = pie([sum(foodWasteTable.WastedDollars(1:3)); foodWasteTable.WastedDollars(4:end)]); legend(["other"; foodWasteTable.TypeofFood(4:end)], 'Location', 'eastoutside') title('Categories of Wasted Food')

As one might expect, the two largest types of wasted food are meat and produce. After further calculations, we find that the value of wasted food in Texas is about $2.3 billion annually.

The second step is finding the dollars are needed to feed all the food-insecure individuals. Feeding America performs this calculation here. We can easily translate this into MATLAB to find that the annual cost to feed food insecure individuals in Texas is about $2.2 billion.

For the final comparison, and based on our assumptions noted in the file `Just_Eat_It.mlx`, we determined that we can feed all the food-insecure individuals in Texas that by repurposing wasted food in the state.

You can find the full example for this problem by opening the file `Food_Foolish.mlx` from the archive linked to at the beginning of this blog post.

Personal choices when it comes to food consumption primarily occur at the grocery store, school cafeteria, restaurants, and at home. For this part of the problem, students are to create a mathematical model that can be used to determine the amount of food waste a household generates in a year based on their traits and habits. The model should then be applied to the following households:* *

- Single parent with a toddler, annual income of $20,500
- Family of four (two parents, two teenage children), annual income of $135,000
- Elderly couple, living on retirement, annual income of $55,000
- Single 23-year-old, annual income of $45,000

We will use the Curve Fitting App in MATLAB to find the relationship between household income and wasted food. Our approach is as follows:

- Gather data for household incomes and wasted food. We can think of each data point as a point on an XY-plane, where the X-axis is household income and the Y-axis is wasted food.
- Use the Curve Fitting App to generate a function that models the trend.
- Visualize this relationship.
- Evaluate the curve at the income levels of the sample family data.

The U.S. Bureau of Labor Statistics has collected data on annual food expenditures for individuals of varying income. You can find the data in this spreadsheet. We can import this data into MATLAB using the Import Tool and store the data in a table.

Once we have the data points, we can use the Curve Fitting App to generate the curve fit. The app makes it easy to quickly try out different fits of a curve to the data points and see the immediate result of each fit:

Once we have our fitted curve, we can pass in the sample family income data:

sampleIncome = [20500 135000 55000 45000]'; sampleWaste = foodWasteCurve(sampleIncome); families = {'Single parent with a toddler', 'Family of four (two parents, two teenage children)', ... 'Elderly couple, living on retirement','Single 23-year-old'}; solution = table(sampleIncome,sampleWaste, 'RowNames', families)

We can also plot both a bar graph and line graph of the trend of food waste for various incomes. Then, we can plot points for the 4 sample families given in the problem statement. This graph makes it easy to visualize the relationship between household income and annual dollars of wasted food. Since we assumed that food waste depends *only* on household income, we have a solution for this part of the problem.

% bar graph bar(incomeRange, wasteDemo) hold on % line graph plot(incomeRange, wasteDemo, 'LineWidth', 3, 'Color', 'green') % plot points of the 4 families from the problem statement scatter(solution.sampleIncome, solution.sampleWaste, 'Marker', '*', 'LineWidth', 5, 'MarkerEdgeColor', 'red') hold off xlabel('Income (USD)') ylabel('Annual Food Waste (USD)') title ('Food Waste by Income')

** **You can find the full example for this problem by opening the file `Hunger_Game_Plan.mlx` from the archive linked to at the File Exchange <link>.

For this problem, students were to use mathematical modeling to provide insight on a strategy that a local community could use to repurpose wasted food with minimal cost.

Our approach was as follows:

- Use Mapping Toolbox to display location of two dozen grocery stores, including all the Stop and Shops, in Boston.
- Display the location of The Greater Boston Food Bank.
- Find the shortest path that starts at the food bank, visits all the grocery stores to collect otherwise-wasted produce from the past week, and ends at the food bank.
- Calculate the trucking costs and dollars of saved produce from the past week.

Since Stop and Shop is the most popular grocery store chain in New England and The Greater Boston Food Bank is an organization that MathWorks supports, our approach for this problem was to find the shortest path that a food collection truck could start at The Greater Boston Food Bank, visit each Stop and Shop and a few other stores in the Boston area to collect otherwise-wasted produce, and return to The Greater Boston Food Bank. We want the shortest path since this would minimize the cost of operating the truck.

This particular scenario is known in computer science as the Travelling Salesman Problem (TSP), in which a salesman starts from their home city and wants to visit multiple cities and return home in the shortest possible path.

Mapping Toolbox lets us plot points on a map given their latitude and longitude values. We start by displaying all the Stop and Shop locations in the Boston area (identified by a green marker). The coordinates were obtained from Google Maps.

This TSP with only 13 locations is not much of a challenge, so we add 11 more locations (shown in dark red) chosen at random, as well as The Greater Boston Food Bank (indicated in blue).

wmmarker(Lats(1:13), Lons(1:13), 'Icon', 'stopandshoplogo.jpg') wmmarker(Lats(14:24), Lons(14:24), 'color', darkred) wmmarker(Lats(25), Lons(25), 'color', ‘blue’)

The previous Cleve’s Corner post described a near-optimal solution of the Travelling Salesman Problem.. We can pass in the coordinates to traveler to obtain the shortest path between them. We can view this path using the Mapping Toolbox as well:

Lats = Lats(path); Lons = Lons(path); wmline(Lats, Lons)

Although the path returned does not consider the exact path of the roads, we still get a good estimation of the shortest path and the distance traveled. The expense of operating an 18-wheeler is a little over \$1 per mile, we can calculate the total cost of gas for this trip—around \$30.

Additionally, we can use some data from the National Resources Defense Council to calculate that the average grocery store in the U.S. wastes about \$7500 per week in fruits and vegetables. This means that by completing this round trip, we will salvage almost \$200,000 worth of otherwise-wasted fruits and vegetables from just these locations in the Boston area from the past week. Pretty good!

Hopefully this has given you a glimpse at how easy MATLAB makes it to analyze, visualize, and perform computations on data. To view the full MATLAB solutions for the above problems, please download the files from File Exchange.

Additionally, if you know of any high school juniors and seniors who would be interested in the challenge, encourage them to participate! The challenge is free to enter and there are $100,000 in scholarships at stake. The challenge is done completely online and there are plenty of reference materials to help teams prepare. For more information, please visit the following link:

]]>Find the optimum traveling salesman tour through the capitals of the 48 contiguous states in the USA.... read more >>

]]>Find the optimum traveling salesman tour through the capitals of the 48 contiguous states in the USA.

I am continuing with the workspace from my previous blog post.

John Burkardt provides a second data set, state_capitals_ll.txt, giving the longitudes and latitudes of capitals of the 48 contiguous states. (Ignore AK, DC, HI, PR and US.)

[lon,lat] = get_capital_coordinates;

When I use these coordinates for the graph I created in my previous post, we get a more accurate picture of the USA.

G = graph(A,cellstr(names)); plot(G,'xdata',lon,'ydata',lat);

The travelling salesman problem, the TSP, was mathematically formulated in the 19th century. The problem is to find the closed circuit of a list of cities that travels the shortest total distance.

For 25 years MATLAB releases have included a simple demo program named `travel` that finds an approximate solution to the TSP through a few dozen randomly chosen points. The background of the plot, an outline the USA, is just for artistic effect.

I've made the `travel` demo into a function named `traveler` whose input is the distance matrix `D`, the pairwise distances between the cities. The output is a path length `miles` and permutation vector `p` so that `miles` is the length of the route produced by `lon(p)` and `lat(p)`.

Each time it is called, `travel` starts with a random permutation of the cities. Then, for several thousand iterations, it tries to reduce the length of the trip by revising the permutation in each of two simple ways. One scheme is to move one city to another position in the ordering. This example starts with the order `[1:7]`. It then moves city `4` to follow city `5`, so the order becomes `[1 2 3 5 4 6 7]`. This reduces the length from `9.71` to `7.24`.

half_fig traveler_scheme1

The second scheme is based on the observation that any time a route crosses itself, the length can be reduced by reversing the crossed segment. We don't actually look for crossings, we just try reversing random chunks of the ordering. This example reverses the `[3 4 5]` in `[1:7]` to get `[1 2 5 4 3 6 7]`, thereby reducing the length from `9.47` to `7.65`.

traveler_scheme2

These two heuristics are not powerful to guarantee that `traveler` will find the global shortest path. It's easy to get stuck in local minima. And the random path approach means that `traveler` is inefficient for more than a few dozen cities.

But nobody knows how to find a guaranteed shortest path for the TSP and the code for `traveler` is only 58 lines. I will include the code at the end of this post and in Cleve's Laboratory on the MATLAB Central File Exchange.

close

The input to `traveler` is the distance matrix.

D = distances(lon,lat);

The inputs to the function `distances` are the vectors `lon` and `lat`, the longitude and latitude of the cities. The output is the 48-by-48 matrix `D`, the great circle distance between pairs of cities. With some spherical trigonometry, `lon` and `lat`, which are essentially angles measured in degrees, are converted to miles, the lengths of "as the crow flies" geodesics on the surface of the earth. The radius of the earth is a scale factor. Here is the core of the function.

dbtype 11:20 distances.m

11 R = 3959; % radius of the earth (mi.) 12 n = length(lon); 13 D = zeros(n,n); 14 for k = 1:n 15 for j = 1:k-1 16 D(k,j) = R*acos(sind(lat(k))*sind(lat(j)) + ... 17 cosd(lat(k))*cosd(lat(j))*cosd(lon(k)-lon(j))); 18 D(j,k) = D(k,j); 19 end 20 end

Which two capitals are nearest each other?

Dmin = min(min(D(D>0))) [k,j] = find(D == Dmin) cities = names(k)

Dmin = 34.9187 k = 37 19 j = 19 37 cities = 2×1 string array "RI" "MA"

Providence, Rhode Island and Boston, Massachusetts ("our fair city") are less than 35 miles apart.

What about the other extreme?

Dmax = max(max(D)) [k,j] = find(D == Dmax) cities = names(k)

Dmax = 2.6629e+03 k = 17 4 j = 4 17 cities = 2×1 string array "ME" "CA"

As we might have expected, the capitals of Maine and California are the farthest apart, 2663 miles. That would require one prodigious crow.

Based on some experience that I will describe shortly, I am going to set the random number seed to `347` before we take our road trip with `traveler`.

rng(347) [miles,p] = traveler(D); miles = round(miles)

miles = 10818

That's an average of

avg = miles/48

avg = 225.3750

miles per leg.

Let's highlight our graph of neighboring states with our traveling salesman path linking capital cities.

Gp = plot(G,'xdata',lon,'ydata',lat,'edgecolor',turquoise); highlight(Gp,p,'edgecolor',darkred,'linewidth',2) title(miles)

There are four missing links. The path goes from UT to MT, but those states are not neighbors. We must go through ID. And the leg from FL to SC goes through GA. There are two more in the northeast. Fill in those missing links with dashed lines.

missing_links(A,lon,lat,p,darkred)

Zoom in on the northeast.

axis([-84 -68 33 46])

Here the state abbreviations, ordered by this path.

fprintf(fmt,names(p))

LA TX OK NM AZ NV CA OR WA ID MT UT CO WY SD ND MN WI MI OH WV PA NY VT ME NH MA RI CT NJ DE MD VA NC SC FL AL GA TN KY IN IL IA NE KS MO AR MS LA

It is interesting to look at the last step. After 3298 iterations, `travel` arrives at this situation with a path length of 10952.

The link connecting WV to VA crosses the link connecting NC to PA. It takes `travel` 222 more iterations, but eventually a permutation that reverses the path from VA and PA is tried. This connects WV to PA and NC to VA to produce the path shown in the previous plot. The total path length is decreased by 134 miles to 10818.

I am confident that we have found the shortest path, but I don't have a proof.

I ran 1000 different initializations of `rng`. Eight runs produced the path with length 10818 that we found with `rng(347)`. None of the runs found a shorter path. Sixteen more runs found a quite different path with length 10821, only three miles longer. Here is this next best path.

I must include an animated gif here. This initially shows every sixth step. It switches to every step when the path length is less than 12000. Few legs in random paths match edges in the neighbor graph so initially most lines are dashed. As we near the minimum, more solid lines appear.

Chances that, without help, `traveler` will find the shortest route through the 48 capital cities are less than 1 in 100. Here is your opportunity to help guide the search. The `traveler_game` is a modification of `travel` that plots every successful step and that provides controls so that you to back up a few steps and try the random strategy again.

I will include the `traveler_game` in the next update, version 3.70, of Cleve's Laboratory on the MATLAB Central File Excange. That may take a few days.

There are five push buttons.

- > Take one minimization step.
- >> Take repeated steps, until a local minimum is reached.
- < Reverse one step.
- << Reverse many steps.
- ^ Start over in a new figure window.

Here are four typical runs.

Here are the codes for `traveler`, `distances`, and `path_length`.

type traveler type distances type path_length

function [len,p] = traveler(D) %TRAVELER Functional form of old MATLAB demo, "travel". % A pretty good, but certainly not the best, solution of the % Traveling Salesman problem. Form a closed circuit of a % number of cities that travels the shortest total distance. % % [len,p] = traveler(D). % Input: D = distances between cities with coordinates x and y. % Output: p a permutation, x(p) and y(p) is a path with length len. % Copyright 1984-2018 The MathWorks, Inc. n = size(D,1); p = randperm(n); len = path_length(p,D); for iter = 1:10000 % Try to reverse a portion. pt1 = floor(n*rand)+1; pt2 = floor(n*rand)+1; lo = min(pt1,pt2); hi = max(pt1,pt2); q = 1:n; q(lo:hi) = q(hi:-1:lo); pnew = p(q); lennew = path_length(pnew,D); if lennew < len p = pnew; len = lennew; iterp = iter; end % Try a single point insertion pt1 = floor(n*rand)+1; pt2 = floor((n-1)*rand)+1; q = 1:n; q(pt1) = []; q = [q(1:pt2) pt1 q((pt2+1):(n-1))]; pnew = p(q); lennew = path_length(pnew,D); if lennew < len p = pnew; len = lennew; iterp = iter; end end % Close the permutation. p(end+1) = p(1); end function D = distances(lon,lat) % D = distances(lon,lay) % Input: vectors lon and lat, the longitude and latitude of the cities. % Output: D(k,j) is the distance (in miles) between cities k and j. % Copyright 1984-2018 The MathWorks, Inc. % Great circle distance matrix between pairs of cities. % https://en.wikipedia.org/wiki/Great-circle_distance. R = 3959; % radius of the earth (mi.) n = length(lon); D = zeros(n,n); for k = 1:n for j = 1:k-1 D(k,j) = R*acos(sind(lat(k))*sind(lat(j)) + ... cosd(lat(k))*cosd(lat(j))*cosd(lon(k)-lon(j))); D(j,k) = D(k,j); end end end function len = path_length(p,D) % len = path_length(p,D) % Calculate current path length for traveling salesman problem. % This function calculates the total length of the current path % p in the traveling salesman problem. % % This is a vectorized distance calculation. % % Create two vectors: p and q = p([n 1:(n-1)]). % The first is the list of first cities in any leg, and the second % is the list of destination cities for that same leg. % (the second vector is an element-shift-right version of the first) % % Use column indexing into D to create a vector of % lengths of each leg which can then be summed. n = length(p); q = p([n 1:(n-1)]); len = sum(D(q + (p-1)*n)); end

Get
the MATLAB code

Published with MATLAB® R2018a

Create a MATLAB® graph object from information about neighbors among the 48 contiguous states in the USA.... read more >>

]]>Create a MATLAB® graph object from information about neighbors among the 48 contiguous states in the USA.

A few years ago, MathWorks asked Tom and Ray Magliozzi, MIT alumni and hosts of the popular show Car Talk on National Public Radio, to come to our offices and meet some visiting students. They said yes on one condition. They figured MathWorkers would be a good source of new material for the puzzler segment of their show. It turned out that we had only one puzzle to offer that they hadn't heard before. But one was enough, and they came for the visit.

Here is that puzzle.

*
My friend tells me that she is about to take a road trip.
She plans to visit each of the 48 contiguous states in the
USA exactly once. She tells me she is going to start someplace
in the West and doesn't tell me where she is going to finish.
However, I am able to agree to meet her in the last state she
will visit. Which state is that?
*

I will reveal the state later in this post, but try to answer the puzzle before I do. (Readers who are not familiar with USA geography don't have to worry -- few Americans can provide the answer.)

The web site of John Burkardt at Florida State University is a tremendous repository of software, data sets and other goodies. I am going to use state neighbors. The first column in this file provides the two-letter abbreviations of the names of the 48 contiguous states.

S = get_state_neighbors; names = S{:,1}; fmt = [repmat('%s ',1,16) '\n']; fprintf(fmt,names)

AL AZ AR CA CO CT DE FL GA ID IL IN IA KS KY LA ME MD MA MI MN MS MO MT NE NV NH NJ NM NY NC ND OH OK OR PA RI SC SD TN TX UT VT VA WA WV WI WY

The list is ordered alphabetically by the full state name, so Wyoming is last.

last = names(end)

last = "WY"

Burkardt's data also gives us the neighbors of each state. We can form an adjacency matrix `A` where `A(i,j)` is one if states `i` and `j` share a common border and zero otherwise.

A = adjacency(S);

For example, Wyoming has six neighbors.

last_neighbors = names(A(end,:))

last_neighbors = 6×1 string array "CO" "ID" "MT" "NE" "SD" "UT"

Every good adjacency deserves a `spy` plot.

spy(A)

The last row (and column) has six nonzeros for Wyoming's six neighbors.

Which states have the most neighbors?

num_neigh = sum(A); max_num_neigh = max(num_neigh) max_neigh = names(num_neigh == max_num_neigh)

max_num_neigh = 8 max_neigh = 2×1 string array "MO" "TN"

Missouri and Tennessee each have eight neighbors.

Which state has the fewest neighbors?

min_num_neigh = min(num_neigh) min_neigh = names(num_neigh == min_num_neigh)

min_num_neigh = 1 min_neigh = "ME"

Maine has only one neighbor, namely New Hampshire. This provides the answer to the MathWorks Car Talk puzzle. My friend's trip must either start or finish in in Maine. Once she's in Maine, she can't leave without going through New Hampshire. Since she is starting in the West, she must finish in Maine.

Every adjacency matrix also has an associated graph.

G = graph(A,cellstr(names))

G = graph with properties: Edges: [105×1 table] Nodes: [48×1 table]

The default layout for plotting this graph is `force`, which uses attractive and repulsive forces on the nodes.

plot(G)

Interchange the axes by switching the viewpoint.

view(-90,-90)

The plot now resembles the United States. California, Oregon and Washington are on the west coast. Florida is a peninsula in the southeast. The New England states dominate the northeast. MO and TN show off their many neighbors.

Utah, Arizona, New Mexico and Colorado are the corners of one of two quadrilaterals in the graph, mimicking their actual meeting at Four Corners. The other quadrilateral involves the four states that surround Lake Michigan.

All this without any coordinates, just knowing the connectivity of neighboring states.

My next post will add the coordinates of the state capitals and tackle the classic Traveling Salesman Problem.

Thanks to John Burkhardt for his data sets and for pointing out that the wording of the puzzle needs to preclude starting in Maine.

Get
the MATLAB code

Published with MATLAB® R2018a

I probably first saw this matrix in 1960 in John Todd's class at Caltech. But I forgot about it until Tahar Loulou jogged my memory with a comment following my blog post in late May. It deserves a place in our gallery of interesting matrices.... read more >>

]]>I probably first saw this matrix in 1960 in John Todd's class at Caltech. But I forgot about it until Tahar Loulou jogged my memory with a comment following my blog post in late May. It deserves a place in our gallery of interesting matrices.

I'm afraid that I don't know very much about T. S. Wilson himself. The only paper we could find on the Internet is the joint paper about pendulum dampers listed in the references. This paper, which does not mention the matrix, was published in 1940. The authors were from a British company called D. Napier & Son Ltd. The earliest appearance of the matrix that we know of is the 1946 paper by J. Morris referenced below. It is possible that by this time Wilson and Morris were colleagues at the Royal Aircraft Establishment in Farnborough in the UK.

I think that Wilson and Todd must have known each other. Todd acknowledges Wilson in several papers and books.

One of the references we used in Todd's class is the booklet written by Marvin Marcus also referenced below. The booklet has a short section collecting some "particular" matrices. A_13 is our old friend the Hilbert matrix and A_14 is Wilson's matrix.

Here is the matrix. Let's use the Live Editor® and the Symbolic Math Toolbox®.

W = sym([5 7 6 5; 7 10 8 7; 6 8 10 9; 5 7 9 10]);

The first noteworthy property is the determinant. The determinant has to be an integer because all the matrix elements are integers.

d = det(W);

This is the smallest possible determinant for a nonsingular matrix with integer entries.

Since the determinant is one, the inverse must also have integer entries.

Winv = round(inv(W));

The matrix is symmetric. Let's use `chol` to verify that it is positive definite.

try R = chol(W); catch disp("Not SPD.") end

Since `chol` was successful we know that `W` is positive definite. Factor out the diagonal to get a nice rational $LDL^T$ factorization.

D = diag(diag(R)); L = R'/D; D = D^2; LDLT = L*D*L';

The characteristic polynomial is a quartic with integer coefficients.

```
p = charpoly(W,'lambda');
```

Since the degree of *p* is less than 5, the exact eigenvalues can be expressed in closed form.

e = eig(W);

We asked for it. Like it or not, that's the exact answer. Since `p` does not factor nicely over the rationals, the closed form is intimidating. It's hard to tell which eigenvalue is the largest, for example. Notice that even though the eigenvalues are real, the formulae involve complex numbers whose imaginary parts necessarily cancel.

For the remainder of this article, let's revert to floating point arithmetic.

W = double(W)

W = 5 7 6 5 7 10 8 7 6 8 10 9 5 7 9 10

Now the computed eigenvalues are approximate, but more useful.

e = eig(W)

e = 0.0102 0.8431 3.8581 30.2887

Is this matrix poorly conditioned?

condW = cond(W)

condW = 2.9841e+03

So the condition is "only" about 3000. That may not seem to be very badly conditioned. But let's look more closely. Set up a linear system where we know the exact solution.

xact = ones(4,1); b = W*xact

b = 23 32 33 31

And solve it.

x = W\b

x = 1.0000 1.0000 1.0000 1.0000

We expect to get all ones, and we do. But now perturb the right-hand side a little.

delta_b = [.01 -.01 .01 -.01]' b_prime = b + delta_b

delta_b = 0.0100 -0.0100 0.0100 -0.0100 b_prime = 23.0100 31.9900 33.0100 30.9900

This perturbation changes the solution drastically.

x_prime = W\b_prime

x_prime = 1.8200 0.5000 0.8100 1.1100

I want to argue that Wilson's matrix *is* badly conditioned, when compared to matrices with these properties.

- 4-by-4
- Symmetric
- Nonsingular
- Integer entries between 1 and 10.

My experiment involves generating one million random matrices with these properties. Random symmetric matrices are created by

U = triu(randi(10,4,4)); A = U + triu(U,1)';

One typical run produced these statistics.

- Number of matrices = 1000000
- Singular = 1820
- Positive definite = 11217
- Condition > condW = 2115
- Maximum condition = 4.8087e4

The distribution of the condition numbers looks like this. We can see that Wilson's matrix is unusual. Only about 0.21 percent of comparable matrices have a larger condition number.

By the way, the matrix with the largest condition number in this run was

condMax

condMax = 1 3 10 10 3 4 8 9 10 8 3 9 10 9 9 3

Thanks to Tahar Loulou whose comment got this all started, to Steve Lord who contributed another comment, and to Jan Peter Schäfermeyer, Nick Higham and Sven Hammarling for their email pointing to references.

R. W. Zdanowich and T. S. Wilson, "The elements of pendulum dampers," *Proceedings of the Institution of Mechanical Engineers*, 143_028_02, pp. 182-210, 1940, link.

J. Morris, "XVI. An escalator process for the solution of linear simultaneous equations", *The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science*, 37:265, pp. 106-120, 1946, link.

Jacques Delforge, "Application of the concept of the conditioning of a matrix to an identification problem," *Comp. and Maths with Appls.*, Vol. 4, pp. 139-148, 1978, link.

E. Bodewig, Matrix calculus, North Holland, Amsterdam, 2nd edition, 1959.

John Todd, "The condition of a certain matrix," *Proceedings of the Cambridge Philosophical Society*, vol. 46, pp. 116-118, 1950, link.

John Todd, Basic Numerical Algorithms: Vol. 2: Numerical Algorithms, Birkhäuser Verlag, Basel, 1977.

Marvin Marcus, Basic Theorems in Matrix Theory, National Bureau of Standards, Applied Mathematics Series, number 57, page 22, 1960, link.

Get
the MATLAB code

Published with MATLAB® R2018a

MIT's Professor Gil Strang gave two talks in one morning recently at the SIAM annual meeting. Both talks derived from his experience teaching a new course at MIT on linear algebra and neural nets. His first talk, "The Structure of a Deep Neural Net", was in a minisymposium titled "Deep Learning and Deep Teaching", which he organized. Another talk in that minisymposium was by Drexel's Professor Pavel Grinfeld on "An Informal Approach to Teaching Calculus." An hour later, Gil's gave his second talk, "Teaching About Learning." It was an invited talk at the SIAM Conference on Applied Mathematics Education.... read more >>

]]>MIT's Professor Gil Strang gave two talks in one morning recently at the SIAM annual meeting. Both talks derived from his experience teaching a new course at MIT on linear algebra and neural nets. His first talk, "The Structure of a Deep Neural Net", was in a minisymposium titled "Deep Learning and Deep Teaching", which he organized. Another talk in that minisymposium was by Drexel's Professor Pavel Grinfeld on "An Informal Approach to Teaching Calculus." An hour later, Gil's gave his second talk, "Teaching About Learning." It was an invited talk at the SIAM Conference on Applied Mathematics Education.

Inspired by Pavel's talk about teaching calculus, Gil began his second talk with some spontaneous remarks. "Can we teach a deep learner calculus?" he wondered. The system might be trained with samples of functions and their derivatives and then be asked to find derivatives of other functions that were not the training set.

Immediately after Gil's second talk, I asked the other MathWorkers attending the meeting if we could take up Gil's challenge. Within a few hours, Mary Fenelon, Christine Tobler and Razvan Carbunescu had the essential portions of the following demonstration of the Neural Networks Toolbox® working at the MathWorks booth.

Our first task is less ambitious than differentiation. It is simply to recognize the shapes of functions. By a function we mean the MATLAB vector obtained by sampling a familiar elementary function at a finite set of ordered random points drawn uniformly from the interval $[-1, 1]$. Derivatives, which we have not done yet, would be divided differences. We use six functions, $x$, $x^2$, $x^3$, $x^4$, $\sin{\pi x}$, and $\cos{\pi x}$. We attach a random sign and add white noise to the samples.

F = {@(x) x, @(x) x.^2, @(x) x.^3, @(x) x.^4, ... @(x) sin(pi*x), @(x) cos(pi*x)}; labels = ["x", "x^2", "x^3", "x^4", "sin(pi*x)", "cos(pi*x)"];

Here are a few definitions and parameters.

Set random number generator state.

rng(2)

Generate uniform random variable on [-1, 1].

randu = @(m,n) (2*rand(m,n)-1);

Generate random +1 or -1.

randsign = @() sign(randu(1,1));

Number of functions.

m = length(F);

Number of repetitions.

n = 1000;

Number of samples in the interval.

nx = 100;

Noise level.

noise = .0001;

C = cell(m,n); for j = 1:n x = sort(randu(nx,1)); for i = 1:m C{i,j} = randsign()*F{i}(x) + noise*randn(nx,1); end end

Let's plot instance one of each function. (With this initialization of `rng`, the $x^2$ and $sin(\pi x)$ curves have negative signs.)

set(gcf,'position',[300 300 300 300]) for i = 1:m plot(x,C{i,1},'.') axis([-1 1 -1 1]) title(labels(i)) snapnow end close

Our deep learning network is one that has proved to be successful in signal processing, text, and other applications with sequential data. There are six layers. The nonlinear activation layer, `relu`, for REctified Linear Unit, is essential. `ReLU(x)` is simply `max(0,x)`. LSTM stands for Long Short-Term Memory. Softmax is a generalization of the logistic function used to compute probabilities.

inputSize = nx; numClasses = m; numHiddenUnits = 100; layers = [ ... sequenceInputLayer(inputSize) reluLayer lstmLayer(numHiddenUnits,'OutputMode','last') fullyConnectedLayer(numClasses) softmaxLayer classificationLayer];

The first option, `'adam'`, is the stochastic optimization algorithm, adaptive moment estimation. An epoch is one forward pass and one backward pass over all of the training vectors, updating the weights. In our experience with this network, six passes is enough.

maxEpochs = 6; miniBatchSize = 27; options = trainingOptions('adam', ... 'ExecutionEnvironment','cpu', ... 'MaxEpochs',maxEpochs, ... 'MiniBatchSize',miniBatchSize, ... 'GradientThreshold',1, ... 'Verbose',0, ... 'Plots','training-progress');

With our setting of the `Plots` option, `trainNetwork` opens a custom figure window that dynamically shows the progress of the optimation.

It takes 15 or 20 seconds to train this network on my laptop. Big time neural nets with more layers and more epochs can make use of GPUs and pools of parallel workers

C = reshape(C',1,[]); Y = repelem(categorical(labels'), n); net = trainNetwork(C,Y,layers,options);

Generate more functions to form a test set.

nt = 100; Ctest = cell(m,nt); for j = 1:nt x = sort(randu(nx,1)); for i = 1:m Ctest{i,j} = randsign()*F{i}(x) + noise*randn(nx,1); end end

Classify the functions in the test set.

```
miniBatchSize = 27;
Ctest = reshape(Ctest',1,[]);
Ytest = repelem(categorical(labels'), nt);
Ypred = classify(net,Ctest,'MiniBatchSize',miniBatchSize);
```

Here are the results. We see scores above 95 percent, except for learning to distinguish between plots of $x^2$ and $x^4$. That's understandable.

T = table(Ypred, Ytest); heatmap(T, 'Ypred', 'Ytest');

Here are typical failed tests. It's not hard to see why the network is having trouble with these.

set(gcf,'position',[300 300 300 300]) for j = 1:m for i = [1:j-1 j+1:m] mismatch = find(Ytest == labels(i) & Ypred == labels(j)); if ~isempty(mismatch) % Plot one of each type of mismatch for k = 1 plot(linspace(-1, 1), Ctest{mismatch(k)},'.') title(labels(i)+" or "+labels(j)) snapnow end end end end close

Thanks to my colleagues attending the SIAM meeting -- Mary, Christine and Razvan. And thanks to Gil for his spontaneous idea.

Get
the MATLAB code

Published with MATLAB® R2018a

David Wilson, from the Auckland University of Technology in New Zealand, has alerted me to this remarkable coincidence. Here is the "Burning Ship" from my post about fractals four weeks ago.... read more >>

]]>David Wilson, from the Auckland University of Technology in New Zealand, has alerted me to this remarkable coincidence. Here is the "Burning Ship" from my post about fractals four weeks ago.

And here is a real photo, entitled *The Long, Long Night*, taken in August 1915 by Frank Hurley, of Antarctic explorer Ernest Shakleton's ship *Endurance*, frozen in the ice in the Weddell Sea.

Photo credit: National Library of Australia, http://nla.gov.au/nla.obj-158931586/view.

Get
the MATLAB code

Published with MATLAB® R2018a