The adjacency matrix of a hypercube demonstrates the new MATLAB graph object.... read more >>

]]>The adjacency matrix of a hypercube demonstrates the new MATLAB graph object.

My previous blog post was about the Patience Chinese Rings Puzzle. The puzzle's state lives in a six-dimensional space. There are 64 vertices, numbered 0 from to 63. Each vertex is connected to six other vertices, the ones whose numbers, expressed in binary, differ from its own by one bit. That connectivity arrangement -- those vertices and the connections between them -- form a *graph* known as a *hypercube*.

Recent versions of MATLAB include two new objects, `graph` and `digraph`. Here is an abbreviated `help` entry for `graph`.

>> help graph graph Undirected Graph G = graph(A) uses the square symmetric matrix A as an adjacency matrix and constructs a weighted graph with edges corresponding to the nonzero entries of A. If A is logical then no weights are added.

G = graph(A,NAMES) additionally uses the cell array of strings NAMES as the names of the nodes in G. NAMES must have as many elements as size(A,1).

And here is a function that generates the adjacency matrix for a hypercube.

```
type hypercube
```

function A = hypercube(d) % hypercube(d) is a logical adjacency matrix for a d-dimensional hypercube. pow2 = 2.^(0:d-1); f2b = @(j) floor(rem(j./pow2,2)); % flint to binary b2f = @(b) b*pow2'; % binary to flint n = 2^d; A = zeros(n,n,'logical'); for j = 0:n-1 % Convert column index to binary. b = f2b(j); % Flip bits to get corresponding row indices. for i = 1:d b(i) = ~b(i); k = b2f(b); b(i) = ~b(i); A(k+1,j+1) = 1; end end end

I first encountered hypercubes thirty years ago. For a couple of years in the late 1980s I was involved with one of the world's first commercial parallel computers, the IPSC, the *Intel Personal Super Computer*. Here is a link to my blog post in 2013.

The IPSC was constructed from single board computers, similar to the mother boards in the PCs of the day. We had three models -- *d5*, *d6* and *d7* -- with 32, 64 or 128 processors. Since it was impractical to connect every processor directly to every other processor, the connection network formed a hypercube. For a machine with $n = 2^d$ processors, that's $n \log_2{n}$ instead of $n^2$ connections. For the d6 that's 384 connections instead of 4096.

The d6 model has the same graph as the patience puzzle that I wrote about in my previous post. (Actually, the IPSC's graph is a directed graph, a *digraph*, because the connections are two-way. But undirected graphs have simpler plots.)

A 3-dimensional hypercube is not a conventional cube. The 3-d hypercube is only the vertices and edges of the cube. The cube itself includes its surfaces and interior. If you had a wire-frame cube made from rubber bands and squished it around without destroying the connectivity, you wouldn't have a cube any more, but you would still have a model of a hypercube.

Here is the adjacency matrix of a 3-d hypercube. (It's so small that it's not necessary to make use of the sparsity.)

A = hypercube(3)

A = 8×8 logical array 0 1 1 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 1 0 0 0 1 0 1 1 0

One view of this matrix is its `spy` plot.

spy(A)

To get other views, let's use the new MATLAB graph object. We'll label the nodes with '0' through '7'.

nodes = num2cellstr(0:7) G = graph(A,nodes)

nodes = 1×8 cell array '0' '1' '2' '3' '4' '5' '6' '7' G = graph with properties: Edges: [12×1 table] Nodes: [8×1 table]

By itself, a graph doesn't have any geometric structure. In order to plot it you specify a *layout*, or coordinates, for the nodes. Coming up with good layouts is an art, as well as a science. Here are the currently available layouts.

'auto' (Default) Automatic choice of layout method based on the structure of the graph. 'circle' Circular layout. 'force' Force-directed layout. Uses attractive and repulsive forces on nodes. 'layered' Layered layout. Places nodes in a set of layers. 'subspace' Subspace embedding layout. Uses projection onto an embedded subspace. 'force3' 3-D force-directed layout. 'subspace3' 3-D Subspace embedding layout.

The default layout of the 3-d hypercube graph is a perspective view of an ordinary cube.

```
plot(G)
axis square
```

Let's move up to four dimensions.

A = hypercube(4); spy(A)

I think the default layout of the 4-d hypercube is attractive.

```
nodes = num2cellstr(0:15)';
G = graph(A,nodes);
plot(G)
axis square
```

But `'circle'` is another attractive layout. Notice that '3' is not connected to '4' because 0011 and 0100 differ by more than a single bit. The blue dot in the center is not a node -- it is just the point where many edges cross.

plot(G,'layout','circle') axis square

Here is yet another view.

plot(G,'layout','layered') axis square

I'm still not satisfied. I want one cube inside of another. I can supply my own coordinates for the vertices.

```
type cubelet
```

function x = cubelet % x = cubelet % 8-by-3 array of vertices of a unit cube. x = [-1 -1 -1 1 -1 -1 -1 1 -1 1 1 -1 -1 -1 1 1 -1 1 -1 1 1 1 1 1]; end

x = cubelet; x = [3*x; x]; plot(G,'xdata',x(:,1),'ydata',x(:,2),'zdata',x(:,3)) axis equal off view(-20,11)

This served as a logo for the IPSC.

The Laplacian of a graph is a matrix that has the degree of the nodes on the diagonal and -1's off the diagonal marking the edges. For the 3-d hypercube the degrees are all 4. So here the Laplacian.

L = full(laplacian(G)); frmt = [' ' repmat('%3d',1,16) '\n']; fprintf('L = \n') fprintf(frmt,L)

L = 4 -1 -1 0 -1 0 0 0 -1 0 0 0 0 0 0 0 -1 4 0 -1 0 -1 0 0 0 -1 0 0 0 0 0 0 -1 0 4 -1 0 0 -1 0 0 0 -1 0 0 0 0 0 0 -1 -1 4 0 0 0 -1 0 0 0 -1 0 0 0 0 -1 0 0 0 4 -1 -1 0 0 0 0 0 -1 0 0 0 0 -1 0 0 -1 4 0 -1 0 0 0 0 0 -1 0 0 0 0 -1 0 -1 0 4 -1 0 0 0 0 0 0 -1 0 0 0 0 -1 0 -1 -1 4 0 0 0 0 0 0 0 -1 -1 0 0 0 0 0 0 0 4 -1 -1 0 -1 0 0 0 0 -1 0 0 0 0 0 0 -1 4 0 -1 0 -1 0 0 0 0 -1 0 0 0 0 0 -1 0 4 -1 0 0 -1 0 0 0 0 -1 0 0 0 0 0 -1 -1 4 0 0 0 -1 0 0 0 0 -1 0 0 0 -1 0 0 0 4 -1 -1 0 0 0 0 0 0 -1 0 0 0 -1 0 0 -1 4 0 -1 0 0 0 0 0 0 -1 0 0 0 -1 0 -1 0 4 -1 0 0 0 0 0 0 0 -1 0 0 0 -1 0 -1 -1 4

The Laplacian could also be obtained from the original adjacency matrix.

L = diag(sum(A)) - A;

The coordinates of possible layouts for the plot of the graph can be obtained by picking three of the eigenvectors of the Laplacian. Here are all of the eigenvalues and three of the eigenvectors. (The eigenvalues are all integers.)

```
[V,E] = eig(L);
evals = round(diag(E)');
fprintf('evals = \n')
fprintf(frmt,evals)
evecs3 = V(:,1:3)
```

evals = 0 2 2 2 2 4 4 4 4 4 4 6 6 6 6 8 evecs3 = -0.2500 0.0216 0.0025 -0.2500 0.1620 0.4067 -0.2500 0.0245 -0.1449 -0.2500 0.1649 0.2593 -0.2500 0.2546 -0.2521 -0.2500 0.3950 0.1521 -0.2500 0.2575 -0.3996 -0.2500 0.3979 0.0046 -0.2500 -0.3979 -0.0046 -0.2500 -0.2575 0.3996 -0.2500 -0.3950 -0.1521 -0.2500 -0.2546 0.2521 -0.2500 -0.1649 -0.2593 -0.2500 -0.0245 0.1449 -0.2500 -0.1620 -0.4067 -0.2500 -0.0216 -0.0025

And here is the resulting plot.

G = graph(A,nodes); plot(G,'xdata',V(:,1),'ydata',V(:,2),'zdata',V(:,3)) axis square off set(gca,'clipping','off') zoom(2) view(-75,45)

The state space of both the puzzle that got me started on this blog post and one of the models of the parallel computer that consumed me thirty years ago is six dimensional.

A = hypercube(6); spy(A)

A plot of a cube inside of a cube inside of a cube inside of a cube is a bit messy. The circle layout is an interesting design, but it is hard to see which nodes are connected to which.

nodes = num2cellstr(0:63)'; G = graph(A,nodes); plot(G,'layout','circle') axis square

Here is a three-dimensional layout.

plot(G,'layout','subspace3') axis off square set(gca,'clipping','off') zoom(2) view(-108,-21)

The Bucky Ball provides an elegant example of a graph. Here's the `help` entry.

```
help bucky
```

BUCKY Connectivity graph of the Buckminster Fuller geodesic dome. B = BUCKY is the 60-by-60 sparse adjacency matrix of the connectivity graph of the geodesic dome, the soccer ball, and the carbon-60 molecule. [B,V] = BUCKY also returns xyz coordinates of the vertices.

When we put `bucky` into MATLAB years ago we went to a whole lot of trouble to compute those vertex coordinates. (To see how much trouble, `type bucky`.) But now, with the three dimensional `force3` layout, the `graph/plot` function does that work. Putting a little spin on it makes it interesting to watch.

```
type bucky_gif
```

B = graph(bucky); plot(B,'layout','force3') axis square off vis3d set(gca,'clipping','off') zoom(2) [a,e] = view; gif_frame('bucky_spin.gif') for d = 10:10:360 view(a+d,e+d) gif_frame end gif_frame('reset')

It's more fun to use the `cameratoolbar` and put the spin on the ball yourself.

Get
the MATLAB code

Published with MATLAB® R2017a

MIT's Professor Daniel Frey recently introduced me to an ancient mechanical puzzle known as "Chinese Rings", "Patience", or "Baguenaudier." I have modified Daniel's simulator to produce a new app. The state space of the puzzle forms a hypercube.... read more >>

]]>MIT's Professor Daniel Frey recently introduced me to an ancient mechanical puzzle known as "Chinese Rings", "Patience", or "Baguenaudier." I have modified Daniel's simulator to produce a new app. The state space of the puzzle forms a hypercube.

There are dozens of YouTube videos showing solutions and speed contests for the tavern puzzle called "Patience". The best video is by Ty Yann, "Baguenaudier Chinese Rings." Here is the link to his puzzle tutorial.

Wikipedia tell us that *Baguenaudier* is French for "time-waster".

Prof. Frey has written a nice MATLAB simulator for his course this spring at MIT, 2.086, "Numerical Computation for Mechanical Engineers." My new app is based on his program.

The puzzle has six steel hooks, often made from nails, attached to a base. A steel ring is attached to each hook. A long movable piece known as the shuttle is initially threaded around the nails and through the rings. The objective is to free the shuttle. It ain't easy.

Image credit: Daniel Frey, MIT

My new interactive app, `patience`, is included with version 2.2 of Cleve's Laboratory in MATLAB Central. I hope you will try to solve the puzzle yourself before you resort to the "no patience" automated solve button. The following animated gif shows several steps at the start of the solution, then skips many intermediate steps before showing the last several.

The position of the six hooks, up or down, provides a six-bit binary number that characterizes the state of the mechanism. Initially, the rings are all engaged, so the state in binary is `000000`, which, of course, is a decimal 0. The objective is to free all the rings, giving a binary `111111`, or decimal 63. The app shows the decimal value at each step in the figure title.

The hooks are numbered from 1 through 6 starting on the left. So, we will use "big endian" binary, with the least significant bit at the left.

This one line in `patience.m`, which refers to the state vector `S` and a hook number `n`, controls the possible moves.

dbtype 52:53 patience

52 % Is the move allowed ? 53 ok = (n==1) || (n==2)&&~S(1) || (n>2)&&(~S(n-1)&&all(S(1:n-2)));

The rule says

- Hook 1 can always be toggled.
- Hook 2 can be toggled if hook 1 is up.
- Any hook numbered 3 through 6 can be toggled if the hook immediately to its left is up and all the others before it are down.

That's all you need to know to operate the puzzle.

It turns out that at any step at most two, and sometimes only one, of the six hooks is a legal move.

For example, if the state is

S = [1 1 0 0 1 0]

there are two possible moves. You can click on hook 1 to give

S = [0 1 0 0 1 0]

Or, you can click on hook 4 to give

S = [1 1 0 1 1 0]

The optimum path from a given starting state, such as 0, to the goal, 63, can be found by a brute force recursive search that tries all possible moves at each step. The code is at the end of this post. Here is the shortest path from 0 to 63.

path = shortest_path(0)

path = Columns 1 through 13 0 2 3 11 10 8 9 13 12 14 15 47 46 Columns 14 through 26 44 45 41 40 42 43 35 34 32 33 37 36 38 Columns 27 through 39 39 55 54 52 53 49 48 50 51 59 58 56 57 Columns 40 through 43 61 60 62 63

Notice that a shortest path never repeats any state because if we find ourselves reaching a state that we've already visited, we could shorten that path by skipping the extra steps.

It turns out that the only starting state that visits all vertices is `31 = [1 1 1 1 1 0]`, which corresponds to starting with the sixth hook up and all the others down.

Each step changes the state by a power of two. You can recover the sequence of mouse clicks that would generate this path by taking the log2 of the changes.

hooks = log2(abs(diff(path))) + 1

hooks = Columns 1 through 13 2 1 4 1 2 1 3 1 2 1 6 1 2 Columns 14 through 26 1 3 1 2 1 4 1 2 1 3 1 2 1 Columns 27 through 39 5 1 2 1 3 1 2 1 4 1 2 1 3 Columns 40 through 42 1 2 1

Let's plot the decimal value of the state for the shortest path from 0 to 63. Here is the progress of this value over the 42 steps. The even numbered steps involve toggling the first hook, which changes the value of the state by plus or minus one. The sixth hook is moved only once, at step 11, changing the value by +32. The fifth hook contributes its +16 at step 28.

plot_path

Here are the lengths of the shortest paths from each starting position.

L = zeros(64,1); for n = 0:63 L(n+1) = length(shortest_path(n)); end bar(L)

The puzzle's state lives in a six-dimensional space. There are 64 vertices, numbered 0 from to 63. Each vertex is connected to the six vertices whose numbers, expressed in binary, differ from its own by one bit. That connectivity arrangement -- those vertices and the connections between them -- form a *graph* known as a *hypercube*.

Recent versions of MATLAB include methods for a new object, the *graph*. I plan to discuss hypercubes and graphs in my next blog post.

Here is the code for the recursive search that finds the shortest path. The code for the complete `patience` app is now included with Cleve's Laboratory in MATLAB Central.

```
type shortest_path
```

function path = shortest_path(state,m,path) % shortest_path, or shortest_path(state), where state is a decimal state % value between 0 and 63, is the shortest path of allowable puzzle moves % from that state to the objective, 63. % % shortest_path(S,m,path) is the recursive call. if nargin == 0 state = 0; % Default end if nargin <= 1 % Initial call m = 0; % Previous n path = []; end if state == 63 % Terminate recursion path = state; return end if state == 31 m = 0; end % Search for shortest path pow2 = 2.^(0:5); lmin = inf; for n = [1:m-1 m+1:6] Sn = mod(floor(state./pow2),2); % Convert state to binary Sn(n) = ~Sn(n); % Flip one bit ok = (n==1) || (n==2)&&~Sn(1) || ... (n>2)&&(~Sn(n-1)&&all(Sn(1:n-2))); % Allowable move if ok staten = Sn*pow2'; % Recursive call pn = shortest_path(staten,n,[path staten]); if length(pn) < lmin lmin = length(pn); pbest = pn; end end end path = [state pbest]; end % shortest_path

Thanks to Daniel Frey.

Get
the MATLAB code

Published with MATLAB® R2017a

"ULP" stands for "unit in the last place." An *ulps plot* samples a fundamental math function such as $\sin{x}$, or a more esoteric function like a Bessel function. The samples are compared with more accurate values obtained from a higher precision computation. A plot of the accuracy, measured in ulps, reveals valuable information about the underlying algorithms.... read more >>

"ULP" stands for "unit in the last place." An *ulps plot* samples a fundamental math function such as $\sin{x}$, or a more esoteric function like a Bessel function. The samples are compared with more accurate values obtained from a higher precision computation. A plot of the accuracy, measured in ulps, reveals valuable information about the underlying algorithms.

`libm` is the library of elementary math functions that supports the C compiler. `fdlibm` is "freely distributable" source for `libm` developed and put into the public domain over 25 years ago by K. C. Ng and perhaps a few others at Sun Microsystems. I wrote about `fdlibm` in our newsletter in 2002, The Tetragamma Function and Numerical Craftsmanship.

Mathematically `fdlibm` shows immaculate craftsmanship. We still use it today for our elementary transcendental functions. And I suspect all other mathematical software projects do as well. If they don't, they should.

`ulps(x)` is the distance from `x` to the next larger floating point number. It's the same as `eps(x)`.

To assess the accuracy of a computed value

y = f(x)

compare it with the more accurate value obtained from the Symbolic Math Toolbox

```
Y = f(sym(x,'f'))
```

The `'f'` flag says to convert `x` to a `sym` exactly, without trying to guess that it is an inverse power of 10 or the `sqrt` of something. The relative error in `y`, measured in units in the last place, is then

u = (y - Y)/eps(abs(Y))

Since this is *relative* error, it is a stringent measure near the zeros of `f(x)`.

If `y` is the floating point number obtained by correctly rounding `Y` to double precision, then

-0.5 <= u <= 0.5

This is the best that can be hoped for. Compute the exact mathematical value of `f(x)` and make a single rounding error to obtain the final result. Half-ulp accuracy is difficult to obtain algorithmically, and too expensive in execution time. All of the functions in MATLAB that are derived from `fdlibm` have better than one-ulp accuracy.

-1.0 < u < 1.0

Each of the following plots involves 100,000 random arguments `x`, uniformly distributed in an appropriate interval.

We see about 0.8 ulp accuracy from this sample. That's typical.

Argument reduction is the first step in computing $\sin{x}$. An integer multiple $n$ of $\pi/2$ is subtracted from the argument to bring it into the interval

$$ -\frac{\pi}{4} \le x - n \frac{\pi}{2} \le \frac{\pi}{4} $$

Then, depending upon whether $n$ is odd or even, a polynomial approximation of degree 13 to either $\sin$ or $\cos$ gives the nearly correctly rounded result for the reduced argument, and hence for the original argument. The ulps plot shows a slight degradation in accuracy at odd multiples of $\pi/4$, which are the extreme points for the polynomial approximations.

It is important to note that the accuracy is better than one ulp even near the end-points of the sample interval, $0$ and $2\pi$. This is where $\sin{x}$ approaches $0$ and the approximation must follow carefully so that the relative error remains bounded.

Again, roughly 0.8 ulp accuracy.

Similar argument reduction results in similar behavior near odd multiples of $\pi/4$. In between these points, at $\pi/2$ and $3\pi/2$, $\tan{x}$ has a pole and the approximation must follow suit. The algorithm uses a reciprocal and the identity

$$ \tan x = -1/\tan{(x+\frac{\pi}{2})} $$

This comes close to dividing by zero as you approach a pole, but the resulting approximation remains better than one ulp.

Good to within slightly more than 0.8 ulp. The underlying approximation is a piecewise polynomial with breakpoints at a few multiples of 1/16 that are evident in the plot and marked on the axis.

About the same accuracy as the previous plots.

The argument reduction involves the key value

$$ r = \ln{2} \approx 0.6931 $$

and the identity

$$ \exp{x} = 2^n \exp{(x-n r)} $$

The resulting ulps plot shows the extremes of the error at odd multiples of $r/2$.

Now for two functions that are not in `fdlibm`. If you follow this blog, you might have noticed that I am a big fan of the Lambert W function. I blogged about it a couple of years ago, The Lambert W Function. The Wikipedia article is excellent, Lambert W Function. And, you cn just Google "lambert w function" for many more interesting links.

The Lambert W function is not available in MATLAB itself. The Symbolic Math Toolbox has an @double method that accesses the symbolic code for type double arguments. Or, my blog post mentioned above has some simple (but elegant, if I do say so myself) code.

Neither code is one-ulp accurate. The primary branch of the function has a zero at the origin. As we get near that zero, the relative error measured in *ulps* is unbounded. The absolute accuracy is OK, but the relative accuracy is not. In fact, you might see a billion ulps error. That's a *gigaulp*, or *gulp* for short.

As `lambertw(x)` crosses a power of two, the unit in the last place, `eps(lambertw(x))`, jumps by a factor of two. Three of these points are marked by ticks on the x-axis in the ulps plot.

for a = [1/8 1/4 1/2] z = fzero(@(x) lambertw(x)-a,.5) end

z = 0.1416 z = 0.3210 z = 0.8244

Our codes for Bessel functions have fairly good, although not one-ulp, absolute accuracy. But they do not have high relative accuracy near the zeros. Here are the first five zeros, and an ulps plot covering the first two, for $J_0(x)$, the zero-th order Bessel function of the first kind.

for a = (1:5)*pi z = fzero(@(x) besselj(0,x), a) end

z = 2.4048 z = 5.5201 z = 8.6537 z = 11.7915 z = 14.9309

Here is the inverse of the error function. It looks very interesting, but I haven't investigated it yet.

I have recently updated Cleve's Laboratory in the MATLAB Central file exchange. The latest version includes `ulpsapp.m`, which generates the ulps plots in this blog.

Get
the MATLAB code

Published with MATLAB® R2017a

A headline in the *New York Times* at the end of 2016 said "Growth of U.S. Population Is at Slowest Pace Since 1937". This prompted me to revisit an old chestnut about fitting and extrapolating census data. In the process I have added a couple of nonlinear fits, namely the logistic curve and the double exponential Gompertz model.... read more >>

A headline in the *New York Times* at the end of 2016 said "Growth of U.S. Population Is at Slowest Pace Since 1937". This prompted me to revisit an old chestnut about fitting and extrapolating census data. In the process I have added a couple of nonlinear fits, namely the logistic curve and the double exponential Gompertz model.

This experiment is older than MATLAB. It started as an exercise in *Computer Methods for Mathematical Computations*, by Forsythe, Malcolm and Moler, published by Prentice-Hall 40 years ago. We were using Fortran back then. The data set has been updated every ten years since. Today, MATLAB graphics makes it easier to vary the parameters and see the results, but the underlying mathematical principles are unchanged:

- Using polynomials of even modest degree to predict the future by extrapolating data is a risky business.

Recall that the famous computational scientist Yogi Berra said

- "It's tough to make predictions, especially about the future."

The data are from the decennial census of the United States for the years 1900 to 2010. The population is in millions.

1900 75.995 1910 91.972 1920 105.711 1930 123.203 1940 131.669 1950 150.697 1960 179.323 1970 203.212 1980 226.505 1990 249.633 2000 281.422 2010 308.746

The task is to extrapolate beyond 2010. Let's see how an extrapolation of just six years to 2016 matches the Census Bureau announcement. Before you read any further, pause and make your own guess.

Here's is the opening screen of the January 2017 edition of my `censusapp`, which is included in Cleve's Laboratory. The plus and minus buttons change the extrapolation year in the title. If you go beyond 2030, the plot zooms out.

The pull-down menu offers these models. Forty years ago we had only polynomials.

models = {'census data','polynomial','pchip','spline', ... 'exponential','logistic','gompertz'}'

models = 7×1 cell array 'census data' 'polynomial' 'pchip' 'spline' 'exponential' 'logistic' 'gompertz'

The Census Bureau news release that prompted the story in *The Times* said the official population in 2016 was 323.1 million. That was on Census Day, which is April 1 of each year. The Census Bureau also provides a dynamic Population Clock that operates continuously. But let's stick with the 323.1 number.

Polynomials like to wiggle. Constrained to match data in a particular interval, they go crazy outside that interval. Today, there are 12 data points. The app lets you vary the polynomial degree between 0 and 11. Polynomials with degree less than 11 approximate the data in a least squares sense. The polynomial of degree 11 interpolates the data exactly. As the degree is increased, the approximation of the data becomes more accurate, but the behavior beyond 2010 (or before 1900) becomes more violent. Here are degrees 2 and 7, 9, 11, superimposed on one plot.

The quadratic fit is the best behaved. When evaluated at year 2016, it misses the target by six million. Of course, there is no reason to believe that the US population grows like a second degree polynomial in time.

The interpolating polynomial of degree 11 tries to escape even before it gets to 2010, and it goes negative late in 2014.

MATLAB has two piecewise cubic interpolating polynomials. The classic `spline` is smooth because it has two continuous derivatives. Its competitor `pchip` sacrifices a continuous second derivate to preserve shape and avoid overshoots. I blogged about splines and pchips a few years ago.

Neither is intended for extrapolation, but we will do it anyway. Their behavior beyond the interval is determined by their end conditions. The classic `spline` uses the so-called *not-a-knot* condition. It is actually a single cubic in the last two subintervals. That cubic is also used for extrapolation beyond the endpoint. `pchip` uses just the last three data points to create a different shape-preserving cubic for use in the last subinterval and beyond.

Let's zoom in on the two. Both are predicting a decreasing rate of growth beyond 2010, just as the Census Bureau is observing. `pchip` gets lucky and comes within 0.2 million of the announcement for 2016.

As I said, there is no good reason to model population growth by a polynomial, piecewise or not. But because the rate of growth can be expected to be proportional to the size of the population, there is good reason to use an exponential.

$$ p(t) \approx a e^{bt} $$

There have been many proposals for ways to modify this model to avoid its unbounded growth. I have just added two of these to `censusapp`. One is the logistic model.

$$ p(t) \approx \frac{a}{1+b e^{-ct}} $$

And the other is the Gompertz double exponential model, named after Benjamin Gompertz, a 19th century self-educated British mathematician and astronomer.

$$ p(t) \approx a e^{-b e^{-ct}} $$

In both of these models the growth is limited because the approximating term approaches $a$ as $t$ approaches infinity.

In all three of the exponential models, the parameters $a$, $b$, and possibly $c$, appear nonlinearly. In principle, I could use `lsqcurvefit` to search in two or three dimensions for a least squares fit to the census data. But I have an alternative. By taking one or two logarithms, I have a *separable least squares* model where at most one parameter, $a$, appears nonlinearly.

For the exponential model, take one logarithm.

$$ \log{p} \approx \log{a} + bt $$

Fit the logarithm of the data by a straight line and then exponentiate the result. No search is required.

For the logistic model, take one logarithm.

$$ \log{(a/p-1)} \approx \log{b} - ct $$

For any value of $a$, the parameters $\log{b}$ and $c$ appear linearly and can be found without a search. So use a one-dimensional minimizer to search for $a$. I could use `fminbnd`, or its textbook version, `fmintx`, from *Numerical Methods with MATLAB*.

For the Gompertz model, take two logarithms.

$$ \log{\log{a/p}} \approx \log{b} - ct $$

Again, do a one-dimensional search for the minimizing $a$, solving for $\log{b}$ and $c$ at each step.

Here are the three resulting fits, extrapolated over more than 200 years to the year 2250. The pure exponential model reaches 5 billion people by that time, and is growing ever faster. I think that's unreasonable.

The value of $a$ in the Gompertz fit turns out to be 4309.6, so the population will be capped at 4.3 billion. But it has only reached 1.5 billion two hundred years from now. Again unlikely.

The value of $a$ in the logistic fit turns out to be 756.4, so the predicted US population will slightly more than double over the next two hundred years. Despite the Census Bureau's observation that our rate of growth has slowed recently, we are not yet even half the way to our ultimate population limit. I'll let you be the judge of that prediction.

I have recently updated Cleve's Laboratory in the MATLAB Central file exchange. One of the updates changes the name of `censusgui` to `censusapp` and adds the two exponential models. If you do install this new version of the Laboratory, you can answer the following question.

The fit generated by `pchip` defines a cubic for use beyond the year 2000 that predicts the population will reach a maximum in the not too distant future and decrease after that. What is that maximum and when does it happen?

Get
the MATLAB code

Published with MATLAB® R2017a

I don't recall where I found this seasonal fractal. And I can't explain how it works. So please submit a comment if you can shed any light on either of these questions.... read more >>

]]>I don't recall where I found this seasonal fractal. And I can't explain how it works. So please submit a comment if you can shed any light on either of these questions.

I have featured a fractal at this time of the year in two previous years. I see Christmas trees or perhaps a holly decoration.

All these figures are obtained by varying the parameters in a formula that generates complex numbers $z$ from the partial sums.

$$ z = \sum_n{\exp{(\phi n^p+\sigma) \pi i}} $$

A vector of points in the complex plane is produced by taking $n$ to be a vector of consecutive integers and using the MATLAB cumulative summation function `cumsum` to compute the partial sums. There are 8600 points in the figure above and 100,000 points in the figures below.

The default value of the parameter $\phi$ is my old friend the `golden ratio`.

$$ \phi = \frac{1+\sqrt{5}}{2} $$

In previous posts I've taken $\phi$ to be other rational and irrational numbers, but today I am sticking to this value.

The parameter $\sigma$ controls the angular orientation. Taking $\sigma$ near $1/8$ makes the large Christmas tree vertical.

While trying how to understand how this thing works I've varied the power $p$ from its usual value of 2 and taken hundreds of thousands of points. This produces today's pictures. Different values of $p$ produce wildly different results.

For a real variable $x$, the expression $\exp (x \pi i)$ is periodic and lies on the unit circle in the complex plane. So we are plotting the cumulative sum of values taken from around the unit circle. At first glance, this appears to be a complex valued random number generator. But it is a lousy generator because we can see Christmas trees in the output.

```
type greetings_gifs
```

function greetings_gifs % Generate animated season's greeting gifs. % Generate the fractal phi = (1+sqrt(5))/2; s = 1/8; n = 100000; for gifnum = 1:4 switch gifnum case 1, p = 2; case 2, p = 2/3; case 3, p = 1.25; case 4, p = 4; end w = exp((phi*(0:n).^p+s)*pi*1i); z = cumsum(w); % Find local extrema ks = extrema(z); % Make an animated gif plotit(z,p,ks,gifnum) end % gifnum % ------------------------ function ks = extrema(z) n = length(z)-1; m = n/40; ks = []; for j = 0:m:n-m zj = z(j+1:j+m); w = zj - mean(zj); k = find(abs(w) == max(abs(w))) + j; ks = [ks k]; end end % extrema function plotit(z,p,ks,gifnum) % Make an animated gif shg plot(z,'.') axis square ax = axis; gif_frame(['greetings' int2str(gifnum) '.gif']) clf axis(ax) axis square gif_frame for j = 1:length(ks) k = ks(j); plot(z(1:k),'k.','markersize',0.5) axis(ax) axis square hold on plot(z(ks(1:j)),'g.','markersize',18) plot(z(k),'r.','markersize',24) hold off title(sprintf('p = %4.2f',p)) gif_frame end gif_frame(5) gif_frame('reset') end % plotit end % greetings_gifs

*Happy New Year.*

Get
the MATLAB code

Published with MATLAB® R2016a

When I was in high school in the 1950's, I didn't know anything about matrices. But I nearly encountered one when I wrote a paper for my physics class about the color scheme that allowed new color TV broadcasts to be compatible with existing black-and-white TV receivers.... read more >>

]]>When I was in high school in the 1950's, I didn't know anything about matrices. But I nearly encountered one when I wrote a paper for my physics class about the color scheme that allowed new color TV broadcasts to be compatible with existing black-and-white TV receivers.

During my high school years in Utah, I worked part-time and summers as a camera man and projectionist at what was then called KTVT, Channel 4, and the local NBC affiliate in Salt Lake City. The TV cameras at the time employed vacuum tubes, not transistors. They were the size of suit cases, weighed over 100 pounds, and were mounted on dollies.

The first coast-to-coast TV broadcast in color was the Tournament of Roses parade on January 1, 1954. KTVT had a prototype color receiver, perhaps the only one in Utah at the time. We set up bleachers in the studio and invited guests to see the new technology.

The NTSC compatible color television standard, adopted in 1953, allows color information to be encoded in a broadcast in such a way that existing black-and-white receivers will still function without modification. The standard remained in use in the US until broadcasters ceased analog transmissions in 2009.

I wrote a paper for my high school physics class about the NTSC compatible color scheme. It involves a *luminance-chrominance* conversion between the RGB, red-green-blue, signal used in color studio equipment and home receivers, and a YIQ encoding. Y represents the brightness, or *luma*, of the signal, and is used by itself for black-and-white. The letters I and Q stand for *in-phase* and *quadrature*. These two components combine with the brightness to produce color.

Here are the conversion formulas that I had in my high school paper.

$$ Y = 0.2989 R + 0.5870 G + 0.1140 B $$

$$ I = 0.5959 R - 0.2744 G - 0.3216 B $$

$$ Q = 0.2115 R - 0.5229 G + 0.3114 B $$

I now know how to write the conversion formulas in matrix notion. If $x$ is the 3-vector containing RGB and $y$ is the 3-vector of YIQ, then

$$ y = A x $$

where $A$ is the matrix

$$ A = \left[ \begin{array}{rrr} 0.2989 & 0.5870 & 0.1140 \\ 0.5959 & -0.2744 & -0.3216 \\ 0.2115 & -0.5229 & 0.3114 \end{array} \right] $$

In retrospect, this was my very first matrix.

Of course, conversion at the receiving end from YIQ to RGB is done by

$$ x = A^{-1} y $$

Fast forward 25 years so we can use MATLAB.

A = [0.2989 0.5870 0.1140 0.5959 -0.2744 -0.3216 0.2115 -0.5229 0.3114] T = inv(A)

A = 0.2989 0.5870 0.1140 0.5959 -0.2744 -0.3216 0.2115 -0.5229 0.3114 T = 1.0002 0.9560 0.6211 1.0001 -0.2720 -0.6470 1.0000 -1.1060 1.7030

The first column of $A^{-1}$ is all 1's, so each of R, G and B gets a full dose of Y.

Although I had nothing to do with putting them there, these two matrices are central to the functions `ntsc2rgb` and `rgb2ntsc` in the `colorspaces` folder of the Image Processing Toolbox. In the following code `A` is a 3D full color RGB image.

Here is the core of the NTSC to RGB function. The image is reshaped to a tall, skinny 2D array and the transformation applied on the right by the transpose of `T`.

dbtype 36:36 ntsc2rgb dbtype 41:43 ntsc2rgb dbtype 48:48 ntsc2rgb

36 T = [1.0 0.956 0.621; 1.0 -0.272 -0.647; 1.0 -1.106 1.703]; 41 m = size(A,1); 42 n = size(A,2); 43 A = reshape(A(:),m*n,3)*T'; 48 A = reshape(A,m,n,3);

The conversion from RGB to NTSC uses *matrix division* on the right by `T` transpose. This may be the only place in MATLAB where the forward slash matrix operator appears.

dbtype 30:31 rgb2ntsc dbtype 35:35 rgb2ntsc

30 T = [1.0 0.956 0.621; 1.0 -0.272 -0.647; 1.0 -1.106 1.703].'; 31 [so(1),so(2),thirdD] = size(A); 35 A = reshape(reshape(A,so(1)*so(2),thirdD)/T,so(1),so(2),thirdD);

Since analog NTSC broadcasts are history, these conversions are probably not very common any more. But the single formula

$$ Y = 0.2989 R + 0.5870 G + 0.1140 B $$

is the basis for the `rgb2gray` function.

NBC introduced a peacock logo to promote the debut of its "in living color" broadcasts. The peacock has lost a few feathers over the years, but it still provides an excellent color test pattern.

(NBC SVG by FOX 52 [Public domain], via Wikimedia Commons)

Here are the red, green and blue components. The bright portions represent high values. All three have relatively bright backgrounds and bright peacocks in the center; these combine to produce white. The R component has bright feathers on the left to contribute to the yellow, orange and red feathers in the full logo. The G component has bright components on the bottom to contribute to the yellow and green in the full. And B has bright feathers on the upper right to give blue and purple.

Convert to YIQ. The Y component is the grayscale rendition to drive old black-and-white TVs. The I component goes from orange to blue and the Q component goes from purple to green.

Talking about color, here is one of my favorite graphics. Let's start with an animated .gif of `colorcubes(3)`, which contains 27 cubelets. If you run `colorcubes` yourself, you will drive the rotation with your mouse.

You can see white at one vertex and black at the opposite vertex. Red, a bright green, and blue are at three vertices, while their complements -- cyan, magenta, and yellow -- oppose them. Actually, the green vertex is too bright. It is often called "lime". A darker green is along the bottom edge, halfway between lime and black. Here's a little quiz: What color is the hidden cubelet in the center?

Now a look at the construction of `colorcubes(5)`.

Here is the `colorcubes` code.

```
type colorcubes
```

function colorcubes(n,w) % COLORCUBES A cube of cubes in the RGB color space. % COLORCUBES, with no arguments, shows 5^3 = 125 cubes with % colors equally spaced in the RGB color space. % COLORCUBES(n) shows n-by-n-by-n colors. % COLORCUBES(2) shows 8 colors: R, G, B, C, M, Y, W, K (black). % COLORCUBES(n,w) uses cubes of width w. Default is w = 0.85. % Rotate the cube with the mouse or arrow keys. % Copyright 2016 The MathWorks, Inc. if nargin < 1, n = 5; end if nargin < 2, w = 0.85; end initgraphics(n) [x,y,z] = cube(w); m = n-1; for i = m:-1:0 for j = m:-1:0 for k = 0:m r = k/m; g = 1-j/m; b = 1-i/m; surface(i+x,j+y,k+z, ... 'facecolor',[r g b], ... 'facelighting','gouraud'); drawnow end %k end %j end %i % ------------------------ % INITGRAPHCS Inialize the colorcubes axis. % INITGRAPHICS(n) for n-by-n-by-n display. function initgraphics(n) clf reset shg set(gcf,'color','white') axis([0 n 0 n 0 n]); axis off axis vis3d rotate3d on end %initgraphics function [x,y,z] = cube(w) % CUBE Coordinates of the faces of a cube. % [x,y,z] = cube(w); surface(x,y,z) % plots a cube of with w. u = [0 0; 0 0; w w; w w]; v = [0 w; 0 w; 0 w; 0 w]; z = [w w; 0 0; 0 0; w w]; s = [nan nan]; x = [u; s; v]; y = [v; s; u]; z = [z; s; w-z]; end %cube end % colorcubes

Get
the MATLAB code

Published with MATLAB® R2016a

(Please replace the erroneous posting from yesterday, Nov. 28, with this corrected version.)... read more >>

]]>(Please replace the erroneous posting from yesterday, Nov. 28, with this corrected version.)

Here is a very short course in Linear Algebra. The Singular Value Decomposition provides a natural basis for Gil Strang's Four Fundamental Subspaces.

Screen shot from Gil Strang MIT/MathWorks video lecture, "The Big Picture of Linear Algebra".

Gil Strang tells me that he began to think about linear algebra in terms of four fundamental subspaces in the 1970's when he wrote the first edition of his textbook, *Introduction to Linear Algebra*. The fifth edition, which was published last May, features the spaces on the cover.

The concept is a centerpiece in his video lectures for MIT course 18.06. It even found its way into the new video series about ordinary differential equations that he and I made for MIT and MathWorks. His paper included in the notes for 18.06 is referenced below.

Suppose that $A$ is a $m$ -by- $n$ matrix that maps vectors in $R^n$ to vectors in $R^m$. The four fundamental subspaces associated with $A$, two in $R^n$ and two in $R^m$, are:

- column space of $A$, the set of all $y$ in $R^m$ resulting from $y = Ax$,
- row space of $A$, the set of all $x$ in $R^n$ resulting from $x = A^Ty$,
- null space of $A$, the set of all $x$ in $R^n$ for which $Ax = 0$,
- left null space of $A$, the set of all $y$ in $R^m$ for which $A^T y = 0$.

The row space and the null space are *orthogonal* to each other and span all of $R^n$. The column space and the left null space are also *orthogonal* to each other and span all of $R^m$.

The *dimension* of a subspace is the number of linearly independent vectors required to span that space. *The Fundamental Theorem of Linear Algebra* is

- The dimension of the row space is equal to the dimension of the column space.

In other words, the number of linearly independent rows is equal to the number of linearly independent columns. This may seem obvious, but it is actually a subtle fact that requires proof.

The *rank* of a matrix is this number of linearly independent rows or columns.

The natural bases for the four fundamental subspaces are provided by the SVD, the Singular Value Decomposition, of $A$.

$$ A = U \Sigma V^T $$

The matrices $U$ and $V$ are *orthogonal*.

$$ U^T U = I_m, \ \ V^T V = I_n $$

You can think of orthogonal matrices as multidimensional generalizations of two dimensional rotations. The matrix $\Sigma$ is *diagonal*, so its only nonzero elements are on the main diagonal.

The shape and size of these matrices are important. The matrix $A$ is rectangular, say with $m$ rows and $n$ columns; $U$ is square, with the same number of rows as $A$; $V$ is also square, with the same number of columns as $A$; and $\Sigma$ is the same size as $A$. Here is a picture of this equation when $A$ is tall and skinny, so $m > n$. The diagonal elements of $\Sigma$ are the *singular values*, shown as blue dots. All of the other elements of $\Sigma$ are zero.

The signs and the ordering of the columns in $U$ and $V$ can always be taken so that the singular values are nonnegative and arranged in decreasing order.

For any diagonal matrix like $\Sigma$, it is clear that the rank, which is the number of independent rows or columns, is just the number of nonzero diagonal elements.

In MATLAB, the SVD is computed by the statement.

[U,Sigma,V] = svd(A)

With inexact floating point computation, it is appropriate to take the rank to be the number of *nonnegligible* diagonal elements. So the function

r = rank(A)

counts the number of singular values larger than a tolerance.

Multiply both sides of $A = U\Sigma V^T $ on the right by $V$. Since $V^T V = I$, we find

$$ AV = U\Sigma $$

Here is the picture. I've drawn a green line after column $r$ to show the rank. The only nonzero elements of $\Sigma$, the singular values, are the blue dots.

Write out this equation column by column.

$$ Av_j = \sigma_j u_j, \ \ j = 1,...,r $$

$$ Av_j = 0, \ \ j = r+1,...,n $$

This says that $A$ maps the first $r$ columns of $V$ onto nonzero multiples of the first $r$ columns of $U$ and maps the remaining columns of $V$ onto zero.

- $U(:,1:r)$ spans the column space.
- $V(:,r+1:n)$ spans the null space.

Transpose the equation $A = U\Sigma V^T $ and multiply both sides on the right by $U$. Since $U^T U = I$, we find

$$ A^T U = V \Sigma^T $$

Here's the picture, with the green line at the rank.

Write this out column by column.

$$ A^T u_j = \sigma_j v_j, \ \ j = 1,...,r $$

$$ A^T u_j = 0, \ \ j = r+1,...,m $$

This says that $A^T$ maps the first $r$ columns of $U$ onto nonzero multiples of the first $r$ columns of $V$ and maps the remaining columns of $U$ onto zero.

- $V(:,1:r)$ spans the row space.
- $U(:,r+1:m)$ spans the left nullspace.

Here is an example involving lines in two dimensions, so $m = n = 2$. Start with these vectors.

u = [-3 4]' v = [1 3]'

u = -3 4 v = 1 3

The matrix $A$ is their outer product.

A = u*v'

A = -3 -9 4 12

Compute the SVD.

[U,Sigma,V] = svd(A)

U = -0.6000 0.8000 0.8000 0.6000 Sigma = 15.8114 0 0 0 V = 0.3162 -0.9487 0.9487 0.3162

As expected, $\Sigma$ has only one nonzero singular value, so the rank is $r = 1$.

The first left and right singular vectors are our starting vectors, normalized to have unit length.

ubar = u/norm(u) vbar = v/norm(v)

ubar = -0.6000 0.8000 vbar = 0.3162 0.9487

The columns of $A$ are proportional to each other, and to $\bar{u}$. So the column space is just the line generated by multiples of either column and $\bar{u}$ is the normalized basis vector for the column space. The columns of $A^T$ are proportional to each other, and to $\bar{v}$. So $\bar{v}$ is the normalized basis vector for the row space.

The only nonzero singular value is the product of the normalizing factors.

sigma = norm(u)*norm(v)

sigma = 15.8114

The second right and left singular vectors, that is the second columns of $V$ and $U$, provide bases for the null spaces of $A$ and $A^T$.

Here is the picture.

Gilbert Strang, "The Four Fundamental Subspaces: 4 Lines", undated notes for MIT course 18.06, <http://web.mit.edu/18.06/www/Essays/newpaper_ver3.pdf>

Gilbert Strang, *Introduction to Linear Algebra*, Wellesley-Cambridge Press, fifth edition, 2016, x+574 pages, <http://bookstore.siam.org/wc14>

Gilbert Strang, "The Fundamental Theorem of Linear Algebra", *The American Mathematical Monthly*, Vol. 100, No. 9. (Nov., 1993), pp. 848-855, <http://www.jstor.org/stable/2324660?seq=1#page_scan_tab_contents>, also available at <http://www.souravsengupta.com/cds2016/lectures/Strang_Paper1.pdf>

Get
the MATLAB code

Published with MATLAB® R2016b

Here is a very short course in Linear Algebra. The Singular Value Decomposition provides a natural basis for Gil Strang's Four Fundamental Subspaces.... read more >>

]]>Here is a very short course in Linear Algebra. The Singular Value Decomposition provides a natural basis for Gil Strang's Four Fundamental Subspaces.

Screen shot from Gil Strang MIT/MathWorks video lecture, "The Big Picture of Linear Algebra".

Gil Strang tells me that he began to think about linear algebra in terms of four fundamental subspaces in the 1970's when he wrote the first edition of his textbook, *Introduction to Linear Algebra*. The fifth edition, which was published last May, features the spaces on the cover.

The concept is a centerpiece in his video lectures for MIT course 18.06. It even found its way into the new video series about ordinary differential equations that he and I made for MIT and MathWorks. His paper included in the notes for 18.06 is referenced below.

Suppose that $A$ is a $m$ -by- $n$ matrix that maps vectors in $R^n$ to vectors in $R^m$. The four fundamental subspaces associated with $A$, two in $R^n$ and two in $R^m$, are:

- row space of $A$, the set of all $x$ for which $Ax$ is nonzero,
- null space of $A$, the set of all $x$ for which $Ax = 0$,
- column space of $A$, the set of all $y$ for which $A^T y$ is nonzero.
- left null space of $A$, the set of all $y$ for which $A^T y = 0$.

The row space and the null space are *orthogonal* to each other and span all of $R^n$. The column space and the left null space are also *orthogonal* to each other and span all of $R^m$.

The *dimension* of a subspace is the number of linearly independent vectors required to span that space. *The Fundamental Theorem of Linear Algebra* is

- The dimension of the row space is equal to the dimension of the column space.

In other words, the number of linearly independent rows is equal to the number of linearly independent columns. This may seem obvious, but it is actually a subtle fact that requires proof.

The *rank* of a matrix is this number of linearly independent rows or columns.

The natural bases for the four fundamental subspaces are provided by the SVD, the Singular Value Decomposition, of $A$.

$$ A = U \Sigma V^T $$

The matrices $U$ and $V$ are *orthogonal*, which you can think of as multidimensional generalizations of two dimensional rotations. The matrix $\Sigma$ is *diagonal*, so its only nonzero elements are on the main diagonal.

The shape and size of these matrices are important. The matrix $A$ is rectangular, say with $m$ rows and $n$ columns; $U$ is square, with the same number of rows as $A$; $V$ is also square, with the same number of columns as $A$; and $\Sigma$ is the same size as $A$. Here is a picture of this equation when $A$ is tall and skinny, so $m > n$. The diagonal elements of $\Sigma$ are the *singular values*, shown as blue dots. All of the other elements of $\Sigma$ are zero.

The signs and the ordering of the columns in $U$ and $V$ can always be taken so that the singular values are nonnegative and arranged in decreasing order.

For any diagonal matrix like $\Sigma$, it is clear that the rank, which is the number of independent rows or columns, is just the number of nonzero diagonal elements.

In MATLAB, the SVD is computed by the statement.

[U,Sigma,V] = svd(A)

With inexact floating point computation, it is appropriate to take the rank to be the number of *nonnegligible* diagonal elements. So the function

r = rank(A)

counts the number of singular values larger than a tolerance.

Multiply both sides of $A = U\Sigma V^T $ on the right by $V$. Since $V^T V = I$, we find

$$ AV = U\Sigma $$

Here is the picture. I've drawn a green line after column $r$ to show the rank. The only nonzero elements of $\Sigma$, the singular values, are the blue dots.

Write out this equation column by column.

$$ Av_j = \sigma_j u_j, \ \ j = 1,...,r $$

$$ Av_j = 0, \ \ j = r+1,...,n $$

This says that $A$ maps the first $r$ columns of $V$ onto nonzero vectors and maps the remaining columns of $V$ onto zero. So the columns of $V$, which are known as the *right singular vectors*, form a natural basis for the first two fundamental spaces.

- $V(:,1:r)$ spans the row space.
- $V(:,r+1:n)$ spans the null space.

Transpose the equation $A = U\Sigma V^T $ and multiply both sides on the right by $U$. Since $U^T U = I$, we find

$$ A^T U = V \Sigma^T $$

Here's the picture, with the green line at the rank.

Write this out column by column.

$$ A^T u_j = \sigma_j v_j, \ \ j = 1,...,r $$

$$ A^T u_j = 0, \ \ j = r+1,...,m $$

This says that $A^T$ maps the first $r$ columns of $U$ onto nonzero vectors and maps the remaining columns of $U$ onto zero. So the columns of $U$, which are known as the *left singular vectors*, form a natural basis for the other two fundamental spaces.

- $U(:,1:r)$ spans the column_space.
- $U(:,r+1:m)$ spans the left_nullspace.

Here is an example involving lines in two dimensions. So $m = n = 2$ and the rank is $r = 1$. Start with these vectors.

u = [-3 4]' v = [1 3]'

u = -3 4 v = 1 3

The matrix $A$ is their outer product

A = u*v'

A = -3 -9 4 12

Compute the SVD.

[U,S,V] = svd(A)

U = -0.6000 0.8000 0.8000 0.6000 S = 15.8114 0 0 0 V = 0.3162 -0.9487 0.9487 0.3162

The first left and right singular vectors are our starting vectors, normalized to have unit length.

ubar = u/norm(u) vbar = v/norm(v)

ubar = -0.6000 0.8000 vbar = 0.3162 0.9487

These vectors provide bases for the one dimensional column and row spaces. The only nonzero singular value is the product of the normalizing factors.

sigma = norm(u)*norm(v)

sigma = 15.8114

The second left and right singular vectors are perpendicular to the first two and form bases for the null spaces of $A$ and $A^T$.

Here is the picture.

Gilbert Strang, "The Four Fundamental Subspaces: 4 Lines", undated notes for MIT course 18.06, <http://web.mit.edu/18.06/www/Essays/newpaper_ver3.pdf>

Gilbert Strang, *Introduction to Linear Algebra*, Wellesley-Cambridge Press, fifth edition, 2016, x+574 pages, <http://bookstore.siam.org/wc14>

Gilbert Strang, "The Fundamental Theorem of Linear Algebra", *The American Mathematical Monthly*, Vol. 100, No. 9. (Nov., 1993), pp. 848-855, <http://www.jstor.org/stable/2324660?seq=1#page_scan_tab_contents>, also available at <http://www.souravsengupta.com/cds2016/lectures/Strang_Paper1.pdf>

Get
the MATLAB code

Published with MATLAB® R2016b

My favorite ordinary differential equation provides a good test of ODE solvers, both numeric and symbolic. It also provides a nice illustration of the underlying existence theory and error analysis.... read more >>

]]>My favorite ordinary differential equation provides a good test of ODE solvers, both numeric and symbolic. It also provides a nice illustration of the underlying existence theory and error analysis.

Here is one of my favorite ordinary differential equations. The dot means differentiation with respect to $t$. The equation does not express $\dot{y}$ directly, so it is implicit.

$$ \dot{y}^2 + y^2 = 1 $$

We don't need a computer to verify that the analytic solution is a combination of $\sin{t}$ and $\cos{t}$. But the constants $y=1$ and $y=-1$ are also solutions.

Our ODE solvers require an explicit equation, but when we solve this equation for $\dot{y}$ we get two solutions, differing in sign.

$$ \dot{y} = \pm \sqrt{1-y^2} $$

That $\pm$ sign makes this interesting.

Let's plunge ahead with the $+$ sign. Specify the equation, an interval for $t$ and an initial condition.

```
f = @(t,y) sqrt(1-y^2)
format short
tspan = [0 pi]
y0 = 0
```

f = @(t,y)sqrt(1-y^2) tspan = 0 3.1416 y0 = 0

An attempt to compute a numeric solution with `ode45` runs into trouble when $y$ gets slightly larger than $1$ and the square root of a negative number generates an imaginary result. So temporarily use a shorter $t$ interval.

try ode45(f,tspan,y0); catch ode45(f,[0 1.47],y0) end pi_axis([-0.1 1.1])

Error using odeplot (line 63) Error updating the ODEPLOT window. Solution data may have been corrupted. Argument Y cannot be complex. Error in ode45 (line 435)

Change the equation so that it provides a guard against the square root of negative numbers.

f = @(t,y) sqrt(1-min(y,1)^2) ode45(f,tspan,y0); pi_axis([-0.1 1.1])

f = @(t,y)sqrt(1-min(y,1)^2)

This is the correct solution, $\sin{t}$. For a while. But when $y$ reaches $1$ it has to stay there because its derivative is always nonnegative.

If we want to continue with $\sin{t}$ beyond $\pi/2$ we have to switch to the minus sign.

f = @(t,y) (1-2*double(t>pi/2))*real(sqrt(1-y.^2)) ode45(f,tspan,y0) pi_axis([-0.1 1.1])

f = @(t,y)(1-2*double(t>pi/2))*real(sqrt(1-y.^2))

We're solving the autonomous differential equation,

$$ \dot{y} = f(y) $$

where

$$ f(y) = \sqrt{1-y^2} $$

An important quantity in both the existence theory for ODEs and the error analysis for numeric methods is the *Jacobian*.

$$ J = \frac{\partial{f}}{\partial{y}} $$

In this case $J$ is 1-by-1.

$$ J = -\frac{y}{\sqrt{1 - y^2}} $$

When $y$ approaches $\pm 1$ the Jacobian approaches infinity.

Existence and uniqueness theorems for ODEs rely on a bounded Jacobian (or a bounded Lipschitz constant). The derivations of numerical methods rely on a bounded Jacobian. Error control in modern ODE software relies on a bounded Jacobian. All of these are up in the air with this example.

We can see the lack of uniqueness happening when we change the initial condition to $y(0) = -1$. The Jacobian is infinite at the start and both the constant $y(t) = -1$ and the trig function $y(t) = -\cos{t}$ are solutions. A small perturbation in the initial condition switches the numerical solution from one to the other.

f = @(t,y) sqrt(1-max(min(y,1),-1)^2) y0 = -1 ode45(f,tspan,y0); pi_axis([-1.2 0]) hold on format long y0 = -.99999999 ode45(f,tspan,y0); pi_axis([-1.1 1.1]) hold off

f = @(t,y)sqrt(1-max(min(y,1),-1)^2) y0 = -1 y0 = -0.999999990000000

I was curious to see how the Symbolic Toolbox handles this equation with this initial condition. There is good news and surprising news.

```
syms y(t)
ode = diff(y)^2+y^2==1
init = y(0)==-1
ysym = dsolve(ode,init)
```

ode(t) = diff(y(t), t)^2 + y(t)^2 == 1 init = y(0) == -1 ysym = -1 -cosh(t*1i)

I am pleased to see the two solutions confirmed, but I am surprised to see the second one expressed as a hyperbolic trig function of an imaginary argument.

Simplification generates a more familiar form.

simplify(ysym)

ans = -1 -cos(t)

Let's return to the original implicit equation and differentiate the entire equation with respect to $t$.

$$ \frac{d}{dt} \left( \dot{y}^2 + y^2 = 1 \right) $$

$$ 2 \dot{y} \ddot{y} + 2 y \dot{y} = 0 $$

We can divide through this equation by $2 \dot{y}$, provided that ${\dot{y}}$ is nonzero. This eliminates the constant solutions and produces an old friend, the harmonic oscillator.

$$ \ddot{y} + y = 0 $$

Our ODE solvers need the second order equation to be expressed as a first order system.

$$ \dot{y_1} = y_2, \ \ \dot{y_2} = -y_1 $$

Now there is no trouble computing $\sin{t}$ and $-\cos{t}$.

f = @(t,y)[y(2); -y(1)] y0 = [0; -1] ode23(f,tspan,y0) pi_axis([-1 1])

f = @(t,y)[y(2);-y(1)] y0 = 0 -1

I have been using this little function to label the x-axis with multiples of $\pi$ and scale the y-axis.

```
type pi_axis
```

function pi_axis(yaxis) axis([0 pi yaxis]) set(gca,'xtick',0:pi/4:pi, ... 'xticklabels',{0 '\pi/4' '\pi/2' '3\pi/4' '\pi'}) end

Get
the MATLAB code

Published with MATLAB® R2016a

I am launching a project that I call *Cleve's Laboratory*. My most recent Cleve's Corner column in MathWorks News & Notes is about the project. The MATLAB app itself is available for download from the MATLAB Central File Exchange.... read more >>

I am launching a project that I call *Cleve's Laboratory*. My most recent Cleve's Corner column in MathWorks News & Notes is about the project. The MATLAB app itself is available for download from the MATLAB Central File Exchange.

*Cleve’s Laboratory* collects much of the work I have done over the last several years in one place. The experiments come from this blog, my two online books, *Cleve’s Corner* columns in News & Notes, and new work.

Each experiment centers on an interactive “app” that allows experimenters to try out the ideas for themselves. Several of the experiments, including `hello_world`, `klock`, and `biorhythms`, are designed to introduce newcomers to MATLAB.

This figure shows snapshots of some of the graphical entry pages. Even these are live pages driven by MATLAB. The wave in the first icon moves every time you open the page. The clock reads the correct time. The Sudoku puzzle changes. The double pendulum swings.

Here are the apps in the current release. If you have been following this blog, you might recognize some of them.

- logo_wave
- biorhythms
- klock
- lifex
- fibonacci
- fern
- house_mult
- t_puzzle
- tictactoe
- flame
- predprey
- mandelbrot
- durerperm
- waterwave
- expshow
- sudoku
- orbits
- golden_spiral
- walker
- hello_world
- blackjack
- swinger
- eigsvdapp
- censusapp
- touchtone
- pdeapp
- interpapp
- waves
- tumbling_box

Get
the MATLAB code

Published with MATLAB® R2016a