I recently started following a new blog by Martin Trauth, called MATLAB Recipes for Earth Sciences. Even if you are not an earth scientist (which, by the way, is part of my scientific lineage), you may find many useful nuggets there.... read more >>

]]>I recently started following a new blog by Martin Trauth, called MATLAB Recipes for Earth Sciences. Even if you are not an earth scientist (which, by the way, is part of my scientific lineage), you may find many useful nuggets there.

Martin is a geoscience professor at University of Potsam. He's written well-regarded textbooks, one with the same title as his blog (which he currently posts to at a very healthy pace, one I cannot sustain!).

Martin's post cover a huge range of topics, many of which are not at all specific to, though helpful to, geoscientists. Some notable titles I read recently include:

- Missing Version History of MATLAB Functions
- Counting Flamingos with MATLAB
- Better Avoid Running Means
- Detecting Change Points in Time Series with MATLAB
- Calculating 3D Point Clouds From Stereo Images Using MATLAB
- Classical Linear Regression of Log-Transformed Data

In addition to posts on computation, Martin's blog also addresses working with Lego Mindstorms hardware with MATLAB. And more.

I think Martin's site is one you might want to start following. Do you know of other sites in a similar vein? Let me know here.

Get
the MATLAB code

Published with MATLAB® R2017a

Today I’d like to introduce a guest blogger, David Garrison, who is a MATLAB Product Manager here at MathWorks. This is the first in a series of blogs over the next few months describing MathWorks involvement in an exciting project related to the 2017 solar eclipse.... read more >>

]]>Today I’d like to introduce a guest blogger, David Garrison, who is a MATLAB Product Manager here at MathWorks. This is the first in a series of blogs over the next few months describing MathWorks involvement in an exciting project related to the 2017 solar eclipse.

**Part 1: The Citizen CATE Experiment**- Part 2: Training the Volunteers
- Part 3: Rehearsing for the Eclipse
- Part 4: Imaging the Eclipse Here is Part 1 of the Series.

Here is Part 1 of the Series.

As many of you know, a total solar eclipse will occur on August 21, 2017 where the path of totality will pass over the continental United States. This is a rare event. The last total eclipse visible from the U.S. occurred in February, 1979. The path of totality for the 2017 eclipse is shown on the map below.

For solar astronomers, a solar eclipse is the best time to observe the sun’s corona (its outer atmosphere). The glow of the corona is a million times less bright than the sun’s photosphere so the only time to observe and gather data on the corona is when the sun’s disk is blocked out during an eclipse. The 2017 eclipse will provide solar astronomers an unprecedented opportunity to gather data. That data will be used for both static and dynamic studies of the corona.

The Citizen CATE Experiment is a citizen science project run by the National Solar Observatory (NSO). Dr. Matthew Penn is the project's Principal Investigator. The project's objective is to capture images of the sun's corona at sites along the eclipse path using a network of telescopes operated by volunteers. Those volunteers include high school and university groups, astronomy clubs, and individuals. Here is the project’s vision:

*The CATE Experiment should unify people from coast to coast, make them aware that they are part of something bigger than themselves, re-ignite a new wave of citizen science, and inspire students by showing them a path to becoming a scientist.*

Volunteers will use a standard set of equipment including a telescope, camera, and laptop computer provided to them by the project. Software will be used to acquire the camera images and do some initial pre-processing of the data. After the data is taken, it will be uploaded to NSO servers and made available to the public. A full description of the science goals of the project can be found here.

Volunteers will be trained by state coordinators in equipment use and data collection. After the eclipse, volunteers will keep the equipment or donate it to a local middle school or high school. There is interest from other researchers in using this network of volunteers to do future large scale astronomical studies.

I first heard about the Citizen CATE Experiment when I attended the 2015 Northeast Astronomy Forum (NEAF) – a conference and tradeshow for amateur astronomers. Dr. Penn was one of the speakers at the conference and described his idea to create a network of citizen scientists spread out across the continental U.S. to image the eclipse. As I watched his presentation, two thoughts occurred to me. First, as an amateur astronomer, I wanted to be involved in the project as a volunteer. Second, the eclipse will be a big science event in 2017 witnessed by millions of people in the U.S. and maybe there is an opportunity for MathWorks to help.

After the NEAF presentation, I submitted my name to be a project volunteer. I have been accepted and will collect data on the eclipse in Hartsville, Tennessee. Two other MathWorks staff have volunteered and will observe the eclipse from sites in South Carolina. MathWorks also agreed to be a Citizen CATE sponsor. We are providing some financial support as well as providing software for the volunteers. A full list of project sponsors can be found on the Citizen CATE website.

During the eclipse, volunteers will use MATLAB, the Image Acquisition Toolbox, and the Image Processing Toolbox to record and process images from the eclipse. Images will be recorded at several exposures and combined into a set of high dynamic range images of the corona. More details on that in a subsequent post.

Are you planning to travel to the see the eclipse? Are there any local eclipse events in your area? We'd love to hear your plans here.

That's all for now. In my next post I'll describe the equipment and software for the project and the process for training the Citizen CATE Volunteers.

Get
the MATLAB code

Published with MATLAB® R2017a

Today's guest blogger is Anoush Najarian who leads the MATLAB Performance Team at MathWorks. And while she mostly focuses on helping MATLAB run fast, in her spare time, she likes to use MATLAB for hobby projects in robotics, math, and games.... read more >>

]]>Today's guest blogger is Anoush Najarian who leads the MATLAB Performance Team at MathWorks. And while she mostly focuses on helping MATLAB run fast, in her spare time, she likes to use MATLAB for hobby projects in robotics, math, and games.

Let me tell you what happened when I got tired of losing at Mancala, and decided to write some MATLAB code to play it. Shout-out to my daughter, sixth grader Natalie, for introducing me to the game, and being a partner in these experiments.

Now, there are many ways to play the games in the Mancala family. The rule set we wrote the code for is: you pick from any hole, and drop one stone at a time while circling the board in counterclockwise fashion, drop a stone into your home whenever you pass through it. If you drop your last stone into your home, you get a 'free' turn. If you drop your last stone into a non-empty hole, you get to continue with what I call an 'automatic' move, picking up all stones from that hole. In the intial position, there are four stones in every hole.

You know how some games have a first-player advantage? It turns out that in Mancala, you can find a way not only to win (which is nice), but to win all the marbles (awesome), and to do so on your very first move!

% Here is driver code to find (one of many!) % all-48-marble-win-on-first-move solutions, which runs in ~20s on my % laptop! gametrees = {0 [4 4 4 4 4 4 4 4 4 4 4 4] []}; nummoves = 0; while nummoves < 50 newgametrees = {}; L = size(gametrees, 1); [maxscore, ind] = max(cell2mat(gametrees(:, 1))); [totalscore, board, winningstreak] = gametrees{ind, :}; % display one highest-score entry for g = 1:L [totalscore, board, winningstreak] = gametrees{g, :}; for t = 1:12 if board(t) == 0, continue, end [score, freemove, newboard] = mancalafirstmove (t, board); if freemove if nummoves<9 || totalscore + score > maxscore * 0.75 % optmization to prune search tree newwinningstreak = [winningstreak, t]; newgametrees(end+1, :) = {totalscore+score, newboard, newwinningstreak}; end end if totalscore+score == 48 disp('Found 48-marble winning sequence!'); disp(newwinningstreak); return; end end end gametrees = newgametrees; nummoves = nummoves + 1; end % The driver code calls a move function which will runs through 'automatic' % moves recursively. Our code generates a 30-step-long sequence of plays % for the sweeping 48-marble win on your first move! function [score, freemove, board] = mancalafirstmove (apick, board) score = 0; moves = eye(12); pickspot = apick; freemove = mancalamove(pickspot); function freemove = mancalamove(pickspot) numpieces = board(pickspot); board(pickspot) = 0; addapiece = moves(pickspot, :); gain = floor((numpieces-pickspot)/12)+1; score = score + gain; freemove = (pickspot == mod(numpieces, 12)); numpieces = numpieces - gain; for i=1:numpieces board = circshift(addapiece, -1*i) + board; end if (not(freemove)) newpickspot = mod(pickspot-numpieces, 12); if newpickspot==0, newpickspot=12; end if board(newpickspot) > 1 freemove = mancalamove(newpickspot); end end end end

Found 48-marble winning sequence! Columns 1 through 13 8 5 4 2 9 5 7 11 9 5 3 9 5 Columns 14 through 26 1 2 1 12 1 10 1 3 1 8 1 6 1 Columns 27 through 30 4 1 2 1

Let the Battle of the First Move play itself out!

Some of the other Mancala rule sets out there include: no 'free' move, no 'automatic' move, only picking from the side of the board you are sitting next to, different number of holes, marbles!

For example, suppose 'automatic' moves and free moves are allowed, but you can only place on your side of the board. Then you still can can win capturing a pretty impressive 42 marbles on your first move! Tiny change on line 18 of the driver code (loop 1:6 instead of 1:12) will give you the sequence of plays to use for this variation!

What we'd really like to build up to here is to use the game-playing code for training the AI. We recently watched exciting videos like Deep Learning in 11 Lines of MATLAB Code, and are eager to try deep reinforcement learning for games.

Let us know about your experiments with coding and modeling games here!

Get
the MATLAB code

Published with MATLAB® R2017a

*Today's guest post comes from Sean de Wolski, one of my fellow Application Engineers. You might recognize him from MATLAB answers and the pick of the week blog!*... read more >>

*Today's guest post comes from Sean de Wolski, one of my fellow Application Engineers. You might recognize him from MATLAB answers and the pick of the week blog!*

Have you ever wanted MATLAB to pause in debug mode when a certain condition is met? Perhaps when a specific problem-file out of a directory is read or on an iteration of a `for`-loop that yields an unexpected result? Perhaps that `for`-loop has to iterate a few dozen or hundred times before the problem iteration occurs and you don't want to walk through step by step in debug mode?

In this post, we're going to survey a few different ways to handle this.

I was talking with a MATLAB user recently about the merits of the `keyboard` function of which I'm not a big fan. Keyboard pulls you into debug mode as soon as it's encountered. They claimed to like it for difficult debugging when certain criteria are met.

Here's a simple example scenario. CM is the cameraman image:

imshow(CM)

This is a simple algorithm to show how many pixels are brighter than each pixel value in the *uint8* range of `[0 255]`.

xg = zeros(1, 256); for ii = 0:255 xg(ii+1) = sum(CM(:)>ii); end plot(0:255, xg) ylabel('Pixels Greater Than X') axis tight

Somewhere around iteration 125 appears to be giving me problems; I'd expect the curve to be positive through to 255.

I want to see what's happening on iteration 125 and the subsequent ones. One way would be to put a break point on the calculation line, hit run, and then hit the continue button 125 times hoping I don't get in a rhythm and accidentally skip it. This has some obvious shortcomings in that it doesn't scale and is time consuming.

The approach with `keyboard` would look like this:

xg = zeros(1, 256); for ii = 0:255 if ii == 125 keyboard end end

Now on the 125th iteration, we enter the if-statement and can step in debug mode into the problem area. This is pretty straight-forward and is a safe, decent solution. However, it requires changing the code in order to debug it and will require changing it again once we've fixed the bug.

What if we could just stop if? Well you can! `dbstop`, the function underlying breakpoints, supports a wide range of conditional breakpoints.

These breakpoints can be set from the breakpoints menu on the editor tab or with `dbstop` directly. Here's the same code encapsulated in a function called `badForLoop`.

function badForLoop(CM) xg = zeros(1, 256); for ii = 0:255 xg(ii+1) = sum(CM(:)>ii); end plot(0:255, xg) ylabel('Pixels Greater Than X') axis tight end

To set it from the breakpoints menu, select "Set Condition" while the cursor is on the line you'd like to stop on.

Then enter your condition; any valid MATLAB code will work here.

To set the same breakpoint programmatically:

dbstop in badForLoop at 4 if (ii==125)

badForLoop(CM);

If you think you have a fix, you can then disable the breakpoint without clearing it by either clicking on the yellow breakpoint itself (an X will appear over it indicating it's disabled). If the problem is resolved, clicking the breakpoint again will clear it or you can reenable it by right clicking. This can be done from the menu as well and clearing can be done with `dbclear`.

The analogous workflow with `keyboard` would be to highlight the selection and comment it out to "disable" and then delete to clear.

The fix for this toy problem was that the image was stored as an *int8* with a range `[-128 127]` rather than a *uint8* with range `[0 255]`.

Additionally, `dbstop` has some predefined conditions to stop on including on errors, warnings, or if nans or infs are encountered. If you haven't discovered `dbstop if error` yet, I strongly suggest giving it a try. When an error occurs, it stops and you can view the state of the world as it was when the error occurred.

Do you have any debugging war stories where one of these tricks could've helped? What about other uses for the `keyboard` command? Let us know here.

Get
the MATLAB code

Published with MATLAB® R2017a

Disclaimer: I read very few MATLAB books. I'm thinking you can guess why!... read more >>

]]>Disclaimer: I read very few MATLAB books. I'm thinking you can guess why!

I recently got a copy of the latest edition of this book, MATLAB Guide, Third Edition, by Desmond and Nick Higham.

And I have enjoyed wandering through it. Des and Nick bring the clarity of their teaching to the book (or is it the other way around?). This is both a good reference book and introduction to MATLAB, though it doesn't replace (nor does it preclude) getting your hands on the keyboard and working problems out for yourself. They encourage you to get tapping away on your keyboard to get your own MATLAB experience.

The book's included topics span a large area of computing - e.g., matrices, linear algebra (of course!), numerical methods for solving ODEs, PDEs, and introducing people to object oriented programming towards the end. In fact, they introduce you to many topics that could be of interest, from sparse arrays to big data. In addition, the Highams also introduce a couple of important toolboxes: Symbolic Math, Parallel Computing.

In the book, I especially like the new Asides in gray boxes (that would be grey to many non-Americans) that give nice background material and reference for a wide variety of topics, from lambda calculus to reproducible research. The latter is, of course, an important topic for all technical people, though not always recognized as such.

Towards the end, the book includes the authors' list of the top 111 MATLAB functions. A useful lens, fun to go through, and it provided me with a chuckle. Because when I talk to people myself about the top N anything in MATLAB, I invariably end up introducing more like 2*N topics!

For those who want a solid book in the hand for teaching, learning or as a reference, with access to programs, links and resources, you just might want to check this one out.

Get
the MATLAB code

Published with MATLAB® R2017a

*I'd like to introduce today's guest blogger, Dave Bergstein, a MATLAB Product Manager at MathWorks. In today's post, Dave discusses recent updates to text processing with MATLAB.*... read more >>

*I'd like to introduce today's guest blogger, Dave Bergstein, a MATLAB Product Manager at MathWorks. In today's post, Dave discusses recent updates to text processing with MATLAB.*

In today's post I share a text processing example using the new string array and a collection of new text manipulation functions, both introduced in R2016b. I also give recommendations on when best to use `string`, `char`, or `cell` for text and share some of our thinking on the future.

Also be sure to check out Toshi's post Introducing String Arrays and Loren's post Singing the Praises of Strings.

My friend in New York City talks about the delays on her bus route. Let's look at some data to see what typical delays are for trains and buses. The Open NY Initiative shares data which includes over 150,000 public transit events spanning 6 years. I downloaded this data as a CSV file from: https://data.ny.gov/Transportation/511-NY-MTA-Events-Beginning-2010/i8wu-pqzv

**Import the Data**

I read the data into a table using `readtable` and specify the `TextType` name-value pair as `string` to read the text as string arrays.

data = readtable('511_NY_MTA_Events__Beginning_2010.csv','TextType','string');

Warning: Variable names were modified to make them valid MATLAB identifiers. The original names are saved in the VariableDescriptions property.

Here is a list of variables in the table:

data.Properties.VariableNames

ans = 1×13 cell array Columns 1 through 4 'EventType' 'OrganizationName' 'FacilityName' 'Direction' Columns 5 through 9 'City' 'County' 'State' 'CreateTime' 'CloseTime' Columns 10 through 13 'EventDescription' 'RespondingOrganiz…' 'Latitude' 'Longitude'

`data.EventDescription` is a string array which contains the event descriptions. Let's take a closer look at the events.

eventsStr = data.EventDescription;

Unlike character vectors or cell array of character vectors, each element of the string array is a string itself. See how I can index the string array just as I would a numeric array and get strings arrays back.

eventsStr(1:3)

ans = 3×1 string array "MTA NYC Transit Bus: due to Earlier flooding Q11 into Hamilton Beach normal service resumed" "MTA NYC Transit Bus: due to Construction, northbound M1 Bus area of 147th Street:Adam Clayton Powell Junior" "MTA NYC Transit Subway: due to Delays, Bronx Bound # 2 & 3 Lines at Nevins Street Station (Brooklyn)"

Many of the event descriptions report delays like 'operating 10 minutes late'. See for example how the 26-minute delay is reported in event 5180.

eventsStr(5180)

ans = "MTA Long Island Rail Road: due to Debris on tracks, westbound Montauk Branch between Montauk Station (Suffolk County) and Jamaica Station (Queens) The 6:44 AM from Montauk due Jamaica at 9:32 AM, is operating 26 minutes late due to an unauthorized vehicle on the tracks near Hampton Bays."

**Identify Delays**

I want to find all the events which contain ' late '. MATLAB R2016b also introduced more than a dozen new functions for working with text. These functions work with character vectors, cell arrays of character vectors, and string arrays. You can learn about these functions from the characters and strings page in our documentation.

I convert the text to all lowercase and determine which events contain ' late ' using the `contains` function.

```
eventsStr = lower(eventsStr);
idx = contains(eventsStr,' late ');
lateEvents = eventsStr(idx);
```

**Extract the Delay Times**

I extract the minutes late from phrases like 'operating 10 minutes late' using the functions `extractAfter` and `extractBefore`.

Let's look at the first late event. The exact phrase we are seeking doesn't appear in this event. When we look for the text following 'operating' we get back a `missing` string.

```
lateEvents(1)
extractAfter(lateEvents(1),'operating')
```

ans = "mta long island rail road: due to delays, westbound babylon branch between speonk station (speonk) and new york penn station (manhattan) the 5:08 a.m. departure due ny @ 7:02 a.m. is 15 minutes late @ babylon." ans = <missing>

Let's look at the second late event. This string contains the phrase 'operating 14 minutes late'. Extracting the text after 'operating' we get '14 minutes late due to signal problems'. Extracting the text before 'minutes late' we get back ' 14 ' which we can convert to a numeric value using `double`.

lateEvents(2) s = extractAfter(lateEvents(2),'operating') s = extractBefore(s,'minutes late') minLate = double(s)

ans = "mta long island rail road: due to delays westbound ronkonkoma branch out of bethpage station (suffolk county) the 8:01 am train due into penn station at 8:47 am is operating 14 minutes late due to signal problems" s = " 14 minutes late due to signal problems" s = " 14 " minLate = 14

Success! We extracted the train delay from the event description. Now let's put this all together. I extract the minutes late from all the events and drop the missing values using the `rmmissing` function. I then convert the remaining values to numbers using `double` and plot a histogram of the results.

s = extractAfter(lateEvents,'operating'); s = extractBefore(s,'minutes late'); s = rmmissing(s); minLate = double(s); histogram(minLate,0:5:40) ylabel('Number of Events') xlabel('Minutes Late') title({'Transit Delays','NY Metropolitan Transit Authority'})

It looks like reported delays are often 10-15 minutes. This simple routine captures many of the transit delays, but not all. The pattern doesn't always fit (consider again `lateEvents(1)`). I also left out any delays that may be reported in hours. Can you improve it?

String arrays are a great choice for text data like the example above because they are memory efficient and perform better than cell arrays of character vectors (previously known as `cellstr`).

Let's compare the memory usage. I convert the string array to a cell array of character vectors with the `cellstr` command and check the memory with `whos`. See the Bytes column - it shows the string array is about 12% more efficient.

```
eventsCell = cellstr(eventsStr);
whos events*
```

Name Size Bytes Class Attributes eventsCell 151225x1 73208886 cell eventsStr 151225x1 64662486 string

The memory savings can be much greater for many smaller pieces of text. for example, suppose I want to store each word as a separate array element. First I join all 150,000 reports into a single long string using the `join` function. I then split this long string on spaces using the `split` function. The result is a string array storing over 4 million words in separate elements. Here the memory savings is nearly 2X.

```
wordsStr = split(join(eventsStr));
wordsCell = split(join(eventsCell));
whos words*
```

Name Size Bytes Class Attributes wordsCell 4356256x1 535429652 cell wordsStr 4356256x1 284537656 string

String arrays also perform better. You can achieve the best performance using string arrays in combination with the text manipulation functions introduced in R2016b. Here I compare the performance of `replace` on a string array with that of `strrep` on a cell array of character vectors. See how `replace` with a string array is about 4X faster than `strrep` with a cell array.

f1 = @() replace(eventsStr,'delay','late'); f2 = @() strrep(eventsCell,'delay','late'); timeit(f1) timeit(f2)

ans = 0.062507 ans = 0.23239

So, should you use string arrays for all your text? Maybe not yet. MATLAB has three different ways to store text:

- character vectors (
`char`) - string arrays (
`string`) - cell arrays of character vectors (
`cell`)

For now (as of R2017a), we encourage you to use string arrays to store text data such as the transit events. We don’t recommend using string arrays elsewhere yet since string arrays aren’t yet accepted everywhere in MATLAB. Notice how I used a character vector for specifying the filename in `readtable` and a cell array of character vectors for the figure title.

What about in the future? We feel string arrays provide a better experience than character vectors and cell arrays of character vectors. Our plan is to roll out broader use of string arrays over time.

In the next few releases we will update more MATLAB functions and properties to accept string arrays in addition to character vectors and cell arrays of character vectors. As we do so, it will become easier for you to use string arrays in more places.

Next we will replace cell arrays of character vectors in MATLAB with string arrays. Note that cell arrays themselves aren't going anywhere. They are an important MATLAB container type and good for storing mixed data types or arrays of jagged size among other uses. But we expect their use for text data will diminish and become largely replaced by string arrays which are more memory efficient and perform better for pure text data.

Beyond that, over time, we will use string arrays in new functions and new properties in place of character vectors (but will continue returning character vectors in many places for compatibility). We expect character vectors will continue to live on for version-to-version code compatibility and special use cases.

Speaking of compatibility: we care deeply about version-to-version compatibility of MATLAB code, today more than ever. So, we are taking the following steps in our roll out of string arrays:

- Text manipulation functions (both old and new) return the text type they are passed. This means you can opt-in to using string with these functions (string use isn't necessary). Note how I used
`split`and`join`above with either string arrays or cell arrays of character vectors. - We are recommending string arrays today for text data applications. Here there are ways to opt-in to string use. In the example, I opted to get a string array from
`readtable`using the`TextType`name-value pair. And string arrays were returned from functions like`extractBefore`because I passed a string array as input. - We added curly brace indexing to string arrays which returns a character vector for compatibility. Cell arrays return their contents when you index with curly braces
`{}`. Code that uses cell arrays of character vectors usually indexes the array with curly braces to access the character vector. Such code can work with string arrays since curly brace indexing will also return a character vector. See how the following code returns the same result whether`f`is a cell array or a string:

d = datetime('now'); f = {'h','m','s'}; % use a cell array for n = 1:3, d.Format = f{n}; disp(d) end

9 43 10

f = ["h","m","s"]; % use a string array for n = 1:3, d.Format = f{n}; disp(d) end

9 43 10

Expect to hear more from me on this topic. And please share your input with us by leaving a comment below. We're interested to hear from you.

We hope string arrays will help you accomplish your goals and that the steps we're taking provide a smooth adoption. If you haven't tried string arrays yet, learn more from our documentation on characters and strings.

Get
the MATLAB code

Published with MATLAB® R2017a

Today you'll see a new demonstration of applying optimization techniques. Today's guest is Takafumi Ohbiraki. This demonstration was part of the contents of the MATLAB EXPO which was held in Tokyo last year (2016).... read more >>

]]>Today you'll see a new demonstration of applying optimization techniques. Today's guest is Takafumi Ohbiraki. This demonstration was part of the contents of the MATLAB EXPO which was held in Tokyo last year (2016).

- Design of an umbrella hook
- FEM model (Using Partial Differential Equation (PDE) Toolbox)
- Process of Optimization
- Design of Experiments (DOE) (Taguchi design)
- Calculation with setting parameters for L18
- Factor Effect Chart
- S/N ratio
- Output
- Optimization using patternsearch solver
- Optimization using ga solver
- Multi-objective optimization using the gamultiobj solver
- Conclusion

Optimization is a universal topic in both engineering and data science. It is a technology with high needs. Speaking of optimization in engineering, parameter estimation is often performed on a model showing a physical phenomenon.

Generally, when expressing a physical phenomenon by a model, MATLAB has functions of ordinary differential equations and partial differential equations,and performs simulation. For example, since the problem of heat conduction can be represented by parabolic partial differential equations on the second floor, it is possible to simulate with the function `pdepe`.

In reality, it is often difficult to convert a physical phenomenon into a closed-form mathematical formula. So it is common to utilize a modeling tool to convert the problem into a mathematical model for analysis.

At this time, we will introduce how to optimize the design by using Partial Differential Equation Toolbox and Optimization Toolbox with an example of structural analysis.

A cool umbrella hook is always seen in Japanese public restrooms.

As a designer of an umbrella hook, we will consider how to reduce the cost by reducing the materials used for an umbrella hook. However, in order to keep the functionality of the umbrella hook, it is necessary to for the hook not to bend.

First, we define the design specifications. We consider opening an elliptical cavity in the member. The material (A or B) of the member, the coordinates (X, Y) of the ellipse, the radius of the ellipse (RX, RY), and the rotation angle of the ellipse (theta) are design parameters. The number of parameters is seven.

We will set the downward load on the tightening portion at the upper right and the fixed end at the left end. We perform an optimization with the aim of minimizing the displacement amount of the lower right point.

All parameters have the following constraints.

We create a function `pdesample` to actually simulate with parameters using Partial Differential Equation Toolbox (PDE Toolbox).

input: [material x-position y-position x-radius y-radius stick theta] output : strain

in = [1 5,2,1.5,1.0,0.3,0]; lh = pdesample out = lh{1}(in)

lh = 5×1 cell array @pdesample_input @pdesample_input_doe_sn @myobjfun @myobjfun2 @mynoncon out = 0.40131

This is the initial condition.

Output is about 0.4013. We do not understand whether this result is good or bad. We will seek a good answer using optimization techniques.

To actually perform the optimization calculation, if there are too many parameters, it takes time to reach the optimum value. This time, we will utilize a two-step optimization method that is often used in quality engineering. According to the two-step design method, first, make efforts to reduce the number of variables to be optimized by utilizing the experimental design method. In this method, we adopt the Taguchi design. The orthogonal table is uploaded to the File Exchange. In variable selection, we paid attention to the signal to noise ratio (S/N) ratio and the nominal-is-best response. In the second step, we use optimization algorithms to find good parameters.

We will exploit which parameters should be reduced by utilizing the orthogonal array famous from the Taguchi design. In the orthogonal array of Taguchi design, there are 2 factors, 3 factors, and mixed orthogonal tables. We chose the famous L18 model for mixing both 2 factors and 3 factors.

L18 model : 18 experiments to make the combination of 8 variables efficient; one variable (material) is 2 levels , seven variables are 3 levels. The variable `design` created by my helper function `orthogonalarray` is a MATLAB table.

design = orthogonalarray('L18','Bounds',[1 2;2.0 8.0;1.0 3.0;0.5 1.9;0.2 0.9;0.1 0.3;-pi/12 pi/12;0 0],... 'VariableNames',{'material','x','y','rx','ry','dh','theta','xxx'})

design = 18×8 table material x y rx ry dh theta xxx ________ _ _ ___ ____ ___ _______ ___ 1 2 1 0.5 0.2 0.1 -0.2618 0 1 2 2 1.2 0.55 0.2 0 0 1 2 3 1.9 0.9 0.3 0.2618 0 1 5 1 0.5 0.55 0.2 0.2618 0 1 5 2 1.2 0.9 0.3 -0.2618 0 1 5 3 1.9 0.2 0.1 0 0 1 8 1 1.2 0.2 0.3 0 0 1 8 2 1.9 0.55 0.1 0.2618 0 1 8 3 0.5 0.9 0.2 -0.2618 0 2 2 1 1.9 0.9 0.2 0 0 2 2 2 0.5 0.2 0.3 0.2618 0 2 2 3 1.2 0.55 0.1 -0.2618 0 2 5 1 1.2 0.9 0.1 0.2618 0 2 5 2 1.9 0.2 0.2 -0.2618 0 2 5 3 0.5 0.55 0.3 0 0 2 8 1 1.9 0.55 0.3 -0.2618 0 2 8 2 0.5 0.9 0.1 0 0 2 8 3 1.2 0.2 0.2 0.2618 0

At experimental points, we calculate the outputs (strain) and S/N ratios.

output = zeros(18,1); sn = zeros(18,1); for i = 1:18 output(i,1) = lh{1}(table2array(design(i,1:8))); sn(i,1) = lh{2}(table2array(design(i,1:8))); end results = [design table(output,sn)]

results = 18×10 table material x y rx ry dh theta xxx output sn ________ _ _ ___ ____ ___ _______ ___ _______ ______ 1 2 1 0.5 0.2 0.1 -0.2618 0 0.346 108.16 1 2 2 1.2 0.55 0.2 0 0 0.36042 88.027 1 2 3 1.9 0.9 0.3 0.2618 0 0.42694 57.447 1 5 1 0.5 0.55 0.2 0.2618 0 0.35443 88.063 1 5 2 1.2 0.9 0.3 -0.2618 0 0.37875 78.837 1 5 3 1.9 0.2 0.1 0 0 0.36595 84.527 1 8 1 1.2 0.2 0.3 0 0 0.33836 121.1 1 8 2 1.9 0.55 0.1 0.2618 0 0.38466 59.585 1 8 3 0.5 0.9 0.2 -0.2618 0 0.34978 96.122 2 2 1 1.9 0.9 0.2 0 0 1.8659 32.802 2 2 2 0.5 0.2 0.3 0.2618 0 0.92894 132.08 2 2 3 1.2 0.55 0.1 -0.2618 0 0.97881 90.175 2 5 1 1.2 0.9 0.1 0.2618 0 1.2566 41.031 2 5 2 1.9 0.2 0.2 -0.2618 0 1.0356 78.99 2 5 3 0.5 0.55 0.3 0 0 0.94161 113 2 8 1 1.9 0.55 0.3 -0.2618 0 1.019 59.27 2 8 2 0.5 0.9 0.1 0 0 0.95126 106.67 2 8 3 1.2 0.2 0.2 0.2618 0 0.95404 88.904

The factor effect diagram is a graph expressing the effect of each factor, with the horizontal axis representing the level of the factor and the vertical axis representing the S / N ratio and the characteristic value. From this graph, find the condition with high S / N ratio and select the one with better characteristics.

`factoreffectchart` is my helper function.

According to this chart, the S/N ratio is very good at both the 1st level of the 4th and 5th parameters (or variables).

The 4th variable is the x radius and the 5th variables is the y radius.

d = orthogonalarray('L18'); Osn = factoreffectchart(d,results.sn) % S/N

Osn = 86.873 82.546 NaN 84.781 80.741 88.608 75.07 90.698 88.362 107.35 84.679 62.103 102.29 83.019 68.818 81.69 78.818 93.621 85.258 91.02 77.851 80.213 83.509 90.407

To minimize the strain, the 1st level of the 1st variable is good.

The 1st variable is material.

We succeed at reducing the parameters from 7 variables to 4.

```
O = factoreffectchart(d,results.output) % Output
```

O = 0.36726 1.1035 NaN 0.81783 0.72215 0.66618 0.86336 0.67327 0.66952 0.64534 0.71116 0.84967 0.66149 0.67315 0.87153 0.71387 0.82003 0.67226 0.68465 0.80391 0.7176 0.81182 0.71343 0.6809

We will optimize 4 parameters using the `patternsearch` solver.

First of all, we are going to optimize the location of the hole that minimizes strain.

The objective function is the code from 192 to 197 lines in "pdesample.m".

dbtype pdesample 193:197 close all

193 function out = myobjfun(in) 194 195 out = pdesample_input([1 in(1) in(2) 0.5 0.2 in(3) in(4)]); 196 197 end

opt = optimoptions('patternsearch'); opt.MeshTolerance = 0.001; opt.FunctionTolerance = 0.00001; opt.MaxIterations = 500; opt.TimeLimit = 10*60; opt.UseParallel = true; opt.Display = 'iter'; [x,obj] = patternsearch(lh{3}, [7,3,0.2,0],[],[],[],[],[2.0,0.3,0.1,-pi/4],[9.5,4.5,0.3,pi/4],lh{5},opt)

Max Iter Func-count f(x) Constraint MeshSize Method 0 1 0.338279 0 1 1 6 0.33769 0 0.009772 Update multipliers 2 329 0.334631 0 0.000955 Update multipliers Optimization terminated: mesh size less than options.MeshTolerance and constraint violation is less than options.ConstraintTolerance. x = 9.3125 2.9688 0.14141 0.25 obj = 0.33463

We will also optimize 4 parameters using the `ga`> solver.

gaoptions = gaoptimset(@ga); gaoptions.CrossoverFcn = @crossoverintermediate; gaoptions.MutationFcn = @mutationadaptfeasible; gaoptions.CrossoverFraction = 0.5; gaoptions.TolFun = 0.00001; gaoptions.PopulationSize = 20; % population size gaoptions.InitialPopulation = [7,3,0.2,0]; % initial population gaoptions.Generations = 500; % gaoptions.TimeLimit = 10*60; gaoptions.UseParallel = true; gaoptions.Display = 'iter'; [x,fval] = ga(lh{3},4,[],[],[],[],[2.0,0.3,0.1,-pi/4],[9.5,4.5,0.3,pi/4],lh{5},gaoptions)

Best Max Stall Generation Func-count f(x) Constraint Generations 1 1078 0.33505 0 0 2 1598 0.334569 0.0008296 0 Optimization terminated: average change in the fitness value less than options.FunctionTolerance and constraint violation is less than options.ConstraintTolerance. x = 9.3345 3.1148 0.19511 0.36912 fval = 0.33457

The values of the output by both the `patternsearch` and `ga` solvers are reduced from the inital conditions. Both results give especially similiar results for the x,y positions. We conclude that both results are good. However, the area of hole is the smallest possible because of considering the S/N ratio. In order to lighten the weight, we will enlarge the area of hole.

We would like to optimize both the size and location of the hole in order to minmize the strain and lighten the weight.

In the previous case, radius x and y are the minimum values. In order to increase the size of the hole, we will use a multi-objective optimization `gamultiobj` solver.

The relationship between the value of strain and the area of hole is inverse proportional and there is a trade-off.

The objective function is the code from 200 to 205 lines in "pdesample.m".

dbtype pdesample 200:205

200 function out = myobjfun2(in) 201 202 out(1) = pdesample_input([1 in 0.2 0*pi/12]); 203 out(2) = -pi*in(3)*in(4); 204 205 end

mulgaoptions = gaoptimset(@gamultiobj); mulgaoptions.CrossoverFraction = 0.5; mulgaoptions.TolFun = 0.001; mulgaoptions.PopulationSize = 100; mulgaoptions.Generations = 3000; mulgaoptions.InitialPopulation = [7 2.5 0.5 0.5]; mulgaoptions.UseParallel = false; mulgaoptions.TimeLimit = 20*60; [x,fval,exitflag,output] = gamultiobj(lh{4},4,[1 0 1 0;0 1 0 1;-1 0 1 0; 0 -1 0 1],[9.9;4.9;-0.2;-0.2],[],[],[2.0,0.3,0.2,0.2],[9.6,4.5,3,2.6],mulgaoptions);

Optimization terminated: time limit exceeded.

Here is the biggest hole.

lh{4}(x(1,:))

ans = 0.33632 -0.62854

In this next plot, we show the pareto frontier which is popular for design engineering. There you can see the trade-off between area and strain for making design decisions.

scatter(fval(:,1),50.4 + fval(:,2),'m') set(gca,'FontSize',16,'FontName','Meiryo UI') xlabel('strain');ylabel('area');title('pareto frontier')

Pareto frontier results.

T1 = array2table([fval(:,1) 50.4+fval(:,2) x],'VariableNames',{'Strain','Area','x','y','rx','ry'})

T1 = 35×6 table Strain Area x y rx ry _______ ______ ______ _______ _______ _______ 0.33632 49.771 8.4674 0.83566 0.54447 0.36746 2.9137 29.829 6.1874 2.4294 2.9804 2.197 0.8458 36.385 6.1587 2.0938 2.7957 1.5957 3.9636 29.123 6.1812 2.593 2.9923 2.2634 1.553 32.151 5.883 2.3126 2.9526 1.9674 0.83451 37.023 6.7695 2.1714 2.4951 1.7065 0.42155 44.452 7.107 2.1224 1.3364 1.4167 1.2357 33.732 6.0206 2.5458 2.879 1.8429 0.93663 35.477 6.1315 2.3105 2.9743 1.5971 0.75125 37.875 6.7413 2.0409 2.5471 1.5653 1.0758 34.69 6.385 2.4479 2.8222 1.7719 3.077 29.714 6.1778 2.4877 2.9777 2.2113 2.8688 29.875 6.022 2.4611 2.9817 2.1911 0.39498 45.353 7.3136 1.8261 1.4593 1.1008 1.1526 34.152 6.2626 2.1237 2.9141 1.7747 0.35257 47.25 8.2497 1.2095 1.0588 0.94702 3.4396 29.299 6.3585 2.4648 3 2.2389 1.4782 32.434 6.1279 2.4721 2.9814 1.9181 2.1484 30.683 5.9299 2.4064 2.9995 2.0924 0.50403 41.862 7.0168 2.0862 1.8298 1.4853 2.3041 30.529 5.9573 2.4324 2.9834 2.1201 1.6972 31.718 5.9601 2.4295 2.9809 1.995 3.2141 29.467 6.1923 2.4677 3 2.221 0.34045 48.795 8.1318 1.0284 0.8714 0.58627 1.0626 35.115 6.6304 2.3413 2.6315 1.8489 1.3309 33.168 6.023 2.5945 2.9801 1.8405 0.38738 45.534 7.4165 1.5702 1.5472 1.0012 1.4303 32.719 5.9897 2.279 2.8842 1.9514 0.55291 40.816 5.8008 1.8469 2.672 1.1418 1.8278 31.425 5.9551 2.491 2.9899 2.0201 0.3506 48.091 8.7273 1.5772 0.62704 1.172 1.9699 31.13 6.0712 2.4664 2.98 2.0583 3.7843 29.133 6.0624 2.4919 2.9961 2.2595 2.7137 30.065 6.191 2.4831 2.9761 2.1749 3.6181 29.252 5.7451 2.5008 2.9962 2.2467

We showed how to use FEM tools and Optimization Toolbox to perform structural design. What can you shape with these tools? Let us know your thoughts here.

Get
the MATLAB code

Published with MATLAB® R2017a

Interesting problems are everywhere. Today's guest blogger, Matt Tearle, loves the chance to apply MATLAB to any intellectually stimulating puzzle. It's a fun way to learn and practice new features.... read more >>

]]>Interesting problems are everywhere. Today's guest blogger, Matt Tearle, loves the chance to apply MATLAB to any intellectually stimulating puzzle. It's a fun way to learn and practice new features.

My wife received this postcard in the mail:

This is a real signpost, directing drivers to various real towns in Maine.

Never one to pass up a teaching moment, I used this to explain to my daughter how you could (theoretically) figure out the location of the signpost if you knew where all these towns were, by finding the point that was the given distance from each town.

Why settle for abstract theory? Why not actually try it out? A little manual work with Google and Wikipedia and we (my long-suffering daughter and I) had the locations of the towns:

town = {'Norway','Paris','Denmark','Naples','Sweden','Poland','Mexico','Peru','China'}; lon = -[70.540;70.501;70.804;70.610;70.823;70.393;70.543;70.406;69.517]; lat = [44.213;44.260;43.970;43.972;44.133;44.061;44.557;44.507;44.479]; scatter(lon,lat) text(lon,lat,town)

Add a little context:

load usastates.mat mainelon = usastates(17).Lon; mainelat = usastates(17).Lat; plot(mainelon,mainelat) hold on scatter(lon,lat) text(lon,lat,town) hold off axis equal

We can add the distances to each town.

r = [14;15;23;23;25;27;37;46;94];

Now we need to find the location that is the given distance from each town. There is a circle of locations 14 miles from Norway. Another circle 15 miles from Paris, and so on. The signpost should be at the intersection of all those circles. So let's plot them and see.

But how do we get the circle of points? One way would be manual calculation, using the parametric equation for a circle. But a more accurate approach would take into account the fact that we live on a spherical planet. This means that degrees of latitude and longitude are not equally spaced at all locations. There are formulas for finding all the points equidistant from a point on the Earth, otherwise known as *small circles* (as opposed to *great circles*). But, as is so often the case with MATLAB, there's also a function to do it for you, if you have the right toolbox. In this case, we're performing geographic calculations, so it's not surprising that Mapping Toolbox has what we need.

calculate lat/lon of circles a given radius from a given center (using a distance measure of statute miles)

[clat,clon] = scircle1(lat,lon,r,[],earthRadius('sm')); % plot results plottowns(lat,lon,clat,clon,mainelat,mainelon)

That plot doesn't look promising. There's no clear single intersection point.

xlim([-71 -70]) ylim([43.5 44.5])

Note that `plottowns` is a function I wrote to display the town locations and the small circle around it (in the same color). In the past, I'd hesitate before cluttering up my hard drive with a function file for such a specific purpose. But R2016b delivered a long-requested feature: you can now add local functions to a script.

dbtype('mainesignpost.m','287:307')

287 288 function plottowns(lat,lon,clat,clon,maplat,maplon) 289 % plot state border 290 plot(maplon,maplat,'k') 291 hold on 292 293 % add town locations 294 clr = parula(length(lon)); % make a set of colors for the lines 295 scatter(lon,lat,[],clr) % plot town locations, with given colors 296 297 % add circles of distances 298 l = plot(clon,clat); 299 % change colors to match town markers 300 for k = 1:length(l) 301 l(k).Color = clr(k,:); 302 end 303 304 % adjust axes 305 axis equal 306 axis([-71.5 -68 43 46]) 307 hold off

Maybe around (-70.7, 44.3) is a point that's close to correct for a few of the towns. And maybe it's not too far off for the rest? Given that the distances are rounded, and we can't know exactly where in the town they're measuring to, maybe it's expecting too much to look for a perfect solution. Instead, how about looking for the best fit? This becomes an optimization problem: find the location (lat/lon) that minimizes the total error in all the distances.

So, given a location $x = (lat,lon)$, the objective function to minimize is

$$ d(x) = \Sigma_k \left( \rho(x,y_k) - r_k \right)^2$$

where $\rho(x,y_k)$ is some distance function between the given point $x$ and the $k^{\rm th}$ town location $y_k$.

Again, Mapping Toolbox provides the function for $\rho$:

distfun = @(x) distance(x(1),x(2),lat,lon);

But if you read the documentation (and you always should!), you find that `distance` returns the distance in degrees of arc. Of course, Mapping Toolbox also provides a function to convert from degrees to (statute) miles:

```
distfun = @(x) deg2sm(distance(x(1),x(2),lat,lon),'Earth');
```

Now that I have $\rho(x)$, it's easy to build the objective ($d(x)$) and run the optimization:

```
d = @(x) sum((distfun(x) - r).^2);
x0 = [mean(lat) mean(lon)]; % Start in the geographic center of all the towns
postlocation = fmincon(d,x0)
```

Local minimum possible. Constraints satisfied. fmincon stopped because the size of the current step is less than the default value of the step size tolerance and constraints are satisfied to within the default value of the constraint tolerance. postlocation = 44.164 -71.061

Let's see the result.

plottowns(lat,lon,clat,clon,mainelat,mainelon) % add the predicted location hold on plot(postlocation(2),postlocation(1),'rp') hold off

That's not right, surely. The postcard clearly said "Maine signpost". This location is in New Hampshire! Yes, there's some vaguery about where the town location is and rounding distances to the nearest mile, but that's not a big enough effect to see this level of inaccuracy.

I was heading down a rabbit-hole of optimization. Should I add constraints? Is it particularly sensitive to the starting location? Then I realized that I was missing something obvious and important.

What does a distance on a signpost mean? Does it mean direct ("as the crow flies") distance? Or -- perhaps more likely, especially for a roadside sign -- driving distance?

That will change things considerably. (Small towns in Maine are not connected by straight highways.)

But how can we compute driving distance? Surely Mapping Toolbox can't do that. Well, no, but then I remembered seeing an example by another MathWorker, Will Campbell, who used the Google Maps API to find driving time between two points. Adapting his code to use driving distance gives the local function `drivedist` that has the same interface as `distance`, but makes a web API call to Google Maps to determine the distance.

dbtype('mainesignpost.m','257:284')

257 %% 258 function d = drivedist(lat1,lon1,lat2,lon2) 259 % Google Maps API requires building a URL with approriate search terms 260 base_url = 'https://maps.googleapis.com/maps/api/directions/json?'; 261 % Add starting location (lat1/lon1) to URL 262 origin = [num2str(lat1) ',' num2str(lon1)]; 263 base_url = [base_url 'origin=' origin '&destination=']; 264 % Last part of the URL is your personal key. For security, I've saved this in a text file. 265 keystr = ['&key=' fileread('mykey.txt')]; 266 267 % Loop over destinations (lat2/lon2) and get distances 268 n = length(lat2); 269 d = zeros(n,1); 270 for k = 1:n 271 % Add destination to URL. Finish URL with key 272 destination = [num2str(lat2(k)) ',' num2str(lon2(k))]; 273 myurl = [base_url destination keystr]; 274 % Send request to Google Maps and unpack the json file returned 275 dists = jsondecode(urlread(myurl)); 276 % Was the query successfully completed? 277 if isequal(dists.status,'OK') 278 % Yes. Good. Extract the distance value (which is in meters) and convert to miles 279 d(k) = dists.routes(1).legs(1).distance.value/1609.34; 280 else 281 % Nope. If it's a random glitch, let's just make that one value something big and hope for the best 282 d(k) = 1000; 283 end 284 end

A couple of notes about this function. Firstly, it demonstrates the benefits of sharing code. I would have had a hard time figuring out the use of the Google Maps API on my own. With Will's example as a starting point, I was able to adapt it to my purpose pretty easily. Secondly, new MATLAB features just keep making things easier (which means we can do more complex and interesting things!). Will used regular expressions to pull apart the JSON information returned by Google Maps. The `jsondecode` function was introduced in R2016b, which made it much easier to extract the driving distance I needed in my function.

Another MathWorker also pointed out that, with these newly aquired skills, I could have actually used the Google Maps API to automate finding the locations of the towns.

Now that I have a function for finding driving distance, it should be easy enough to simply switch out the distance function. But before handing this to `fmincon`, it's probably worth considering that the objective function may not vary smoothly with location. Location is continuous in latitude/longitude, but driving distance depends on a discrete, finite set of roads. Rather than use a gradient-based solver, maybe a direct search method from Global Optimization Toolbox might work better.

distfun = @(x) drivedist(x(1),x(2),lat,lon); d = @(x) sum((distfun(x) - r).^2); [postlocation,dmin,~,optiminfo] = patternsearch(d,x0);

Let's see the results

plottowns(lat,lon,clat,clon,mainelat,mainelon) hold on plot(postlocation(2),postlocation(1),'rp') hold off

That looks a lot more reasonable. Maybe I could try some different starting locations, but it's worth noting that this is expensive on function evaluations:

optiminfo.funccount

ans = 195

Each of those requires multiple calls to Google Maps, for a total of:

numel(lon)*optiminfo.funccount

ans = 1755

The Google Maps API allows only 2500 requests per day (unless you pay for it), so it's best not to go crazy.

Instead, let's see how good our solution is.

evaluateresults(distfun(postlocation),r,town)

That's not bad! (And, yes, that's another local function I'm using to make the plots. I like this new feature!)

dbtype('mainesignpost.m','310:326')

310 311 function evaluateresults(ractual,rgiven,town) 312 soln_vs_actual = [rgiven round(ractual)]; 313 figure 314 bar(soln_vs_actual) 315 xticklabels(town) 316 xtickangle(60) 317 ylabel('Distance (miles)') 318 legend('Stated distance','Actual distance','Location','northwest') 319 320 percerr = 100*abs(diff(soln_vs_actual,1,2))./rgiven; 321 figure 322 bar(percerr) 323 ylim([0 100]) 324 xticklabels(town) 325 xtickangle(60) 326 ylabel('Percentage error')

This function also uses some other new functions that I love: `xticklabels` and `xtickangle` for adding and rotating text labels on the axes.

Of course, there's another way to find this signpost, also using Google: search for information on it! It turns out you can find it:

realpost = [44.244284,-70.785009]; da = d(realpost); plottowns(lat,lon,clat,clon,mainelat,mainelon) hold on plot(postlocation(2),postlocation(1),'rp') plot(realpost(2),realpost(1),'mx') hold off

That's looking good. Zoom in...

xlim([-71 -70]) ylim([43.5 44.5])

Wow! Nice job, MATLAB. The predicted location is just a couple of miles down the road from the real location:

And, as a sanity check, how good could we actually do? What distances do we get from the distance function (with the town locations we're using) for the actual signpost location?

evaluateresults(distfun(realpost),r,town)

Interesting! It turns out that the signpost is wrong (at least according to Google Maps) -- Sweden, ME, is only about 13 miles away, not 25:

Of course, it depends on your route, but the other option suggested by Google Maps is still only 18 miles. Maybe roads have changed since the signpost was made. But regardless, for the distances given and the current roads, the predicted location was about as good as it could possibly be.

disp(' Total error:') disp('--------------------------------------') disp('Predicted location | Actual location') fprintf(' %6.2f | %6.2f\n',dmin,da)

Total error: -------------------------------------- Predicted location | Actual location 110.70 | 170.35

As you may have guessed, my poor daughter gave up on this quest long before I did. But I was learning new things. And new tricks in MATLAB are the best things to learn!

How about you? Have random problems or quixotic quests led you to new MATLAB knowledge? Share your favorite moments of MATLAB serendipity here.

function d = drivedist(lat1,lon1,lat2,lon2) % Google Maps API requires building a URL with approriate search terms base_url = 'https://maps.googleapis.com/maps/api/directions/json?'; % Add starting location (lat1/lon1) to URL origin = [num2str(lat1) ',' num2str(lon1)]; base_url = [base_url 'origin=' origin '&destination=']; % Last part of the URL is your personal key. For security, I've saved this in a text file. keystr = ['&key=' fileread('mykey.txt')]; % Loop over destinations (lat2/lon2) and get distances n = length(lat2); d = zeros(n,1); for k = 1:n % Add destination to URL. Finish URL with key destination = [num2str(lat2(k)) ',' num2str(lon2(k))]; myurl = [base_url destination keystr]; % Send request to Google Maps and unpack the json file returned dists = jsondecode(urlread(myurl)); % Was the query successfully completed? if isequal(dists.status,'OK') % Yes. Good. Extract the distance value (which is in meters) and convert to miles d(k) = dists.routes(1).legs(1).distance.value/1609.34; else % Nope. If it's a random glitch, let's just make that one value something big and hope for the best d(k) = 1000; end end end function plottowns(lat,lon,clat,clon,maplat,maplon) % plot state border plot(maplon,maplat,'k') hold on % add town locations clr = parula(length(lon)); % make a set of colors for the lines scatter(lon,lat,[],clr) % plot town locations, with given colors % add circles of distances l = plot(clon,clat); % change colors to match town markers for k = 1:length(l) l(k).Color = clr(k,:); end % adjust axes axis equal axis([-71.5 -68 43 46]) hold off end function evaluateresults(ractual,rgiven,town) soln_vs_actual = [rgiven round(ractual)]; figure bar(soln_vs_actual) xticklabels(town) xtickangle(60) ylabel('Distance (miles)') legend('Stated distance','Actual distance','Location','northwest') percerr = 100*abs(diff(soln_vs_actual,1,2))./rgiven; figure bar(percerr) ylim([0 100]) xticklabels(town) xtickangle(60) ylabel('Percentage error') end

Optimization terminated: mesh size less than options.MeshTolerance.

Get
the MATLAB code

Published with MATLAB® R2016b

Today, my guest blogger is Rick Gentile, an engineer in our Signal Processing Product Marketing group. He will introduce a new app which enables you to gain quick insights into your data.... read more >>

]]>Today, my guest blogger is Rick Gentile, an engineer in our Signal Processing Product Marketing group. He will introduce a new app which enables you to gain quick insights into your data.

Signal Processing Toolbox has helped MATLAB users generate and work with signals for many years. In our recent releases, we have expanded the ability to analyze and compare signals in the time, frequency, and time-frequency domains. You can now use these capabilities to gain insights into your data, which can help you to develop and validate your algorithms.

Signal Processing Toolbox provides functions and apps to preprocess, explore, and extract features from signals. We recently added **Signal Analyzer** app to the toolbox to make it really simple for you to visualize and compare multiple, time-based signals that live in the MATLAB Workspace. You can gather insight with the app about the nature of your signals in the time and frequency domains simultaneously. You will find this to be very helpful in data analytics and machine learning applications where identifying patterns and trends, extracting features, and developing custom algorithms are key aspects of the workflow.

To demonstrate some of the new capabilities of the app, I have selected data that is easy to visualize and understand. The short example below is based on what may be a familiar data set to you, a blue whale call. The file `bluewhale.au` contains audio data from a Pacific blue whale. The file is from the library of animal vocalizations maintained by the Cornell University Bioacoustics Research Program. The time scale in the data is compressed by a factor of 10 to raise the pitch and make the calls more audible.

To start, I read the data into the MATLAB Workspace. Note that you can use the command `sound(x,fs)` to listen to the audio. In the code, I start the **Signal Analyzer** app from the **APPS** tab at the top of the MATLAB environment toolstrip. I could have also done this by typing `signalAnalyzer` at the MATLAB prompt.

whaleFile = fullfile(matlabroot,'examples','matlab','bluewhale.au'); [x,fs] = audioread(whaleFile); signalAnalyzer

In the picture below, after the app opens, I drag the signal `x` directly from the Workspace Browser (lower left section of the app) into the display region.

The recorded data includes four signal components of interest. The first is commonly referred to as a "trill" and comprises roughly the first 10,000 samples. The next three sounds are referred to as "moans".

You see the **Signal Analyzer** app in the picture below with time, spectrum and panner axes enabled. The middle axes show the spectrum of the signal. The bottom axes contain a panner, where you can zoom and navigate into the whale call of interest.

I then turn on Data Cursors to find where each of the sounds is located in the data. Next, I use this information to extract vectors with the signal segments of interest. The code below extracts these vectors based on the beginning and end of the specific moans. In a data analytics application, this is analogous to exploring a data set and finding a condition of interest in the data stream.

moan1 = x(24396:31087); % Extract Signal for Moan 1 moan2 = x(45499:52550); % Extract Signal for Moan 2 moan3 = x(65571:72571); % Extract Signal for Moan 3

You can use Signal Processing Toolbox functions to do a range of operations on the data, including finding changes and outliers, filtering and smoothing, or filling gaps where data is missing. Signal Processing Toolbox also has functions to find signal segments within a larger signal based on similarity measures. Let's look at how you can use this to automate the work that was manually performed above.

The `findsignal` function returns the start and stop indices of a segment of the data array, `data`, that best matches the search array, `signal`. The best-matching segment is such that `dist`, the Euclidean distance between the segment and the search array, is smallest. Note that many functions in Signal Processing Toolbox conveniently generate a plot when no output arguments are specified. You will see that in the code below where I repeated some instructions to take advantage of this but you don't need to repeat the lines if only the plot or the outputs are needed.

I start by taking an additional look at the signals of interest in the app. As shown in the picture below, I compare the three whale moans by simply dragging `moan1`, `moan2`, and `moan3` directly into the display region. The three moans are overlaid in the time domain in the top axes. The bottom axes provide the overlaid spectrum of the three occurrences. From this view, the moan spectra of all three occurrences look more similar than their time-domain counterparts. This is an indication that it may be best to find the signals using the frequency content rather than looking through their time-domain representations.

With this intuition, I now try to use `findsignal` with the spectrogram as an input to see if it matches the signal of interest with the spectra of other occurrences. First I look at the spectrogram for the entire signal. The trill and moans are visible in the image.

spectrogram(x,kaiser(64,3), 60, 256, fs,'yaxis') % View spectrogram for original signal

I now compute the spectrogram of the entire signal and a reference spectrogram based on the first moan and feed these to `findsignal` in order to find all call signal occurrences.

[~,F,T,PxxSignal] = spectrogram(x,kaiser(64,3), 60, 256, fs); [~,F,T,PxxMoan] = spectrogram(moan1,kaiser(64,3), 60, 256, fs); findsignal(PxxSignal, PxxMoan, 'Normalization','power','TimeAlignment','dtw','Metric','symmkl',... 'MaxNumSegments',3);

You can see that `findsignal` did return three matches. The start and stop index for each match is highlighted in the plot. This confirmed the intuition we gained by viewing the data in **Signal Analyzer**. For completeness, I make the same call using output arguments to capture the start and stop index of each match. With this information now in the MATLAB Workspace, it is much easier to automate the signal extraction for larger signals, especially if there are many more signal segments to find. Having these locations generated automatically will save you time as you navigate through your own signals.

[istart,istop,d] = findsignal(PxxSignal, PxxMoan, 'Normalization','power','TimeAlignment', ... 'dtw','Metric','symmkl','MaxNumSegments',3); % Obtain start and stop index for each match

There are many other functions you can use to improve the quality of the data. These functions can all be used in a similar manner, in conjunction with the app, which can provide you with more insights into your data and signal sets.

You can also find more info on the app here.

I hope this example has helped you understand how to visualize and compare multiple time-based signals and find interesting features. Can this help in your area of work? Let us know here.

Get
the MATLAB code

Published with MATLAB® R2016b

Hopefully you recall the recent post where we used a monte-carlo technique to simulate an infinitely-stocked port decanter passing around a table of N people (always starting at position 0), moving with probability p to the left and q to the right, only stopping when everyone at the table had... read more >>

]]>- Introduction
- Gambler's Ruin
- Calculating the probability that position
`k`is last - What routes do we need to consider to compute the expected path length?
- Probability that the port takes route L or R
- Calculating the expected path length when
`k`is last - What is the mean number of passes needed for a table of size N?
- And finally

Hopefully you recall the recent post where we used a monte-carlo technique to simulate an infinitely-stocked port decanter passing around a table of `N` people (always starting at position 0), moving with probability *p* to the left and *q* to the right, only stopping when everyone at the table had received some port. Those simulations seemed to show that simple analytic solutions to expected path lengths might be possible and today, I'll show you what I managed to prove.

In the last post we decided that the functions we wanted to find are

- $p_k(N)$ the probability that position k is the last to receive the port
- $E_k(N)$ the expected number of passes needed when k is the last
- $E_N$ the expected number of passes needed for a table of size N

Thanks to my colleague Rick Amos, we have a starting point for thinking about an analytical solution. He pointed out that our port passing problem was very similar to the random walk process Gambler's Ruin, in which the gambler starts out with some stake (`k`) and wins another unit of that stake with probability `p` (or loses with probability `1-p`). The game continues until the gambler has either lost all the money, or has a total of `N`. The major difference is that in Gambler's Ruin there are two different end conditions (winning with `N` or losing with `0`).

To enable us to attack the port passing problem, I'll need to quote two results from Gambler's Ruin. You can find derivations of these results in the associated files on File Exchange (see the files `BiasedGamblersRuin` and `UnbiasedGamblersRuin`). The first result allows us to answer the question: "what is the probability, given that we are currently at position S, of arriving at position A before arriving at position B?". In terms of Gambler's Ruin this is the same as starting with stake $k=S-B$ and finishing with $N=A-B$. To make some of the equations simpler I'll use $q=1-p$, and note we assume $A<S<B \mbox{ or } B<S<A$.

$$pSAB=\left\lbrace \begin{array}{cc}\frac{k}{N} & p=q=\frac{1}{2}\\\frac{1-r^k}{1-r^N} & p\ne q\mbox{ and } r=\frac{q}{p}\end{array}\right.$$

Similarly, we also need to answer: "what is the expected number of passes (starting at S) to arrive at position A before arriving at position B?".

$$eSAB=\left\lbrace \begin{array}{cc}\frac{N^2-k^2}{3} & p=q=\frac{1}{2}\\\frac{1+r}{1-r}(N(\frac{2}{1-r^N }-1)-k(\frac{2}{{1-r}^k }-1)) & p\ne q\end{array}\right.$$

Note how each result has a different expression for the biased and unbiased case? Below we will run through the biased case (which has more complexity). Exactly the same code could be run with the unbiased case and the correct result obtained - if you are interested simply change `pToUse` below back to `0.5` and execute.

syms pSAB(S,A,B) eSAB(S,A,B) p q k N pToUse = 0.55; assume(p == pToUse); % Define the functions for k and N in terms of S, A, and B. k(S,A,B) = S-B; N(S,A,B) = A-B; if pToUse == 0.5 pSAB(S,A,B) = k/N; eSAB(S,A,B) = (N^2 - k^2)/3; else % In the biased case we will use r rather than p in the equations syms r assume(r ~= 1); pSAB(S,A,B) = (1-r^k)/(1-r^N); eSAB(S,A,B) = (1+r)/(1-r) * (N*(2/(1-r^N)-1) - k*(2/(1-r^k)-1)); end

To allow us to talk about the general case where position `k` is the last to receive the port around a table of size `N` we will use position `L` to refer to a position one to the right of `k` and `R` to one to the left of `k` (note the use of modulo arithmetic to ensure $k-N<R \le 0 \le L < k$ and that the direction left is defined as an increase in position number, and right is a decrease in position number).

syms L R N k L = k-1; R = k+1-N;

For position `k` to be last we either need the port to move from the starting position 0 to L and then to R, or move from 0 to R and then L. This probability is given by

$$p_k = pSAB(0,L,R) * pSAB(L,R,k) + pSAB(0,R,L) * pSAB(R,L,k)$$

p0L = pSAB(0, L, R); p0R = pSAB(0, R, L); pLR = pSAB(L, R, k); pRL = pSAB(R, L, k-N); p0LR = p0L*pLR; p0RL = p0R*pRL; pk = simplify(p0LR + p0RL)

pk = -(r^N*(r - 1))/(r^k*(r - r^N))

In the previous blog post we showed that for the unbiased case this result was independent of position `k`, but in the biased case the position around the table affects the last probability. Let's look at the residuals from some simulations to check if this is correct (the `portPassingSimulation` code is available here).

p = pToUse; N = 17; k = 1:N-1; nSims = 1e5; residuals = NaN(10,N-1); expected = double(subs(pk, {'r' 'N' 'k'}, {(1-p)/p, N, k})); % Simulate 10 times passing around a table of 17 parfor ii = 1:10 last = portPassingSimulation(nSims, N, p); % Group simulations into k (1:N-1) bins and normalize by number of simulations lastProb = hist(last, k)/nSims; residuals(ii, :) = lastProb - expected; end % Are the residuals normally distributed? probplot('normal', residuals(:)) grid on

We can see that the residuals from the simulation are roughly normally distributed (some tail points are not quite normal) but there isn't any bias from the expected results (because the mean of the residuals is close to zero).

meanR = mean(residuals(:))

meanR = 7.5894e-20

To compute the expected path length, we are going to have to consider all possible paths that finally end with position `k`, work out the probability with which each path occurs and multiply by the expected length of that path. Fortunately, whilst there are infinitely many possible paths we can decompose the problem into just 2 that relate directly to Gambler's Ruin. These are the L route and R route as specified by

- Start at 0 and move to L (but
**not R**)`E0L` - Move from L to R (but
**not k**)`ELR` - Then either
**(a)**move from R to k-N (but**not k**)`ERk` - Or
**(b)**move from R to L to k (but**not k-N**)`ERLk`

The alternative route R (which is basically the inverse of route L) is

- Start at 0 and move to R (but
**not L**)`E0R` - Move from R to L (but
**not k-N**)`ERL` - Then either
**(a)**move from L to k (but**not k-N**)`ELk` - Or
**(b)**move from L to R to k-N (but**not k**)`ELRk`

The expected lengths of each of these parts of the path are

syms k N E0L = eSAB(0, L, R); E0R = eSAB(0, R, L); ELR = eSAB(L, R, k); ERL = eSAB(R, L, k-N); ERk = eSAB(R, k-N, k); ELk = eSAB(L, k, k-N); ERLk = eSAB(R, k, k-N); ELRk = eSAB(L, k-N, k);

It is interesting to note that the expected path length `eSAB` is symmetric with respect to swapping the variables `p` and `q`. We can see this in the above equations as `ELR==ERL`, `ERk==ELk` and `ERLk==ELRk`

simplify(ELR == ERL) simplify(ERk == ELk) simplify(ELRk == ERLk)

ans = TRUE ans = TRUE ans = TRUE

To combine the parts of route L and route R where there is a choice we need to include the probabilities of undertaking part 3 and 4 of L and part 3 and 4 of R. These are simply the probabilities of those paths in Gambler's Ruin.

pRk = pSAB(R, k-N, k); pLk = pSAB(L, k, k-N); pRLk = pSAB(R, k, k-N); pLRk = pSAB(L, k-N, k);

So finally we can combine these parts for our two routes. `EL` is the expected path for route L and `ER` the expected path length for route R.

EL = E0L + ELR + pRk.*ERk + pRLk.*ERLk; ER = E0R + ERL + pLk.*ELk + pLRk.*ELRk

ER = ((1/r - 1)*(r + 1)*(2/(1/r - 1) - N*(2/(1/r^N - 1) + 1) + 1))/((r - 1)*(1/r^N - 1)) - (((2/(r^(2 - N) - 1) + 1)*(N - 2) - (2/(r^(1 - k) - 1) + 1)*(k - 1))*(r + 1))/(r - 1) - ((r + 1)*(2/(r - 1) - (N - 1)*(2/(r^(N - 1) - 1) + 1) + 1))/(r - 1) - (((N - 1)*(2/(r^(N - 1) - 1) + 1) - N*(2/(r^N - 1) + 1))*(r + 1)*(r^(N - 1) - 1))/((r^N - 1)*(r - 1))

I've only printed one result out simply to show that at the moment this looks pretty complicated! Hopefully this is going to get simpler some way down the line!

To be able to compute the overall expected path we also need to know the probability that the port takes route L or R, since

$$E_k = p_L E_L + p_R E_R$$

Fortunately, we already have the components to compute these probabilities from the analysis of finish position probabilities. The probability the we take route L is just the probability of from 0 to L to R divided by the probability that k is last. Similarly, for the route R.

pL = simplify(p0L*pLR/pk) pR = simplify(p0R*pRL/pk)

pL = (r^N - r^(k + 1))/(r^N - r^2) pR = -(r*(r - r^k))/(r^N - r^2)

With all these results in place we can compute the expected length by adding the two possible path lengths times their probabilities.

Ek = simplify(pL.*EL + pR.*ER, 400)

Ek = (2*(2*N - k - 3*r^N + N*r^N - N*r^k + k*r^N + 1))/((r^N - 1)*(r - 1)) - (k - 2*N + N*r^N + N*r^k - k*r^N)/(r^N - 1) - 4/(r - 1)^2 - (2*r^N*(r^N + 1)*(N - 1))/((r - r^N)*(r^N - 1))

**Does this agree with simulation?**

We can look to see how well `Ek` agrees with the simulated data by running the simulation and computing the mean number of passes for each finishing position. As we saw in the previous blog post, a simple way to look at the mean number of passes for each last position is by using `varfun` for a table, with a grouping variable.

p = pToUse; N = 17; k = 1:N-1; [last,nPass] = portPassingSimulation(1e6, N, p); t = varfun(@mean, table(last, nPass), ... 'GroupingVariable', 'last')

t = last GroupCount mean_nPass ____ __________ __________ 1 9291 95.262 2 11379 111.75 3 13888 123.74 4 17016 130.06 5 20845 135.69 6 25541 136.47 7 31030 136.05 8 38271 133.8 9 46659 130.33 10 56701 125.85 11 69383 119.94 12 84602 112.84 13 1.04e+05 106.39 14 1.2629e+05 98.765 15 1.5536e+05 90.384 16 1.8975e+05 82.086

Plotting these against the analytical results gives

cMean = double(subs(Ek, {'r' 'N' 'k'}, {(1-p)/p N k})); plot(t.last, t.mean_nPass, 'ro'); hold on plot(k, cMean, '.-') grid on hold off title 'Simulated vs expected mean number of passes' xlabel 'Position port received last at the table' ylabel 'Mean number of passes to reach last position at table' legend('Simulated results', 'Expected result')

And lastly the difference in the mean and expected mean is almost zero

mean(cMean(:) - t.mean_nPass(:))

ans = -0.041108

The average of the residuals is close to zero showing the simulated data are in agreement with the expression.

Having deduced the expected number of passes needed when the last position is `k`, we can now, finally, compute the expected path length for the table of size `N` because we can simply sum the expected path for each of the possible k positions times the probability that position k is the last.

$$E_N =\sum_{k=1}^{N-1} p_k E_k$$

syms k N r EN = simplify(symsum(pk*Ek, k, 1, N-1), 1000)

EN = (N^2*(r + 1))/((r^N - 1)*(r - 1)) - ((r + 1)*(N + r - N*r))/(r - 1)^2 + (r*(N - 1)^2*(r + 1))/((r - r^N)*(r - 1))

Thankfully this looks a lot simpler than some of the other results earlier.

**Check against a known analytical solution**

Before continuing we can check this result in a couple of ways. Firstly, let's check specifically against the `N==3` case where we already know the analytic solution (see `SymbolicAnalysisFor3People.mlx`)

$$E_3 =\frac{3}{q^2 -q+1}-1$$

Substituting in a value for N and equation for r

E3 = simplify(subs(EN, {N, r}, {3 q/(1-q)}), 400)

E3 = 3/(q^2 - q + 1) - 1

Alternatively, we can check the results against some simulations for different values of N

p = pToUse; tableSize = 5:3:37; sim = NaN(size(tableSize)); parfor ii = 1:numel(tableSize) N = tableSize(ii); [~,nPass] = portPassingSimulation(1e5, N, p); sim(ii) = mean(nPass); end plot(tableSize, sim, 'ro') grid on hold on N = min(tableSize):max(tableSize); expected = double(subs(EN, {'r' 'N'}, {(1-p)/p N})); plot(N, expected); hold off title 'Expected passes to finish random walk' xlabel 'Number of people at the table' ylabel 'Number of passes' legend('Simulated results', 'Expected result', 'Location', 'N')

Writing the biased and unbiased results out in full, we get

$$E_N =\left\lbrace \begin{array}{cc}\frac{N(N-1)}{2} & p=q=\frac{1}{2}\\\frac{r+1}{r-1}(\frac{N^2}{r^N -1}-\frac{{(N-1)}^2 }{r^{N-1} -1}+\frac{Nr-N-r}{r-1})& p\ne q\end{array}\right.$$

Written this way, it is quite complicated, but there is a symmetry to it. A term in N, the same term in N-1 and another term. All-in-all distinctly simpler than `ER` that I printed out further up the page. And with that we have produced analytic solutions for all the values we wanted to investigate - perhaps it is now time to pass the port!

Manipulation of the fairly complicated algebraic expression that resulted from joining all the expected paths with probabilities together was something I thought would be really hard when done by hand. Yet I think you will agree using the Symbolic Math Toolbox has made it tractable. I'm also impressed with its ability to solve recurrence relationships (see the associated files that prove the initial definitions of `pSAB` and `eSAB`).

From the article it might look like I've done all the work here myself, so I'd like to thank both Rick Amos (MathWorks) and Barnaby Martin (Durham University) for helping out with many of the details.

Having seen these related blog posts, what problems might you now attack using simulation to guide your mathematical approach? Can you see opportunities to bring statistics and symbolic manipulation together? Let me know here.

Get
the MATLAB code

Published with MATLAB® R2016b