Loren on the Art of MATLAB

Now is the Time

Loren Shure — Mon, 14 Mar 2022 12:53:20 +0000

It's reached that time for me. I will be retiring from MathWorks at the end of March 2022. It's been 35 years of tremendous growth for MathWorks, and for me. When I started this blog, the original stipulation was I needed to produce 5 posts to show that I had more than 1 or 2 in my head. That was 17 years ago! I just reread my inaugural post and it rings as true today as it did then.

I still love math and linear algebra. I've had the great good fortune to work with MathWorks founders, including my colleague and friend, Cleve Moler. I am grateful to Jack Little, who gave me (or I took!) opportunities that I never would have dreamt of - helping create world-class software, learning new domains in-depth, such as some areas in signal and image processing.

We have added great new functionality and technology over the years, enabling you to write code in many styles, ranging from quick one-off calculations to large systems that can be shared and deployed for many to use.

MATLAB started as a language for matrix computation and evolved to be replete with tools for a wide array of technical problems in technical computing and model-based design.

During that time, we have continued to focus on what you, our users, need. Sometimes it was completely new features, sometimes it was making certain features easier to use, and sometimes it was to make your code run faster (a focus from our teams all the time, even in the presence of other goals).

I have been lucky to help MathWorks by occasionally relocating for months at a time, several times, to Europe. In addition to travels far and wide, talking to you: users, researchers, engineers, scientists, professors, deans, rectors, and heads of large groups at small and large educational and research institutions as well as at a broad range of companies, has been the most rewarding part of my job.

From you, I have learned a lot, including things about MATLAB that I didn't know! And certainly things we could do to improve MATLAB. Please keep these suggestions and hassles coming our way. I've also learned plenty about science and engineering. And lots about the world we live in. What a beautiful place with an incredible collection of talented people.

These days I am very glad to see that the issue of inclusion of women and minorities is gaining the traction and focus that it deserves in STEM. In the broader world, and on a daily basis at MathWorks, there is evidence of progress. I didn't think early in my career that we'd have to work so hard to get there, but it's been totally worth every bit of effort.

The first major activity I am planning after I retire is taking a weaving class to create a Mobius scarf! So I'm not straying so far from all the technical stuff I love, but venturing into a more visually artistic rendition.

In the meantime, I leave you in the capable hands of my colleagues here at MathWorks. I hope you'll keep reading as Mike Croucher will very soon continue the tradition of blogging about MATLAB. May you all have prosperous futures and satisfaction as you learn new things and help the world become a better place.

Any comments? Leave them here.

A few nostalgic pictures for you.

Cleve and Loren at MathWorks 25th anniversary

MathWorks Team in the early days - our first real office, lines show me in the front middle, Cleve on far left.

Worldwide gathering of MathWorks for 35th anniversary, lines showing locations for Cleve (right) and me (left)

Copyright 2022 The MathWorks, Inc.

ALIKE (or not) – A Second Go At Beating Wordle

Loren Shure — Tue, 08 Feb 2022 12:51:52 +0000

Today's guest blogger is Matt Tearle, who works on the team that creates our online training content, such as our various Onramp courses to get you started on MATLAB, Simulink, and applications. Matt has written several blog posts here in the past, usually prompted by a puzzle - and today is no different.

STUMP (v: to cause to be at a loss; baffle)

Wordle. It has captured us all. When Adam wrote his post about using MATLAB to solve Wordle puzzles, I had been thinking about doing exactly the same thing. (In the past, I have written code to cheat at Wordscapes and the NY Times Spelling Bee puzzle.) I've seen other friends post about letter distributions. I guess this is what nerds do.

When I read Adam's post, I knew I had to see if I could do better. My first thought was what reader Peter Wittenberg suggested: weighting letter probabilities by where they occurred in the word. I then tried something close to what another reader, TallBrian, suggested, by scoring words according to how much they cut down the possibilities in the next turn. I also experimented with how to choose the word with the most new letters.

But nothing worked. I couldn't make any significant improvement on Adam's 94% success rate. He had noted that there were some words in the official Wordle canon that were not in the set used to develop the algorithm. I was getting suspicious. My data-senses were tingling.

CHECK (v: to make an examination or investigation; inquire)

According to the letter probabilities, TARES is a great opening guess. But I felt like I hadn't seen too many words ending in -S, when I was playing Wordle as a human. Maybe it was time to compare the two word sets. Here I'm just repurposing Adam's code to get the two word lists. The dictionary words (called word5 in Adam's code) are in the string array trainwords. The actual Wordle list (called mystery_words in Adam's code) is in testwords.

[trainwords,testwords] = getdictionaries;

whos

  Name               Size             Bytes  Class     Attributes

  testwords       2315x1             125106  string              
  trainwords      4581x1             247470  string              

The variable names show my bias: I was now thinking about this like a machine learning problem. One set of words had been used to train an algorithm - in this case, not a standard machine learning method, but a bespoke algorithm based on statistics. The other was the test set. Anyone who does machine learning knows that the quality of your model depends critically on the quality of your data. Specifically, you need your training data to accurately represent the actual data your model will be used on.

How do the letter distributions for the "training" and "test" data sets compare? First, I need to count the number of appearances of each letter in each location (for both sets of words):

% Make a list of all capital letters (A-Z)

AZ = string(char((65:90)'));

% Calculate distribution of letters

lprobTrain = letterdistribution(trainwords,AZ);

lprobTest = letterdistribution(testwords,AZ);

whos

  Name               Size             Bytes  Class     Attributes

  AZ                26x1               1500  string              
  lprobTest         26x5               1040  double              
  lprobTrain        26x5               1040  double              
  testwords       2315x1             125106  string              
  trainwords      4581x1             247470  string              

I now have two 26-by-5 matrices of the frequency of each letter of the alphabet in each position. First, let's see the overall distribution:

% Average across the 5 letter positions

distTrain = mean(lprobTrain,2);

distTest = mean(lprobTest,2);

% Plot

bar([distTrain,distTest])

xticks(1:26)

xticklabels(AZ)

xtickangle(0)

legend("Training","Test")

ylabel("Proportion of uses")

title("Total letter distributions")

Wow, something's going on with S. Sure enough, there's a big difference between the usage in the two word lists. There are also more Ds in the training set than in the actual Wordle set. Meanwhile, there are several letters that are used more in the Wordle set than the general dictionary - most notably, R, T, and Y.

Now I'm even more suspicious of words ending in -S. Let's look at the full picture of usage by letter and word location. If you do any data analysis, you will likely have encountered this situation of wanting to visualize values as a function of two discrete variables. In the past, you may have used imagesc for this, and that's certainly a reliable old workhorse. But if you don't religiously read the release notes, you may not be aware of some of the new data analysis charting functions, such as heatmap (introduced in R2017a).

heatmap(1:5,AZ,lprobTrain)

title("Letter distribution for TRAINING data")

heatmap(1:5,AZ,lprobTest)

title("Letter distribution for TEST data")

Ah, sweet vindication! Sure enough, the words we built our strategy on have a different distribution to where the common letters are used in the actual Wordle words. In particular, note that Wordle is much more likely to start a word with S than end with one. The distribution of where Es show up is different, too. To make these kinds of comparisons, it might be easier to visualize the difference between the two heatmaps:

% Take the difference in distributions

delta = lprobTrain-lprobTest;

% Find the biggest, to set the color limits symmetrically

dl = max(abs(delta(:)));

% Visualize

hm = heatmap(1:5,AZ,delta,"ColorLimits",dl*[-1 1],"Colormap",turbo);

title(["Difference in letter distribution","Red = higher prevalance in training set","Blue = higher prevalance in test set"])

Notice that I've set the color limits so that 0 is a "neutral" green, while blue or red shows a difference in either direction. The lack of -S words shows up clearly, as does the shift from -E- to -E at the end of words. Together, this suggests fewer plurals (like WORDS) and fewer -ES verb forms (eg "Adam LOVES playing Wordle") in the Wordle list.

There are a few other details in there, but they're harder to see because everything is so dominated by the difference in -S words. Let's manually narrow the color range to see more details. This will show -S as less significant than it is, but that's OK - we know about that already.

hm.ColorLimits = [-0.1 0.1];

Now we can see some interesting trends: more -Y words (presumably adjectives like "this is a SILLY topic for a blog post"), as well as more -R and -T, fewer -D words (perhaps -ED past tenses like "until Adam's post, Matt LOVED playing Wordle"), fewer instances of vowels as the second letter, and R and S switching at positions 1 & 2.

Using this heatmap, you can also see that TARES is much more aligned with the training (dictionary) set than the test (Wordle) set: every letter has a positive value in the heatmap. So while the letters are all high probability overall, this specific arrangement is particularly good for the training set and bad for the test set. A simple rearrangement to STARE reverses the situation.

ASIDE (n: a comment or discussion that does not relate directly to the main subject being discussed)

I noticed that my code jumps freely between strings, chars, and categoricals. You can see some of that in the code here, but my Wordle solver was even more liberal in its use of the different types. That might seem like evidence of bad programming - "pick a data type already!", you cry - but I'm claiming that this is actually a good practice: MATLAB gives you lots of great data types; use them! With the introduction of strings (R2016b), we get questions like "so should we just use strings now?" and "is there any point in char instead of string?". If you're confused about this, here's a simple principle: the unit of a char is a single character, the unit of a string is text of any length. Wordle is all about words... but also all about the letters! That's why it's useful to use both string (for studying words) and char (for letters).

Also, our dedicated team of developers gave us a whole pile of handy text functions along with strings. But those functions aren't just for strings - like many MATLAB functions, they accept different kinds of inputs. These ones accept text in any form and allow you to do basic text processing without regular expressions. For example, I've hypothesized that Wordle doesn't use -ES and -ED verb forms. Let's see the words that have those endings in each list:

ESDendings = @(words) words(endsWith(words,["ES","ED"]));

ESDtrain = ESDendings(trainwords)'

ESDtrain = 1×619 string
"ABBES"      "ACHED"      "ACHES"      "ACMES"      "ACRES"      "ACTED"      "ADDED"      "ADZES"      "AIDED"      "AIDES"      "AILED"      "AIMED"      "AIRED"      "ALOES"      "ANTED"      "ANTES"      "APSES"      "ARCED"      "ARMED"      "ASHED"      "ASHES"      "ASKED"      "ASSES"      "AXLES"      "BAAED"      "BABES"      "BAKED"      "BAKES"      "BALED"      "BALES"      

ESDtest = ESDendings(testwords)'

ESDtest = 1×23 string
"ABLED"      "BLEED"      "BREED"      "BUSED"      "CLUED"      "CREED"      "CRIED"      "DRIED"      "EMBED"      "FREED"      "FRIED"      "GREED"      "KNEED"      "PLIED"      "PRIED"      "SHIED"      "SPEED"      "SPIED"      "STEED"      "TRIED"      "TWEED"      "UNFED"      "UNWED"      

100*numel(ESDtrain)/numel(trainwords)

ans = 13.5123

100*numel(ESDtest)/numel(testwords)

ans = 0.9935

The handy endsWith function does what it suggests and finds the words with the given endings. Sure enough, the training data from the dictionary has many pairs of -ES and -ED verbs, like ACHED and ACHES (13.5% of the whole list). But Wordle has almost no words that are made by simply appending -ED or -ES to a 4-letter verb. Consequently, prevalence of -ED and -ES words is much lower (only 1%).

LATER (adj: coming at a subsequent time or stage)

Having confirmed that the letter distributions were indeed different, I was able to salvage my pride by building my various solution algorithms with the Wordle list and then testing them. Now I was able to successfully solve the puzzle 99% of the time. Great. But also a little unsatisfying. As any data scientist knows, training and testing with the same data set is cheating and not a good measure of how well your algorithm will perform on new data.

But... well, there is no new data. The Wordle word list is set. So it remains a valid question: given the official Wordle list, what is the best way to solve it?

Unfortunately, readers pointed out some details with what Adam had done (which I had followed). That casts doubt on my "solutions". So for now, I'll need to keep tinkering. If the New York Times haven't hidden Wordle away behind a paywall by the time I figure it out, I'll be back. I might even be brave enough to enter the internet-argument-of-the-day: what is the best starting word? 

REPLY (v: to give an answer in words or writing; respond)

Adam's readers had some clever ideas on how they would go about beating this addictive game. Do any of you have a guaranteed opening word? A secret strategy that you will reveal to subscribers for only $19.95? How did you find yours and what makes it so great? Let us know in the comments.

function [word5,mystery_words] = getdictionaries

% Copied from Adam F

% read the list of words into a string array

r = readlines("https://gist.githubusercontent.com/wchargin/8927565/raw/d9783627c731268fb2935a731a618aa8e95cf465/words");

% replace diacritics using a custom function from the Appendix

rs = removediacritics(r);

% keep only the entries that start with a lower case letter

rs = rs(startsWith(rs,characterListPattern("a","z")));

% get rid of entries with apostrophes, like contractions

rs = rs(~contains(rs,"'"));

% Wordle uses all upper case letters

rs = upper(rs);

% get the list of unique five letter words

word5 = unique(rs(strlength(rs)==5));

mystery_id = "1-M0RIVVZqbeh0mZacdAsJyBrLuEmhKUhNaVAI-7pr2Y"; % taken from the sheet's URL linked above

mystery_url = sprintf("https://docs.google.com/spreadsheets/d/%s/gviz/tq?tqx=out:csv",mystery_id);

mystery_words = readlines(mystery_url);

% there's an extra set of double quotes included, so let's strip them out

mystery_words = erase(mystery_words,"""");

% also we're using upper case

mystery_words = upper(mystery_words);

end

function lprob = letterdistribution(words,AZ)

% split our words into their individual letters

letters = split(words,"");

% this also creates leading and trailing blank strings, drop them

letters = letters(:,2:end-1);

% Calculate the distribution of letters in each word position

for k = 1:5

    lcount(:,k) = histcounts(categorical(letters(:,k),AZ));

end

lprob = lcount./sum(lcount);  % Normalize

end

% Also from Adam

% citation: Jim Goodall, 2020. Stack Overflow, available at: https://stackoverflow.com/a/60181033

function [clean_s] = removediacritics(s)

%REMOVEDIACRITICS Removes diacritics from text.

%   This function removes many common diacritics from strings, such as

%     á - the acute accent

%     à - the grave accent

%     â - the circumflex accent

%     ü - the diaeresis, or trema, or umlaut

%     ñ - the tilde

%     ç - the cedilla

%     å - the ring, or bolle

%     ø - the slash, or solidus, or virgule

% uppercase

s = regexprep(s,'(?:Á|À|Â|Ã|Ä|Å)','A');

s = regexprep(s,'(?:Æ)','AE');

s = regexprep(s,'(?:ß)','ss');

s = regexprep(s,'(?:Ç)','C');

s = regexprep(s,'(?:Ð)','D');

s = regexprep(s,'(?:É|È|Ê|Ë)','E');

s = regexprep(s,'(?:Í|Ì|Î|Ï)','I');

s = regexprep(s,'(?:Ñ)','N');

s = regexprep(s,'(?:Ó|Ò|Ô|Ö|Õ|Ø)','O');

s = regexprep(s,'(?:Œ)','OE');

s = regexprep(s,'(?:Ú|Ù|Û|Ü)','U');

s = regexprep(s,'(?:Ý|Ÿ)','Y');

% lowercase

s = regexprep(s,'(?:á|à|â|ä|ã|å)','a');

s = regexprep(s,'(?:æ)','ae');

s = regexprep(s,'(?:ç)','c');

s = regexprep(s,'(?:ð)','d');

s = regexprep(s,'(?:é|è|ê|ë)','e');

s = regexprep(s,'(?:í|ì|î|ï)','i');

s = regexprep(s,'(?:ñ)','n');

s = regexprep(s,'(?:ó|ò|ô|ö|õ|ø)','o');

s = regexprep(s,'(?:œ)','oe');

s = regexprep(s,'(?:ú|ù|ü|û)','u');

s = regexprep(s,'(?:ý|ÿ)','y');

% return cleaned string

clean_s = s;

end

Building a Wordle solver

Loren Shure — Tue, 18 Jan 2022 14:59:22 +0000

Today's guest blogger is Adam Filion, a Senior Data Scientist at MathWorks. Adam has worked on many areas of data science at MathWorks, including helping customers understand and implement data science techniques, managing and prioritizing our development efforts, building Coursera classes, and leading internal data science projects.

My wife recently introduced me to the addictive puzzle game Wordle. In the game, you make a series of guesses to figure out the day's secret answer word. The answer is always a five letter English word, and you have six attempts to guess the right answer. After each guess, the game gives you some information about how close you are to the answer.

Figure 1: Examples of how Wordle gives feedback. 

I've always been more of a numbers guy, so after getting stuck on a daily Wordle I decided to see if I could make better guesses using MATLAB. In this post, I'll walk through a simple method of generating suggestions for the Wordle game that can get the right answer within six guesses 94% of the time without knowing Wordle's official word list. The example puzzle is from Jan 12, 2022.

Figure 2: A blank Wordle puzzle. Six guesses remaining!

Table of Contents

Generate our vocabulary
Find the most commonly used letters
Create a score for each word
Choose a word and make our first guess
Account for Wordle's feedback
Make our second guess
Make our third guess
Make our fourth guess
Make our fifth guess
Make our sixth and final guess
Play a random game of Wordle
Play all possible games of Wordle
Areas for improvement
Appendix

Generate our vocabulary

If we're going to play games of Wordle, we need a vocabulary list of five letter English words. Fans of the game have already scraped the Wordle source code and shared the list of 2,315 mystery words and 12,972 guessable words (thanks FiveThirtyEight!). We'll come back to the mystery words later to check our accuracy but using that in our solver feels a bit like cheating, so let's pretend we don't know what list Wordle uses. There isn't a single comprehensive list of English words, so let's pick a common source for coders looking for a list of English words, a list that unix systems provide under /usr/share/dict/words. If you're on Windows, you can find the same list on places like github. We can easily read text files like this directly off the web for text processing using readlines. This list includes acronyms and proper nouns, which we can remove by ignoring entries that start with a capital letter. While it doesn't contain the full English language, it gives us a list of 4,581 five letter words to play with. We'll probably be missing some of the words in Wordle's mystery list but it should still be close enough to make helpful suggestions.

% read the list of words into a string array

r = readlines("https://gist.githubusercontent.com/wchargin/8927565/raw/d9783627c731268fb2935a731a618aa8e95cf465/words");

% replace diacritics using a custom function from the Appendix

rs = removediacritics(r);

% keep only the entries that start with a lower case letter

rs = rs(startsWith(rs,characterListPattern("a","z")));

% get rid of entries with apostrophes, like contractions

rs = rs(~contains(rs,"'"));

% Wordle uses all upper case letters

rs = upper(rs);

% get the list of unique five letter words

word5 = unique(rs(strlength(rs)==5))

word5 = 4581×1 string
"ABACI"      
"ABACK"      
"ABAFT"      
"ABASE"      
"ABASH"      
"ABATE"      
"ABBES"      
"ABBEY"      
"ABBOT"      
"ABEAM"      

Find the most commonly used letters

Now we have our list of five letter words, but how to pick which word to guess first? Our first guess is made blind, with no clues to the final answer. Since Wordle gives feedback by letter, an easy method is to pick the word that has the most commonly used letters. 

Let's start by splitting each word into its letters and looking at the overall histogram of letters. We can see that some letters are used vastly more often than others.

% split our words into their individual letters

letters = split(word5,"");

% this also creates leading and trailing blank strings, drop them

letters = letters(:,2:end-1);

% view the counts of letter use

h = histogram(categorical(letters(:)));

ylabel("Number of uses in five letter words")

Let's put this in a table for use in creating word scores.

lt = table(h.Categories',h.Values','VariableNames',["letters","score"])

lt = 26×2 table 
 lettersscore
1'A'1841
2'B'556
3'C'790
4'D'989
5'E'2449
6'F'445
7'G'543
8'H'646
9'I'1258
10'J'69
11'K'477
12'L'1301
13'M'639
14'N'1019
⋮

	letters	score
1	'A'	1841
2	'B'	556
3	'C'	790
4	'D'	989
5	'E'	2449
6	'F'	445
7	'G'	543
8	'H'	646
9	'I'	1258
10	'J'	69
11	'K'	477
12	'L'	1301
13	'M'	639
14	'N'	1019
⋮

Create a score for each word

We can now create a word score based on the popularity of the letters it uses. Start by replacing each letter with its individual score, then adding up the letter scores to create word scores.

% for each letter, replace it with its corresponding letter score

letters_score = arrayfun(@(x) lt.score(lt.letters==x),letters);

% sum the letter scores to create word scores

word_score = sum(letters_score,2);

% find the top scores and their corresponding words

[top_scores,top_idx] = sort(word_score,1,"descend");

word_scores = table(word5(top_idx),top_scores,'VariableNames',["words","score"]);

Choose a word and make our first guess

While I'm no game theorist, it seems obvious our opening move should be one that uses five different and popular letters to maximize the chance we'll get useful feedback to narrow down our search. After removing words with repeated letters, we see AROSE is the top choice for first word so let's try that.

% find how many unique letters are in each word

word_scores.num_letters = arrayfun(@(x) numel(unique(char(x))),word_scores.words);

% keep only the words with no repeated letters

top_words_norep = word_scores(word_scores.num_letters==5,:);

head(top_words_norep)

ans = 8×3 table 
 wordsscorenum_letters
1"AROSE"97925
2"EARLS"96285
3"LASER"96285
4"REALS"96285
5"ALOES"96095
6"ASTER"95895
7"RATES"95895
8"STARE"95895

	words	score	num_letters
1	"AROSE"	9792	5
2	"EARLS"	9628	5
3	"LASER"	9628	5
4	"REALS"	9628	5
5	"ALOES"	9609	5
6	"ASTER"	9589	5
7	"RATES"	9589	5
8	"STARE"	9589	5

Account for Wordle's feedback

Figure 3: Our Wordle puzzle after making our first guess.

After submitting our first guess, we can see that three of the letters, A, R, and O, are in the final answer but in different positions. The letters S and E are not in the word at all. This feedback eliminates a huge number of possible words.

Now that we have this feedback, how can we incorporate it? It's a fairly simple matter of representing the feedback received then looping through those results and eliminating words that are no longer possible solutions. We do so in the filter_words helper function found in the Appendix. With it we pass in our table of words and their scores, the words we've guessed so far, and the encoded results of those guesses. The results are encoded as a matrix with one row per guess and one column per letter. If the letter is incorrect it is encoded as 0, if the letter is in the answer but not in that position it is encoded as 1, and if it is in the correct position it is encoded as 2. 

Make our second guess

We're off to a good start! Passing this information to filter_words, we've narrowed our candidates down from 4,581 words to just 35. 

% our previous guesses

guesses = "AROSE";

% encode the feedback

results = [1,1,1,0,0];

% filter down to the remaining candidates

top_words_filtered = filter_words(word_scores,guesses,results)

top_words_filtered = 35×3 table 
 wordsscorenum_letters
1"TAROT"73144
2"RATIO"73105
3"RADIO"70375
4"CAROL"68815
5"CORAL"68815
6"POLAR"68455
7"RADON"67985
8"MOLAR"67305
9"MORAL"67305
10"ROYAL"67265
11"LABOR"66475
12"LARGO"66345
13"MANOR"64485
14"ROMAN"64485
⋮

	words	score	num_letters
1	"TAROT"	7314	4
2	"RATIO"	7310	5
3	"RADIO"	7037	5
4	"CAROL"	6881	5
5	"CORAL"	6881	5
6	"POLAR"	6845	5
7	"RADON"	6798	5
8	"MOLAR"	6730	5
9	"MORAL"	6730	5
10	"ROYAL"	6726	5
11	"LABOR"	6647	5
12	"LARGO"	6634	5
13	"MANOR"	6448	5
14	"ROMAN"	6448	5
⋮

We can see the top score for the next word is TAROT, but at this point we're probably better off still using words with five unique letters, so let's try RATIO.

FIgure 4: Our Wordle puzzle after making our second guess.

Make our third guess

Now the "A" is in the right location, and we've eliminated two more popular letters. After adding in this information, there are only 10 candidates left and CAROL is the next top choice.

% our previous guesses

guesses = ["AROSE";"RATIO"];

% encode the feedback

results = [1,1,1,0,0;

    1,2,0,0,1];

% filter down to the remaining candidates, no requirement on unique letters

top_words_filtered = filter_words(word_scores,guesses,results)

top_words_filtered = 10×3 table 
 wordsscorenum_letters
1"CAROL"68815
2"LABOR"66475
3"MANOR"64485
4"BARON"63655
5"VALOR"63505
6"CAROM"62195
7"MAYOR"60645
8"VAPOR"58035
9"MAJOR"54985
10"FAVOR"54945

	words	score	num_letters
1	"CAROL"	6881	5
2	"LABOR"	6647	5
3	"MANOR"	6448	5
4	"BARON"	6365	5
5	"VALOR"	6350	5
6	"CAROM"	6219	5
7	"MAYOR"	6064	5
8	"VAPOR"	5803	5
9	"MAJOR"	5498	5
10	"FAVOR"	5494	5

Figure 5: Our Wordle puzzle after making our third guess.

Make our fourth guess

Now we've got two letters in the right spot, and by process of elimination we know "R" must come last. Adding this info, we see there's only five choices left and three of them start with M so let's go with MANOR.

% our previous guesses

guesses = ["AROSE";"RATIO";"CAROL"];

% encode the feedback

results = [1,1,1,0,0;

    1,2,0,0,1;

    0,2,1,2,0];

% filter down to the remaining candidates

top_words_filtered = filter_words(word_scores,guesses,results)

top_words_filtered = 5×3 table 
 wordsscorenum_letters
1"MANOR"64485
2"MAYOR"60645
3"VAPOR"58035
4"MAJOR"54985
5"FAVOR"54945

	words	score	num_letters
1	"MANOR"	6448	5
2	"MAYOR"	6064	5
3	"VAPOR"	5803	5
4	"MAJOR"	5498	5
5	"FAVOR"	5494	5

Figure 6: Our Wordle puzzle after making our fourth guess.

Make our fifth guess

And now we're left with two choices for two guesses.

% our previous guesses

guesses = ["AROSE";"RATIO";"CAROL";"MANOR"];

% encode the feedback

results = [1,1,1,0,0;

    1,2,0,0,1;

    0,2,1,2,0;

    0,2,0,2,2];

% filter down to the remaining candidates

top_words_filtered = filter_words(word_scores,guesses,results)

top_words_filtered = 2×3 table 
 wordsscorenum_letters
1"VAPOR"58035
2"FAVOR"54945

	words	score	num_letters
1	"VAPOR"	5803	5
2	"FAVOR"	5494	5

Figure 7: Our Wordle puzzle after our fifth guess.

Make our sixth and final guess

One option left for our final guess. Fingers crossed!

% our previous guesses

guesses = ["AROSE";"RATIO";"CAROL";"MANOR";"VAPOR"];

% encode the feedback

results = [1,1,1,0,0;

    1,2,0,0,1;

    0,2,1,2,0;

    0,2,0,2,2;

    1,2,0,2,2];

% filter down to the remaining candidates

top_words_filtered = filter_words(word_scores,guesses,results)

top_words_filtered = 1×3 table 
 wordsscorenum_letters
1"FAVOR"54945

	words	score	num_letters
1	"FAVOR"	5494	5

Figure 8: Our Wordle puzzle after our sixth guess. Success!

So, it worked out with this Wordle puzzle, but it took all six guesses so we cut it close. How well will this work in general?

Play a random game of Wordle

If MATLAB knows what the answer is, we can automate the process of playing a game of Wordle and see if our algorithm will correctly guess it. We'll start by creating another helper function wordle_feedback in the Appendix to encode the feedback we receive for each guess based on the correct answer.

Now we can automatically play a game using our play_wordle helper function. This accepts our table of five letter words and their scores, along with a word to serve as the answer. It will return the answer we were trying to guess, whether or not we won while playing, and the guesses made along the way. As we play, we'll require that our first three guesses use no repeating letters (assuming such words are still possible), but from the fourth guess on letters can repeat.

Since we know where there's a list of the mystery words, we can read it from the Google sheet directly into MATLAB.

mystery_id = "1-M0RIVVZqbeh0mZacdAsJyBrLuEmhKUhNaVAI-7pr2Y"; % taken from the sheet's URL linked above

mystery_url = sprintf("https://docs.google.com/spreadsheets/d/%s/gviz/tq?tqx=out:csv",mystery_id);

mystery_words = readlines(mystery_url);

% there's an extra set of double quotes included, so let's strip them out

mystery_words = erase(mystery_words,"""");

% also we're using upper case

mystery_words = upper(mystery_words);

Our algorithm can only guess words from the vocabulary we gave it. About 4% of mystery words are missing from our vocabulary, so even if we play perfectly using the words we know, the best win rate we can expect is 96%.

num_missing = sum(~ismember(mystery_words,word_scores.words))

num_missing = 94

perc_missing = num_missing / numel(mystery_words) * 100

perc_missing = 4.0605

Now that we have the mystery list, we can play a game with a random answer to guess. 

answer_idx = randi(numel(mystery_words));

[answer,win,played_words] = play_wordle(word_scores,mystery_words(answer_idx))

answer = "SPRAY"
win = 1
played_words = 1×6 string
"AROSE"      "LAIRS"      "SPRAT"      "SPRAY"      ""           ""           

Play all possible games of Wordle

We can test our algorithm across the entire 2,315 mystery word vocabulary by running in a loop. We can see that this simple approach will get us the right answer within six guesses about 94% of the time, which is pretty close to the maximum possible of 96%! When we do win, we'll most commonly win in four guesses.

num_games = numel(mystery_words);

wins = nan(num_games,1);

guesses = strings(num_games,6);

answers = strings(num_games,1);

for ii = 1:num_games % for each word in our vocabulary

    % play a game of Wordle where that word is the answer we're guessing

    [answers(ii),wins(ii),guesses(ii,:)] = play_wordle(word_scores,mystery_words(ii));

end

fprintf("This strategy results in winning ~%0.1f%% of the time.\n",sum(wins)/numel(wins)*100)

This strategy results in winning ~94.2% of the time.

num_guesses = sum(guesses(wins==1,:)~="",2);

histogram(num_guesses,"Normalization","probability")

xlabel("Number of guesses when winning Wordle")

ylabel("Fraction of victories")

Here's how the game went for an answer we didn't get correct. 

missed_answers = answers(wins==0);

[answer,win,played_words] = play_wordle(word_scores,missed_answers(1))

answer = "ABLED"
win = 0
played_words = 1×6 string
"AROSE"      "ALIEN"      ""           ""           ""           ""           

There seems to be two patterns to missed answers.

As mentioned above, about 4% of answers aren't in our vocabulary, such as with RAMEN and ZESTY. You can tell when this happens because we lose the game without using all our guesses due to running out of allowable words.
Some answers combine a common letter pattern with a rarely used letter, and we didn't have enough guesses to narrow it down. For example, when the answer is FIXER, there are 39 words in our vocabulary that use "I" in the second position and "ER" at the end. Out of all of them FIXER has the lowest word score due to F and X both being in the bottom seven least used letters. Our six guesses go AROSE, LITER, DINER, RIPER, HIKER, FIBER and we run out of guesses before getting to FIXER.

Areas for improvement

What are some other things we could try to get our win rate to 100%? Here's a few ideas:

We identified the two main patterns to missed answers above. Clearly the first pattern could be resolved just by adding Wordle's mystery words to our vocabulary.
A solution to the second pattern is less clear. One drawback of our current word scoring approach is that the scores are static, so if a word like FIXER starts with a lower score, that will never change. We could potentially get a few more correct guesses by updating our score as we play by removing the ineligible words and/or solved letter positions from the score computation.
We could also try improving our scoring method by looking for common patterns, called n-grams. Most commonly n-grams are used to find common word combinations, but it can also be used to find common letter combinations. We could extract the top letter n-grams and incorporate that into our score, since guessing a word with a common n-gram will get us feedback on many similar words.
We're already requiring that our first three guesses use non-repeating letters, which is a strategy I picked through trial-and-error and may not be optimal. We could also use non-overlapping words on the first few guesses, even if we already got some letters correct. This would require us to always use 10 unique letters across our first two guesses, even if we have to make guesses we know can't be correct in order to do so. I experimented with using this universally and it actually decreases the overall win rate very slightly, but there may be a smarter way to use it situationally.

Do you have any other ideas for better strategies? Let us know in the comments.

Appendix

function word_scores_filtered = filter_words(word_scores,words_guessed,results)

% remove words_guessed since those can't be the answer

word_scores_filtered = word_scores;

word_scores_filtered(matches(word_scores_filtered.words,words_guessed),:) = [];

% filter to words that have correct letters in correct positions (green letters)

[rlp,clp] = find(results==2);

if ~isempty(rlp)

    for ii = 1:numel(rlp)

        letter = extract(words_guessed(rlp(ii)),clp(ii));

        % keep only words that have the correct letters in the correct locations

        word_scores_filtered = word_scores_filtered(extract(word_scores_filtered.words,clp(ii))==letter,:);

end

end

% filter to words that also contain correct letters in other positions (yellow letters)

[rl,cl] = find(results==1);

if ~isempty(rl)

    for jj = 1:numel(rl)

        letter = extract(words_guessed(rl(jj)),cl(jj));

        % remove words with letter in same location

        word_scores_filtered(extract(word_scores_filtered.words,cl(jj))==letter,:) = [];

        % remove words that don't contain letter

        word_scores_filtered(~contains(word_scores_filtered.words,letter),:) = [];

end

end

% filter to words that also contain no incorrect letters (grey letters)

[ri,ci] = find(results==0);

if ~isempty(ri)

    for kk = 1:numel(ri)

        letter = extract(words_guessed(ri(kk)),ci(kk));

        % remove words that contain incorrect letter

        word_scores_filtered(contains(word_scores_filtered.words,letter),:) = [];

end

end

end % filter_words

function results = wordle_feedback(answer, guess)

results = nan(1,5);

for ii = 1:5 % for each letter in our guess

    letter = extract(guess,ii); % extract that letter

    if extract(answer,ii) == letter

        % if answer has the letter in the same position

        results(ii) = 2;

    elseif contains(answer,letter)

        % if answer has that letter in another position

        results(ii) = 1;

    else

        % if answer does not contain that letter

        results(ii) = 0;

end

end

end % wordle_feedback

function [word_to_guess,win,guesses] = play_wordle(word_scores, word_to_guess)

top_words = sortrows(word_scores,2,"descend"); % ensure scores are sorted

guesses = strings(1,6);

results = nan(6,5);

max_guesses = 6;

for ii = 1:max_guesses % for each of our guesses

    % filter our total vocabulary to candidate guesses using progressively different strategies

    if ii == 1 % for our first guess, filter down to words with five unique letters and take top score

        top_words_filtered = top_words(top_words.num_letters==5,:);

    elseif ii <= 3 % if we're generating our second or third guess

        % filter out ineligible words and require five unique letters if possible

        min_uniq = 5;

        top_words_filtered = filter_words(top_words(top_words.num_letters==min_uniq,:),guesses(1:ii-1),results(1:ii-1,:));

        % if filtering to five unique letters removes all words, allow more repeated letters

        while height(top_words_filtered) == 0 && min_uniq > min(word_scores.num_letters)

            min_uniq = min_uniq - 1;

            top_words_filtered = filter_words(top_words(top_words.num_letters==min_uniq,:),guesses(1:ii-1),results(1:ii-1,:));

end

    else % after third guess, set no restrictions on repeated letters

        top_words_filtered = filter_words(top_words,guesses(1:ii-1),results(1:ii-1,:));

end

    % generate our guess (if we have any)

    if height(top_words_filtered) == 0 % if there are no eligible words in our vocabulary

        win = 0; % we don't know the word and we've lost

        return % make no more guesses

    else % otherwise generate a new guess and get the results

        guesses(1,ii) = top_words_filtered.words(1);

        results(ii,:) = wordle_feedback(word_to_guess,guesses(1,ii));

end

    % evaluate if we've won, lost, or should keep playing

    if guesses(1,ii) == word_to_guess % if our guess is correct

        win = 1; % set the win flag

        return % make no more guesses

    elseif ii == max_guesses % if we've already used all our guesses and they're all wrong

        win = 0; % we've lost and the loop will end

    else % otherwise we're still playing

end

end

end % play_wordle

% citation: Jim Goodall, 2020. Stack Overflow, available at: https://stackoverflow.com/a/60181033

function [clean_s] = removediacritics(s)

%REMOVEDIACRITICS Removes diacritics from text.

%   This function removes many common diacritics from strings, such as

%     á - the acute accent

%     à - the grave accent

%     â - the circumflex accent

%     ü - the diaeresis, or trema, or umlaut

%     ñ - the tilde

%     ç - the cedilla

%     å - the ring, or bolle

%     ø - the slash, or solidus, or virgule

% uppercase

s = regexprep(s,'(?:Á|À|Â|Ã|Ä|Å)','A');

s = regexprep(s,'(?:Æ)','AE');

s = regexprep(s,'(?:ß)','ss');

s = regexprep(s,'(?:Ç)','C');

s = regexprep(s,'(?:Ð)','D');

s = regexprep(s,'(?:É|È|Ê|Ë)','E');

s = regexprep(s,'(?:Í|Ì|Î|Ï)','I');

s = regexprep(s,'(?:Ñ)','N');

s = regexprep(s,'(?:Ó|Ò|Ô|Ö|Õ|Ø)','O');

s = regexprep(s,'(?:Œ)','OE');

s = regexprep(s,'(?:Ú|Ù|Û|Ü)','U');

s = regexprep(s,'(?:Ý|Ÿ)','Y');

% lowercase

s = regexprep(s,'(?:á|à|â|ä|ã|å)','a');

s = regexprep(s,'(?:æ)','ae');

s = regexprep(s,'(?:ç)','c');

s = regexprep(s,'(?:ð)','d');

s = regexprep(s,'(?:é|è|ê|ë)','e');

s = regexprep(s,'(?:í|ì|î|ï)','i');

s = regexprep(s,'(?:ñ)','n');

s = regexprep(s,'(?:ó|ò|ô|ö|õ|ø)','o');

s = regexprep(s,'(?:œ)','oe');

s = regexprep(s,'(?:ú|ù|ü|û)','u');

s = regexprep(s,'(?:ý|ÿ)','y');

% return cleaned string

clean_s = s;

end

Pattern from 1997: using feval

Loren Shure — Thu, 06 Jan 2022 14:39:17 +0000

In the early 1990s, to avoid eval and all of its quirks (if you don't know about this, DON'T look it up - it's totally discouraged), we recommended using feval for evaluating functions that might not be known until supplied by the user running the code.  We used this, for example, for evaluating numeric integrals.  We wanted to leave the integrand completely flexible and up to the user.  Yet the integrator had to be able to evaluate the user function, an unknown at the time of creating the integrator.

function I=integ(fcn,fmin,fmax,tol)

if ~ischar(fcn) 

   error(...)

end

% figure out some initial points to evaluate

pts = linspace(fmin, fmax, 20);

fv = feval(fcn,pts);

I = ...

:

end

This had the advantage of not asking MATLAB to "poof" any variables into the workspace.  It helped also avoid situations where there was a possibility of a function and variable having the same name, thereby possibly not giving you the version of the name you expected.  The way you used feval at that time was generally via a character array identifying the function to be called. 

I am only considering the use of feval in the context of characters or strings in MATLAB, and not for some of the more specialized versions such as working with GPUs.

You would call the integration function like this.

area = integ('myfun', 0, pi);

Today, with function handles, we can bypass using feval and use the function handle directly.

function I=integ(fcn,fmin,fmax,tol)

if ~isa(fcn, 'function_handle')  

% might still be nice to allow chars for backward compatiblity- but not be permissive about allowing new "strings".

% if ~isa(fcn, 'function_handle') || ~ischar(fcn)

   error(...)

end

% figure out some initial points to evaluate

pts = linspace(fmin, fmax, 20);

fv = fcn(pts);

I = ...

:

end

Call it like this.

area = integ(@myfun, 0, pi);

This is useful for at least a couple of reasons: 

It is generally faster, if only by a bit, because there is one less indirection of function calls.
It evaluates the function for the handle you supply - and can't get confused about other possible name conflicts as a result. Inside the integrator, we have complete control about the name of the function (called fcn in the code) and since it's a function handle, it can't conflict with anything else we may have around in the function or environment.

Thoughts

I know we use feval for some cases of working with GPUs, but I can't think of any typical MATLAB case where I still need to use feval instead of directly applying the function.  Do you still use feval, perhaps where it's no longer needed?  Let us know here.

Copyright 2022 The MathWorks, Inc.

Benefits of Refactoring Code

Loren Shure — Mon, 29 Nov 2021 21:57:29 +0000

Benefits of Refactoring Code

I have seen a lot of code in my life, including code from many different people written for many different purposes, and in many different "styles".  These styles range from quick-and-dirty (I only need to do this once), to fully optimized, documented, and tested (I want this to last a long time while other people use it).  For me, I have found, a bit more than I expected, that the quick-and-dirty quickly morphs into something being useful and used a lot, but without the thought and care of making sure the code is really up to the task.

Today I want to argue why, as soon as you take your quick-and-dirty code, to use again, it is time to refactor and use some good engineering techniques to whip it into shape.

Refactoring leads to smaller components

First break the code into logical units that are small.  Each unit is more concise, it's more focused, does less, and it's clear what it does and does not do.  Decide on edge cases and error conditions and deal with these in a way that is straight-forward and is likely to cause users of this module the least amount of trouble.

Benefits of smaller components

Each piece is easier to understand

As I already said, each part does less and so it's easier to understand what each piece does. In fact, if you can get the piece of code to do one thing well, that often pays off.  One technique for doing this is to reduce the branching (if-elseif-else) and instead have various branches relegated to separate functions.  Another technique is to use an arguments block to check the input arguments.  This generally takes up fewer lines of code and you are able to be concise and precise using it.

Reduced complexity has less overhead (mental and otherwise)

When you have smaller components, they are frequently easier to understand, and much easier to test - especially if there are a very limited number of code paths.  You can check out the complexity of your code using either of 2 options to checkcode.

checkcode(filename, "-cyc") 

checkcode(filename, "-modcyc")

Complexity is reduced when you refactor the code so there are not so many nested statements like if and switch statements.

Each piece is easier to test

With smaller pieces, each part is easier to test, debug, check out edge conditions.  You can be certain you are covering all the bases more readily.  Of course, using the one of the testing frameworks.

Each piece is easier to reuse

Since we're working with smaller units, it's generally easier to reuse the pieces because, I hope, we've split things up in such a way that the interfaces are simpler to use.  With each piece small, there's usually only a limited number of inputs required.

Each piece is easier to read and understand

With input argument lists smaller, I hope the calling syntax is shorter.  If you are calling functions that take optional arguments from within your function, the more the code complexity, the more likely code gets indented further and further to the right, either having code not in view in your window, or the argument list goes on for several lines, especially if you are using the new name=value syntax.  In that case, each "input" is potentially long, possible causing extra line continuations to fit inside the editor window parameters you have set.

Example

I have an example from the file exchange, X Steam.  If you look at the code, there are lots of switch/case statements with if-else statements inside.  How can you easily debug something near a particular line.  It's hard!  And that's despite the code not being poorly written or structured apart from this nesting.

checkcode Xsteam.m -cyc

L 159 (C 14-19): The McCabe cyclomatic complexity of 'XSteam' is 322.
L 1430 (C 4-6): The value assigned to variable 'err' might be unused.
L 1441 (C 18-22): The McCabe cyclomatic complexity of 'v1_pT' is 2.
L 1457 (C 18-22): The McCabe cyclomatic complexity of 'h1_pT' is 2.
L 1473 (C 18-22): The McCabe cyclomatic complexity of 'u1_pT' is 2.
L 1491 (C 18-22): The McCabe cyclomatic complexity of 's1_pT' is 2.
L 1509 (C 19-24): The McCabe cyclomatic complexity of 'Cp1_pT' is 2.
L 1525 (C 19-24): The McCabe cyclomatic complexity of 'Cv1_pT' is 2.
L 1547 (C 18-22): The McCabe cyclomatic complexity of 'w1_pT' is 2.
L 1569 (C 18-22): The McCabe cyclomatic complexity of 'T1_ph' is 2.
L 1584 (C 18-22): The McCabe cyclomatic complexity of 'T1_ps' is 2.
L 1599 (C 18-22): The McCabe cyclomatic complexity of 'p1_hs' is 2.
L 1614 (C 20-26): The McCabe cyclomatic complexity of 'T1_prho' is 3.
L 1634 (C 18-22): The McCabe cyclomatic complexity of 'v2_pT' is 2.
L 1638 (C 1-2): The value assigned to variable 'J0' might be unused.
L 1639 (C 1-2): The value assigned to variable 'n0' might be unused.
L 1653 (C 18-22): The McCabe cyclomatic complexity of 'h2_pT' is 3.
L 1675 (C 18-22): The McCabe cyclomatic complexity of 'u2_pT' is 3.
L 1701 (C 18-22): The McCabe cyclomatic complexity of 's2_pT' is 3.
L 1727 (C 19-24): The McCabe cyclomatic complexity of 'Cp2_pT' is 3.
L 1749 (C 19-24): The McCabe cyclomatic complexity of 'Cv2_pT' is 3.
L 1777 (C 18-22): The McCabe cyclomatic complexity of 'w2_pT' is 3.
L 1805 (C 18-22): The McCabe cyclomatic complexity of 'T2_ph' is 8.
L 1857 (C 18-22): The McCabe cyclomatic complexity of 'T2_ps' is 8.
L 1912 (C 18-22): The McCabe cyclomatic complexity of 'p2_hs' is 8.
L 1966 (C 18-24): The McCabe cyclomatic complexity of 'T2_prho' is 4.
L 1991 (C 20-26): The McCabe cyclomatic complexity of 'p3_rhoT' is 2.
L 2000 (C 1-2): The value assigned to variable 'pc' might be unused.
L 2011 (C 20-26): The McCabe cyclomatic complexity of 'u3_rhoT' is 2.
L 2020 (C 1-2): The value assigned to variable 'pc' might be unused.
L 2031 (C 20-26): The McCabe cyclomatic complexity of 'h3_rhoT' is 2.
L 2040 (C 1-2): The value assigned to variable 'pc' might be unused.
L 2053 (C 20-26): The McCabe cyclomatic complexity of 's3_rhoT' is 2.
L 2062 (C 1-2): The value assigned to variable 'pc' might be unused.
L 2075 (C 21-28): The McCabe cyclomatic complexity of 'Cp3_rhoT' is 2.
L 2084 (C 1-2): The value assigned to variable 'pc' might be unused.
L 2102 (C 21-28): The McCabe cyclomatic complexity of 'Cv3_rhoT' is 2.
L 2111 (C 1-2): The value assigned to variable 'pc' might be unused.
L 2121 (C 20-26): The McCabe cyclomatic complexity of 'w3_rhoT' is 2.
L 2130 (C 1-2): The value assigned to variable 'pc' might be unused.
L 2148 (C 18-22): The McCabe cyclomatic complexity of 'T3_ph' is 4.
L 2182 (C 18-22): The McCabe cyclomatic complexity of 'v3_ph' is 4.
L 2216 (C 18-22): The McCabe cyclomatic complexity of 'T3_ps' is 4.
L 2249 (C 18-22): The McCabe cyclomatic complexity of 'v3_ps' is 4.
L 2283 (C 18-22): The McCabe cyclomatic complexity of 'p3_hs' is 4.
L 2318 (C 18-22): The McCabe cyclomatic complexity of 'h3_pT' is 5.
L 2348 (C 20-26): The McCabe cyclomatic complexity of 'T3_prho' is 3.
L 2369 (C 17-20): The McCabe cyclomatic complexity of 'p4_T' is 1.
L 2380 (C 17-20): The McCabe cyclomatic complexity of 'T4_p' is 1.
L 2391 (C 17-20): The McCabe cyclomatic complexity of 'h4_s' is 9.
L 2449 (C 17-20): The McCabe cyclomatic complexity of 'p4_s' is 4.
L 2462 (C 18-22): The McCabe cyclomatic complexity of 'h4L_p' is 5.
L 2488 (C 18-22): The McCabe cyclomatic complexity of 'h4V_p' is 5.
L 2513 (C 18-22): The McCabe cyclomatic complexity of 'x4_ph' is 3.
L 2525 (C 18-22): The McCabe cyclomatic complexity of 'x4_ps' is 4.
L 2541 (C 18-22): The McCabe cyclomatic complexity of 'T4_hs' is 15.
L 2606 (C 18-22): The McCabe cyclomatic complexity of 'h5_pT' is 3.
L 2630 (C 18-22): The McCabe cyclomatic complexity of 'v5_pT' is 2.
L 2634 (C 1-3): The value assigned to variable 'Ji0' might be unused.
L 2635 (C 1-3): The value assigned to variable 'ni0' might be unused.
L 2650 (C 18-22): The McCabe cyclomatic complexity of 'u5_pT' is 3.
L 2675 (C 19-24): The McCabe cyclomatic complexity of 'Cp5_pT' is 3.
L 2698 (C 18-22): The McCabe cyclomatic complexity of 's5_pT' is 3.
L 2724 (C 19-24): The McCabe cyclomatic complexity of 'Cv5_pT' is 3.
L 2752 (C 18-22): The McCabe cyclomatic complexity of 'w5_pT' is 3.
L 2781 (C 18-22): The McCabe cyclomatic complexity of 'T5_ph' is 3.
L 2798 (C 18-22): The McCabe cyclomatic complexity of 'T5_ps' is 3.
L 2814 (C 18-24): The McCabe cyclomatic complexity of 'T5_prho' is 3.
L 2836 (C 22-30): The McCabe cyclomatic complexity of 'region_pT' is 15.
L 2867 (C 22-30): The McCabe cyclomatic complexity of 'region_ph' is 18.
L 2953 (C 22-30): The McCabe cyclomatic complexity of 'region_ps' is 16.
L 3008 (C 22-30): The McCabe cyclomatic complexity of 'region_hs' is 33.
L 3168 (C 24-34): The McCabe cyclomatic complexity of 'Region_prho' is 17.
L 3238 (C 19-24): The McCabe cyclomatic complexity of 'B23p_T' is 1.
L 3245 (C 19-24): The McCabe cyclomatic complexity of 'B23T_p' is 1.
L 3255 (C 20-26): The McCabe cyclomatic complexity of 'p3sat_h' is 2.
L 3271 (C 20-26): The McCabe cyclomatic complexity of 'p3sat_s' is 2.
L 3285 (C 19-24): The McCabe cyclomatic complexity of 'hB13_s' is 2.
L 3299 (C 20-26): The McCabe cyclomatic complexity of 'TB23_hs' is 2.
L 3319 (C 29-44): The McCabe cyclomatic complexity of 'my_AllRegions_pT' is 13.
L 3348 (C 1-2): The value assigned to variable 'ps' might be unused.
L 3365 (C 29-44): The McCabe cyclomatic complexity of 'my_AllRegions_ph' is 14.
L 3408 (C 1-2): The value assigned to variable 'ps' might be unused.
L 3426 (C 21-28): The McCabe cyclomatic complexity of 'tc_ptrho' is 8.
L 3444 (C 7): Consider using newline, semicolon, or comma before this statement for readability.
L 3444 (C 9-10): Terminate statement with semicolon to suppress output (in functions).
L 3467 (C 30-46): The McCabe cyclomatic complexity of 'Surface_Tension_T' is 3.
L 3485 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_p' is 1.
L 3488 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_p' is 1.
L 3491 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_T' is 1.
L 3494 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_T' is 1.
L 3497 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_h' is 1.
L 3499 (C 26-37): The McCabe cyclomatic complexity of 'fromSIunit_h' is 1.
L 3501 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_v' is 1.
L 3503 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_v' is 1.
L 3505 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_s' is 1.
L 3507 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_s' is 1.
L 3509 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_u' is 1.
L 3511 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_u' is 1.
L 3513 (C 24-34): The McCabe cyclomatic complexity of 'toSIunit_Cp' is 1.
L 3515 (C 26-38): The McCabe cyclomatic complexity of 'fromSIunit_Cp' is 1.
L 3517 (C 24-34): The McCabe cyclomatic complexity of 'toSIunit_Cv' is 1.
L 3519 (C 26-38): The McCabe cyclomatic complexity of 'fromSIunit_Cv' is 1.
L 3521 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_w' is 1.
L 3523 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_w' is 1.
L 3525 (C 24-34): The McCabe cyclomatic complexity of 'toSIunit_tc' is 1.
L 3527 (C 26-38): The McCabe cyclomatic complexity of 'fromSIunit_tc' is 1.
L 3529 (C 24-34): The McCabe cyclomatic complexity of 'toSIunit_st' is 1.
L 3531 (C 26-38): The McCabe cyclomatic complexity of 'fromSIunit_st' is 1.
L 3533 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_x' is 1.
L 3535 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_x' is 1.
L 3537 (C 24-34): The McCabe cyclomatic complexity of 'toSIunit_vx' is 1.
L 3539 (C 26-38): The McCabe cyclomatic complexity of 'fromSIunit_vx' is 1.
L 3541 (C 24-34): The McCabe cyclomatic complexity of 'toSIunit_my' is 1.
L 3543 (C 26-38): The McCabe cyclomatic complexity of 'fromSIunit_my' is 1.
L 3550 (C 16-20): The McCabe cyclomatic complexity of 'check' is 28.
L 3570 (C 14): Terminate statement with semicolon to suppress output (in functions).
L 3571 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3581 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3582 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3592 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3593 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3605 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3606 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3626 (C 14): Terminate statement with semicolon to suppress output (in functions).
L 3627 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3637 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3638 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3648 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3649 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3660 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3661 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3681 (C 14): Terminate statement with semicolon to suppress output (in functions).
L 3682 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3692 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3693 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3703 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3704 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3714 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3715 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3725 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3726 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3736 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3737 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3747 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3748 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3755 (C 1-2): The preallocated value assigned to variable 'R3' might be unused.
L 3757 (C 5-6): The variable 'R4' appears to change size on every loop iteration. Consider preallocating for speed.
L 3759 (C 11): Terminate statement with semicolon to suppress output (in functions).
L 3760 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3764 (C 1-2): The preallocated value assigned to variable 'R3' might be unused.
L 3768 (C 11): Terminate statement with semicolon to suppress output (in functions).
L 3769 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3773 (C 1-2): The value assigned to variable 'R3' might be unused.
L 3777 (C 11): Terminate statement with semicolon to suppress output (in functions).
L 3778 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3798 (C 14): Terminate statement with semicolon to suppress output (in functions).
L 3799 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3809 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3810 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3820 (C 12): Terminate statement with semicolon to suppress output (in functions).
L 3821 (C 4): Terminate statement with semicolon to suppress output (in functions).
L 3835 (C 17-23): If you are operating on scalar values, consider using STR2DOUBLE for faster performance.
L 3835 (C 45-51): If you are operating on scalar values, consider using STR2DOUBLE for faster performance.
L 3836 (C 5-9): The value assigned to variable 'Check' might be unused.
L 3836 (C 10): Terminate statement with semicolon to suppress output (in functions).
L 3838 (C 9-11): The value assigned to variable 'err' might be unused.

With this many paths through the code, 322, what are the chances that there are no issues, despite the refactored functions?  If I were using this for some work I wanted to publish, I would need to make sure that all the paths I used were correctly computing what I need.  Since that's a hassle, I'd likely refactor the code.

Refactor - how?

Apart from going in the code and copy/pasting to elsewhere (could be in the same file), you can use the tools in the editor toolstrip or from the right-click context menu once you've made your selection.

How do you deal with your piles of code?

Do you let the code rule you or do you rule your code?  Please post any additional techniques or benefits I have not mentioned right here.

Copyright 2021 The MathWorks, Inc.

Symmetry, tessellations, golden mean, 17, and patterns

Loren Shure — Sat, 23 Oct 2021 11:03:40 +0000

Seventeen?  Why 17?  Well, as a high school student, I attended HCSSIM, a summer program for students interested in math.  There we learned all kinds of math you don't typically learn about until much later in your studies.  One of the reference books was Calculus on Manifolds by Michael Spivak.  Inside, you learn some of the mysteries of algebra, and, if you read carefully, you will find references to both yellow pigs and the number 17.  I leave it as a challenge to you to learn more about either or both if you are interested.

As I went to college, the number 17 was a part of my life.  Looking through the course catalogue before my first semester, I saw an offering something like "the seventeen regular tilings of the plane", and I signed up.  And isn't cool that all of these patterns are displayed in tiles within the Alhambra!  I leave you to search the many sites with pictures and drawings of these.

I enjoy the artwork of Rafael Araujo.  If you have watched any webinars I have delivered during 2020-2021, you may notice a piece of Araujo's hanging in the background.  The basis for much of his work is the golden mean (or golden ratio).  Here's a place where you can explore the influence of math on art. 

So what is the golden mean?

It's defined as the solution to 

And the value, typically denoted by the Greek letter

ϕ =1+52 " style="vertical-align:-15px">ϕ =1+52 

or approximately 

phi = (1+sqrt(5))/2

phi = 1.6180

And there are claims that this ratio is universally(?) pleasing.  You can see approximations to it show up in everyday life.  In the US, we use note cards that are 5x3".

ratio5to3 = 5/3

ratio5to3 = 1.6667

So, close.

plot(0:(5/3):5,0:3,'.')

title("Not quite the golden ratio: " + ratio5to3)

axis equal

axis tight

I have written several blogs that show ways to compute Fibonacci numbers, also related to the golden mean.  Why?  Because ratios of successive Fibonacci numbers converge to the golden mean

limn→∞Fn+1Fn=ϕ " style="vertical-align:-17px">limn→∞Fn+1Fn=ϕ 

History

In the 1990s, we held several MATLAB User Conferences.  In 1997, I gave a talk on Programming Patterns in MATLAB.  I had 17 of them available, but time to only discuss 6 of them.  The regular tilings of the plane seemed like a cool way to categorize and clump together some of programming patterns I wanted to talk about.  I thought it would be interesting to revisit many of these and see how well they held up over time.  So that's my plan for some of the upcoming posts though I feel no compulsion to do all of them or in the order they showed up in my original talk.

First pattern - data duplication in service of mathematical operations

Steve Eddins wrote two posts on this topic in 2016: one and two.  And I wrote one as well, on performance implications.

Part 1

When I first started at MathWorks (1987), MATLAB had only double matrices and no other data types or dimensions.  If I wanted to remove the mean of each column of data in a matrix, I would do something like this.

A(4,4) = 0;

A(:) = randperm(16)

A = 4×4
   3    11     1
  10     5    13
   8     4     2
  15    16     9

Here I'll calculate the mean of each column.

meanAc = mean(A)

meanAc = 1×4
    9.7500    9.0000    9.0000    6.2500

and then I needed to create an array from meanAc that was the same size as A in order to subtract the means.   Originally, we did this by matrix multiplication.

Ameans1 = ones(4,1)*meanAc

Ameans1 = 4×4
7500    9.0000    9.0000    6.2500
7500    9.0000    9.0000    6.2500
7500    9.0000    9.0000    6.2500
7500    9.0000    9.0000    6.2500

And now I can do the subtraction.

Ameanless1 = A-Ameans1

Ameanless1 = 4×4
   -2.7500   -6.0000    2.0000   -5.2500
   -3.7500    1.0000   -4.0000    6.7500
    2.2500   -1.0000   -5.0000   -4.2500
    4.2500    6.0000    7.0000    2.7500

I then met a customer at my first ICASSP conference (in Phoenix, AZ), Tony, and he asked why I was not using indexing instead - because I never thought about it!  This is cool because I didn't need to do arithmetic to get my expanded mean matrix.

Ameans2 = meanAc(ones(1,4),:)

Ameans2 = 4×4
7500    9.0000    9.0000    6.2500
7500    9.0000    9.0000    6.2500
7500    9.0000    9.0000    6.2500
7500    9.0000    9.0000    6.2500

isequal(Ameans1, Ameans2)

ans = logical
   1

That was all well and good - but potentially not so easy to remember each time you might need it.  

Part 2

In 1996, we had heard plenty from customers that we were making something simple a little too difficult.  And, we were very close to introducing ND arrays, where we wanted to be able to do similar operations in any chosen dimension(s).  So we introduced a new function, repmat.

Now I can find the matrix mean with easier to read code, in my opinion.

Ameanlessr = A - repmat(mean(A),[4,1])

Ameanlessr = 4×4
   -2.7500   -6.0000    2.0000   -5.2500
   -3.7500    1.0000   -4.0000    6.7500
    2.2500   -1.0000   -5.0000   -4.2500
    4.2500    6.0000    7.0000    2.7500

isequal(Ameanless1, Ameanlessr)

ans = logical
   1

Part 3

By 2006, we had a lot of evidence that handling really large data was important for many of our customers, and likely to be an increasing demand.  Up until then, we always created an intermediate matrix the same size as our original one, A, in order to calculate the result.  But this wasn't strictly necessary -- we just need some syntax -- a way to express that all the rows (or columns) would be the same.  Now, of course we need a matrix the same size as A for the answer.  But how many more arrays of that size did we need along the way?  Along came the function, gloriously named bsxfun (standing for binary singleton expansion), and we could perform the computation without fully forming the mxn matrix to subtract from the original.

Ameanlessb = bsxfun(@minus, A, mean(A))

Ameanlessb = 4×4
   -2.7500   -6.0000    2.0000   -5.2500
   -3.7500    1.0000   -4.0000    6.7500
    2.2500   -1.0000   -5.0000   -4.2500
    4.2500    6.0000    7.0000    2.7500

isequal(Ameanless1, Ameanlessb)

ans = logical
   1

Part 4

Finally, in 2016, we decided that the meaning was clear even if it wasn't strictly linear algebra, and we now allow many operations to take advantage of implicit expansion of singleton dimensions.  What this means for us with this problem is now we can simply say

Ameanless2016 = A - mean(A)

Ameanless2016 = 4×4
   -2.7500   -6.0000    2.0000   -5.2500
   -3.7500    1.0000   -4.0000    6.7500
    2.2500   -1.0000   -5.0000   -4.2500
    4.2500    6.0000    7.0000    2.7500

isequal(Ameanless1, Ameanless2016)

ans = logical
   1

Conclusion

I do not expect a Part 5 to come along in 2026, though of course I could be wrong!

Did any of you attend the conferences in 1993, 1995, and 1997?  Share your memories here!

Copyright 2021 The MathWorks, Inc.

What Do MATLAB and Games Have in Common?

Loren Shure — Tue, 21 Sep 2021 11:32:00 +0000

Today I want to introduce you to Jake Mitchell, a MATLAB user that I knew of and someone recently reminded me of again. Jake is a mechanical engineering major who is interested in data science. He uses MATLAB to explore strategies and positions in various games, and then writes about it.  As he does, he shows the core code for the way pieces move and the game unfolds. 

Games with Simple Rules

Jake has really nice commentary about possible strategies, based on simulating many, many plays of each games.  In some cases, he also applies machine learning techniques to enable a machine to learn to play, such as tic-tac-toe.  He's got an algorithm for playing Connect-4, a fun post on Chutes and Ladders.  And he explores games that appear simple, or at least have simple governing rules, to ones that have much more nuance.

Games with More Complexity

I learned to play Settlers of Catan probably 15 years ago.  And I still play occasionally.  Now I will be armed with more strategic knowledge after reading Jake on How I Built the Best Catan Board.

Perhaps my favorite is Jake's analysis on the value of Monopoly properties.  He goes into all the different property types, adding houses and hotels, plus utilities and railroads.  And don't forget about going to jail!  I like the way Jake presents the results as well, sometimes in tables and sometimes in plots.  Here's a plot Jake allowed me to copy, showing the effects of houses and hotels on reaching break-even on the investment.  Plus I really like that he uses the same colors as the Monopoly game so you can easily tell which group of properties are which.

I also like that he delves into the ins and outs of Boardwalk and Park Place!

Games, Anyone?

Have you made or analyzed games using MATLAB?  Clearly some people have, when I check out the File Exchange, or with this search.  If you have, please share with us here!

Copyright 2021 The MathWorks, Inc.

A faster and more accurate sum

Loren Shure — Tue, 07 Sep 2021 12:13:33 +0000

Today's guest blogger is Christine Tobler, who's a developer at MathWorks working on core numeric functions.

Hi everyone! I'd like to tell you a story about round-off error, the algorithm used in sum, and compability issues. Over the last few years, my colleague Bobby Cheng has made changes to sum to make it both more accurate and faster, and I thought it would be an interesting story to tell here.

Table of Contents

Numerical issues in sum
Different ways of computing the sum
Making changes to sum
References about accurately computing the sum
Helper Functions

Numerical issues in sum

Even for a simple function like sum, with floating-point numbers the order in which we sum them up matters:

x = (1 + 1e-16) + 1e-16

x = 1

y = 1 + (1e-16 + 1e-16)

y = 1.0000

x - y

ans = -2.2204e-16

So the result of the sum command depends on the order in which the inputs are added up. The sum command in MATLAB started out with the most straight-forward implementation: Start at the first number and add them up one at a time. I have implemented that in sumOriginal at the bottom of this post.

But we ran into a problem: For single precision, the results were sometimes quite wrong:

sumOfOnes = sumOriginal(ones(1e8, 1, 'single'))

sumOfOnes = single
    16777216

What's going on here? The issue is that for this number, the round-off error is larger than 1, as we can see by calling eps

eps(sumOfOnes)

ans = single
    2

Therefore, adding another 1 and then rounding to single precision returns the same number again, as the exact result is rounded down to fit into single precision:

sumOfOnes + 1

ans = single
    16777216

This was particularly noticeable when sum was used to compute the mean of a large array, since the mean computed could be smaller than all the elements of the input array.

Different ways of computing the sum

Now we already had a threaded version of sum in MATLAB, where we would compute the sum of several chunks of an array in parallel, and then add those up in the end. It turned out that this version didn't have the same issues:

numBlocks = 8;

sumInBlocks(ones(1e8, 1, 'single'), numBlocks)

ans = single
    100000000

We have made this change in MATLAB's sum even for the non-threaded case, and it addresses this and other similar cases. But keep in mind that while this is an improvement for many cases, it's not a perfect algorithm and still won't always give the correct answer. We could modify the number of blocks that we split the input into, and not all work out great:

sumInBlocks(ones(1e8, 1, 'single'), 4)

ans = single
    67108864

sumInBlocks(ones(1e8, 1, 'single'), 128)

ans = single
    99999832

Let's look at another case where we're not just summing up the number one (still summing up the same number everytime, because it makes it easier to compare with the exact value). We'll be looking at the relative error of the result here, which is more relevant than if we get the exact integer correct:

x = repmat(single(3155), 54194, 1);

exactSum = 3155*54194;

numBlocks = 2.^(0:9)

numBlocks = 1×10
     1     2     4     8    16    32    64   128   256   512

err = zeros(1, length(numBlocks));

for i=1:length(numBlocks)

    err(i) = abs(sumInBlocks(x, numBlocks(i)) - exactSum) / exactSum;

end

loglog(numBlocks, err)

xlabel('Number of blocks')

ylabel('Relative error in sumInBlocks function')

So there's definitely a balancing act for choosing the exact number of blocks, with the plot above just representing one dataset.

There are other possible algorithms for computing the sum, I'm including some links at the bottom. Some more complicated ones would result in a slowdown in computation time, which would put a burden on cases that aren't experiencing the issues with accuracy described above.

We chose this "computing by blocks" algorithm because the change could be made in a such a way that sum became faster. This was done using loop-unrolling: Instead of adding one number at a time, several numbers are added to separate running totals at the same time. I've included an implementation in sumLoopUnrolled below.

By the way: I'm not explicitly giving the choice of numBlocks we made, because we wouldn't want anyone to rely on this - it might change again in the future, after all.

Making changes to sum

In R2017b, we first shipped this new version of sum, for the case of single-precision numbers only. All practical issues with results being off by magnitudes for large arrays of single data had been addressed, and the function even got faster at the same time. However, we also got quite some feedback from people who were unhappy about the behavior change; while the new behavior gives as good or better results, their code was relying on the old behavior and adjusting to the new behavior was painful.

While we don't want to get stuck in never being able to improve our functions for fear of breaking a dependency, we definitely want to make the process of updating as pain-free as possible. Generally speaking, we aim for run-to-run reproducibility: If a function is called twice with the same inputs, the outputs will be the same. However, if the machine, OS or MATLAB version (among other externals) change, this can also change the exact value that's returned by MATLAB, within round-off error. It's common for the output of matrix multiplication to change for performance' sake, for example. But with sum, no changes had been made for a long time, and so more people had come to rely on its exact behavior than we would have expected.

So after making this change to sum for single values, we waited a few releases to evaluate customer feedback on the change. In R2020b, we added the same behavior for all other datatypes. This time, we added a Release Note that describes both the performance improvement and mentions the change as a "Compatibility Consideration". While we still had some feedback on this being an inconvenient change, it was less than in the first case.

Have you had changes in round-off error with a new MATLAB release break some of your code? Do you find the compatibility considerations in the release notes useful? Please let us know in the comments.

References about accurately computing the sum

Nick Higham, "What is stochastic rounding?", Blog post from 7/7/2020.
Blanchard, Pierre, Nicholas J. Higham, and Theo Mary. "A class of fast and accurate summation algorithms." SIAM Journal on Scientific Computing 42, no. 3 (2020): A1541-A1557.

Helper Functions

function s = sumOriginal(x)

    s = 0;

    for ii=1:length(x)

        s = s + x(ii);

end

end

function s = sumInBlocks(x, numBlocks)

    len = length(x);

    blockLength = ceil(len / numBlocks);

    s = sumOriginal(x(1:blockLength));

    iter = blockLength;

    while iter

        s = s + sumOriginal(x(iter+1:min(iter+blockLength, len)));

        iter = iter + blockLength;

end

end

function s = sumLoopUnrolled(x) %#ok 

 % Example of loop unrolling for numBlocks == 4. For simplicity, we assume

 % the length of x is divisible by 4.

%

 % Note this technique is faster using a built-in code with the right

 % compiler flags. It won't necessarily be faster in MATLAB code like this.

    s1 = 0;

    s2 = 0;

    s3 = 0;

    s4 = 0;

    for ii=1:4:length(x)

        s1 = s1 + x(ii);

        s2 = s2 + x(ii+1);

        s3 = s3 + x(ii+2);

        s4 = s4 + x(ii+3);

end

    s = s1 + s2 + s3 + s4;

end

Copyright 2021 The MathWorks, Inc.

Cone Programming and Optimal Discrete Dynamics

Loren Shure — Wed, 18 Aug 2021 12:17:07 +0000

Today's guest blogger is Alan Weiss, who writes documentation for Optimization Toolbox and other mathematical toolboxes.

Table of Contents

Cone Programming
Discrete Dynamics With Cone Constraints
Find Optimal Time
Final Thoughts
Helper Functions

Cone Programming

Hi, folks. The subject for today is cone programming, and an application of cone programming to controlling a rocket optimally. Since R2020b the coneprog solver has been available to solve cone programming problems. What is cone programming? I think of it as a generalization of quadratic programming. All quadratic programming problems can be represented as cone programming problems. But there are cone programming problems that cannot be represented as quadratic programs.

So again, what is cone programming? It is a problem with a linear objective function and linear constraints, like a linear program or quadratic program. But it also incorporates cone constraints. In three dimensions [x, y, z], you can represent a cone as, for example, the radius of a circle in the x-y direction is less than or equal to z. In other words, the cone constraint is the inequality constraint

$ x^2+y^2\le z^2 $,

or equivalently

$ \|[x,y]\|\le z $ for nonnegative z.

Here is a picture of the boundary of the cone $ \|[x,y]\|\le z $ for nonnegative z.

[X,Y] = meshgrid(-2:0.1:2);

Z = sqrt(X.^2 + Y.^2);

surf(X,Y,Z)

view(8,2)

xlabel("x")

ylabel("y")

zlabel("z")

Of course, you can scale, translate, and rotate a cone constraint. The formal definition of a general cone constraint uses a matrix Asc, vectors bsc and d, and scalar gamma with the constraint in x represented as

norm(Asc*x - bsc) <= d'*x - gamma;

The coneprog solver in Optimization Toolbox requires you to use the secondordercone function to formulate cone constraints. For example,

Asc = diag([1,1/2,0]);

bsc = zeros(3,1);

d = [0;0;1];

gamma = 0;

socConstraints = secondordercone(Asc,bsc,d,gamma);

f = [-1,-2,0];

Aineq = [];

bineq = [];

Aeq = [];

beq = [];

lb = [-Inf,-Inf,0];

ub = [Inf,Inf,2];

[x,fval] = coneprog(f,socConstraints,Aineq,bineq,Aeq,beq,lb,ub)

Optimal solution found.
x = 3×1
    0.4851
    3.8806
    2.0000
fval = -8.2462

It can be simpler to use the problem-based approach to access cone programming. This functionality was added in R2021a. For the previous example using the problem-based approach:

x = optimvar('x',3,"LowerBound",[-Inf,-Inf,0],"UpperBound",[Inf,Inf,2]);

Asc = diag([1,1/2,0]);

prob = optimproblem("Objective",-x(1)-2*x(2));

prob.Constraints = norm(Asc*x) <= x(3);

[sol,fval] = solve(prob)

Solving problem using coneprog.
Optimal solution found.
sol = struct with fields:
    x: [3×1 double]
fval = -8.2462

Notice that, unlike most nonlinear solvers, you do not need to specify an initial point for coneprog. This comes in handy in the following example.

Discrete Dynamics With Cone Constraints

Suppose that you want to control a rocket to land gently at a particular location using minimal fuel. Suppose that the fuel used is proportional to the applied acceleration times time. Do not model the changing weight of the rocket as you burn fuel; we are supposing that this control is for a relatively short time, where the weight does not change appreciably. There is gravitational acceleration g = 9.81 in the negative z direction. There is also linear drag on the rocket that acts in the negative direction of velocity with coefficient 1/10. This means after time t, without any applied acceleration or gravity, the velocity changes from v to $ v\exp(-t/10) $.

In continuous time the equations of motion for position $ p(t) $, velocity $ v(t) $, and applied acceleration $ a(t) $ are

$ \frac{dp}{dt} = v(t) $

$ \frac{dv}{dt} = -v(t)/10 + a(t) + g*[0,0,-1] $.

Here are some approximate equations of motion, using discrete time with N equal steps of length $ t = T/N $:

$ p(i+i) = p(i) + t*(v(i) + v(i+1))/2 $ (trapezoidal rule)

$ v(i+1) = v(i)*\exp(-t/10) + t*(a(i) + g*[0, 0, -1]) $ (Euler integration).

Therefore,

$ p(i+1) = p(i) + t*v(i)*(1 + \exp(-t/10))/2 + t^2*(a(i) + g*[0, 0, -1])/2 $.

Now for the part that leads to cone programming. Suppose that the applied acceleration at each step is bounded by a constant Amax. These constraints are

$ \|a(i)\| \le {\rm Amax} $ for all i.

The cost to minimize should be the sum of the norms of the accelerations times t. Cone programming requires the objective function to be linear in optimization parameters. You can reformulate this cost to be linear by introducing new optimization variables s(i) that are subject to a new set of cone constraints:

$ {\rm cost} = \sum s(i)*t $

$ \|s(i)\| \le a(i) $.

Suppose that the rocket is traveling initially at velocity $ v0 = [100,50,-40] $ at position $ p0 = [-1000,-800,1200] $. Calculate the acceleration required to bring the rocket to position $ [0,0,0] $ with velocity $ [0,0,0] $ at time $ T = 40 $. Break up the calculation into 100 steps ($ t=40/100 $). Suppose that the maximum acceleration $ \rm{Amax} = 2g $.

The makeprob function at the end of this script accepts the time T, initial position p0, and initial velocity v0, and returns a problem that describes the discrete dynamics and cost.

p0 = [-1000,-800,1200];

v0 = [100,50,-40];

prob = makeprob(40,p0,v0)

prob = 
  OptimizationProblem with properties:

       Description: ''
    ObjectiveSense: 'minimize'
         Variables: [1×1 struct] containing 4 OptimizationVariables
         Objective: [1×1 OptimizationExpression]
       Constraints: [1×1 struct] containing 4 OptimizationConstraints

  See problem formulation with show.

Set options to solve the cone programming problem using an optimality tolarance 100 times smaller than the default. Use the "schur" linear solver, which can be more accurate for this problem.

opts = optimoptions("coneprog","OptimalityTolerance",1e-8,"LinearSolver","schur");

[sol,cost] = solve(prob,Options=opts)

Solving problem using coneprog.
Optimal solution found.
sol = struct with fields:
    a: [99×3 double]
    p: [100×3 double]
    s: [99×1 double]
    v: [100×3 double]
cost = 312.7740

The plottrajandaccel function at the end of this script plots both the trajectory and the norm of the acceleration as a function of time step.

plottrajandaccel(sol)

The optimal acceleration is nearly "bang-bang." The rocket accelerates at about $ 2g $ at first, then has close to zero acceleration until the near end.  Near the end, the rocket accelerates at maximum to slow the descent and land with zero velocity. The total cost of this control is about 313.

Find Optimal Time

Find the optimal time T for the rocket to land, meaning the time that causes the rocket to use the least possible fuel. The findT function at the end of this script calls fminbnd to locate the minimal-cost time. I experimented briefly to find that [20,60] is a reasonable range for times T for the minimum, and I used those bounds in the fminbnd call. If you take a time much less than 20 you get an infeasible problem:

badprob = makeprob(15,p0,v0);

badsol = solve(badprob,Options=opts)

Solving problem using coneprog.
Problem is infeasible.
badsol = struct with fields:
    a: []
    p: []
    s: []
    v: []

(As an aside, if you try to make T an optimization variable then the problem is no longer a coneprog problem. Instead, it is a problem for fmincon, which takes much longer to solve in this case, and requires you to provide an initial point.)

Topt = findT(opts)

Solving...
Done
Topt = 22.3294

Plot the optimal trajectory and acceleration.

probopt = makeprob(Topt,p0,v0);

[solopt,costopt] = solve(probopt,Options=opts)

Solving problem using coneprog.
Optimal solution found.
solopt = struct with fields:
    a: [99×3 double]
    p: [100×3 double]
    s: [99×1 double]
    v: [100×3 double]
costopt = 171.1601

plottrajandaccel(solopt)

The optimal cost is about 171, which is roughly half of the cost for the original parameters. This time, the control is more nearly bang-bang. The rocket accelerates at maximum at first, then stops accelerating for some time. Again, during the final times the rocket accelerates at maximum to land with zero velocity.

Final Thoughts

Cone programming is a surprisingly versatile framework for solving many convex optimization problems. For another nontrivial example, see Minimize Energy of Piecewise Linear Mass-Spring System Using Cone Programming, Problem-Based. For other problems that can be put in the cone programming framework, see Lobo, Miguel Sousa, Lieven Vandenberghe, Stephen Boyd, and Hervé Lebret. “Applications of Second-Order Cone Programming.” Linear Algebra and Its Applications 284, no. 1–3 (November 1998): 193–228. https://doi.org/10.1016/S0024-3795(98)10032-0

Do you find cone programming or discrete dynamics useful? Do you have any examples of your own to share? Let us know here.

Helper Functions

This code creates the makeprob function.

function trajectoryproblem = makeprob(T,p0,v0)

N = 100;

g = 9.81;

pF = [0 0 0];

Amax = 2*g;

p = optimvar("p",N,3);

v = optimvar("v",N,3);

a = optimvar("a",N-1,3);

s = optimvar("s",N-1,"LowerBound",0,"UpperBound",Amax);

trajectoryproblem = optimproblem;

t = T/N;

trajectoryproblem.Objective = sum(s)*t;

scons = optimconstr(N-1);

for i = 1:(N-1)

    scons(i) = norm(a(i,:)) <= s(i);

end

acons = optimconstr(N-1);

for i = 1:(N-1)

    acons(i) = norm(a(i,:)) <= Amax;

end

vcons = optimconstr(N+1,3);

vcons(1,:) = v(1,:) == v0;

vcons(2:N,:) = v(2:N,:) == v(1:(N-1),:)*exp(-t/10) + t*(a + repmat([0 0 -g],N-1,1));

vcons(N+1,:) = v(N,:) == [0 0 0];

pcons = optimconstr(N+1,3);

pcons(1,:) = p(1,:) == p0;

pcons(2:N,:) = p(2:N,:) == p(1:(N-1),:) + (1+exp(-t/10))/2*t*v(1:(N-1),:) + t^2/2*(a + repmat([0 0 -g],N-1,1));

pcons((N+1),:) = p(N,:) == pF;

trajectoryproblem.Constraints.acons = acons;

trajectoryproblem.Constraints.scons = scons;

trajectoryproblem.Constraints.vcons = vcons;

trajectoryproblem.Constraints.pcons = pcons;

end

This code creates the plottrajandaccel function.

function plottrajandaccel(sol)

figure

psol = sol.p;

p0 = psol(1,:);

pF = psol(end,:);

plot3(psol(:,1),psol(:,2),psol(:,3),'rx')

hold on

plot3(p0(1),p0(2),p0(3),'ks')

plot3(pF(1),pF(2),pF(3),'bo')

hold off

view([18 -10])

xlabel("x")

ylabel("y")

zlabel("z")

legend("Steps","Initial Point","Final Point")

figure

asolm = sol.a;

nasolm = sqrt(sum(asolm.^2,2));

plot(nasolm,"rx")

xlabel("Time step")

ylabel("Norm(acceleration)")

end

This code creates the fvalT function, which is used by findT.

function Fval = fvalT(T,opts)

p0 = [-1000,-800,1200];

v0 = [100,50,-40];

tprob = makeprob(T,p0,v0);

opts = optimoptions(opts,"Display","off");

[~,Fval] = solve(tprob,Options=opts);

end

This code creates the findT function.

function Tmin = findT(opts)

disp("Solving...")

Tmin = fminbnd(@(T)fvalT(T,opts),20,60);

disp("Done")

end

Copyright 2021 The MathWorks, Inc.

Finding the Optimal Value

Loren Shure — Tue, 03 Aug 2021 12:10:46 +0000

Have you ever needed to solve an optimization problem where there were local minima?  What strategy do you use to solve it, trying to find the "best" answer?  Today I'm going to talk about a simple strategy, readily available in the Global Optimization Toolbox.

Solve a Simple Problem

Or at least let's try.  I have some data and I want to fit a particular form of a curve to it.   First let's look at the pharmacokinetic data. Here's the reference: Parameter estimation in nonlinear algebraic models via global optimization. Computers & Chemical Engineering, Volume 22, Supplement 1, 15 March 1998, Pages S213-S220 William R. Esposito, Christodoulos A. Floudas.

The data are time vs. concentration

t = [ 3.92,  7.93, 11.89, 23.90, 47.87, 71.91, 93.85, 117.84 ]

t = 1×8
    3.9200    7.9300   11.8900   23.9000   47.8700   71.9100   93.8500  117.8400

c = [0.163, 0.679, 0.679, 0.388, 0.183, 0.125, 0.086, 0.0624 ]

c = 1×8
    0.1630    0.6790    0.6790    0.3880    0.1830    0.1250    0.0860    0.0624

I like to see the data, in part to be sure I have no entry mistakes, and in part to get a feel for the overall system.  In fact, let's visualize the data.

plot(t,c,'o')

xlabel('Time')

ylabel('Concentration')

3 Compartment Model

As in the reference, we fit a 3 compartment model, sum of 3 decaying exponentials.

c=b1 e(-b4 t)+b2 e(-b5 t)+b3 e(-b6 t)" style="vertical-align:-6px">c=b1 e(-b4 t)+b2 e(-b5 t)+b3 e(-b6 t)

and we can express that model as an anonymous function of t (time) and the model parameters [b(1) b(2) ... b(6)].

model = @(b,t) b(1)*exp(-b(4)*t) + b(2)*exp(-b(5)*t) + b(3)*exp(-b(6)*t)

model = function_handle with value:
    @(b,t)b(1)*exp(-b(4)*t)+b(2)*exp(-b(5)*t)+b(3)*exp(-b(6)*t)

Define Optimization Problem

We next define the optimization problem to solve using the problem-based formulation.  This allows us to choose the solver we want, supply the data, and naturally express constraints and options.

problem = createOptimProblem('lsqcurvefit', ...

    'objective', model, ...

    'xdata', t, 'ydata', c, ...

    'x0',ones(1,6),...

    'lb', [-10 -10 -10  0   0   0 ],...

    'ub', [ 10  10  10 0.5 0.5 0.5], ...

    'options',optimoptions('lsqcurvefit',...

    'OutputFcn', @curvefittingPlotIterates,...

    'Display','none'))

problem = struct with fields:
    objective: @(b,t)b(1)*exp(-b(4)*t)+b(2)*exp(-b(5)*t)+b(3)*exp(-b(6)*t)
           x0: [1 1 1 1 1 1]
        xdata: [3.9200 7.9300 11.8900 23.9000 47.8700 71.9100 93.8500 117.8400]
        ydata: [0.1630 0.6790 0.6790 0.3880 0.1830 0.1250 0.0860 0.0624]
           lb: [-10 -10 -10 0 0 0]
           ub: [10 10 10 0.5000 0.5000 0.5000]
       solver: 'lsqcurvefit'
      options: [1×1 optim.options.Lsqcurvefit]

Solve the Problem

First solve the problem directly once.

b = lsqcurvefit(problem)

b = 1×6
    0.1842    0.1836    0.1841    0.0172    0.0171    0.0171

You'll notice that the model does not do a stellar job fitting the data or even following the shape of the data.  

MultiStart

Let's see if we can do better by starting at a bunch of different points.

ms = MultiStart;

ms.Display = 'iter';

rng default

figure

tic

[~,fval,exitflag,output,solutions] = run(ms, problem, 50)

    Run       Local       Local      Local    Local   First-order
   Index     exitflag      f(x)     # iter   F-count   optimality
       1         3        0.222         8        63     0.0006396
       2         3     0.000154        21       154      0.001864
       3         3     0.009442        44       315       0.01989
       4         3    1.462e-05        34       245      0.002586
       5         3    1.454e-05        19       140     1.079e-05
       6         3    1.475e-05        24       175      0.006883
       7         3     0.009445        50       357      0.002266
       8         3    1.495e-05        32       231      0.006853
       9         3    1.466e-05        35       252       0.00478
      10         3      0.00944        80       567       0.01042
      11         3    1.471e-05        40       287      0.005472
      12         3    1.566e-05        24       175      0.001576
      13         3     0.009439        24       175     0.0005121
      14         3     0.009451        41       294       0.02935
      15         3     0.009493        26       189      0.004837
      16         3    1.476e-05        40       287      0.006352
      17         3    1.494e-05        40       287      0.008288
      18         3     0.009446        62       441       0.01296
      19         3    1.457e-05        22       161      0.001755
      20         3    1.488e-05        53       378      0.007608
      21         3      0.00944        37       266      0.006878
      22         3    1.464e-05        24       175      0.003709
      23         3     0.009449        43       308       0.02515
      24         3     0.009447        47       336      0.007942
      25         3    1.455e-05        23       168      0.001621
      26         3     0.009442        32       231       0.01328
      27         3    1.479e-05        40       287      0.004821
      28         3    1.479e-05        18       133      0.006878
      29         3    1.456e-05        72       511     0.0009721
      30         3    1.455e-05        42       301      0.001122
      31         3     0.009441        47       336       0.01537
      32         3     0.009451        47       336      0.008942
      33         3    0.0001729        14       105     0.0003276
      34         3     0.009442        44       315       0.01062
      35         3    0.0001751        21       154      7.71e-05
      36         3    1.509e-05        26       189      0.009896
      37         3     0.009458        39       280       0.02208
      38         1    1.454e-05        24       175     7.815e-08
      39         3     0.009441        60       427        0.0107
      40         3    1.472e-05        34       245      0.002981
      41         3    1.503e-05        22       161       0.00585
      42         3      0.00952        15       112      0.008492
      43         3     0.009439        21       154      0.000769
      44         3    1.462e-05        64       455     0.0005576
      45         3     0.009439        17       126     0.0001567
      46         3    1.471e-05        30       217      0.001973
      47         3     0.009444        38       273       0.02022
      48         3    1.474e-05        24       175      0.004799
      49         3    1.522e-05        42       301      0.008228
      50         3     0.009445        40       287       0.02166

MultiStart completed the runs from all start points.

All 50 local solver runs converged with a positive local solver exit flag.
fval = 1.4540e-05
exitflag = 1
output = struct with fields:
                funcCount: 12726
         localSolverTotal: 50
       localSolverSuccess: 50
    localSolverIncomplete: 0
    localSolverNoSolution: 0
                  message: 'MultiStart completed the runs from all start points.↵↵All 50 local solver runs converged with a positive local solver exit flag.'
solutions = 1×50 object 
 123456789101112131415161718192021222324252627282930
11×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution1×1 GlobalOptimSolution

serialTime = toc;

Visualize the Best Solution

The 50th solution, which is what is plotted above, is not necessarily the best one.  Luckily for us, MultiStart orders the solutions from best to worst. So we need only look at the first one.

curvefittingPlotIterates(solutions)

You can see now that the 50th was not the best solution as the mean squared error on this final one displayed is over a factor of 10 better.

MultiStart with Parallel Computing

I will now see if we can improve the performance using all 4 of my cores as parallel workers locally.

ms.UseParallel = true;

gcp;

tic;

rng default

run(ms, problem, 50);

Running the local solvers in parallel.

    Run       Local       Local      Local    Local   First-order
   Index     exitflag      f(x)     # iter   F-count   optimality
       1         3        0.222         8        63     0.0006396
       3      0.00944        80       567       0.01042
       9         3    1.466e-05        35       252       0.00478
       2         3     0.000154        21       154      0.001864
       3    1.476e-05        40       287      0.006352
       3     0.009493        26       189      0.004837
       3     0.009451        41       294       0.02935
       3         3     0.009442        44       315       0.01989
       3    1.464e-05        24       175      0.003709
       3      0.00944        37       266      0.006878
       3    1.488e-05        53       378      0.007608
       4         3    1.462e-05        34       245      0.002586
       8         3    1.495e-05        32       231      0.006853
       7         3     0.009445        50       357      0.002266
       6         3    1.475e-05        24       175      0.006883
       5         3    1.454e-05        19       140     1.079e-05
       3    1.479e-05        18       133      0.006878
       3    1.479e-05        40       287      0.004821
       3     0.009442        32       231       0.01328
       3    1.455e-05        23       168      0.001621
       3     0.009439        24       175     0.0005121
       3    1.566e-05        24       175      0.001576
       3    1.471e-05        40       287      0.005472
       3     0.009442        44       315       0.01062
       3    0.0001729        14       105     0.0003276
       3     0.009451        47       336      0.008942
       3    1.457e-05        22       161      0.001755
       3     0.009446        62       441       0.01296
       3    1.494e-05        40       287      0.008288
       3     0.009447        47       336      0.007942
       3     0.009449        43       308       0.02515
       3     0.009441        47       336       0.01537
       3    1.455e-05        42       301      0.001122
       3    1.456e-05        72       511     0.0009721
       3      0.00952        15       112      0.008492
       3     0.009439        17       126     0.0001567
       3     0.009458        39       280       0.02208
       3    1.509e-05        26       189      0.009896
       3    0.0001751        21       154      7.71e-05
       3    1.472e-05        34       245      0.002981
       3     0.009441        60       427        0.0107
       1    1.454e-05        24       175     7.815e-08
       3    1.503e-05        22       161       0.00585
       3    1.522e-05        42       301      0.008228
       3     0.009444        38       273       0.02022
       3    1.462e-05        64       455     0.0005576
       3     0.009439        21       154      0.000769
       3    1.474e-05        24       175      0.004799
       3     0.009445        40       287       0.02166
       3    1.471e-05        30       217      0.001973

MultiStart completed the runs from all start points.

All 50 local solver runs converged with a positive local solver exit flag.

parallelTime = toc;

Calculate Speedup

Speed up may not be evident until second run due to pool start up time.  Since I started mine earlier, I get to see decent speed up.

speedup = serialTime/parallelTime

speedup = 2.5014

Do You Have Problems Where the Solution is Sensitive to the Starting Point?

Tell us about your exploration of the solution space to your problem.  If the solution is sensitive to where you start, you might consider using MultiStart and other techniques from the Global Optimization Toolbox.

Copyright 2021 The MathWorks, Inc.

Appendix

Here's the code for plotting the iterates.

dbtype curvefittingPlotIterates

   function stop = curvefittingPlotIterates(x,optimValues,state)
       % Output function that plots the iterates of the optimization algorithm.
   
       %   Copyright 2010 The MathWorks, Inc.
   
       persistent x0 r;
       if nargin == 1
           showPlot(x(1).X,x(1).X0{:},x(1).Fval)
       else
          switch state
              case 'init' % store initial point for later use
                  x0 = x;
              case 'done'
                  if ~(optimValues.iteration == 0)
                      % After optimization, display solution in plot title
                      r = optimValues.resnorm;
                      showPlot(x,x0,r)
                  end
          end
      end
      if nargout > 0
          stop = false;
          clear function
      end
  end
  
  function showPlot(b,b0,r)
      f = @(b,x) b(1)*exp(-b(4).*x) + b(2).*exp(-b(5).*x) +...
          b(3).*exp(-b(6).*x);
  
      persistent h ha
      if isempty(h) || ~isvalid(h)
          x = [  3.92,  7.93, 11.89, 23.90, 47.87, 71.91, 93.85, 117.84 ];
          y = [ 0.163, 0.679, 0.679, 0.388, 0.183, 0.125, 0.086, 0.0624 ];
          plot(x,y,'o');
          xlabel('t')
          ylabel('c')
          title('c=b_1e^{-b_4t}+b_2e^{-b_5t}+b_3e^{-b_6t}')
          axis([0 120 0 0.8]);        
          h = line(3:120,f(b,3:120),'Color','r','Tag','PlotIterates');
  
      else
          set(h,'YData',f(b,get(h,'XData')));
      end
      s = sprintf('Starting Value   Fitted Value\n\n');
  
      for i = 1:length(b)
          s = [s, sprintf('b(%d): % 2.4f      b(%d): % 2.4f\n',i,b0(i),i,b(i))];
      end
      s = [s,sprintf('\nMSE = %2.4e',r)];
  
      if isempty(ha) || ~isvalid(ha)
          % Create textbox
          ha = annotation(gcf,'textbox',...
              [0.5 0.5 0.31 0.32],...
              'String',s,...
              'FitBoxToText','on',...
              'Tag','CoeffDisplay');
      end
      ha.String = s;
      drawnow
  
  end