For several years I’ve thought about the interplay of statistics and common sense. Probability is more abstract than physical properties like length or color, and so common sense is more often misguided in the context of probability than in visual perception. In probability and statistics, the analogs of optical illusions are usually called paradoxes: St. Petersburg paradox, Simpson’s paradox, Lindley’s paradox, etc. These paradoxes show that common sense can be seriously wrong, without having to consider contrived examples. Instances of Simpson’s paradox, for example, pop up regularly in application.

Some physicists say that you should always have an order-of-magnitude idea of what a result will be before you calculate it. This implies a belief that such estimates are usually possible, and that they provide a sanity check for calculations. And that’s true in physics, at least in mechanics. In probability, however, it is quite common for even an expert’s intuition to be way off. Calculations are more likely to find errors in common sense than the other way around.

Nevertheless, common sense is vitally important in statistics. Attempts to minimize the need for common sense can lead to nonsense. You need common sense to formulate a statistical model and to interpret inferences from that model. Statistics is a layer of exact calculation sandwiched between necessarily subjective formulation and interpretation. Even though common sense can go badly wrong with probability, it can also do quite well in some contexts. Common sense is necessary to map probability theory to applications and to evaluate how well that map works.

]]>This evening I ran across a couple lines from Ed Catmull that are more accurate than the vet’s quote.

Do not fall for the illusion that by preventing errors, you won’t have errors to fix. The truth is, the cost of preventing errors is often far greater than the cost of fixing them.

From Creativity, Inc.

]]>The inequality is strict unless all the *x*‘s are zero, and the constant *e* on the right side is optimal. Torsten Carleman proved this theorem in 1923.

We fear bad things that we’ve seen on the news because they make a powerful emotional impression. But the things rare enough to be newsworthy are precisely the things we should not fear. Conversely, the risks we should be concerned about are the ones that happen too frequently to make the news.

]]>- R
- Version control
- Linear algebra
- Advanced math
- Bayesian statistics
- Category theory
- Foreign languages
- How to not waste time
- Women

IgorCarron‘s response didn’t fit into the list above. He said “I wish I had known that sensing all the way to machine learning is about approximating the identity” and gave a link to this post.

]]>I like the term “Data Scientist” for now. I expect that term will be meaningless in 5 years.

Sounds about right.

]]>

]]>

**Related post**: Take chances, make mistakes, and get messy

You can rent time on a virtual machine for around $0.05 per CPU-hour. You could pay more or less depending on on-demand vs reserved, Linux vs Windows, etc.

Suppose the total cost of hiring someone — salary, benefits, office space, equipment, insurance liability, etc. — is twice their wage. This implies that a minimum wage worker in the US costs as much as 300 CPUs.

This also implies that **programmer time is three orders of magnitude more costly than CPU time**. It’s hard to imagine such a difference. If you think, for example, that it’s worth minutes of programmer time to save hours of CPU time, you’re grossly under-valuing programmer time. It’s worth **seconds** of programmer time to save hours of CPU time.

**Update**: Use promo code KeenCon-JohnCook to get 75% off registration.

**What would Donald Knuth do**? Do a depth-first search on all technologies that might be relevant, and write a series of large, beautiful, well-written books about it all.

**What would Alexander Grothendieck do**? Develop a new field of mathematics that solves the problem as a trivial special case.

**What would Richard Stallman do**? Create a text editor so powerful that, although it doesn’t solve your problem, it does allow you to solve your problem by writing a macro and a few lines of Lisp.

**What would Larry Wall do**? Bang randomly on the keyboard and save the results to a file. Then write a language in which the file is a program that solves your problem.

What would you add to the list?

]]>

Compare Cost and Performance of Replication and Erasure Coding

Hitachi Review Vol. 63 (July 2014)

John D. Cook

Robert Primmer

Ab de Kwant

Discussions about technology choices seldom consider who we become by using a tool. Different tools encourage different ways of thinking. Over time, different tools lead to different habits of mind.

]]>**Cheer 1**: He’s not being secretive, fearing that someone will scoop his results. There have been a few instances of one academic scooping another’s research, but these are rare and probably not worth worrying about. Besides, a public GitHub repo is a pretty good way to prove your priority.

**Cheer 2**: Rather than being afraid someone will find an error, he’s inviting a world-wide audience to look for errors.

**Cheer 3**: He’s writing a dissertation that someone might actually want to read! That’s not the fastest route to a degree. It’s even actively discouraged in some circles. But it’s generous and great experience.

]]>

The podcast was posted this afternoon here.

**Related post**: Looking like you know what you’re doing

I’ve stopped posting to @DailySymbol. It was a fun experiment, but it was time to wrap it up.

My most popular account, @CompSciFact, now has over 100,000 followers. It’s interesting how some Twitter accounts take off and some don’t. CompSciFact has done quite well but I’ve shut down several other accounts that never gained much of a following.

You can find a list of my accounts here with a very brief description of each. Some of the accounts are a little broader than the name implies.

]]>

The AirConf events will be broadcast via G+ hangouts.

]]>Most of the rides involve sitting in an inner tube and floating down a course with rapids, waterfalls, swells, etc. At many points there are back currents. You could be headed toward a fall but then find yourself reversing direction. It’s surprising to have to work to make yourself go downhill. At most if not all these points there are employees standing in the water to grab hold of rafts and pull people in the right direction who need a little help.

One question I had is **what causes the back currents**. Ultimately you could solve Navier-Stokes equations, but it would be nice to understand at a more rule-of-thumb level how these currents work. It would also be interesting to see **whether a park could reduce the number of guides** while keeping the rides as fun. The guides also serve as lifeguards, so the park may need to position people in all the same spots even if they didn’t need as many guides.

The slowest person in the family was consistently yours truly. I’d start out in front and inevitably end up bringing up the rear. I was curious **how I could be so inept at a mostly passive activity**.

I was also curious **how they designed the rapids to be so safe**. You’re repeatedly tossed straight toward rocks — perfectly smooth artificial rocks, but still not not things you want to hit your head on — at a fairly high speed, and yet you never hit one. It has something to do with how they position jets to push you away from the rocks, but that would be interesting to understand in more detail.

Another thing I was curious about is **what the park does with its water in the off-season**. Schlitterbahn in New Braunfels is actually two parks, an older park that uses untreated water from the Comal river, and a newer park that uses treated water. When the parks close for the season, the older park must just let its water return to the river. (At least one of the rides ends in the river, so they’re already returning water to the river.)

The question of **what to do with the treated water** in the new park is more interesting. I assume they cannot just dump a huge volume of chlorinated water into the river. Aside from ecological consequences, I wonder whether they’d even want to dump the water. Is it economical to store the water somewhere when the park closes for the year? If not, do they store it anyway because they have no way to dispose of it, or do they treat it so that they can dispose it? I suppose they could circulate the water occasionally while the park is closed, though that seems expensive. I wonder whether different waterparks solve this problem different ways.

If I could propose a new ride for Schiltterbahn, it would be a video presentation about how the park was designed followed by Q&A with a couple engineers. This would be a terrible business decision, but a few visitors would love it.

]]>]]>It’s amazing how much cleaner your code looks the third time writing it. First time, hack; Second over-engineer; Third = goldilocks.

]]>Now the discovery of ideas as general as these is chiefly the willingness to make a brash or speculative abstraction, in this case supported by the pleasure of purloining words from the philosophers: “Category” from Aristotle and Kant, “Functor’ from Carnap …, and “natural transformation” from the current informal parlance.

Computing: the only industry that becomes less mature as more time passes.

The immaturity of computing is used to excuse every ignorance. There’s an enormous body of existing wisdom but we don’t care.

I don’t know whether computing is becoming less mature, though it may very well be on average, even if individual developers become more mature.

One reason is that computing is a growing profession, so people are entering the field faster than they are leaving. That lowers average maturity.

Another reason is chronological snobbery, alluded to in Fogus’s second tweet. Chronological snobbery is pervasive in contemporary culture, but especially in computing. Tremendous hardware advances give the illusion that software development has advanced more than it has. What could I possibly learn from someone who programmed back when computers were 100x slower? Maybe a lot.

**Related posts**:

lhs2TeX is roughly the Haskell analog of Sweave and Pweave. This post takes the sample code I wrote for Sweave and Pweave before and gives a lhs2TeX counterpart.

\documentclass{article} %include polycode.fmt %options ghci \long\def\ignore#1{} \begin{document} Invisible code that sets the value of the variable $a$. \ignore{ \begin{code} a = 3.14 \end{code} } Visible code that sets $b$ and squares it. (There doesn't seem to be a way to display the result of a block of code directly. Seems you have to save the result and display it explicitly in an eval statement.) \begin{code} b = 3.15 c = b*b \end{code} $b^2$ = \eval{c} Calling Haskell inline: $\sqrt{2} = \eval{sqrt 2}$ Recalling the variable $a$ set above: $a$ = \eval{a}. \end{document}

If you save this code to a file `foo.lhs`

, you can run

lhs2TeX -o foo.tex foo.lhs

to create a LaTeX file `foo.tex`

which you could then compile with `pdflatex`

.

One gotcha that I ran into is that your `.lhs`

file must contain at least one code block, though the code block may be empty. You cannot just have code in `\eval`

statements.

Unlike R and Python, the Haskell language itself has a notion of literate programming. Haskell specifies a format for literate comments. lhs2TeX is a popular tool for processing literate Haskell files but not the only one.

]]>One of the pages that stuck in my mind was a photo of Samuel Eilenberg. His name meant nothing to me at the time, but the caption titled “A subway topologist” caught my imagination.

… Polish-born Professor Samuel Eilenberg sprawls contemplatively in his Greenwich Village apartment in New York City. “Sometimes I like to think lying down,” he says, “but mostly I like to think riding on the subway.” Mainly he thinks about algebraic topology — a field so abstruse that even among mathematicians few understand it. …

I loved the image of Eilenberg staring intensely at the ceiling or riding around on a subway thinking about math. Since then I’ve often thought about math while moving around, though usually not on a subway. I’ve only lived for a few months in an area with a subway system.

The idea that a field of math would be unknown to many mathematicians sounded odd. I had no idea at the time that mathematicians specialized.

Algebraic topology doesn’t seem so abstruse now. It’s a routine graduate course and you might get an introduction to it in an undergraduate course. The book was published in 1963, and I suppose algebraic topology would have been more esoteric at the time.

]]>

I’d rather write a PowerShell script than a bash script, but I’d rather use the bash console interactively. The PowerShell console is essentially the old `cmd.exe`

console. (I haven’t kept up with PowerShell in a while, so maybe there have been some improvements, but it’s my impression that the scripting language has moved forward and the console has not.) PSReadLine adds some bash-like console conveniences such as Emacs-like editing at the command prompt.

**Update**: Thanks to Will for pointing out Clink in the comments. Clink sounds like it may be even better than PSReadLine.

SICP gives a Scheme program to solve the problem:

(define (count-change amount) (cc amount 5)) (define (cc amount kinds-of-coins) (cond ((= amount 0) 1) ((or (< amount 0) (= kinds-of-coins 0)) 0) (else (+ (cc amount (- kinds-of-coins 1)) (cc (- amount (first-denomination kinds-of-coins)) kinds-of-coins))))) (define (first-denomination kinds-of-coins) (cond ((= kinds-of-coins 1) 1) ((= kinds-of-coins 2) 5) ((= kinds-of-coins 3) 10) ((= kinds-of-coins 4) 25) ((= kinds-of-coins 5) 50)))

Concrete Mathematics explains that the number of ways to make change for an amount of *n* cents is the coefficient of *z*^*n* in the power series for the following:

Later on the book gives a more explicit but complicated formula for the coefficients.

Both show that there are 292 ways to make change for a dollar.

]]>See the full list of my daily tip Twitter accounts here.

The icon for the site is taken from one of Leonardo da Vinci’s anatomical drawings.

]]>Since jigsaw pieces are irregularly shaped, it may be surprising to learn that the pieces are actually arranged in a regular grid. At least they usually are. There are exceptions such as circular puzzles or puzzles that throw in a couple small pieces that throw off the grid regularity.

How many aspect ratios can you have with a rectangular grid of 1,000 points? Which ratio comes closest to the golden ratio? More generally, answer the same questions with 10^*n* points for positive integer *n*.

**More puzzles**:

A knight’s random walk

Peculiar property of 3909511

Roman numeral problem

A perspective problem

To first approximation, the earth is a sphere. The next step in sophistication is to model the earth as an ellipsoid.

The surface area of an ellipsoid with semi-axes *a* ≥ *b* ≥ *c* is

where

and

The functions *E* and *F* are incomplete elliptic integrals

and

implemented in SciPy as `ellipeinc`

and `ellipkinc`

. Note that the SciPy functions take *m* as their second argument rather its square root *k*.

For the earth, *a* = *b* and so *m* = 1.

The following Python code computes the ratio of earth’s surface area as an ellipsoid to its area as a sphere.

from scipy import pi, sin, cos, arccos from scipy.special import ellipkinc, ellipeinc # values in meters based on GRS 80 # http://en.wikipedia.org/wiki/GRS_80 equatorial_radius = 6378137 polar_radius = 6356752.314140347 a = b = equatorial_radius c = polar_radius phi = arccos(c/a) # in general, m = (a**2 * (b**2 - c**2)) / (b**2 * (a**2 - c**2)) m = 1 temp = ellipeinc(phi, m)*sin(phi)**2 + ellipkinc(phi, m)*cos(phi)**2 ellipsoid_area = 2*pi*(c**2 + a*b*temp/sin(phi)) # sphere with radius equal to average of polar and equatorial r = 0.5*(a+c) sphere_area = 4*pi*r**2 print(ellipsoid_area/sphere_area)

This shows that the ellipsoid model leads to 0.112% more surface area relative to a sphere.

Source: See equation 19.33.2 here.

**Update**: It was suggested in the comments that it would be better to compare the ellipsoid area to that of a sphere of the same volume. So instead of using the average of the polar and equatorial radii, one would take the geometric mean of the polar radius and two copies of the equatorial radius. Using that radius, the ellipsoid has 0.0002% more area than the sphere.

***

Poe, E.

Near a Raven

Midnights so dreary, tired and weary,

Silently pondering volumes extolling all by-now obsolete lore,

During my rather long nap — the weirdest tap!

An ominous vibrating sound disturbing my chamber’s antedoor.

“This,” I whispered quietly, “I ignore.”

…

So he sitteth, observing always, perching ominously on these doorways.

Squatting on the stony bust so untroubled, O therefore.

Suffering stark raven’s conversings, I am so condemned, subserving,

To a nightmare cursed, containing miseries galore.

Thus henceforth, I’ll rise (from a darkness, a grave) — nevermore!

***

The number of letters in most words encodes a digit of pi. Words with 10 letters encode a zero. Words with more than 10 letters encode two consecutive digits of pi. The poem encodes the first 740 digits of pi.

]]>The grammar stage of the trivium could be literal language grammar, but it also applies more generally to absorbing the basics of any subject and often involves rote learning.

The logic stage is more analytic, examining the relationships between the pieces gathered in the grammar stage. Students learn to construct sound arguments.

The rhetoric stage is focused on eloquent and persuasive expression. It is more outwardly focused than the previous stages, more considerate of others. Students learn to create arguments that are not only logically correct, but also memorable, enjoyable, and effective.

It would be interesting to see a classical approach to teaching programming. Programmers often don’t get past the logic stage, writing code that works (as far as they can tell). The rhetoric stage would train programmers to look for solutions that are not just probably correct, but so clear that they are persuasively correct. The goal would be to write code that is testable, maintainable, and even occasionally eloquent.

Parthenon replica in Nashville, TN.

]]>