According to the normalized comparison chart on langpop.com, at the time of writing this post, D is easily more popular than all functional programming languages combined. Here’s a portion of the chart zooming in on D and functional languages.

This hardly seems possible. As I stated up front, all measures of programming language popularity are crude proxies for what we’d really like to know. But still, this says that by at least one measure, D is more popular than any functional language.

The langpop.com site lets you twiddle the weights assigned to different components of the comparison. And for a wide range of weight choices, D comes out ahead of any functional language.

My preference would be to set the Google Files component should be zero since it’s only looking at web page languages. (The page won’t let you set a component to exactly zero, but you can make it tiny, say 0.001.) Github stats give you a good view into the open source world, but it would be an example of availability bias to assume the whole world of software development looks like Github. Craigslist might be a good window into commercial software development.

]]>Computing: the only industry that becomes less mature as more time passes.

The immaturity of computing is used to excuse every ignorance. There’s an enormous body of existing wisdom but we don’t care.

I don’t know whether computing is becoming less mature, though it may very well be on average, even if individual developers become more mature.

One reason is that computing is a growing profession, so people are entering the field faster than they are leaving. That lowers average maturity.

Another reason is chronological snobbery, alluded to in Fogus’s second tweet. Chronological snobbery is pervasive in contemporary culture, but especially in computing. Tremendous hardware advances give the illusion that software development has advanced more than it has. What could I possibly learn from someone who programmed back when computers were 100x slower? Maybe a lot.

**Related posts**:

lhs2TeX is roughly the Haskell analog of Sweave and Pweave. This post takes the sample code I wrote for Sweave and Pweave before and gives a lhs2TeX counterpart.

\documentclass{article} %include polycode.fmt %options ghci \long\def\ignore#1{} \begin{document} Invisible code that sets the value of the variable $a$. \ignore{ \begin{code} a = 3.14 \end{code} } Visible code that sets $b$ and squares it. (There doesn't seem to be a way to display the result of a block of code directly. Seems you have to save the result and display it explicitly in an eval statement.) \begin{code} b = 3.15 c = b*b \end{code} $b^2$ = \eval{c} Calling Haskell inline: $\sqrt{2} = \eval{sqrt 2}$ Recalling the variable $a$ set above: $a$ = \eval{a}. \end{document}

If you save this code to a file `foo.lhs`

, you can run

lhs2TeX -o foo.tex foo.lhs

to create a LaTeX file `foo.tex`

which you could then compile with `pdflatex`

.

One gotcha that I ran into is that your `.lhs`

file must contain at least one code block, though the code block may be empty. You cannot just have code in `\eval`

statements.

Unlike R and Python, the Haskell language itself has a notion of literate programming. Haskell specifies a format for literate comments. lhs2TeX is a popular tool for processing literate Haskell files but not the only one.

]]>One of the pages that stuck in my mind was a photo of Samuel Eilenberg. His name meant nothing to me at the time, but the caption titled “A subway topologist” caught my imagination.

… Polish-born Professor Samuel Eilenberg sprawls contemplatively in his Greenwich Village apartment in New York City. “Sometimes I like to think lying down,” he says, “but mostly I like to think riding on the subway.” Mainly he thinks about algebraic topology — a field so abstruse that even among mathematicians few understand it. …

I loved the image of Eilenberg staring intensely at the ceiling or riding around on a subway thinking about math. Since then I’ve often thought about math while moving around, though usually not on a subway. I’ve only lived for a few months in an area with a subway system.

The idea that a field of math would be unknown to many mathematicians sounded odd. I had no idea at the time that mathematicians specialized.

Algebraic topology doesn’t seem so abstruse now. It’s a routine graduate course and you might get an introduction to it in an undergraduate course. The book was published in 1963, and I suppose algebraic topology would have been more esoteric at the time.

]]>

I’d rather write a PowerShell script than a bash script, but I’d rather use the bash console interactively. The PowerShell console is essentially the old `cmd.exe`

console. (I haven’t kept up with PowerShell in a while, so maybe there have been some improvements, but it’s my impression that the scripting language has moved forward and the console has not.) PSReadLine adds some bash-like console conveniences such as Emacs-like editing at the command prompt.

**Update**: Thanks to Will for pointing out Clink in the comments. Clink sounds like it may be even better than PSReadLine.

SICP gives a Scheme program to solve the problem:

(define (count-change amount) (cc amount 5)) (define (cc amount kinds-of-coins) (cond ((= amount 0) 1) ((or (< amount 0) (= kinds-of-coins 0)) 0) (else (+ (cc amount (- kinds-of-coins 1)) (cc (- amount (first-denomination kinds-of-coins)) kinds-of-coins))))) (define (first-denomination kinds-of-coins) (cond ((= kinds-of-coins 1) 1) ((= kinds-of-coins 2) 5) ((= kinds-of-coins 3) 10) ((= kinds-of-coins 4) 25) ((= kinds-of-coins 5) 50)))

Concrete Mathematics explains that the number of ways to make change for an amount of *n* cents is the coefficient of *z*^*n* in the power series for the following:

Later on the book gives a more explicit but complicated formula for the coefficients.

Both show that there are 292 ways to make change for a dollar.

]]>See the full list of my daily tip Twitter accounts here.

The icon for the site is taken from one of Leonardo da Vinci’s anatomical drawings.

]]>Since jigsaw pieces are irregularly shaped, it may be surprising to learn that the pieces are actually arranged in a regular grid. At least they usually are. There are exceptions such as circular puzzles or puzzles that throw in a couple small pieces that throw off the grid regularity.

How many aspect ratios can you have with a rectangular grid of 1,000 points? Which ratio comes closest to the golden ratio? More generally, answer the same questions with 10^*n* points for positive integer *n*.

**More puzzles**:

A knight’s random walk

Peculiar property of 3909511

Roman numeral problem

A perspective problem

To first approximation, the earth is a sphere. The next step in sophistication is to model the earth as an ellipsoid.

The surface area of an ellipsoid with semi-axes *a* ≥ *b* ≥ *c* is

where

and

The functions *E* and *F* are incomplete elliptic integrals

and

implemented in SciPy as `ellipeinc`

and `ellipkinc`

. Note that the SciPy functions take *m* as their second argument rather its square root *k*.

For the earth, *a* = *b* and so *m* = 1.

The following Python code computes the ratio of earth’s surface area as an ellipsoid to its area as a sphere.

from scipy import pi, sin, cos, arccos from scipy.special import ellipkinc, ellipeinc # values in meters based on GRS 80 # http://en.wikipedia.org/wiki/GRS_80 equatorial_radius = 6378137 polar_radius = 6356752.314140347 a = b = equatorial_radius c = polar_radius phi = arccos(c/a) # in general, m = (a**2 * (b**2 - c**2)) / (b**2 * (a**2 - c**2)) m = 1 temp = ellipeinc(phi, m)*sin(phi)**2 + ellipkinc(phi, m)*cos(phi)**2 ellipsoid_area = 2*pi*(c**2 + a*b*temp/sin(phi)) # sphere with radius equal to average of polar and equatorial r = 0.5*(a+c) sphere_area = 4*pi*r**2 print(ellipsoid_area/sphere_area)

This shows that the ellipsoid model leads to 0.112% more surface area relative to a sphere.

Source: See equation 19.33.2 here.

**Update**: It was suggested in the comments that it would be better to compare the ellipsoid area to that of a sphere of the same volume. So instead of using the average of the polar and equatorial radii, one would take the geometric mean of the polar radius and two copies of the equatorial radius. Using that radius, the ellipsoid has 0.0002% more area than the sphere.

***

Poe, E.

Near a Raven

Midnights so dreary, tired and weary,

Silently pondering volumes extolling all by-now obsolete lore,

During my rather long nap — the weirdest tap!

An ominous vibrating sound disturbing my chamber’s antedoor.

“This,” I whispered quietly, “I ignore.”

…

So he sitteth, observing always, perching ominously on these doorways.

Squatting on the stony bust so untroubled, O therefore.

Suffering stark raven’s conversings, I am so condemned, subserving,

To a nightmare cursed, containing miseries galore.

Thus henceforth, I’ll rise (from a darkness, a grave) — nevermore!

***

The number of letters in most words encodes a digit of pi. Words with 10 letters encode a zero. Words with more than 10 letters encode two consecutive digits of pi. The poem encodes the first 740 digits of pi.

]]>The grammar stage of the trivium could be literal language grammar, but it also applies more generally to absorbing the basics of any subject and often involves rote learning.

The logic stage is more analytic, examining the relationships between the pieces gathered in the grammar stage. Students learn to construct sound arguments.

The rhetoric stage is focused on eloquent and persuasive expression. It is more outwardly focused than the previous stages, more considerate of others. Students learn to create arguments that are not only logically correct, but also memorable, enjoyable, and effective.

It would be interesting to see a classical approach to teaching programming. Programmers often don’t get past the logic stage, writing code that works (as far as they can tell). The rhetoric stage would train programmers to look for solutions that are not just probably correct, but so clear that they are persuasively correct. The goal would be to write code that is testable, maintainable, and even occasionally eloquent.

Parthenon replica in Nashville, TN.

]]>Gaussian elimination tells you nothing about the final solution until it’s almost done. The first phase, factorization, takes O(*n*^3) steps, where *n* is the number of unknowns. This is followed by the back-substitution phase which takes O(*n*^2) steps. The factorization phase tells you nothing about the solution. The back-substitution phase starts filling in the components of the solution one at a time. In application *n* is often so large that the time required for back-substitution is negligible compared to factorization.

Iterative methods start by taking a guess at the final solution. In some contexts, this guess may be fairly good. For example, when solving differential equations, the solution from one time step gives a good initial guess at the solution for the next time step. Similarly, in sequential Bayesian analysis the posterior distribution mode doesn’t move much as each observation arrives. Iterative methods can take advantage of a good starting guess while methods like Gaussian elimination cannot.

Iterative methods take an initial guess and refine it to a better approximation to the solution. This sequence of approximations converges to the exact solution. In theory, Gaussian elimination produces an exact answer in a finite number of steps, but iterative methods never produce an exact solution after any finite number of steps. But in actual computation with finite precision arithmetic, no method, iterative or not, ever produces an exact answer. The question is not which method is exact but which method produces an acceptably accurate answer first. Often the iterative method wins.

Successful projects often work like iterative numerical methods. They start with an approximation solution and iteratively refine it. All along the way they provide a useful approximation to the final product. Even if, in theory, there is a more direct approach to a final product, the iterative approach may work better in practice.

Algorithms iterate toward a solution because that approach may reach a sufficiently accurate result sooner. That may apply to people, but more important for people is the psychological benefit of having something to show for yourself along the way. Also, iterative methods, whether for linear systems or human projects, are robust to changes in requirements because they are able to take advantage of progress made toward a slightly different goal.

**Related post**: Ten surprises from numerical linear algebra

where the sum is over all positive integers *n*.

Euler also introduced a multivariate generalization of the zeta function

where the sum is over all decreasing *k*-tuples of positive integers. This generalized zeta function satisfies the following beautiful identity:

The multivariate zeta function and identities such as the one above are important in number theory and are the subject of open conjectures.

]]>For the majority of the bankrupt projects we studied, there was not a single technological issue to explain the failure.

Gerald Weinberg said something similar in his Second Law of Consulting:

]]>No matter how it looks at first, it’s always a people problem.

All the usual disclaimers about benchmarks apply, your mileage may vary, etc. See the paper for details.

Here I give my summary of their summary of their results. The authors ran separate benchmarks on Mac and Windows. The results were qualitatively the same, so I just report the Windows results here.

Times in the table below are relative to the fastest C++ run.

Language | Time |
---|---|

C++ | 1.00 |

Java | 2.10 |

Julia | 2.70 |

CPython | 155.31 |

Python with Numba | 1.57 |

R | 505.09 |

R using compiler package | 243.38 |

The most striking result is that the authors were able to run their Python code 100x faster, achieving performance comparable to C++, by using Numba.

]]>People often dislike static type systems because they’ve only met weak ones. A weak or not very expressive type system gets in your way all the time. It prevents you from writing functions you want to write that you know are fine. … The solution is not to abandon the type system but to make the type system more expressive.

In particular, he mentions Haskell’s polymorphic types and type inference as ways to make strong static typing convenient to use.

]]>The real problem is that research on expertise focuses on fields where “expertise” is a well-posed and objectively codified notion. This means mature fields that are closed and bounded, and can be easily observed, modeled and studied under laboratory conditions. So it is not surprising that the work … is based on fields like “medicine, music, chess and sports” … all sharply circumscribed and regulated domains.

Rao’s essay may be a bit too harsh on closed-world domains and a bit too romantic about open-world domains, but the distinction between the domains is important. You don’t become a successful entrepreneur, for example, the same way you become a successful violinist. Closed worlds place a much higher emphasis on error-elimination, at least initially, than do open worlds. In an open world, the concept of an error may not even make sense. Where there is no law there is no sin.

Consulting is a more open world than academia. As Rao notes, academia can close off an otherwise open world through “bureaucratic productivity measures like publications and citations.” Clients are happy if you solve their problems. They could not care less whether your solutions are publishable and all that implies. Original and thoroughly footnooted work that doesn’t solve their problems is not appreciated.

Clients are not going to give you an oral exam to see whether you’ve mastered some canon. And they don’t care if you cross academic boundary lines to use something “outside your field.” They do care about credentials sometimes, but in a pragmatic way: they may need someone with the right credentials to review something. In that case, your credentials are part of the solution.

**Related posts**:

Doing all the exercises in a book isn’t a bad way to learn something, though it depends on the book, what you’re trying to accomplish, and on the quality and quantity of the exercises.

Have you ever gone through a book working every exercise? If so, what book? How was your experience?

]]>

The fountain in Sergels torg (English: Sergel’s Square) in Sockholm is in the shape of a “superellipse.” Piet Hein proposed this shape as a practical and aesthetically pleasing compromise between a circle and a rectangle.

What Piet Hein dubbed a superellipse is a curve satisfying

|*x*/*a*|^{2.5} + |*y*/*b*|^{2.5} = 1

In the case of Sergels torg, *a*/b = 6/5. A superellipse is what an analyst would call an *L*^{p} ball with *p* = 2.5 and rescaled axes.

Incidentally, I used curves of this form in a dose-finding method a few years ago. A clinical trial design would pick *a*, *b*, and *p* to match a physician’s desired trade-off between probability of efficacy and probability of toxicity.

**Update**: A few minutes after I posted this, I realized that there was a bowl on my desk shaped like a superellipse:

The bowl came from Ikea, so there’s another connection to Sweden.

**Related post**: Volumes of Lp balls

]]>

Apparently the door of the prison was really unlocked all the time; but it was only you who thought of trying the handle.

In context he was saying that Charles Williams had recovered a clear understanding of Milton that had been obfuscated for over a century.

That line made me think of a quote from Alexander Grothendieck (that I haven’t been able to find this morning) to the effect that progress in mathematics has often waited for someone to be bold enough to ask a simple question or introduce a simple concept.

**Update**: Thanks to Roland Elliot for providing the quote I was missing. Ronald Brown says that Grothendieck said in a letter to him in 1982:

]]>The introduction of the cipher 0 or the group concept was general nonsense too, and mathematics was more or less stagnating for thousands of years because nobody was around to take such childish steps …

I’m more likely to implement suggestions that come with specific details. For example, if you say “I don’t like your theme” then there’s not much I can do unless you suggest an alternative. But if you say something like “I’d suggest changing this line of CSS because …” I can at least try it out.

]]>Haskell, on the other hand, is more deeply mathematical. Fortran resembles math in the small, but Haskell resembles math in the large. Haskell has mathematical structures at a higher level of abstraction than just formulas. As a recent article put it

While Fortran provides a comfortable translation of mathematical formulas, Haskell code begins to resemble mathematics itself.

On its surface, Perl is one of the least English-like programming languages. It is often criticized as looking like “line noise” because of its heavy use of operators. By contrast, Python has been called “executable pseudocode” because the source is easier to read, particularly for novice programmers. And yet at a deeper level, Perl is more English-like than other programming languages such as Python.

Larry Wall explains in Natural Language Principles in Perl that he designed Perl to incorporate features of spoken language. For example, Perl has analogs of articles and pronouns. (Larry Wall studied both linguistics and computer science in college.) Opinions differ on how well his experiment with incorporating natural language features into programming languages has worked out, but it was an interesting idea.

**Related posts**:

Haskell / Category theory decoder ring

]]>I find it more plausible when someone says a new technology has new trade-offs than when someone says a new technology is “better.” Rarely does one thing improve on another by all criteria, but often one thing is an improvement on another by the criteria you value most.

If you want to persuade me to adopt something new, you’ll gain credibility by being candid about its drawbacks. Explain by what criteria you think the new thing is better, by what criteria it is worse, and why the former should matter more to me in my circumstances.

]]>

This has been a surprise to me. I’m more used to creating a mathematical model so you can compute something or apply some theorem. But sometimes you can move a project along just by providing a name for a concept. A meandering discussion can snap into focus because someone has a name for an idea everyone vaguely understands.

Sometimes it may be clear that only part of a mathematical definition applies. In this case math can guide the discussion by asking whether the rest of the definition applies. “It sounds like we’ve got a widget here. A widget has to have these five properties and clearly we have the first three. Let’s think about whether the last two hold.” The answers don’t have to be positive to be useful. You might realize something important in the process of explaining why your thing is **not** a widget.

Sometimes a definition may not apply at all and still be useful! “This reminds me of a widget. It’s not a widget in any strict sense. But if it were, this is what we’d do next. I wonder whether we can do something like that.”

]]>

Order statistics are robust in a sense. The median of a sample, for example, is a very robust measure of central tendency. If Bill Gates walks into a room with a large number of people, the mean wealth jumps tremendously but the median hardly budges.

But order statistics are not robust in this sense: the **identity** of the sample in any given position can be very sensitive to perturbation. Suppose a room has an odd number of people so that someone has the median wealth. When Bill Gates and Warren Buffett walk into the room later, the **value** of the median income may not change much, but the **person** corresponding to that income will change.

One way to evaluate machine learning algorithms is by how often they pick the right winner in some sense. For example, dose-finding algorithms are often evaluated on how often they pick the best dose from a set of doses being tested. This can be a terrible criteria, causing researchers to be mislead by a particular set of simulation scenarios. It’s more important how often an algorithm makes a **good** choice than how often it makes the **best** choice.

Suppose five drugs are being tested. Two are nearly equally effective, and three are much less effective. A good experimental design will lead to picking one of the two good drugs most of the time. But if the best drug is only slightly better than the next best, it’s too much to expect any design to pick the best drug with high probability. In this case it’s better to measure the expected utility of a decision rather than how often a design makes the best decision.

]]>

One difficulty is that although Haskell articles use terms like *functor* and *monad* from category theory, they seldom actually talk about categories per se. If we’ve got functors, where are the categories? (This reminds me of Darth Vader asking “If this is a consular ship, where is the ambassador?”)

In Haskell literature, everything implicitly lives **Hask**, the category of Haskell types, or in some subcategory of **Hask**. This means that the category itself is not the focus of attention. In category theory, functors often operate between very different classes of objects, such as topological spaces and their fundamental groups, and so it’s more important to state what category something lives in.

Another potential stumbling block is to think of Haskell types as categories and values as objects. That would be reasonable, since in computer science an “object” is an instance of a type. But the right correspondence is to think of Haskell types as categorical objects. Instances of types are below the level of abstraction we’re working at. This is analogous to how category theory treats objects as black boxes with no way to talk about what’s inside.

Finally, Haskell monads look a little different from categorical monads. Haskell’s `return`

corresponds directly to *unit*, usually written as η, in category theory. But Haskell monads have a bind operator `>>=`

while mathematical monads have a join operator μ. These are not equivalent, though you can implement each in terms of the other:

join :: Monad m => m (m a) -> m a join x = x >>= id (>>=) :: Monad m => m a -> (a -> m b) -> m b x >>= f = join (fmap f x)

To read more along these lines, see the Wikibooks article on Haskell and Category theory.

**Update**: Stephen Diehl suggested I mention the differences between the idealized category **Hask** and the implementation of the Haskell language. These are discussed here.

It is easy to forget that the man who writes a good love sonnet needs not only to be enamoured of a woman, but also to be enamoured of the Sonnet.

]]>

In high school you get used to hearing about “algebra” and “calculus” as subjects, then the first time you hear someone say “an algebra” or “a calculus” it sounds bizarre. Similarly, once you’re used to hearing about logic, statistics, or topology, it sounds exotic to hear someone say “a logic,” “a statistic,” or “a topology,” something like hearing “a speed of light.”

When I signed up for a topology course, visions of “rubber sheet geometry” danced in my head. I was taken aback when the course started out by defining a topology to be a collection of sets closed under finite intersections and arbitrary unions. Where did the rubber sheet go?

Sometimes when a subject takes on an indefinite article, the new definition comes early in the development. For example, “a statistic” is any function of a random sample, a disappointing definition that comes early in a statistics course. But sometimes the definition with the indefinite article comes much later.

You can take a couple courses in algebra in high school and even a college course in algebra (i.e. groups, rings, and fields) without ever hearing the definition of “an algebra.” And you can take several courses in calculus without ever hearing of “a calculus.”

The phrase “an algebra” can have several meanings, and the phrase “a calculus” even more. The core definition of an algebra is a vector space with a bilinear product. But there are more general ideas of an algebra, some so general that they mean “things and operations on things.” (That’s basically universal algebra.)

Often you put an adjective in front of *algebra* to specify what kind of algebra you’re talking about: Banach algebra, σ-algebra, Boolean algebra, etc. Usually in mathematics, an adjective in front of a noun narrows the definition. For example, an Abelian group is a particular kind of group. But often an algebra with a qualifier, such as a σ-algebra, is not an algebra by what I called the core definition above.

The idea of “a calculus” is even more vague. If it has a formal definition, I’ve never seen it. A calculus is just a subject with a set of useful rules for calculating things. Sometimes there’s a strong analogy to calculus in the sense of differential calculus, as in the calculus of finite differences, a.k.a. umbral calculus. Sometimes the analogy is more of a stretch, as in λ calculus or π calculus.

So what is the point of this post? I don’t really have one. Just throwing out some half-baked musings.

]]>