I didn’t write my clients and leads on balls, but I did write them on index cards. And it helped a great deal. It’s easier to think about projects when you have physical representations you can easily move around. Moving lines up and down in an org-mode file, or even moving boxes around in 2D in OneNote, doesn’t work as well.

Electronic files are great for storing, editing, and querying ideas. But they’re not the best medium for *generating* ideas. See Create offline, analyze online. See also Austin Kleon’s idea of having two separate desks, one digital and one analog.

Suppose you want to find whether a number *n* is divisible by 7. Start with the last digit of *n* and write a 1 under the last digit, and a 3 under the next to last digit. The digits under the digits of *n* cycle through 1, 3, 2, 6, 4, 5, repeatedly until there is something under each digit in *n*. Now multiply each digit of *n* by the digit under it and add the results.

For example, suppose n = 1394. Write the digits 1, 3, 2, and 6 underneath, from right to left:

1394

6231

The sum we need to compute is 1*6 + 3*2 + 9*3 + 4*1 = 6 + 6 + 27 + 4 = 43.

This sum, 43 in our example, has the same remainder when divided by 7 as the original number, 1394 in our example. Since 43 is not divisible by 7, neither is 1394. Not only that, the result of our method has the same remainder by 7 as the number we started with. In our example, 43 leaves a remainder of 1 by 7, so 1394 also leaves a remainder of 1.

You could apply this method repeatedly, though in this case 43 is small enough that it’s easy enough to see that it leaves a remainder of 1.

Suppose you started with a 1000-digit number *n*. Each digit is no more than 9, and is being multiplied by a number no more than 6. So the sum would be less than 54000. So you’ve gone from a 1000-digit number to at most a 5-digit number in one step. One or two more steps should be enough for the remainder to be obvious.

Why does this method work? The key is that the multipliers 1, 3, 2, 6, 4, 5 are the remainders when the powers of 10 are divided by 7. Since 10^{6} has a remainder of 1 when divided by 7, the numbers 10^{6a+b} and 10^{b} have the same remainder by 7, and that’s why the multipliers have period 6.

All the trick is doing is expanding the base 10 representation of a number and adding up the remainders when each term is divided by seven. In our example, 1394 = 1000 + 3*100 + 9*10 + 4, and mod 7 this reduces to 1*6 + 3*2 + 9*3 + 4*1, the exact calculation above.

The trick presented here is analogous to casting out nines. But since every power of 10 leaves a remainder of 1 when divided by 9, all the multipliers in casting out nines are 1.

You could follow the pattern of this method to create a divisibility rule for any other divisor, say 13 for example, by letting the multipliers be the remainders when powers of 10 are divided by 13.

**Related post**: Divisibility rules in hex

((17 + 12√2)^{n} + (17 – 12√2)^{n} – 2)/32

Now 17 – 12√2 is 0.029… and so the term (17 – 12√2)^{n} approaches zero very quickly as *n* increases. So the *f*(*n*) is very nearly

((17 + 12√2)^{n} – 2)/32

The error in the approximation is less than 0.001 for all *n*, so the approximation is *exact* when rounded to the nearest integer. So the *n*th square triangular number is

⌊((17 + 12√2)^{n} +14)/32⌋

where ⌊*x*⌋ is the greatest integer less than *x*.

Here’s how you might implement this in Python:

from math import sqrt, floor def f(n): x = 17 + 12*sqrt(2) return floor((x**n + 14)/32.0)

Unfortunately this formula isn’t that useful in ordinary floating point computation. When *n* = 11 or larger, the result is needs more bits than are available in the significand of a floating point number. The result is accurate to around 15 digits, but the result is longer than 15 digits.

The result can also be computed with a recurrence relation:

*f*(*n*) = 34 *f*(*n*-1) – *f*(*n*-2) + 2

for *n* > 2. (Or *n* > 1 if you define *f*(0) to be 0, a reasonable thing to do).

This only requires integer arithmetic, so it’s only limitation is the size of the integers you can compute with. Since Python has arbitrarily large integers, the following works for very large integers.

def f(n): if n < 2: return n return f(n-1)*34 - f(n-2) + 2

This is a direct implementation, though it’s inefficient because of the redundant function evaluations. It could be made more efficient by, for example, using memoization.

]]>A triangular number is the sum of the first so many positive integers. For example, 10 is a triangular number because it equals 1+2+3+4. These numbers are called triangle numbers because you can form a triangle by having a row of one coin, two coin, three coins, etc. forming a triangle.

The smallest number that is both triangular and square is 1. The next smallest is 36. There are infinitely many numbers that are both triangular and square, and there’s even a formula for the *n*th number that is both a triangle and a square:

((17 + 12√2)^{n} + (17 – 12√2)^{n} – 2)/32

Source: American Mathematical Monthly, February 1962, page 169.

For more on triangle numbers and their generalizations, see Twelve Days of Christmas and Tetrahedral Numbers.

There is also a way to compute the square triangular numbers recursively discussed in the next post.

]]>After the alphabet and the tables of multiplication, nothing has proved quite so useful in my professional life as these six little expressions.

The six expressions he refers to are nicknamed the *vergeet-me-nietjes *in Dutch, which translates to forget-me-nots in English. They are also known as Dr. Myosotis’s equations because myosotis is the genus for forget-me-nots. The equations give the angular and linear deflections of a cantilever beam.

Imagine a beam anchored at one end and free on the other, subject to one of the kinds of load: a bending moment *M* at the opposite end, a point force *P* a the opposite end, or a force *w* distributed over the length of the beam. The equations below give the rotation (angular deflection) and displacement (linear deflection) of the free end of the beam.

Rotation | Displacement | |
---|---|---|

Bending moment | ML/EI |
ML^{2}/2EI |

Point load | PL^{2}/2EI |
PL^{3}/3EI |

Distributed load | wL^{3}/6EI |
wL^{4}/8EI |

Here *E* is the modulus of elasticity, *L* is the length of the beam, and *I* is the area moment of inertia.

People often answer the question with a list of their favorite books, perhaps skewed in favor of long books. But I don’t think you should take books that have been your favorites in the past. You should take what you think would be your favorite books ** on a deserted island**. I expect my tastes there would be very different than they are in my current suburban life.

I think of books that I could only read on a desert island, books that I’ve enjoyed in small portions but ordinarily would not have the patience to read cover-to-cover. For example, I’ve found portions of Donald Knuth’s series The Art of Computer Programming enjoyable and useful, but I can’t say I’ve read it all. Perhaps on a deserted island with little to do and few distractions I’d enjoy going through it carefully line by line, attempting all the exercises. I might even learn MMIX, something I can’t imagine doing under ordinary circumstances.

Along those lines, I might want to take some works by Thomas Aquinas such as his Summa Theologica or his commentaries on Aristotle. The little I’ve read of Aquinas has been profound, and more approachable than I expected. Still, I find it hard to read much of him. Alone on an island I might take the time to read him carefully.

For math, I might want to take Spivak’s differential geometry series, depending on how long I expect to be on this island. If I’m going to be there too long and I’m limited on books, I might want to take something else that’s more dense and less familiar.

For science, I’d take Gravitation by Misner, Thorne, and Wheeler. I’ve intended to read that book for many years and have started a couple times. In college I couldn’t afford this price; now I can’t afford the time.

For fiction, I’d take Patrick O’Brian’s Aubrey/Maturin series because I haven’t read it, I’ve heard it is very well written, and it’s long.

]]>

The root problem turned out to be an assumption that sounds reasonable. You’re making two changes, one to a numerator and one to a denominator. The total change equals the sum of the results of each change separately. Except that’s not so.

At this point, a mathematician would say “Of course you can’t split the effects like that. It’s nonlinear.” But it’s worth pursuing a little further. For one thing, it doesn’t help a general audience to just say “it’s nonlinear.” For another, it’s worth seeing when it is appropriate, at least approximately, to attribute the effects this way.

You start with a numerator and denominator, *N*/*D*, then change *N* to *N*+*n* and change *D* to *D*+*d*. The total change is then (*N*+*n*)/(*D*+*d*) – *N*/*D*.

The result from only the change in the numerator is *n*/*D*. The result from only the change in denominator is *N*/(*D*+*d*) – *N*/*D*.

The difference between the total change and the sum of the two partial changes is

–*nd*/*D*(*D*+*d*).

The assumption that you can take the total change and attribute it to each change separately is wrong in general. But it is correct if *n* or *d* is zero, and it is approximately correct with *nd* is small. This can make the bug harder to find. It could also be useful when *nd* is indeed small and you don’t need to be exact.

Also, if all the terms are positive, the discrepancy is negative, i.e. the total change is less than the sum of the partial changes. Said another way, allocating the change to each cause separately over-estimates the total change.

]]>

**JC**: Thanks for giving me a copy of the book when we were at SciPy 2015. It’s a nice book. It’s about a lot more than computational physics.

**KH**: Right. If you think of it as physical science in general, that’s the group we’re trying to target.

**JC**: Targeting physical science more than life science?

**KH**: Yes. You can see that more in the data structures we cover which are very float-based rather than things like strings and locations.

**AS**: To second that, I’d say that all the examples are coming from the physical sciences. The deep examples, like in the parallelism chapter, are most relevant to physicists.

**JC**: Even in life sciences, there’s a lot more than sequences of base pairs.

**KH**: Right. A lot of people have asked what chapters they should skip. It’s probable that ecologists or social scientists are not going to be interested in the chapter about HDF5. But the rest of the book, more or less, could be useful to them.

**JC**: I was impressed that there’s a lot of scattered stuff that you need to know that you’ve brought into one place. This would be a great book to hand a beginning grad student.

**KH**: That was a big motivation for writing the book. Anthony’s a professor now and I’m applying to be a professor and I can’t spend all my time ramping students up to be useful researchers. I’d rather say “Here’s a book. It’s yours. Come to me if it’s not in the index.”

**JC**: And they’d rather have a book that they could access any time than have to come to you. Are you thinking of doing a second edition as things change over time?

**AS**: It’s on the table to do a second edition eventually. Katy and I will have the opportunity if the book is profitable and the material becomes out of date. O’Reilly could ask someone else to write a second edition, but they would ask us first.

**JC**: Presumably putting out a second edition would not be as much work as creating the first one.

**KH**: I sure hope not!

**AS**: There’s a lot of stuff that’s not in this book. Greg Wilson jokingly asked us when Volume 2 would come out. There may be a need for a more intermediate book that extends the topics.

**KH**: And maybe targets languages other than Python where you’re going to have to deal with configuring and building, installing and linking libraries, that kind of stuff. I’d like to cover more of that, but Python doesn’t have that problem!

**JC**: You may sell a lot of books when the school year starts.

**KH**: Anthony and I both have plans for courses based around this book. Hopefully students will find it helpful.

**JC**: Maybe someone else is planning the same thing. It would be nice if they told you.

**AS**: A couple people have approached us about doing exactly that. Something I’d like to see is for people teaching courses around it to pull their curriculum together.

**JC**: Is there a web site for the book, other than an errata page at the publisher?

**KH**: Sure, there’s physics.codes. Anthony put that up.

**AS**: There’s also a GitHub repo, physics-codes. That’s where you can find code for all the examples, and that should be kept up to date. We also have a YouTube channel.

**JC**: When did y’all start writing the book?

**AS**: It was April or May last year when we finally started writing. There was a proposal cycle six or seven months before that. Katy and I were simultaneously talking to O’Reilly, so that worked out well.

**KH**: In a sense, the book process started for me in graduate school with The Hacker Within and Software Carpentry. A lot of the flows in the book come from the outlines of Hacker Within tutorials and Software Carpentry tutorials years ago.

**AS**: On that note, what happened for me, I took those tutorials and turned them into a masters course for AIMS, African Institute for Mathematical Sciences. At the end I thought it would be nice if this were a book. It didn’t occur to me that there was a book’s worth of material until the end of the course at AIMS. I owe a great debt to AIMS in that way.

**JC**: Is there something else you’d like to say about the book that we haven’t talked about?

**KH**: I think it would be a fun exercise for someone to try to determine which half of the chapters I wrote and which Anthony wrote. Maybe using some sort of clustering algorithm or pun detection. If anyone wants to do that sort of analysis, I’d love to see if you guess right. Open competition. Free beer from Katy if you can figure out which half. We split the work in half, but it’s really mixed around. People who know us well will probably know that Anthony’s chapters have a high density of puns.

**AS**: I think the main point that I would like to see come across is that the book is useful to a broader audience outside the physical sciences. Even for people who are not scientists themselves, it’s useful to describe the mindset of physical scientists to software developers or managers. That communication protocol kinda goes both ways, though I didn’t expect that when we started out.

**JC**: I appreciate that it’s *one* book. Obviously it won’t cover everything you need to know. But it’s not like here’s a book on Linux, here’s a book on git, here are several books on Python. And some of the material in here isn’t in any book.

**KH**: Like licensing. Anthony had the idea to add the chapter on licensing. We get asked all the time “Which license do you use? And why?” It’s confusing, and you can get it really wrong.

* * *

Check out Effective Computation in Physics. It’s about more than physics. It’s a lot of what you need to know to get started with scientific computing in Python, all in one place.

**Related**:

- Scientific computing in Python (notes from SciPy 2015)
- Python resources
- Winston Churchill, Bessie Braddock, and Python

If *b* is a power of 2, we showed in the previous post that the last digit of *P* is *b*-1.

If *b* is odd, we can find the last digit using Euler’s totient theorem. Let φ(*b*) be the number of positive integers less than *b* and relatively prime to *b*. Then Euler’s theorem tells us that 2^{φ(b)} ≡ 1 mod *b *since *b* is odd. So if *r* is the remainder when 57885161 is divided by φ(*b*), then the last digit of *Q* = *P*+1 in base *b* is the same as the last digit of 2^{r} in base *b*.

For example, suppose we wanted to know the last digit of *P* in base 15. Since φ(15) = 8, and the remainder when 57885161 is divided by 8 is 1, the last digit of *Q* base 15 is 2. So the last digit of *P* is 1.

So we know how to compute the last digit in base *b* if *b* is a power or 2 or odd. Factor out all the powers of 2 from *b* so that *b* = 2^{k} *d* and *d* is odd. We can find the last digit base 2^{k}, and we can find the last digit base *d*, so can we combine these to find the last digit base *b*? Yes, that’s exactly what the Chinese Remainder Theorem is for.

To illustrate, suppose we want to find the last digit of *P* in base 12. *P* ≡ 3 mod 4 and *P* ≡ 1 mod 3, so *P* ≡ 7 mod 12. (The numbers are small enough here to guess. For a systematic approach, see the post mentioned above.) So the last digit of *P* is 7 in base 12.

If you’d like to write a program to play around with this, you need to be able to compute φ(*n*). You may have an implementation of this function in your favorite environment. For example, it’s `sympy.ntheory.totient`

in Python. If not, it’s easy to compute φ(*n*) if you can factor *n*. You just need to know two things about φ. First, it’s a multiplicative function, meaning that if *a* and *b* are relatively prime, then φ(*ab*) = φ(*a*) φ(*b*). Second, if *p* is a prime, then φ(*p*^{k}) = *p*^{k} – *p*^{k-1}.

You can convert binary numbers to other bases that are powers of two, say 2^{k}, by starting at the right end and converting to the new base in blocks of size *k*. So in base 4, we’d start at the right end and convert pairs of bits to base 4. Until we get to the left end, all the pairs are 11, and so they all become 3’s in base four. So in base 4, *P* would be written as a 1 followed by 28,942,580 threes.

Similarly, we could write *P* in base 8 by starting at the right end and converting triples of bits to base 8. Since 57885161 = 3*19295053 + 2, we’ll have two bits left over when we get to the left end, so in base 8, *P* would be written as 3 followed by 19,295,053 sevens. And since 57885161 = 4*14471290 + 1, the hexadecimal (base 16) representation of *P* is a 1 followed by 14,471,290 F’s.

What about base 10, i.e. how many digits does *P* have? The number of digits in a positive integer *n* is ⌊log_{10}*n*⌋ + 1 where ⌊*x*⌋ is the greatest integer less than *x*. For positive numbers, this just means chop off the decimals and keep the integer part.

We’ll make things slightly easier for a moment and find how many digits *P*+1 has.

log_{10}(*P*+1) = log_{10}(2^{57885161}) = 57885161 log_{10}(2) = 17425169.76…

and so *P*+1 has 17425170 digits, and so does *P*. How do we know *P* and *P*+1 have the same number of digits? There are several ways you could argue this. One is that *P*+1 is not a power of 10, so the number of digits couldn’t change between *P* and *P*+1.

**Update**: How many digits does *P* have in other bases?

We can take the formula above and simply substitute b for 10: The base *b* representation of a positive integer *n* has ⌊log* _{b}n*⌋ + 1 digits.

As above, we’d rather work with *Q* = *P*+1 than *P*. The numbers *P* and *Q* will have the same number of digits unless *Q* is a power of the base *b*. But since *Q* is a power of 2, this could only happen if *b* is also a power of two, say *b* = 2^{k}, and *k* is a divisor of 57885161. But 57885161 is prime, so this could only happen if *b* = *Q*. If *b* > *P*, then *P* has one digit base *b. S*o we can concentrate on the case *b* < *Q*, in which case *P* and *Q* have the same number of digits. The number of digits in *Q* is then 57885161 log_{b}(2) + 1.

If* b* = 2^{k}, we can generalize the discussion above about bases 4, 8, and 16. Let *q* and *r* be the quotient and remainder when 57885161 is divided by *k*. The first digit is *r* and the remaining *q* digits are *b*-1.

**Next post**: Last digit of largest known prime

]]>

The 61st Fibonacci number is 2504730781961. The 62nd is 4052739537881. Since these end in 1 and 1, the 63rd Fibonacci number must end in 2, etc. and so the pattern starts over.

It’s not obvious that the cycle should have length 60, but it is fairly easy to see that there must be a cycle. There are only 10*10 possibilities for two consecutive digits. Since the Fibonacci numbers are determined by a two-term recurrence, and since the last digit of a sum is determined by the sum of the last digits, the sequence of last digits must repeat eventually. Here “eventually” means after at most 10*10 terms.

Replace “10” by any other base in the paragraph above to show that the sequence of last digits must be cyclic in any base. In base 16, for example, the period is 24. In hexadecimal notation the 25th Fibonacci number is 12511 and the 26th is 1DA31, so the 27th must end in 2, etc.

Here’s a little Python code to find the period of the last digits of Fibonacci numbers working in any base b.

from sympy import fibonacci as f def period(b): for i in range(1, b*b+1): if f(i)%b == 0 and f(i+1)%b == 1: return(i)

This shows that in base 100 the period is 300. So in base 10 the last *two* digits repeat every 300 terms.

The period seems to vary erratically with base as shown in the graph below.

]]>Once at twelve on one night’s drear, ’twas while I, weak and tired thought here

On the words in lots of quaint and odd old tomes of mind’s lost lore,

While I dozed, so near a nap, there came but then a soft, quick tap,

As of one who made a rap, a rap at my front room’s closed door.

“‘Tis some guest,” I spoke, voice low, “who taps at my front room’s closed door,

Well, Just this, and not much more.”

Morice’s entire poem is available here.

**Related post**: A rewriting of The Raven that gives a mnemonic for the digits of pi.

A young developer approached me after a conf talk and said, “You must feel really bad about the failure of object-oriented programming.” I was confused. I said, “What do you mean that object-orient programming was a failure. Why do you think that?”

He said, “OOP was supposed to fix all of our software engineering problems and it clearly hasn’t. Building software today is just as hard as it was before OOP. came along.”

“Have you ever look at the programs we were building in the early 1980s? At how limited their functionality and UIs were? OOP has been an incredible success. It enabled us to manage complexity as we grew from 100KB applications to today’s 100MB applications.”

Of course OOP hasn’t solved all software engineering problems. Neither has anything else. But OOP has been enormously successful in allowing ordinary programmers to write much larger applications. It has become so pervasive that few programmers consciously think about it; it’s simply how you write software.

I’ve written several posts poking fun at the excesses of OOP and expressing moderate enthusiasm for functional programming, but I appreciate OOP. I believe functional programming will influence object oriented programming, but not replace it.

**Related**:

- Some problems simply have no solution.
- Some problems have no simple solution.
- Some problems have many solutions.
- Determining that a solution exists may be half the work of finding it.
- Solutions that work well locally may blow up when extended too far.
- Boundary conditions are the hard part.
- Something that starts out as a good solution may become a very bad solution.
- You can fool yourself by constructing a solution where one doesn’t exist.
- Expand your possibilities to find a solution, then reduce them to see how good the solution is.
- You can sometimes do what sounds impossible by reframing your problem.

**Related posts**:

]]>

The *n*th harmonic number, *H*_{n}, is the sum of the reciprocals of the integers up to and including *n*. For example,

*H*_{4} = 1 + 1/2 + 1/3 + 1/4 = 25/12.

Here’s a curious fact about harmonic numbers, known as **Wolstenholme’s theorem**:

For a prime *p* > 3, the numerator of *H*_{p-1} is divisible by *p*^{2}.

The example above shows this for *p* = 5. In that case, the numerator is not just divisible by *p*^{2}, it *is* *p*^{2}, though this doesn’t hold in general. For example, *H*_{10} = 7381/2520. The numerator 7381 is divisible by 11^{2} = 121, but it’s not equal to 121.

The generalized harmonic numbers *H*_{n,m} are the sums of the reciprocals of the first *n* positive integers, each raised to the power *m*. Wolstenholme’s theorem also says something about these numbers too:

For a prime *p* > 3, the numerator of *H*_{p-1,2} is divisible by *p*.

For example, *H*_{4,2} = 205/144, and the numerator is clearly divisible by 5.

You can play with harmonic numbers and generalized harmonic numbers in Python using SymPy. Start with the import statement

from sympy.functions.combinatorial.numbers import harmonic

Then you can get the *n*th harmonic number with `harmonic(n)`

and generalized harmonic numbers with `harmonic(n, m)`

.

To extract the numerators, you can use the method `as_numer_denom`

to turn the fractions into (numerator, denominator) pairs. For example, you can create a list of the denominators of the first 10 harmonic numbers with

[harmonic(n).as_numer_denom()[0] for n in range(10)]

You might notice that `harmonic(0)`

returns 0, as it should. The sum defining the harmonic numbers is empty in this case, and empty sums are defined to be zero.

Traditional numerical integration routines work well in low dimensions. (“Low” depends on context, but let’s say less than 10 dimensions.) When you have the choice of a well-understood, deterministic method, and it works well, by all means use it. I’ve seen people who are so used to problems requiring Monte Carlo methods that they use these methods on low-dimensional problems where they’re not the most efficient (or robust) alternative.

Except for very special integrands, high dimensional integration requires either Monte Carlo integration or some variation such as quasi-Monte Carlo methods.

As I wrote about here, naive Monte Carlo integration isn’t practical for high dimensions. It’s unlikely that any of your integration points will fall in the region of interest without some sort of importance sampler. Constructing a good importance sampler requires some understanding of your particular problem. (Sorry. No one-size-fits-all black-box methods are available.)

Quasi-Monte Carlo methods (QMC) are interesting because they rely on something like a random sequence, but “more random” in a sense, i.e. more effective at exploring a high-dimensional space. These methods work well when an integrand has low effective dimension. The effective dimension is low when nearly all the action takes place on a lower dimensional manifold. For example, a function may ostensibly depend on 10,000 variables, while for practical purposes 12 of these are by far the most important. There are some papers by Art Owen that say, roughly, that Quasi-Monte Carlo methods work well if and only if the effective dimension is much smaller than the nominal dimension. (QMC methods are especially popular in financial applications.)

While QMC methods can be more efficient than Monte Carlo methods, it’s hard to get bounds on the integration error with these methods. Hybrid approaches, such as randomized QMC can perform remarkably well and give theoretical error bounds.

Markov Chain Monte Carlo (MCMC) is a common variation on Monte Carlo integration that uses **dependent** random samples. In many applications, such as Bayesian statistics, you’d like to be able to create independent random samples from some probability distribution, but this is not practical. MCMC is a compromise. It produces a Markov chain of dependent samples, and asymptotically these samples have the desired distribution. The dependent nature of the samples makes it difficult to estimate error and to determine how well the integration estimates using the Markov chain have converged.

(Some people speak of the Markov chain itself converging. It doesn’t. The *estimates* using the Markov chain may converge. Speaking about the chain converging may just be using informal language, or it may betray a lack of understanding.)

There is little theory regarding the convergence of estimates from MCMC, aside from toy problems. An estimate can appear to have converged, while an important region of the integration problem remains unexplored. Despite its drawbacks, MCMC is the only game in town for many applications.

If you’d like help with high dimensional integration, or any other kind of numerical integration, please contact me to discuss how we might work together.

]]>Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can.

The folks behind Mathematica and Wolfram Alpha announced yesterday now they too can read email.

]]>You can get some idea of the rapid develop of the scientific Python stack and its future direction by watching the final keynote of the conference by Jake VanderPlas.

I used Python for a while before I discovered that there were so many Python libraries for scientific computing. At the time I was considering learning Ruby or some other scripting language, but I committed to Python when I found out that Python has far more libraries for the kind of work I do than other languages do. **It felt like I’d discovered a secret hoard of code**. I expect it would be easier today to discover the scientific Python stack. (It really is becoming more of an integrated **stack**, not just a collection of isolated libraries. This is one of the themes in the keynote above.)

When people ask me why I use Python, rather than languages like Matlab or R, my response is that I do a mixture of mathematical programming and general programming. **I’d rather do mathematics in a general programming language than do general programming in a mathematical language**.

One of the drawbacks of Python, relative to C++ and related languages, is speed. This is a problem in languages like R as well. However, with Python there are ways to speed up code without completely rewriting it, such as Cython and Numba. The only reliable way I’ve found to make R much faster, is to rewrite it in another language.

Another drawback of Python until recently was that data manipulation and exploration were not as convenient as one would hope. However, that has changed due to developments such as Pandas, initiated by Wes McKinney. For more on how that came to be and where it’s going, see his keynote from the second day of SciPy 2015.

It’s not clear why Python has become the language of choice for so many people in scientific computing. Maybe if people like Travis Oliphant had decided to use some other language for scientific programming years ado, we’d all be using that language now. Python wasn’t intended to be a scientific programming language. And as Jake VanderPlas points out in his keynote, Python still is not a scientific programming language, but the foundation for a scientific programming stack. Maybe Python’s strength is that it’s *not* a scientific language. It has drawn more computer scientists to contribute to the core language than it would have if it had been more of a domain-specific language.

* * *

If you’d like help moving to the Python stack, please let me know.

]]>Yesterday at SciPy 2015 Allen Downey did something similar for his contact info and gave me the idea for the image above.

LinkedIn doesn’t quite fit; you have to know that LinkedIn profiles stick `linkedin.com/in/`

before the user name.

- All of the below
- None of the below
- All of the above
- One of the above
- None of the above
- None of the above

Which answer is correct?

]]>It’s easy to prove this assertion by brute force. Since the last digit of *b*^{n} only depends on the last digit of *b*, it’s enough to verify that the statement above holds for 0, 1, 2, …, 9.

Although that’s a valid proof, it’s not very satisfying. Why fifth powers? We’re implicitly working in base 10, as humans usually do, so what is the relation between 5 and 10 in this context? How might we generalize it to other bases?

The key is that 5 = φ(10) + 1 where φ(*n*) is the number of positive integers less than *n* and relatively prime to *n*.

Euler discovered the φ function and proved that if *a* and *m* are relatively prime, then

*a*^{φ(m)} = 1 (mod *m*)

This means that *a*^{φ(m)} – 1 is divisible by *m*. (Technically I should use the symbol ≡ (U+2261) rather than = above since the statement is a congruence, not an equation. But since Unicode font support on various devices is disappointing, not everyone could see the proper symbol.)

Euler’s theorem tells us that if *a* is relatively prime to 10 then *a*^{4} ends in 1, so *a*^{5} ends in the same digit as *a*. That proves our original statement for numbers ending in 1, 3, 7, and 9. But what about the rest, numbers that are divisible by 2 or 5? We’ll answer that question and a little more general one at the same time. Let *m* = αβ where α and β are distinct primes. In particular, we could choose α = 2 and β = 5; We will show that

*a*^{φ(m)+1} = *a* (mod *m*)

for all *a*, whether relatively prime to *m* or not. This would show in addition, for example, that in base 15, every number keeps the same last “digit” when raised to the 9th power since φ(15) = 8.

We only need to be concerned with the case of *a* being a multiple of α or β since Euler’s theorem proves our theorem for the rest of the cases. If *a* = αβ our theorem obviously holds, so assume *a* is some multiple of α, *a* = *k*α with *k* less than β (and hence relatively prime to β).

We need to show that αβ divides (*k*α)^{φ(αβ)+1} – *k*α. This expression is clearly divisible by α, so the remaining task is to show that it is divisible by β. We’ll show that (*k*α)^{φ(αβ)} – 1 is divisible by β.

Since α and *k* are relatively prime to β, Euler’s theorem tells us that α^{φ(β)} and *k*^{φ(β)} are congruent to 1 (mod β). This implies that *k*α^{φ(β)} is congruent to 1, and so *k*α^{φ(α)φ(β)} is also congruent to 1 (mod β). One of the basic properties of φ is that for relatively prime arguments α and β, φ(αβ) = φ(α)φ(β) and so we’re done.

**Exercise**: How much more can you generalize this? Does the base have to be the product of two distinct primes?

I do use software to schedule my tweets in advance. Most of the tweets from my personal account are live. Most of the tweets from my topic accounts are scheduled, though some are live. All replies are manual, not automated, and I don’t scrape content from anywhere.

Occasionally I read the responses to these accounts and sometimes I reply. But with over half a million followers (total, not unique) I don’t try to keep up with all the responses. If you’d like to contact me, you can do so here. That I do keep up with.

]]>

]]>… I said that if science could come up with something like the Jump it could surely solve a problem like that. Severin seized hold of that word, “science.” Science, he said, is not some mysterious larger-than-life force, it’s just the name we give to bright ideas that individual guys have when they’re lying in bed at night, and that if the fuel thing bothered me so much, there was nothing stopping me from having a bright idea to solve it …

Although I completely agree that “algorithmic wizardry” is over-rated in general, my personal experience has been a little different. My role on projects has frequently been to supply a little bit of algorithmic wizardry. I’ve often been asked to look into a program that is taking too long to run and been able to speed it up by an order of magnitude or two by improving a numerical algorithm. (See an example here.)

James Hague says that “rarely is there some … algorithm that casts a looming shadow over everything else.” I believe he is right, though I’ve been called into projects precisely on those rare occasions when an algorithm *does* cast a shadow over everything else.

Here are some of the most popular posts on this site and some other things I’ve written.

If you’d like to subscribe to this site you can do so by RSS or email. I also have a monthly newsletter.

You can find out more about me and my background here.

You can also find me on Twitter and Google+.

If you have any questions or comments, here’s my contact info.

]]>When it comes to writing code, the number one most important skill is how to keep a tangle of features from collapsing under the weight of its own complexity. I’ve worked on large telecommunications systems, console games, blogging software, a bunch of personal tools, and very rarely is there some tricky data structure or algorithm that casts a looming shadow over everything else. But there’s always lots of state to keep track of, rearranging of values, handling special cases, and carefully working out how all the pieces of a system interact. To a great extent the act of coding is one of organization. Refactoring. Simplifying. Figuring out how to remove extraneous manipulations here and there.

Algorithmic wizardry is easier to teach and easier to blog about than organizational skill, so we teach and blog about it instead. A one-hour class, or a blog post, can showcase a clever algorithm. But how do you present a clever bit of organization? If you jump to the solution, it’s unimpressive. “Here’s something simple I came up with. It may not look like much, but trust me, it was really hard to realize this was all I needed to do.” Or worse, “Here’s a moderately complicated pile of code, but you should have seen how much more complicated it was before. At least now someone stands a shot of understanding it.” Ho hum. I guess you had to be there.

You can’t appreciate a feat of organization until you experience the disorganization. But it’s hard to have the patience to wrap your head around a disorganized mess that you don’t care about. Only if the disorganized mess is your responsibility, something that means more to you than a case study, can you wrap your head around it and appreciate improvements. This means that while you can learn algorithmic wizardry through homework assignments, you’re unlikely to learn organization skills unless you work on a large project you care about, most likely because you’re paid to care about it.

**Related posts**:

AI: From “It’s so horrible how little progress has been made” to “It’s so horrible how much progress has been made” in one step.

When I read this I thought of Pandora (the mythical figure, not the music service).

“Are you still working on opening that box? Any progress?”

“No, the lid just … won’t … budge … Oh wait, I think I got it.”

**Related post**: Why the robots aren’t coming in the way you expect by Mark Burgess

I believe that reading only packaged microwavable fiction ruins the taste, destabilizes the moral blood pressure, and makes the mind obese.

I agree with that. That’s why I shop at Amazon.

If I liked to read best-selling junk food, I could find it at any bookstore. But I like to read less popular books, books I can only find from online retailers like Amazon. If fact, most of Amazon’s revenue comes from obscure books, not bestsellers.

Suppose I want to read something by, I don’t know, say, Ursula K. Le Guin. I doubt I could find a copy of any of her books, certainly not her less popular books, within 20 miles of my house, and I live in the 4th largest city in the US. There’s nothing by her in the closest Barnes and Noble. But I could easy find anything she’s ever written on Amazon.

If you’d like to support Amazon so they can continue to bring us fine authors like Ursula K. Le Guin, authors you can’t find in stores that mostly sell packaged microwavable fiction, you can buy one of the books mentioned on this blog from Amazon.

]]>Equations are typically applied left to right. When you write *A* = *B* you imply that it may be useful to replace *A* with *B*. This is helpful to keep in mind when learning something new: the order in which an equation is written gives a hint as to how it may be applied. However, this way of thinking can also be a limitation. Clever applications often come from realizing that you can apply an equation in the opposite of the usual direction.

For example, Euler’s reflection formula says

Γ(*z*) Γ(1-*z*) = π / sin(π*z*).

Reading from left to right, this says that two unfamiliar/difficult things, values of the Gamma function, are related to a more familiar/simple thing, the sine function. It would be odd to look at this formula and say “Great! Now I can compute sines if I just know values of the Gamma function.” Instead, the usual reaction would be “Great! Now I can relate the value of Gamma at two different places by using sines.”

When we see Einstein’s equation

*E* = *mc*^{2}

the first time, we think about creating energy from matter, such as the mass lost in nuclear fission. This applies the formula from left to right, relating what we want to know, an amount of energy, to what we do know, an amount of mass. But you could also read the equation from right to left, calculating the amount of energy, say in an accelerator, necessary to create a particle of a given mass.

Calculus textbooks typically have a list of equations, either inside the covers or in an appendix, that relate an integral on the left to a function or number on the right. This makes sense because calculus students compute integrals. But mathematicians often apply these equations in the opposite direction, replacing a number or function with an integral. To a calculus student this is madness: why replace a familiar thing with a scary thing? But integrals aren’t scary to mathematicians. Expressing a function as an integral is often progress. Properties of a function may be easier to see in integral form. Also, the integral may lend itself to some computational technique, such as reversing the order of integration in a double integral, or reversing the order to taking a limit and an integral.

Calculus textbooks also have lists of equations involving infinite sums, the summation always being on the left. Calculus students want to replace the scary thing, the infinite sum, with the familiar thing, the expression on the right. Generating functions turn this around, wanting to replace things with infinite sums. Again this would seem crazy to a calculus student, but it’s a powerful problem solving technique.

Differential equation students solve differential equations. They want to replace what they find scary, a differential equation, with something more familiar, a function that satisfies the differential equation. But mathematicians sometimes want to replace a function with a differential equation that it satisfies. This is common, for example, in studying special functions. Classical orthogonal polynomials satisfy 2nd order differential equations, and the differential equation takes a different form for different families of orthogonal polynomials. Why would you want to take something as tangible and familiar as a polynomial, something you might study as a sophomore in high school, and replace it with something as abstract and mysterious as a differential equation, something you might study as a sophomore in college? Because some properties, properties that you would not have cared about in high school, are more clearly seen via the differential equations.

]]>The difference between pure functional languages and traditional imperative languages is not quite that simple in practice.

Programming with pure functions is conceptually easy but can be awkward in practice. You could just pass each function the state of the world before the call, and it returns the state of the world after the call. It’s unrealistic to pass a program’s entire state as an argument each time, so you’d like to pass just that state that you need to, and have a convenient way of doing so. You’d also like the compiler to verify that you’re only passing around a limited slice of the world. That’s where **monads** come in.

Suppose you want a function to compute square roots and log its calls. Your square root function would have to take two arguments: the number to find the root of, and the state of the log before the function call. It would also return two arguments: the square root, and the updated log. This is a pain, and it makes function composition difficult.

**Monads provide a sort of side-band** for passing state around, things like our function call log. You’re still passing around the log, but you can do it implicitly using monads. This makes it easier to call and compose two functions that do logging. It also lets the compiler check that you’re passing around a log but not arbitrary state. A function that updates a log, for example, can effect the state of the log, but it can’t do anything else. It can’t launch missiles.

Once monads get large and complicated, it’s hard to know what side effects they hide. **Maybe they can launch missiles after all**. You can only be sure by *studying the source code*. Now how do you know that calling a C function, for example, doesn’t launch missiles? You *study the source code*. In that sense Haskell and C aren’t entirely different.

The Haskell compiler does give you assurances that a C compiler does not. But ultimately you have to study source code to know what a function does and does not do.

**Related post**: Monads are hard because …