- Some problems simply have no solution.
- Some problems have no simple solution.
- Some problems have many solutions.
- Determining that a solution exists may be half the work of finding it.
- Solutions that work well locally may blow up when extended too far.
- Boundary conditions are the hard part.
- Something that starts out as a good solution may become a very bad solution.
- You can fool yourself by constructing a solution where one doesn’t exist.
- Expand your possibilities to find a solution, then reduce them to see how good the solution is.
- You can sometimes do what sounds impossible by reframing your problem.

The *n*th harmonic number, *H*_{n}, is the sum of the reciprocals of the integers up to and including *n*. For example,

*H*_{4} = 1 + 1/2 + 1/3 + 1/4 = 25/12.

Here’s a curious fact about harmonic numbers, known as **Wolstenholme’s theorem**:

For a prime *p* > 3, the numerator of *H*_{p-1} is divisible by *p*^{2}.

The example above shows this for *p* = 5. In that case, the numerator is not just divisible by *p*^{2}, it *is* *p*^{2}, though this doesn’t hold in general. For example, *H*_{10} = 7381/2520. The numerator 7381 is divisible by 11^{2} = 121, but it’s not equal to 121.

The generalized harmonic numbers *H*_{n,m} are the sums of the reciprocals of the first *n* positive integers, each raised to the power *m*. Wolstenholme’s theorem also says something about these numbers too:

For a prime *p* > 3, the numerator of *H*_{p-1,2} is divisible by *p*.

For example, *H*_{4,2} = 205/144, and the numerator is clearly divisible by 5.

You can play with harmonic numbers and generalized harmonic numbers in Python using SymPy. Start with the import statement

from sympy.functions.combinatorial.numbers import harmonic

Then you can get the *n*th harmonic number with `harmonic(n)`

and generalized harmonic numbers with `harmonic(n, m)`

.

To extract the numerators, you can use the method `as_numer_denom`

to turn the fractions into (numerator, denominator) pairs. For example, you can create a list of the denominators of the first 10 harmonic numbers with

[harmonic(n).as_numer_denom()[0] for n in range(10)]

You might notice that `harmonic(0)`

returns 0, as it should. The sum defining the harmonic numbers is empty in this case, and empty sums are defined to be zero.

Traditional numerical integration routines work well in low dimensions. (“Low” depends on context, but let’s say less than 10 dimensions.) When you have the choice of a well-understood, deterministic method, and it works well, by all means use it. I’ve seen people who are so used to problems requiring Monte Carlo methods that they use these methods on low-dimensional problems where they’re not the most efficient (or robust) alternative.

Except for very special integrands, high dimensional integration requires either Monte Carlo integration or some variation such as quasi-Monte Carlo methods.

As I wrote about here, naive Monte Carlo integration isn’t practical for high dimensions. It’s unlikely that any of your integration points will fall in the region of interest without some sort of importance sampler. Constructing a good importance sampler requires some understanding of your particular problem. (Sorry. No one-size-fits-all black-box methods are available.)

Quasi-Monte Carlo methods (QMC) are interesting because they rely on something like a random sequence, but “more random” in a sense, i.e. more effective at exploring a high-dimensional space. These methods work well when an integrand has low effective dimension. The effective dimension is low when nearly all the action takes place on a lower dimensional manifold. For example, a function may ostensibly depend on 10,000 variables, while for practical purposes 12 of these are by far the most important. There are some papers by Art Owen that say, roughly, that Quasi-Monte Carlo methods work well if and only if the effective dimension is much smaller than the nominal dimension. (QMC methods are especially popular in financial applications.)

While QMC methods can be more efficient than Monte Carlo methods, it’s hard to get bounds on the integration error with these methods. Hybrid approaches, such as randomized QMC can perform remarkably well and give theoretical error bounds.

Markov Chain Monte Carlo (MCMC) is a common variation on Monte Carlo integration that uses **dependent** random samples. In many applications, such as Bayesian statistics, you’d like to be able to create independent random samples from some probability distribution, but this is not practical. MCMC is a compromise. It produces a Markov chain of dependent samples, and asymptotically these samples have the desired distribution. The dependent nature of the samples makes it difficult to estimate error and to determine how well the integration estimates using the Markov chain have converged.

(Some people speak of the Markov chain itself converging. It doesn’t. The *estimates* using the Markov chain may converge. Speaking about the chain converging may just be using informal language, or it may betray a lack of understanding.)

There is little theory regarding the convergence of estimates from MCMC, aside from toy problems. An estimate can appear to have converged, while an important region of the integration problem remains unexplored. Despite its drawbacks, MCMC is the only game in town for many applications.

If you’d like help with high dimensional integration, or any other kind of numerical integration, please contact me to discuss how we might work together.

]]>Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can.

The folks behind Mathematica and Wolfram Alpha announced yesterday now they too can read email.

]]>You can get some idea of the rapid develop of the scientific Python stack and its future direction by watching the final keynote of the conference by Jake VanderPlas.

I used Python for a while before I discovered that there were so many Python libraries for scientific computing. At the time I was considering learning Ruby or some other scripting language, but I committed to Python when I found out that Python has far more libraries for the kind of work I do than other languages do. **It felt like I’d discovered a secret hoard of code**. I expect it would be easier today to discover the scientific Python stack. (It really is becoming more of an integrated **stack**, not just a collection of isolated libraries. This is one of the themes in the keynote above.)

When people ask me why I use Python, rather than languages like Matlab or R, my response is that I do a mixture of mathematical programming and general programming. **I’d rather do mathematics in a general programming language than do general programming in a mathematical language**.

One of the drawbacks of Python, relative to C++ and related languages, is speed. This is a problem in languages like R as well. However, with Python there are ways to speed up code without completely rewriting it, such as Cython and Numba. The only reliable way I’ve found to make R much faster, is to rewrite it in another language.

Another drawback of Python until recently was that data manipulation and exploration were not as convenient as one would hope. However, that has changed due to developments such as Pandas, initiated by Wes McKinney. For more on how that came to be and where it’s going, see his keynote from the second day of SciPy 2015.

It’s not clear why Python has become the language of choice for so many people in scientific computing. Maybe if people like Travis Oliphant had decided to use some other language for scientific programming years ado, we’d all be using that language now. Python wasn’t intended to be a scientific programming language. And as Jake VanderPlas points out in his keynote, Python still is not a scientific programming language, but the foundation for a scientific programming stack. Maybe Python’s strength is that it’s *not* a scientific language. It has drawn more computer scientists to contribute to the core language than it would have if it had been more of a domain-specific language.

* * *

If you’d like help moving to the Python stack, please let me know.

]]>Yesterday at SciPy 2015 Allen Downey did something similar for his contact info and gave me the idea for the image above.

LinkedIn doesn’t quite fit; you have to know that LinkedIn profiles stick `linkedin.com/in/`

before the user name.

- All of the below
- None of the below
- All of the above
- One of the above
- None of the above
- None of the above

Which answer is correct?

]]>It’s easy to prove this assertion by brute force. Since the last digit of *b*^{n} only depends on the last digit of *b*, it’s enough to verify that the statement above holds for 0, 1, 2, …, 9.

Although that’s a valid proof, it’s not very satisfying. Why fifth powers? We’re implicitly working in base 10, as humans usually do, so what is the relation between 5 and 10 in this context? How might we generalize it to other bases?

The key is that 5 = φ(10) + 1 where φ(*n*) is the number of positive integers less than *n* and relatively prime to *n*.

Euler discovered the φ function and proved that if *a* and *m* are relatively prime, then

*a*^{φ(m)} = 1 (mod *m*)

This means that *a*^{φ(m)} – 1 is divisible by *m*. (Technically I should use the symbol ≡ (U+2261) rather than = above since the statement is a congruence, not an equation. But since Unicode font support on various devices is disappointing, not everyone could see the proper symbol.)

Euler’s theorem tells us that if *a* is relatively prime to 10 then *a*^{4} ends in 1, so *a*^{5} ends in the same digit as *a*. That proves our original statement for numbers ending in 1, 3, 7, and 9. But what about the rest, numbers that are divisible by 2 or 5? We’ll answer that question and a little more general one at the same time. Let *m* = αβ where α and β are distinct primes. In particular, we could choose α = 2 and β = 5; We will show that

*a*^{φ(m)+1} = *a* (mod *m*)

for all *a*, whether relatively prime to *m* or not. This would show in addition, for example, that in base 15, every number keeps the same last “digit” when raised to the 9th power since φ(15) = 8.

We only need to be concerned with the case of *a* being a multiple of α or β since Euler’s theorem proves our theorem for the rest of the cases. If *a* = αβ our theorem obviously holds, so assume *a* is some multiple of α, *a* = *k*α with *k* less than β (and hence relatively prime to β).

We need to show that αβ divides (*k*α)^{φ(αβ)+1} – *k*α. This expression is clearly divisible by α, so the remaining task is to show that it is divisible by β. We’ll show that (*k*α)^{φ(αβ)} – 1 is divisible by β.

Since α and *k* are relatively prime to β, Euler’s theorem tells us that α^{φ(β)} and *k*^{φ(β)} are congruent to 1 (mod β). This implies that *k*α^{φ(β)} is congruent to 1, and so *k*α^{φ(α)φ(β)} is also congruent to 1 (mod β). One of the basic properties of φ is that for relatively prime arguments α and β, φ(αβ) = φ(α)φ(β) and so we’re done.

**Exercise**: How much more can you generalize this? Does the base have to be the product of two distinct primes?

I do use software to schedule my tweets in advance. Most of the tweets from my personal account are live. Most of the tweets from my topic accounts are scheduled, though some are live. All replies are manual, not automated, and I don’t scrape content from anywhere.

Occasionally I read the responses to these accounts and sometimes I reply. But with over half a million followers (total, not unique) I don’t try to keep up with all the responses. If you’d like to contact me, you can do so here. That I do keep up with.

]]>

]]>… I said that if science could come up with something like the Jump it could surely solve a problem like that. Severin seized hold of that word, “science.” Science, he said, is not some mysterious larger-than-life force, it’s just the name we give to bright ideas that individual guys have when they’re lying in bed at night, and that if the fuel thing bothered me so much, there was nothing stopping me from having a bright idea to solve it …

Although I completely agree that “algorithmic wizardry” is over-rated in general, my personal experience has been a little different. My role on projects has frequently been to supply a little bit of algorithmic wizardry. I’ve often been asked to look into a program that is taking too long to run and been able to speed it up by an order of magnitude or two by improving a numerical algorithm. (See an example here.)

James Hague says that “rarely is there some … algorithm that casts a looming shadow over everything else.” I believe he is right, though I’ve been called into projects precisely on those rare occasions when an algorithm *does* cast a shadow over everything else.

Here are some of the most popular posts on this site and some other things I’ve written.

If you’d like to subscribe to this site you can do so by RSS or email. I also have a monthly newsletter.

You can find out more about me and my background here.

You can also find me on Twitter and Google+.

If you have any questions or comments, here’s my contact info.

]]>When it comes to writing code, the number one most important skill is how to keep a tangle of features from collapsing under the weight of its own complexity. I’ve worked on large telecommunications systems, console games, blogging software, a bunch of personal tools, and very rarely is there some tricky data structure or algorithm that casts a looming shadow over everything else. But there’s always lots of state to keep track of, rearranging of values, handling special cases, and carefully working out how all the pieces of a system interact. To a great extent the act of coding is one of organization. Refactoring. Simplifying. Figuring out how to remove extraneous manipulations here and there.

Algorithmic wizardry is easier to teach and easier to blog about than organizational skill, so we teach and blog about it instead. A one-hour class, or a blog post, can showcase a clever algorithm. But how do you present a clever bit of organization? If you jump to the solution, it’s unimpressive. “Here’s something simple I came up with. It may not look like much, but trust me, it was really hard to realize this was all I needed to do.” Or worse, “Here’s a moderately complicated pile of code, but you should have seen how much more complicated it was before. At least now someone stands a shot of understanding it.” Ho hum. I guess you had to be there.

You can’t appreciate a feat of organization until you experience the disorganization. But it’s hard to have the patience to wrap your head around a disorganized mess that you don’t care about. Only if the disorganized mess is your responsibility, something that means more to you than a case study, can you wrap your head around it and appreciate improvements. This means that while you can learn algorithmic wizardry through homework assignments, you’re unlikely to learn organization skills unless you work on a large project you care about, most likely because you’re paid to care about it.

**Related posts**:

]]>

AI: From “It’s so horrible how little progress has been made” to “It’s so horrible how much progress has been made” in one step.

When I read this I thought of Pandora (the mythical figure, not the music service).

“Are you still working on opening that box? Any progress?”

“No, the lid just … won’t … budge … Oh wait, I think I got it.”

**Related post**: Why the robots aren’t coming in the way you expect by Mark Burgess

I believe that reading only packaged microwavable fiction ruins the taste, destabilizes the moral blood pressure, and makes the mind obese.

I agree with that. That’s why I shop at Amazon.

If I liked to read best-selling junk food, I could find it at any bookstore. But I like to read less popular books, books I can only find from online retailers like Amazon. If fact, most of Amazon’s revenue comes from obscure books, not bestsellers.

Suppose I want to read something by, I don’t know, say, Ursula K. Le Guin. I doubt I could find a copy of any of her books, certainly not her less popular books, within 20 miles of my house, and I live in the 4th largest city in the US. There’s nothing by her in the closest Barnes and Noble. But I could easy find anything she’s ever written on Amazon.

If you’d like to support Amazon so they can continue to bring us fine authors like Ursula K. Le Guin, authors you can’t find in stores that mostly sell packaged microwavable fiction, you can buy one of the books mentioned on this blog from Amazon.

]]>Equations are typically applied left to right. When you write *A* = *B* you imply that it may be useful to replace *A* with *B*. This is helpful to keep in mind when learning something new: the order in which an equation is written gives a hint as to how it may be applied. However, this way of thinking can also be a limitation. Clever applications often come from realizing that you can apply an equation in the opposite of the usual direction.

For example, Euler’s reflection formula says

Γ(*z*) Γ(1-*z*) = π / sin(π*z*).

Reading from left to right, this says that two unfamiliar/difficult things, values of the Gamma function, are related to a more familiar/simple thing, the sine function. It would be odd to look at this formula and say “Great! Now I can compute sines if I just know values of the Gamma function.” Instead, the usual reaction would be “Great! Now I can relate the value of Gamma at two different places by using sines.”

When we see Einstein’s equation

*E* = *mc*^{2}

the first time, we think about creating energy from matter, such as the mass lost in nuclear fission. This applies the formula from left to right, relating what we want to know, an amount of energy, to what we do know, an amount of mass. But you could also read the equation from right to left, calculating the amount of energy, say in an accelerator, necessary to create a particle of a given mass.

Calculus textbooks typically have a list of equations, either inside the covers or in an appendix, that relate an integral on the left to a function or number on the right. This makes sense because calculus students compute integrals. But mathematicians often apply these equations in the opposite direction, replacing a number or function with an integral. To a calculus student this is madness: why replace a familiar thing with a scary thing? But integrals aren’t scary to mathematicians. Expressing a function as an integral is often progress. Properties of a function may be easier to see in integral form. Also, the integral may lend itself to some computational technique, such as reversing the order of integration in a double integral, or reversing the order to taking a limit and an integral.

Calculus textbooks also have lists of equations involving infinite sums, the summation always being on the left. Calculus students want to replace the scary thing, the infinite sum, with the familiar thing, the expression on the right. Generating functions turn this around, wanting to replace things with infinite sums. Again this would seem crazy to a calculus student, but it’s a powerful problem solving technique.

Differential equation students solve differential equations. They want to replace what they find scary, a differential equation, with something more familiar, a function that satisfies the differential equation. But mathematicians sometimes want to replace a function with a differential equation that it satisfies. This is common, for example, in studying special functions. Classical orthogonal polynomials satisfy 2nd order differential equations, and the differential equation takes a different form for different families of orthogonal polynomials. Why would you want to take something as tangible and familiar as a polynomial, something you might study as a sophomore in high school, and replace it with something as abstract and mysterious as a differential equation, something you might study as a sophomore in college? Because some properties, properties that you would not have cared about in high school, are more clearly seen via the differential equations.

]]>The difference between pure functional languages and traditional imperative languages is not quite that simple in practice.

Programming with pure functions is conceptually easy but can be awkward in practice. You could just pass each function the state of the world before the call, and it returns the state of the world after the call. It’s unrealistic to pass a program’s entire state as an argument each time, so you’d like to pass just that state that you need to, and have a convenient way of doing so. You’d also like the compiler to verify that you’re only passing around a limited slice of the world. That’s where **monads** come in.

Suppose you want a function to compute square roots and log its calls. Your square root function would have to take two arguments: the number to find the root of, and the state of the log before the function call. It would also return two arguments: the square root, and the updated log. This is a pain, and it makes function composition difficult.

**Monads provide a sort of side-band** for passing state around, things like our function call log. You’re still passing around the log, but you can do it implicitly using monads. This makes it easier to call and compose two functions that do logging. It also lets the compiler check that you’re passing around a log but not arbitrary state. A function that updates a log, for example, can effect the state of the log, but it can’t do anything else. It can’t launch missiles.

Once monads get large and complicated, it’s hard to know what side effects they hide. **Maybe they can launch missiles after all**. You can only be sure by *studying the source code*. Now how do you know that calling a C function, for example, doesn’t launch missiles? You *study the source code*. In that sense Haskell and C aren’t entirely different.

The Haskell compiler does give you assurances that a C compiler does not. But ultimately you have to study source code to know what a function does and does not do.

**Related post**: Monads are hard because …

The curve is the plot of exp(*it*) – exp(6*it*)/2 + *i* exp(-14*it*)/3 with *t* running from 0 to 2π.

Here’s Python code to draw the curve.

import matplotlib.pyplot as plt from numpy import pi, exp, real, imag, linspace def f(t): return exp(1j*t) - exp(6j*t)/2 + 1j*exp(-14j*t)/3 t = linspace(0, 2*pi, 1000) plt.plot(real(f(t)), imag(f(t))) # These two lines make the aspect ratio square fig = plt.gcf() fig.gca().set_aspect('equal') plt.show()

Maybe there’s a more direct way to plot curves in the complex plane rather than taking real and imaginary parts.

Updated code for the aspect ratio per Janne’s suggestion in the comments.

**Related posts**:

Several people have been making fun visualizations that generalize the example above.

Brent Yorgey has written two posts, one choosing frequencies randomly and another that animates the path of a particle along the curve and shows how the frequency components each contribute to the motion.

Mike Croucher developed a Jupyter notebook that lets you vary the frequency components with sliders.

John Golden created visualizations in GeoGerba here and here.

Jennifer Silverman showed how these curves are related to decorative patterns that popular in the 1960’s. She also created a coloring book and a video.

Dan Anderson accused me of nerd sniping him and created this visualization.

]]>- Business
- Clinical trials
- Computing
- Creativity
- Graphics
- Machine learning
- Math
- Music
- Python
- Science
- Software development
- Statistics
- Typography
- Misc

You can also subscribe to my Twitter feeds via RSS if you’d like.

]]>Finding Windows ports of Unix utilities is easy. The harder part is finding a shell that behaves as expected. (Of course “as expected” depends on your expectations!)

There have been many projects to port Unix utilities to Windows, particularly GnuWin32 and Gow. Some of the command shells I’ve tried are:

- Cmd
- PowerShell
- Eshell
- Bash
- Clink

I’d recommend the combination of Gow and Clink for most people. If you’re an Emacs power user you might like Eshell.

The built-in command line on Windows is `cmd`

. It’s sometimes called the “DOS prompt” though that’s misleading. DOS died two decades ago and the `cmd`

shell has improved quite a bit since then.

`cmd`

has some features you might not expect, such as `pushd`

and `popd`

. However, I don’t believe it has anything analogous to `dirs`

to let you see the directory stack.

PowerShell is a very sophisticated scripting environment, but the interactive shell itself (e.g. command editing functionality) is basically `cmd`

. (I haven’t kept up with PowerShell and that may have changed.) This means that writing a PowerShell script is completely different from writing a batch file, but the experience of navigating the command line is essentially the same as `cmd`

.

You can run shells inside Emacs. By default, `M-x shell`

brings up a `cmd`

prompt inside an Emacs buffer. You can also use Emacs’ own shell with the command `M-x eshell`

.

Eshell is a shell implemented in Emacs Lisp. Using Eshell is very similar across platforms. On a fresh Windows machine, with nothing like Gow installed, Eshell provides some of the most common Unix utilities. You can use the `which`

command to see whether you’re using a native executable or Emacs Lisp code. For example, if you type `which ls`

into Eshell, you get the response

eshell/ls is a compiled Lisp function in `em-ls.el'

The primary benefit of Eshell is that provides integration with Emacs. As the documentation says

Eshell is

nota replacement for system shells such as`bash`

or`zsh`

. Use Eshell when you want to move text between Emacs and external processes …

Eshell does not provide some of the command editing features you might expect from `bash`

. But the reason for this is clear: if you’re inside Emacs, you’d want to use the full power of Emacs editing, not the stripped-down editing features of a command line. For example, you cannot use `^foo^bar`

to replace `foo`

with `bar`

in the previous command. Instead, you could retrieve the previous command and edit it just as you edit any other line inside Emacs.

In `bash`

you can use `!^`

to recall the first argument of the previous command and `!$!$`

using `$_`

instead. Many of the other `bash`

shortcuts that begin with `!`

work as expected: `!foo`

, `!!`

, `!-3`

, etc. Directory navigation commands like `cd -`

, `pushd`

, and `popd`

work as in `bash`

.

Gow comes with a `bash`

shell, a Windows command line program that creates a `bash`

-like environment. I haven’t had much experience with it, but it seems to be a faithful bash implementation with few compromises for Windows, for better and for worse. For example, it doesn’t understand backslashes as directory separators.

There are other implementations of `bash`

on Windows, but I either haven’t tried them (e.g. win-bash) or have had bad experience with them (e.g. Cygwin).

Clink is not a shell *per se* but an extension to `cmd`

. It adds the functionality of the Gnu `readline`

library to the Windows command line and so you can use all the Emacs-like editing commands that you can with `bash`

: Control-a to move to the beginning of a line, Control-k to delete the rest of a line, etc.

Clink also gives you Windows-like behavior that Windows itself doesn’t provide, such as being able to paste text onto the command line with Control-v.

I’ve heard that Clink will work with PowerShell, but I was not able to make it work.

The command editing and history shortcuts beginning with `!`

mentioned above all work with Clink, as do substitutions like `^foo^bar`

.

In my opinion, the combination of Gow and Clink gives a good compromise between a Windows and Unix work environment. And if you’re running Windows, a compromise is probably what you want. Otherwise, simply run a (possibly virtual) Linux machine. Attempts to make Windows too Unix-like run down an uncanny valley where it’s easy to waste a lot of time.

]]>In some contexts software is regulated but data is not, or at least software comes under different regulations than data. For example, maybe you have to maintain test records for software but not for data.

Suppose as part of some project you need to search for files containing the word “apple” and you use the command line utility `grep`

. The text “apple” is data, input to the `grep`

program. Since grep is a widely used third party tool, it doesn’t have to be validated, and you haven’t written any code.

Next you need to search for “apple” and “Apple” and so you search on the regular expression “[aA]pple” rather than a plain string. Now is the regular expression “[aA]pple” code? It’s at least a tiny step in the direction of code.

What about more complicated regular expressions? Regular expressions are equivalent to deterministic finite automata, which sure seem like code. And that’s only regular expressions as originally defined. The term “regular expression” has come to mean more expressive patterns. Perl regular expressions can even contain arbitrary Perl code.

In practice we can agree that certain things are “code” and others are “data,” but there are gray areas where people could sincerely disagree. And someone wanting to be argumentative could stretch this gray zone to include everything. One could argue, for example, that all software is data because it’s input to a compiler or interpreter.

You might say “data is what goes into a database and code is what goes into a compiler.” That’s a reasonable rule of thumb, but databases can store code and programs can store data. Programmers routinely have long discussions about what belongs in a database and what belongs in source code. Throw regulatory considerations into the mix and there could be incentives to push more code into the database or more data into the source code.

* * *

See Slava Akhmechet’s essay The Nature of Lisp for a longer discussion of the duality between code and data.

]]>This is a thumbnail version of a large, high-resolution image by Ulysse Carion. Thanks to Aleksey Shipilëv (@shipilev) for pointing it out.

It’s hard to see in the thumbnail, but the map gives the change in velocity needed at each branch point. You can find the full 2239 x 2725 pixel image here or click on the thumbnail above.

]]>This decomposition is unique if you impose the extra requirement that consecutive Fibonacci numbers are not allowed. [1] It’s easy to see that the rule against consecutive Fibonacci numbers is necessary for uniqueness. It’s not as easy to see that the rule is sufficient.

Every Fibonacci number is itself the sum of two consecutive Fibonacci numbers—that’s how they’re defined—so clearly there are at least two ways to write a Fibonacci number as the sum of Fibonacci numbers, either just itself or its two predecessors. In the example above, 8 = 5 + 3 and so you could write 10 as 5 + 3 + 2.

The *n*th Fibonacci number is approximately φ^{n}/√5 where φ = 1.618… is the golden ratio. So you could think of a Fibonacci sum representation for *x* as roughly a base φ representation for √5*x*.

You can find the Fibonacci representation of a number *x* using a greedy algorithm: Subtract the largest Fibonacci number from *x* that you can, then subtract the largest Fibonacci number you can from the remainder, etc.

Programming exercise: How would you implement a function that finds the largest Fibonacci number less than or equal to its input? Once you have this it’s easy to write a program to find Fibonacci representations.

* * *

[1] This is known as Zeckendorf’s theorem, published by E. Zeckendorf in 1972. However, C. G. Lekkerkerker had published the same result 20 years earlier.

]]>Some say they enjoy the blog, but I post more often than they care to keep up with, particularly if they’re only interested in the non-technical posts.

Others have said they’d like to know more about my consulting business. There are some interesting things going on there, but I’d rather not write about them on the blog.

The newsletter will address both of these groups. I’ll highlight a few posts from the blog, some technical and some not, and I’ll say a little about what I’ve been up to.

If you’d like to receive the newsletter, you can sign up here.

I won’t share your email address with anyone and you can unsubscribe at any time.

]]>… we should clarify. From what, or whom, are we hiding information?

[T]raditional languages … bend over backwards to ensure that modules hide internal routines and data structures from other modules. The goal is to achieve module independence (a minimum coupling). The fear seems to be that modules strive to

attack each other like alien antibodies. Or else, thatevil bands of marauding modulesare out to clobber the precious family data structures.This is not what we’re concerned about. The purpose of hiding information, as we mean it, is simply to minimize the effects of a possible design-change by localizing things that might change within each component.

Quote from Thinking Forth. Emphasis added.

]]>

The sample code uses PyPDF2. I’m using Conda for my Python environment, and PyPDF2 isn’t directly available for Conda. I searched Binstar with

`binstar search -t conda pypdf2`

The first hit was from JimInCO, so I installed PyPDF2 with

`conda install -c https://conda.binstar.org/JimInCO pypdf2`

I scanned a few pages from a book to PDF, turning the book around every other page, so half the pages in the PDF were upside down. I needed a script to rotate the even numbered pages. The script counts pages from 0, so it rotates the *odd* numbered pages from its perspective.

import PyPDF2 pdf_in = open('original.pdf', 'rb') pdf_reader = PyPDF2.PdfFileReader(pdf_in) pdf_writer = PyPDF2.PdfFileWriter() for pagenum in range(pdf_reader.numPages): page = pdf_reader.getPage(pagenum) if pagenum % 2: page.rotateClockwise(180) pdf_writer.addPage(page) pdf_out = open('rotated.pdf', 'wb') pdf_writer.write(pdf_out) pdf_out.close() pdf_in.close()

It worked as advertised on the first try.

]]>- AlgebraFact
- AnalysisFact
- CompSciFact
- Diff_eq
- MedVocab
- NetworkFact
- PerlRegex
- ProbFact
- RegexTip
- ScienceTip
- SciPyTip
- StatFact
- TeXtip
- TopologyFact
- UnitFact
- UnixToolTip

If you would like to subscribe to more Twitter accounts via RSS, you could subscribe to the BazQux service and create a custom RSS feed for whatever Twitter, Google+, or Facebook accounts you’d like to follow.

]]>Rosenzweig discusses experiments designed to study decision making. In order to make clean comparisons, subjects are presented with discrete choices over which they have no control. They cannot look for more options or exercise any other form of agency. The result is an experiment that is easy to analyze and easy to publish, but so unrealistic as to tell us little about real-world decision making.

In his book Left Brain, Right Stuff, Rosenzweig quotes Philip Tetlock’s summary:

]]>Much mischief can be wrought by transplanting this hypothesis-testing logic, which flourishes in controlled lab settings, into the hurly-burly of real-world settings where

ceteris paribusnever is, and never can be, satisfied.

In his new book How to Fly a Horse, Kevin Ashton says that the Mozart story above is a myth based on a forged letter. According to Ashton,

Mozart’s real letters—to his father, to his sister, and to others—reveal his true creative process. He was exceptionally talented, but he did not write by magic. He sketched his compositions, revised them, and sometimes got stuck. He could not work without a piano or harpsichord. He would set work aside and return to it later. … Masterpieces did not come to him complete in uninterrupted streams of imagination, nor without an instrument, nor did he write them whole and unchanged. The letter is not only forged, it is false.

**Related posts**:

]]>

Simplifying fractions sometimes makes things clearer, but not always. It depends on context, and context is something students don’t understand at first. So it makes sense to be pedantic at some stage, but then students need to learn that **clear communication trumps pedantic conventions**.

Along these lines, there is a old taboo against having radicals in the denominator of a fraction. For example, 3/√5 is not allowed and should be rewritten as 3√5/5. This is an arbitrary convention now, though there once was a practical reason for it, namely that in hand calculations it’s easier to multiply by a long fraction than to divide by it. So, for example, if you had to reduce 3/√5 to a decimal in the old days, you’d look up √5 in a table to find it equals 2.2360679775. It would be easier to compute 0.6*2.2360679775 by hand than to compute 3/2.2360679775.

As with unreduced fractions, radicals in the denominator might be not only mathematically equivalent but psychologically preferable. If there’s a 3 in some context, and a √5, then it may be clear that 3/√5 is their ratio. In that same context someone may look at 3√5/5 and ask “Where did that factor of 5 in the denominator come from?”

A possible justification for rules above is that they provide standard forms that make grading easier. But this is only true for the simplest exercises. With moderately complicated exercises, following a student’s work is harder than determining whether two expressions represent the same number.

One final note on pedantic arithmetic rules: If the order of operations isn’t clear, make it clear. Add a pair of parentheses if you need to. Or write division operations as one thing above a horizontal bar and another below, not using the division symbol. Then you (and your reader) don’t have to worry whether, for example, multiplication has higher precedence than division or whether both have equal precedence and are carried out left to right.

]]>