Trevor Oakes

Notes on Deforestation: Transforming programs to eliminate trees

2012-06-19T21:17:00-06:00

As I am making notes as I read papers, It seemed a shame to let them bit rot in “the cloud”.

If you don’t already have the paper in front of you, get it from one of these links.

Paper Outline

The idea is to create a new optimization for optimizing functional compilers to reduce memory overhead. The paper:

Defines pure treeless form for a first-order lazy functional language
Extends that model with “Blazing” - marking trees according to type, to allow some intermediate values that don’t affect efficiency
Adds back most higher order functions using higher-order macros

It came out of the author’s previous work on “listlessness,” but is simpler, more general and the result is a functional, not an imperative program. For example, the algorithm developed in the second paper only applies when pre-order traversal can be verified. However, the class of treeless functions is not a superset of the class of listless programs; the deforestation algorithm does not optimize programs like the average function (/ (sum xs) (length xs)).

Some Definitions

Deforestation - an algorithm that turns a function composed of functions that have treeless forms to a function in treeless form
Treeless form - a linear function definition with no intermediate trees (or in a simpler case, lists)
linear function - a function where the inputs do not appear more than once in the body of the function. square x = x * x is not linear, while square x = exp ( 2 * log x) is.
Blazing - marking a term (from a forestry term to make a cut in the bark of a tree) by its type

These next few are more widespread but I had to look up because I was hazy on what they meant.

first-order logic - predicates are sets
higher-order logic - predicates can be sets of sets i.e. Predicates and functions can be arguments to predicates
arity - the number of arguments to a function, i.e. unary, binary, ternary, etc.
unfolding - replacing an instance of the left-hand side of an equation by the corresponding right hand side

Motivating Example Breakdown

Near the beginning of the paper we are given a functional program as a motivating example sum (map square (upto 1 n)) which can be transformed into

h 0 1 n
where
h a m n = if m > n
          then a
          else h (a + square m) (m+1) n

using the deforestation technique in the paper. It wasn’t clear to me where all the parts of the deforested program came from so I broke it up:

m comes from upto, representing, by turn, each member of the list
+1 is the stepping function of the initial list (from upto)
a is the accumulator from the fold that the treeless version of sum is implemented with

Proofs for Optimization and Termination

Instead of proving that the algorithm optimizes, Wadler instead proves that it cannot make the program less efficient, which was easier, but in fact if there are intermediate trees the transformation will be an optimization. In order to ensure that the the algorithm is an optimization, terms must be linear and “every argument of function application and every selector of a case term [must be] a variable.” Insisting that treeless definitions are linear guarantees that they can be unfolded without sacrificing efficiency. If evaluating the variable is expensive, deforestation of non-linear terms can cause the variable to be evaluated more than once. The restriction that particular arguments must be variables guarantees that no intermediate trees are created.

The other important thing is to prove that this transformation itself terminates. The proofs are pretty clear, so read them if you’re interested. There is one interesting corner case though: at some point we need to introduce new definitions, otherwise some fusings like flip wouldn’t terminate, but where? Keep a list of terms encountered in the deforest process. If the term is found again as a renaming, create a function and use instead of having the deforest function recurse again.

Blazing

Our example program still doesn’t optimize as much as we’d like it to. upto creates a “tree” of integers. Types like int, bool and polymorphic types are blazed so that intermediate terms of that type may remain. Intermediate “tree” forms do not cause large memory overheads. You may be surprised to see polymorphic things in that list, but this is safe because we know they are not manipulated because they are polymorphic and so they can be shared. If an expression is in blazed treeless form we can then perform a modified deforestation algorithm. Blazed treeless form means that places where intermediate values may appear must be variables or blazed safe, and the term must be linear in variables blazed unsafe, i.e. duplication of safe variables does not create expensive computations.

Macros

The idea of this section of the paper it to keep most of the expressiveness of higher-order functions in a first-order language, by treating higher-order functions as macros that build the appropriate first order functions which can then be deforested. They are one way to simplify things like this (program transformation systems) without losing most of the expressive power. The restriction: higher order macros cannot be recursive. This approach was unproven (at time of writing). If you know of papers that discuss this, speak up in the comments. (Goguen uses a different method to keep higher-order functions using parameterized modules instead.) Another alternative would be to create an alternative algorithm that applies to higher-order functions directly so you don’t have to treat them as macros.

And an Unanswered Question

Wadler states that the deforestation optimization is more suitable for lazy languages, because in a strict language deforestation can transform an undefined program into a value returning one. How? And what would be an example program?

Intro to Deforestation and Macros

2012-03-17T21:39:00-06:00

My new project is to write a deforestation macro for Racket. Since I’ll be spending a lot of time on that in the next few months and there will probably be a corresponding amount of related content here on the blog, for the rest of the post I’ll unpack that sentence for you so you can follow along in the future. Read on to get a grounding in deforestation and Racket macros.

Racket

Racket is a member of the Lisp family of languages, more recently diverging from Scheme. Racket takes that core and adds more interesting continuations, macros and an emphasis on real programs and practical research. That means that it has tons of libraries. Actually Racket is more like a language implementation platform, with a primary language #lang racket but there is also a lazy language, standards compliant versions of scheme, a functional reactive programming language (a functional abstraction over gui programming), learning languages and even a logic programming language. And its easy to create your own.

One way is with macros, and Lispy languages are really great at doing macros. One reason is that they are homoiconic, i.e. the representation of the program is actually a data structure in the same language. For lisp this means lists. Perhaps you’ve heard the idea that code is data. If not, you’ve probably never worked on a compiler or interpreter. Remember that a program uses code to transform input data to output data. Now think about a compiler, it is a program that takes in code and produces code. The code you pass to gcc is data to gcc. Code is just data. Essentially homoiconicity means that not only is code data, but data that your own code can understand.

Macros

This is where macros come in. Now, I’d like to be clear at the outset of a section on macros that I am not talking about the abomination that C++ calls macros. I shudder involuntarily at the thought of C++ “macros”.

Macros look at your code as data and transform it into a different program. Essentially they allow you to create your own language, Racket+, which the macros you created, like a compiler, transform the rest of your program into Racket. Then Racket eventually interprets/compiles your program to the native platform. Lots of Racket features are actually implemented using macros so it’s macros almost all the way down.

Language extensions to the language in the language? Sign me up. Actually the point of this project is for me to learn macros since the best way to learn them is to write them.

What I’ve described so far applies to Scheme, Lisp and even shudder C++ macros. The problem with C++ macros is that they perform text to text transformations. This process is error prone and makes me sick inside. Lisp macros manipulate parenthesized expressions of Lisp data, or s-expressions. This is a huge improvement over text manipulation, comparable to the difference between storing all your data in a string that you parse and extract and manipulate versus using actual data structures like arrays, maps, and trees.

This is a big improvement, but definitely not perfect. Lisp macros have a hygiene problem. Apparently “hygiene” is a mathematical term, but what it means is that names defined or used in a Lisp macro can clash with names defined in the outer program, by other macros, or even the same macro used multiple times. If two macros who work in isolation are used together, there is no way to know for sure that they will work together without checking the source code. Sure, there are workarounds, gensyms, packages, etc. but why deal with that when there is a better system: hygienic macros from Scheme or Racket?

Essentially hygienic macros extend the idea of lexical scope to macros. Lexical (or static) scope is the idea that it should be obvious from looking at nearby code what a name means. In languages with first-class functions or macros this is not trivial property to have so some don’t bother, but lexical scope is good for the same reason that avoiding global variables and unnecessary mutation are good. They make your programs easier to reason about.

Racket takes the idea of hygiene even further than Scheme. As for how exactly I’m not yet exactly clear. Something about safe ways to violate the hygiene rules but still stay squeaky clean, making them more flexible/powerful. The commenters on Why Hygienic Macros Rock discuss it a little bit. I’ll have to get back to you on why, or feel free lay down the law in the comments.

Deforestation

The point of deforestation, in the computer science sense, is to make functional abstractions efficient. By focusing on saying what you want and not how it should be done you can write clearer more concise programs:

Functional Version

square n = n * n
sumOfSquaresTo n = sum (map square [1..n])

versus:

The Imperative Version

def square(n):
    return n * n
def sumOfSquaresTo(n):
    sum = 0
    for number in range(1,n+1):
        sum += square(number)
    return sum

The functional version is more concise and once you learn the basics of functional programming, just as clear. Now before you complain, I realize that you could use the sum function, but it would do something similar with a loop. I lifted the loop out of the sum function because I wanted to make the algorithm’s performance characteristics obvious.)

Although the functional version wins in conciseness, there is a tradeoff - the simple way to compile it will use two intermediate lists, the numbers from 1 to n and the first n squares, while the imperative version uses no lists (at least in python3, xrange in python 2 would have the same effect). Creating and storing these lists are less efficient than we would hope. Stream fusion is the process of transforming operations on lists that use intermediate lists into fast imperative code behind the scenes. Deforestation is the same process but more general, it also works on tree-like data, and removes intermediate trees, making the code more cpu and memory efficient. Strategies like these have been applied the Glorious Haskell Compiler (usually referred to as ghc), but my task is to write a Racket macro to do the same thing.

Since I’ll be reading and taking notes on deforestation papers, summaries of them will probably pop up here in the future. If you want to get a head start, you overachiever you, here are the papers I will read:

First Class and Higher Order Functions

2012-03-08T21:24:00-07:00

In order to expand my mind, I’ve been exploring different languages, but sometimes I’ve had to come back to the first language I learned in depth: Java. And when I do, the thing I miss most is first class, higher order functions. That and pattern matching.

Let me show you what I mean:

#lang racket (require rackunit) ; A higher order function is simply a function that accepts a ; function as an argument or returns a function, or even both. ; Here is a handy higher order function, zipWith: (define (zipWith f l1 l2) (if (or (empty? l1) (empty? l2)) empty (cons (f (first l1) (first l2)) (zipWith f (rest l1) (rest l2))))) ; It takes in a function f and two lists, l1 and l2, and returns ; a list where the function has been applied to pairs of arguments ; from l1 and l2. Here is its type: ;; zipWith ((a b)->c) list list -> list ; So if know how much candy your kids got on Halloween (define kids-candy '(10 13 10 50 17)) ; and you keep track of how much candy you threw away (or ate) ; to prevent them from getting totally sick (define candy-disposed '(2 4 8 5 7)) ; then you can easily figure out how much each kid has after your tax (check-equal? (zipWith - kids-candy candy-disposed) '(8 9 2 45 10)) ; Compose is a higher order function that happens to take functions as ; arguments and return a function (define (compose f g) (lambda (x) (f (g x)))) (define (cube x) (expt x 3)) (define (cube-root x) (expt x (/ 1 3))) ; Actually, in Racket, zipWith is equivalent to the builtin map ;; map ( (a0 a1 .. an)->b list list .. list ) -> list ; so we can show that a function composed with its inverse ; is like doing nothing to its input (ignoring rounding errors) thus: (for ([h (map compose (list sin cos cube) (list asin acos cube-root))]) (displayln (h 0.5))) ; 0.5 ; 0.4999999999999999 ; 0.5000000000000001

The cool thing about a language with first class functions, like Racket, is that if the language did not provide us with map, we can write zipWith.

But besides that, allowing functions as arguments to and return values from functions allows you to abstract more things than you can without that ability. Many design patterns exist because the language does not have first class functions, but because that idea is very useful, people emulate first class functions poorly using object orientation. The visitor pattern, for example, is essentially a fold.

A programmer can emulate first class functions by hand using object orientation, but take a look at an attempt to do what took 7 short lines above in Racket in Java and ask yourself, why not use a language with them in the first place?

public static Function compose( final Function f, final Function g) { return new Function() { @Override public Double apply(Double x) { return f.apply(g.apply(x)); } }; } @SuppressWarnings("unchecked") public static void main(String[] args) { ArrayList> functions = Lists.newArrayList( new Function() { @Override public Double apply(Double x) { return Math.cos(x); } }, new Function() { @Override public Double apply(Double x) { return Math.tan(x); } }, new Function() { @Override public Double apply(Double x) { return x * x; } }); ArrayList> inverse = Lists.newArrayList( new Function() { @Override public Double apply(Double x) { return Math.acos(x); } }, new Function() { @Override public Double apply(Double x) { return Math.atan(x); } }, new Function() { @Override public Double apply(Double x) { return Math.sqrt(x); } }); for (int i = 0; i < functions.size(); i++) { System.out.println(compose(functions.get(i), inverse.get(i)).apply(0.5)); } }

John Carmack on Static Analysis

2012-03-01T15:22:00-07:00

… and functional programming.

The most important thing I have done as a programmer in recent years is to aggressively pursue static code analysis. Even more valuable than the hundreds of serious bugs I have prevented with it is the change in mindset about the way I view software reliability and code quality.
John Carmack, #AltDevBlogADay altdevblogaday.com/2011/12/24/…

Now if you don’t know who that is, that’s ok, but out of the people in industry that I know, which to be fair isn’t many, John Carmack is the programmer that I repect the most. It’s not just because he created some very popular games (Doom and Quake for instance) but that he essentially created the western game industry as we know it by creating whole new graphics techniques.

First he wrote the first all-software 2d game engine for Commander Keane, and then he created the whole 3d games genre with Doom. The techniques he developed and even a decent chunk of the code that John Carmack has written lives in practically every 3d western game you’ve ever played. Wow.

I’m really glad that Carmack recognizes the importance of static analysis and functional programming languages on code quality, and its good to hear when reasearch we have about software productivity is supported by luminaries in industry. I’ve started the video where he begins talking about static analysis which goes until 1:12:35.

Stay til then and you’ll hear him recommend Haskell and Ocaml, which makes me smile.

Programming Languages and Sheet Music

2012-02-22T10:06:00-07:00

If the juxtaposition seems strange to you, I agree with you that at first glance there doesn’t seem to be much in common between the two. If you’ve chalked it up the strangeness of the author and moved on to something else I wouldn’t blame you, but I still think you’re wrong.

Programming languages and sheet music share some interesting historical similarities besides their importantance as mathematically grounded tools for capturing and spreading ideas. Very different ideas.

A Very Shory History of Musical Notation and Programming Languages

Music is old. Really old. It shouldn’t be surprising then that written methods of describing music are also old. The oldest surviving example of musical notation, a mesopotamian cuneiform tablet (i.e. modern day Iraq), is about three thousand years old. Other examples of early musical notations also survive from Greece, the Byzantine Empire, India, China, Russia, Japan, Persia and Indonesia among many other countries.

The musical notation that we all know, however, comes from Western Europe, developed by monks in the abbeys and monastaries of the middle ages. Although originating to notate religious music, the modern staff notation became codified and spread with the rise and spread of classical music, and this is the musical notation system we have with us today.

Programming languages as we think of them did not arrive until much later, but that doesn’t mean that the history of programming languages started in the 1940’s with the first electronic computer.

No, that date probably belongs at least as far back as 1801 due to the creation of the Jacquard Loom. Besides the mechanical machine there was a language of sorts. Instead of code as bits or text files there where cards with holes punched in them. Instead of computing information it creates you a rug or some fabric. This invention, and its method of programs “written” on cards with punched holes, inspired Charles Babbage in his design of his computing machines: the Difference Engine and the Analytical Engine.

Neither of these was actually completed in his lifetime. A fully built Difference Engine wasn’t completed until 1991 and work only began very recently on a 10 year multi-million dollar project to build a working version of the Analytical Engine. The Analytical Engine as designed is the first instance of a Turing complete machine, making Ada Lovelace’s program to compute a sequence of Bernoulli numbers the first program written in a Turing-complete language. (For further reading I submit to you: A humorous and fictional account of Babbage and Lovelace.)

Unfortunately for Babbage, failing to get your brilliant ideas working in the real world doesn’t help you much to convince people that they actually are brilliant. Hence the idea of a general purpose computing machine (and the languages needed to program them) had to wait for more than 50 years to be realized. That doesn’t mean that different programming languages and computing machines didn’t arise in the meantime.

For example the autopiano, invented in 1876, had a sort of language for describing how to play music, while Herman Hollerith used punch cards to encode the 1890 census and then quickly tabulate the results. A steady progression of mechanical calculating machines eventually led to the creation of the “modern,” electrically powered computer in the 1940’s and with it various assembly languages, and then the explosion of general purpose languages that happened after the creation of Fortran, Lisp and Algol.

Wait a second! Back up a bit. A language, albeit a very limited one, that tells a piano how to play music? We’re back to musical notation again. This time, a sort of hybrid between the two: a human illegible notation for music and a very, very limited language for programming an autopiano. Some may contend that this is not programming, but it is an encoding of a series of steps to “compute” a song.

At this point it might be helpful to describe what a programming language is. A programming language allows you to compute the values of functions, or the results of algorithms, a series of well-defined steps. The duality of functions, mathematical functions, and series of steps is not a coincidence. Computation, and therefore programming languages, are intertwined with math. In a bit, we’ll see how exactly.

But first let’s see what math has to do with sheet music.

The Mathematical Basis of Musical Notation

The most obvious example is that there are numbers written on every piece of music giving the timing: 4/4 or 3/4 or 6/8. Music implies repetition of sound, but it only sounds good when it forms some sort of regular pattern. Sheet music breaks that up into wholes, or measures, and describes them using meter. The sheet music also usually has a recommeded timing, or length of certain kind of note. A 3/4 time means that there are three major beats in a measure, and 1 whole beat is marked as a quarter note. Why not as a whole note? I don’t know.

The notion and therfore the notation of rythym is inextricably connected with regular repetitions, and those sound good when they come at regular intervals and they form some relation to each other. In fact the notes are a form of binary fractions. You could represent the length of notes in binary:

	Note Name	“Binary”
	Double Whole Note	`1000000000`
	Whole Note	`0100000000`
	Half Note	`0010000000`
	Quarter Note	`0001000000`
	Eigth Note	`0000100000`
	Sixteenth Note	`0000010000`
	Thirty-second Note	`0000001000`
	Sixty-fourth Note	`0000000100`
	Hundred Twenty-Eighth Note	`0000000010`

Rests can also be encoded in the same manner. (Wikipedia has a table of these notes, the rests and what they look like for those who are fascinated by music but not familiar with its notation. Nice to meet you too, nobody.)

Then you can think of a dotted note as adding a version of itself right-shifted by one bit: dotted_note = undotted_value + (undotted_value >> 1). A quarter note 0001000000, when dotted,becomes 0001100000. Two tied notes can be added together: a whole note 0100000000 tied with a quarter note 0001000000 becomes 0101000000.

This system is made more complicated when tuplets are introduced but the ratios and the math remain. These are only the easiest to see. Because music is based on sound, the mathematics of sound also affect the notation of music, a topic way out of my comfort zone. Besides the application to musical notation, branches of mathematics, such as set theory and abstract algebra, are also related to music theory, at least, according to wikipedia.

The Mathematical Basis of Programming Languages

The Church-Turing Thesis

Early in history of computing, mathematicians wanted to figure out what exactly computation means. They proposed different ways of modelling computations. Alan Turing developed a machine that provides the canonical definition of algorithms. If it runs on a Turing machine it is an algorithm or “an effective method to calculate a function, expressed as a finite list of well-defined instructions.” An effective method means that the method always: gives an answer-the correct answer, can complete the calculation in a finite number of steps, and works for all instances of a class of problems.

Other mathematicians came up with ideas rooted in recursive mathematical functions. For example Alonzo Church created the lambda calculus, the repeated application of functions. Other models for computability were also created, but it was shown that these are equivalent. This thesis, the Church-Turing Thesis, expresses the idea that the idea of computable functions, embodied in Church’s lambda calculus, is exactly equivalent to the idea of algorithms, embodied in the Universal Turing machine. Both rely on the idea that given infinite time and space, they can solve (and not solve) the same functions. Lambda calculus relies on infinity through recursion, while the Turing Machine invokes infinity by requiring an infinite tape.

There are other theories (recursion, reckonable in the system S1, canonical systems, register machine, combinatory logic, pointer machine, cellular automata) are also considered by the Church-Turing thesis to be equivalent to each other. (Sidenote: Since the idea of an algorithm is the intuitive definition of mechanical computation and not a formal idea, the Church-Turing thesis cannot be proven. i.e. a proof cannot show that a formal system, a Turing machine, is equivalent to an intuitive idea, mechanical computation.)

Since programming languages encode programs, whether they are thought of as algorithms as in imperative languages, or as functions in functional languages, or logical contraint resolution in logical languages, or any other way of computing a function, they are computing and manipulating mathematical objects.

The Curry-Howard Correspondence

This is another way that programming languages, like musical notation, are steeped in math: type systems.

Simply put, a typed computer program is exactly equivalent to a logical proof. Programs are proofs and types are logical propositions. This allows you to think about the problem in whatever way is most helpful at a time, and switch back and forth as needed. This probably seems very unexpected, but it can be very useful.

This means that proofs can be written in a language like Coq, checked by computer and run. In addition, types can be used to prove different properties of programs - for example that your kernel is free of implementation bugs such as deadlocks, livelocks, buffer overflows, arithmetic exceptions or use of uninitialised variables.

So, unless you are programming in assembly or the untyped lambda calculus, the Curry-Howard correspondence grounds your programming language in mathematical theory, just as musical notation is.

“Thought Stuff”

Why is programming fun? What delights may it’s practitioner expect as his reward?
…
[T]here is the delight in working in such a tractable medium. The programmer, like the poet, works only slightly removed from pure thought-stuff. He creates his castles in the air, from air, creating by exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures…
Fred Brooks The Mythical Man-Month

As the English language was to Shakespeare so is the programming language to a programmer or musical notation to the composer, a means to capture and describe thought stuff.

These three things have something in common, a series of abstract symbols and associated meanings. Instead of pictograms for things, actions and descriptions we have words made up of abstract symbols called letters. Then we combine these groups of abstract symbols into groups of groups of symbols called sentences. With just 26 symbols we have created 250,000 words. And with all these words we can create an infinite number of syntactically and grammatically valid sentences. With only 26 symbols we can still create an infinite number of words, although I’m not sure what “sinfalderogalteblishanieraphtholology” or “qqqqqqqqqqqqqqqqqqqqqqqqqqqqqq” should mean.

It is strange and fascinating that a few simple rules, a finite grammar, can create infinite possibilities. In the same way that we can create an infinite number of sentences or even words for the english languages, there are an infinite number of programs for any programming language, or songs that can be captured by a musical notation. Our job is to capture these ideas, explore the space of interesting and beatiful english, songs and programs.

There’s a reason though that programming fascinates me more than others and I think the Fred Brooks captures pretty well the reason why. I’ll let Fred explain.

Yet, the program construct, unlike the poet’s words, is real in the sense that it moves and works, producing visible outputs separately from the contruct itself. It prints results, draws pictures, produces sounds, moves arms. The magic of myth and legend has come true in our time. One types the correct incantation at the keyboard, and a display screen comes to life showing things that never were nor could be.
Programming then is fun because it gratifies creative longings built deep within us and delights sensibilities we have in common with all men.
Fred Brooks The Mythical Man-Month

GreenTea's 2nd Place Entry Postmortem Translated by Hand

2011-12-23T22:10:00-07:00

Editorial stuff

This is the original but I got tired of reading the Google Translation and thought that more people would enjoy reading the postmortem in native(-ish) English. I’d like to thank GreenTea for linking to this and for any Russian speakers, proshu prosheniye za oshibki perovedeniya. Before we begin, here is the code for the final bot.

Begin Translation

As I write this post the finals of the Google AI Contest 2011 - Ants is in full swing. The organizers are gradually cutting off the weaker bots to allow the stronger ones to play more. According to the timer on the official site the winner will be announced in 1 day, 5 hours, 35 minutes and 5, 4, 3, 2 … seconds.

Oh yeah, what was I saying? At the moment my bot GreenTea is in third place (Ed. it finished in second place). And it is the focus of this post. I don’t even know where best to start since so much effort and time was invested, beginning with the beta itself before hills were added to the game …

This year’s beta ended up dragging on for an inexcusable amount of time. This greatly undermined my faith in enthusiasm and in open source. Then suddenly the developers wrote on the site “Sponsored by Google” and then an intense commotion began to stir as well as some commits. Come on! Although better that than the alternative :) - I really thought that it was going to stall at the beta stage and that would be it … And the whole beta (I can’t even remember how long anymore, 4 months?) I coded something little by little, adding and improving.

If there’s a reason I love this type of competition, it’s because there is LOTS of time. You don’t have to hurry or rush: you can sit down, think everything through, try stuff, test, make a change … make 5 more changes, do some more testing, etc. Basically its enjoyable to program within my level of understanding.

Game Mechanics

The map is a rectangular torus of a certain size, where by going off of top you’ll arrive at bottom and if you go over the left side you will end up on the right side. Each player starts with the same number of holes and an ant next to each hole. Every tile on the map can be either land, or water which is impassable for an ant. Ants should collect food because for each piece collected an ant will appear out of one of that player’s holes, chosen at random. Food appears randomly on the map. If ants belonging to different players after the move phase are within the attack radius, they will fight. The combat rules are not exactly easy to explain but suffice it to say that, in general, more ants a player has in the battle, the fewer ants they will lose. A hole is destroyed when an ant stands on a hole belonging to a different player. Destroying a hole is worth 2 points and losing a hole costs you 1 point. So if the choice is between capturing an enemy’s hole and saving your own, you should only prefer the first option if you have more than 1 hole left. For each turn your bot needs to tell each ant where to go. Oh, I almost forgot to mention that each map is symmetric and the food also pops up symmetrically random so that everything is fair for each player. And that’s it! These aren’t tricky rules.

In contrast to Planet Wars, where moves were not clear, this game is easier to break into sub-problems and then solve each individually. Well, for example, it’s obvious that you need to race for the food that is evenly spread across the map and then assault the enemy with your large numbers of ants. The funny thing is that even though the competition is easier and clearer for everyone individually, it is just as complex to defeat everyone else and the dificulty of the competition didn’t decrease at all.

My Tools

Java

I am a java developer. It is a simple and fast language. It allows you to very quickly pump out tons of code (if you need to implement some fresh idea) without worrying about all sorts of system things like memory leaks (hi C++ !!). For these reasons I chose to write my bot in Java. At some point in the middle of the beta I tried switching to SBCL Lisp for fun. I didn’t work out … My path finding algorithms, written exactly the same way but in Lisp were 5 time slower than the Java versions. It’s possible that this is because I don’t know Lisp well enough …

IntelliJ IDEA Community Edition

Excellent development environment. I had practically no complaints throughout the process. Once, however, the incremental compilation cache got corrupted ?and it complained that every line of code was a compilation error. I ended up having to clear the cache by hand and restart.

git

I really regretted not using version control on Planet Wars. I fixed this error, and from the start committed my code to a git repository. Git won me over with its many options and speed (in comparison with svn).

Perl Scripts

In the previous contest, Planet Wars, I created scripts to automatically test my bot against my own previous versions. It would be a sin not to use them like this! I had to modify the script to adapt it to multiplayer maps and the free for all mode - i.e. every man for himself. For example:

After Game three the bot:

Beat 1.5 bots and lost to 4.5 bots (A tie is split between the two)
Scored 0 points while the other bots scored 2.16 points on average

And the Summary Info:

4.5 wins (45%)
5.5 losses
4 points scored total (sum from all games)
3.66 is the sum of the mean scores for all opponents

Google Documents Spreadsheets

I used it to keep track of my improvements, bot versions and the results of testing the different versions against each other on different maps and with different food spawning patterns.

Sampling Profiler hprof

This tool was a large help to me. Looking ahead about 2 weeks before the deadline, my bot still very frequently timed out (took more than 500 ms on at least one turn). I couldn’t understand how this was happening. At first glance the code was clean and ideal, and looking at the data for the game the bot timed out showed that the bot should have had time to finish the turn (for example 300ms left). Then why is it stuttering and falling on its face? I had begun thinking about all sorts of mystical causes, like the garbage collection running at some point and pausing execution. Perhaps this is part of the problem, but the solution was much simpler than that. I don’t remember how exactly, but I ran into hprof. You run it in its own process as an agent to java-e (something like this: “agentlib:hprof=cpu=samples,interval=10,depth=10”) and as the program runs (in my case as the bot takes its turns, say 200 turns) it gathers statistics about which pieces of code run most often. With this help I found many not-so-obvious places that were running slowly. There was some old, creaky code which could be made 10x simpler! Actually, there was quite a bit of it. Because of these tweaks my bot ran about 40% faster and all of my timeouts on the official site disappeared.

Birth and Evolution

The young do not remember, but I do, that it all began in an open space. Ants were nomads and had no holes. For what purpose are holes when ants are born from a piece of food? In these wide open spaces the childhood and youth of GreenTea were spent. In the beginning while still very young, we walked towards our food in a straight line, now and then bumping into the wall. Sometimes we would get stuck to them, struggle, but could not escape. Later I learned to ~~walk~~ find the shortest path to food …

The first part implemented was spreading out evenly on the visible portion of the map. Implementing the rule to go where none of my ants already were was pretty easy. We find the N nearest ants and build a vector from each of them to the current ant. We give each vector a magnitude of 10 (a magic constant) and sum the vectors together to get the vector the current ant should move in. N may vary, the larger N the better they spread out but this also takes longer, and begins to be noticeable later in the game when you have lots of ants.

The figure shows how the vectors to the ant (yellow) are summed up to create the vector for the ant (green). This is similar to how molecules of gas interact.

Then there was pathfinding. I honestly didn’t use Google because I wanted to figure it out myself. My first algorithm ended up greedy, recursive, and admittedly, not very fast. But this was quite good enough in the beginning.

But the carefree childhood on the open spaces did not last long. Now and then other ants appeared, who could also search for food not just in a straight line but by shorter paths, and for some reason they did not want to live and let live …

The second thing I implemented was combat code. The thing is that during the beta the rules deciding combat as well as the attack radius were unstable. Because of that it was necessary to write something more or less universal so that I wouldn’t have to rewrite it again. I didn’t manage to think of anything better than going through all possible moves for my ants and the opponents ants which could be drawn into a fight. I chose out of all those choices the ones that prevented any enemy move from killing all of my combatants in one fell swoop. If there were more than a few of these “safe” moves, the one that causes the most problems for the enemy is chosen. In short: we seek the most aggressive of the safest moves. These two factors are often mutually exclusive so safety always comes first. Initially the algorithm was configured to fight battles with 7 ants or fewer (counting both my and enemy ants). This is a maximum of 78,125 possible moves. In practice, adjacent ants and water tiles eliminate combinations. When I let there be 8 ants in a battle I was guaranteed to timeout.

And the ants understood that together they were powerful. And they began to annihilate their enemies, pulverizing them without pity or rage.

As far as I remember the battle code, together with pathfinding and spreading out directive, led the GreenTea bot to a quick rise to the top of the beta site’s rankings. It was the rare opponent that could hold out against the crazy micro which just worked amazingly, and no wonder, it was looking at all the possible moves! For battle groups of >8 ants there was a primitive algorithm but it got used so infrequently because on the revealed map large numbers of ants rarely clumped together up to that point in time (more on that later).

Improvement

Someone could disagree with me, but it seems to me that Ants differs from Planet Wars in that improvement is significantly easier. Because of this, when you watch the replay - its immediately apparent where the bot made a mistake. Here’s pretty much how most of the improvements happened: watching the replay “Aha - here it was a little dumb and lost an ant, here then went wonky and gathered into a clump when it should have spread out in different directions and look for food …” Planet Wars required tough math and accurate scoring. One mistake and it was all over. Here you can make a few mistakes, its possible to make a few good turns and even a few OK turns without causing you to lose. But all the same the person who wins is the one who makes fewer mistakes.

Looking at my Google Doc spreadsheet I made 76 improvements, each of which was tested on different maps and different food spawning methods and showed between a 3 and 15% in the number of victories in a 1 vs 1 battle with the previous version, and between 1 and 10% improvement in FFA battles. I will describe the most significant improvements to the bot made from the beginning. Why not start once the rules of the game were finalized? Because from that point many important improvements were made which influenced the excellent play on the open maps, even with holes as a part of the game.

u4. Improving Pathfinding - Paths should look more like a straight line

Heh. My first clumsy attempt at pathfinding found the shortest path but it looks slightly off.

Red - original path found to food. Green - correct path. Although its not obvious that they are the same length, both paths have the same lengths when measured using tiles.

u6. When searching for food and enemies use the actual shortest distance

Old version of bot incorrectly used the red path as the distance when the green path is the actual distance to the food. In this case we would have wasted some moves going towards the food when the enemy will get there first.

(Ed. Apparently he switched from using the Manhattan distance, which the starter bot provided, to something like breadth first search, although apparently not that. BFS started at u56.)

u9. Combat Algorithm

Read about it above.

u11. Collect food before attacking

Since the more ants you have the faster you can grow, it is critical to eat food as soon as possible to get ants as fast as possible. That is why it is important, even in the middle of a fight, to pull one ant out of the fight to go pick up the food.

u14. The Combat Algorithm prunes duplicate positions.

This significantly sped up search allowing battle groups to be up to 8 ants.

Implementation

Wrote a hash function that takes all positions of moved ants and produces an integer i.e. hash value of the position. Then I keep track of all executed positions in a HashSet and before executing a position further, the algorithm checks to make sure the position has not been executed before. Skipping execution when it had seen the position before saved a lot of time. unfortunately the hash function was not unique enough, i.e. there where some collisions when it thought it had seen the position before when it had not and should have considered the position but ignored it. This completely broke the algorithm and the bot made stupid moves. In the end I had to return to keeping a complete listing of positions and only allowing 7 ants in a battle group.

u18. Use 4 ants to generate the spreading vector instead of 3

This looks much better and smoother and pretty significantly improved winning percentage.

u21. The Combat Algorithm started keeping track of ants who had moved and who spawned over holes

An ant who moved is for example one went after food instead of fighting.

u25. Pathfinding Algorithm modification

Removed recursion from pathfinding in a way that seemed intuitive to me. In retrospect it seems to have become very similar to A* :)

u26. Timeout Preventing Code

In a few places the code checks how much time is left. If more than 90% of the time has been used, then an exception is thrown that when caught tidily completes the turn.

u27. Better choice of which square to move to eat a piece of food

If you can move closer to another piece of food or further from your ants prefer that way to eat the piece of food. Quite a subtle improvement. Separately considers from which side to approach the food to eat it.

Moving to the right is better for two reasons. 1) This brings it closer to the other piece of food, which the other ant isn’t moving towards. 2) This distances him from your other ant, at the same time extending your visible territory.

u29. If the enemy will reach the food first, then don’t move toward that food. If there will be a fight over the food send another ant in pursuit.

If just the nearest ant goes towards the food and the opponent also moves towards the food, they are assured mutual annihilation, while our other ant sits idly by. The second ant should move as close to the food as soon as possible.

u35. Avoiding Enemy Ants, if there are more than two Dangerous enemy bots.

If a bot is being pressured from all sides then engaging in even trades is a very bad idea, since all of the bot’s forces would be spread between multiple opponents. On the other hand, passive bots will grow but eventually peter out. One should only attack if it is 100% safe.

And now we’ve arrived to the tail end of the beta, a phase that reminds everyone of the dominance of the German bot mathis. This bot was good managing large, concentrated groups of ants in battle. He lined them up in a chain of ants, while preventing random losses (from, say, an ant moving one step too close to the enemy and getting himself killed). And then this line would simply crush everyone. My combat system, however, was unable to cope because it couldn’t handle large groups of ants working together. The simplified system of combat wasn’t getting me anywhere. Ants on my front line acted like a disorganized herd and made a lot of mistakes. And yet it did so well at spreading ants around inside its own territory. In addition he separated himself by how effectively he crushes ants which managed to get within his territory. It turns out that mathis worked out where the zone of complete safety was and constantly tried to increase its size. At that point in time I was unable to construct as brutal tactics. From my end there were many small improvements that are not worth mentioning.

End of the Beta and Start of the Official Google AI Challenge - Ants 2011

With the official start there were two new things: holes and a new type of map, the maze, which created a bunch a problems for me:

Spreading my ants out by vectors didn’t work on mazes. Mostly the ants ended up getting killed near the hill.
Experienced severe slowdown as soon as the number of ants got larger than 50. This was because almost all pathfinding, which there was a lot of, didn’t finish quickly, and got confused and lost.

Improvements of Version 2.1

A new method of spreading out in the maze. This was a nightmare, but worked better than the vectors in addition to not slowing down … Here’s how the improvement worked: Create a map of what is visible. Each tile corresponds to a tile on the full map. Every turn I incremented the value at visible tiles by 1 and kept these values between turns. Over time, places where ants have been frequently will have quite large values. And when an ant is choosing where to go, it will prefer a tile with a smaller visibility score, i.e. that has been seen fewer times, and then ants don’t sit around in one place but try and expand away from the hill.

The main thing was to do this only for maze maps. I decided if a map was a maze map by gathering statistics for all the pathfinding and if the found shortest paths ran into lots of obstacles and there weren’t many straight-ish paths I assumed it was a maze.

Hole Defense

My hole defense was not very intricate: as soon as enemy ants were within N moves of the hole, all of my nearby ants would throw themselves at the enemy. For a long time N was 25, near the end I changed it to 15. I might have been worth making even smaller.

If you have only one hole left
If you have a lot of ants on the map.

On a map where you start with more than one hole it’s not necessary to defend all of them from the start. A bot that does that wastes a lot of time, will miss out on food collection and get crushed by the enemy who now has a large numerical advantage. So, I defend my holes if I have 1 hole or the number of my ants is greater than C times the number of holes I have (where C is a magic constant equal to 10).

u54. New Type of Attack - “Medium”

This was the first attempt to do something better than a simple attack in order to compete with mathis.

u56. Improved Movement Towards Goals

Now examines the shortest path to all free ants within a certain area of accessible tiles. You can call it the embryo of breadth-first search (BFS).

u70. 1) Many Performance Tweaks 2) Food Gathering Tweak

When going for a piece of food, the ant can go for a more distant cluster of food than the closest, if it will move him away from the main cluster of ants.

In this case it is better to go for the further away food.

Nearing the Finish Line

One week left before the finals. On both the official and tcp server a German bot named xathis is dominating. His 1st version arrived about week after the official start and immediately took the lead. Then, almost at the end of the competition his second version has appeared - and broke away from everyone again by a large margin. And where did mathis disappear to? (Ed. xathis’ name is Mathis Lichtenberger)

I used a week of my vacation time from work to spend on one last push.

The drawback of the current method of distributing ants in a maze with a comprehensive map was that often ants are pushed into dead ends and could not get out of them. This was a natural consequence of the method chosen; the way out of the dead end is already well explored by other ants and its impossible to get out with the same method that got you there.

I decided to try a much bigger emphasis on breadth first search. The idea is that there are a few types of goals. Each goal has its own priorities. Around a goal of one type waves spread out in the search, increasing the importance of the priority in the cell but by a diminishing amount the further away it is from the source, using the inverse of the square of the distance from the target. Types of Objectives:

Boundary point on the edge of visible region - they lead to unexplored territory
Enemy ants
Enemy holes
Points which haven’t been seen for more than 20 turns, but are found in the already explored region. There could be food there which we should gather.

In that way after 4 rounds of BFSes the map has been filled with priorities which increase the closer a tile is to one (or a few) of the goals. All that is left to do is take each ant and move it towards the most high priority tile in its view radius. This improvement led to a truly tangible improvement in strength. This despite the fact that I did not have time to optimize the coefficients for each type of goal.

Final Weekend Before the Final

These were the most active days for past few months. I tried to figure out the best system for battle with large numbers of ants. I moved forward with the help of unit tests. I found in the replays situations where I made mistakes, make a corresponding unit test and tried to fix the problem. The code turned out horrible, but I managed to fix 15 problems and not regress on any earlier ones :). This set of improvements game win a 60-70% win rate against the previous version 1vs1. And up to a 10% improvement in a free for all. Looking at the fact that the new version managed to gain second place on the official server, I held out great hopes before the start of finals!

What I Ran Out of Time To Do

All the same the final battle system ended up not nearly as good as xathis’. In order to see this, all one needs to do is watch a few games against him. I also don’t like I follow already captured territory in search of food when spreading out for maze maps mixed with the map of priorities. It seems that it would have been better to use a uniform grid coverage system like xathis or tetrapohedron. It would be nice if scouts would dynamically join battles if needed.

It’s time to finish this up. The article ended up not a small one. (Ed. I’ll say) :)

I want to thank Roma Ishenko :), my work colleague, for helping me brainstorm over lunch, thinking over new opportunities for improvement.

AI Challenge Code Dump

2011-12-19T10:29:00-07:00

Phew. The Google AI contest closed for entries last night. I had been working on a bot since the start of the contest, but I gave it up after running into timeout issues that I couldn’t track down.

With 36 hours to the deadline I jumped back into the game, read through the forums and decided to try a different approach than I had been pursuing. The code is ugly and there are almost no comments (none if you count only those that will make sense to anyone else) but I thought that this blog could use some more code on it, and some of you may be interested in how my diffusion approach looks.

With all those caveats and hedging, here you go:

import java.io.IOException; import java.util.*; /** TODO if closeToFood: efficiently eat food if closeToFrontier: explore if notCloseToFrontier: either move to conflict or to (nearest) part of external border if closeToEnemyAnt: if more numerous: attack else: flee and summon help if closeToEnemyAnthill: strike or surround defend own ant hill (keep it watched and then call in help) if reserves very large suspend food gathering and interior intelligence maintenance */ /* Version Name Ideas: Antilles*, Bossk*, Carth* /Chewie/Cracken, Dengar, Exar/Ebe, Fett/Fortuna, Greedo/Grievous, Han/Horn, Ig88, Jabba, Katarn/Kenobi/Kun, Lars/Leia/Lobot, Maul/Mothma, Organa, Porkins, Quadrinaros, Rendar/Ruhk, Solo, Tarkin, Vader, Wicket/Windu, Xizor, Yoda, Zorba/Zuckuss */ /* This Version is Carth */ public class MyBot extends Bot { public MyBot() { } public static void main(String[] args) throws IOException { new MyBot().readSystemInput(); } Map visitedAgo = new HashMap(); Set seenFood = new HashSet(); Set seenEnemyHills = new HashSet(); @Override public void doTurn() { Ants state = getAnts(); Board.setAnts(state); findExloreTargets(); double[][] diffArray = new double[Board.getRows()][Board.getCols()]; Map consts = new HashMap(visitedAgo); for (Tile ea : state.getEnemyAnts()) { for (Tile ma : state.getMyAnts()) { if (state.getDistance(ea, ma) <= 4) { List possibleDirs = state.getDirections(ma, ea); Tile escapeTo; if (Board.canIssueOrder(ma, possibleDirs.get(0))) { escapeTo = Board.getTile(ma, Aim.getOpposite(possibleDirs.get(0))); consts.put(escapeTo, 10000); } else if (possibleDirs.size() > 1 && Board.canIssueOrder(ma, possibleDirs.get(1))) { escapeTo = Board.getTile(ma, Aim.getOpposite(possibleDirs.get(1))); consts.put(escapeTo, 10000); } } } } for (Tile a : state.getMyAnts()) consts.put(a, 0); //TODO don't do as const? for (Tile mh : state.getMyHills()) consts.put(mh, 0); for (Tile w : Board.waterTiles) consts.put(w, 0); final int FOOD = 5000;//TODO use turns of hillAnts left to decide value of getting food for (Tile f : seenFood) consts.put(f, FOOD); final int E_HILL = 9000000; for (Tile eh : seenEnemyHills) consts.put(eh, E_HILL); final int E_ANT = 100; for (Tile ea : state.getEnemyAnts()) consts.put(ea, E_ANT); for (int r = 0; r < Board.getRows(); r++) { for (int c = 0; c < Board.getCols(); c++) { Tile t = new Tile(r, c); diffArray[r][c] = consts.containsKey(t) ? consts.get(t) : 0; } } final int ITERS = state.getViewRadius2() * 7; for (int i = 0; i < ITERS; i++) { for (int r = 0; r < Board.getRows(); r++) { for (int c = 0; c < Board.getCols(); c++) { Tile t = new Tile(r, c); if (!consts.containsKey(t)) { double total = 0; for (Aim a : Aim.allDirections()) { Tile neighbor = Board.getTile(t, a); total += diffArray[neighbor.getRow()][neighbor.getCol()]; } diffArray[r][c] = total / 4; } } } } for (Tile mh : state.getMyHills()) { diffArray[mh.getRow()][mh.getCol()] = Double.NEGATIVE_INFINITY; } List myAnts = new ArrayList(state.getMyAnts()); for (Tile ma : myAnts) { double[] choices = new double[5]; choices[0] = Board.canIssueOrder(ma) ? diffArray[ma.getRow()][ma.getCol()] : Integer.MIN_VALUE; int i = 1; for (Aim a : Aim.allDirections()) { Tile nt = Board.getTile(ma, a); choices[i] = Board.canIssueOrder(ma, a) ? diffArray[nt.getRow()][nt.getCol()] : Integer.MIN_VALUE; i++; } double maxV = choices[0]; int maxI = 0; for (i = 1; i < 5; i++) { if (choices[i] > maxV) { maxV = choices[i]; maxI = i; } } switch (maxI) { case 1: Board.issueOrder(ma, Aim.NORTH); break; case 2: Board.issueOrder(ma, Aim.EAST); break; case 3: Board.issueOrder(ma, Aim.SOUTH); break; case 4: Board.issueOrder(ma, Aim.WEST); break; default: break; } } } private void findExloreTargets() { Set curSeenFood = new HashSet(); for (Tile sf : seenFood) { if (!Board.isVisible(sf)) { curSeenFood.add(sf); } } Set curSeenHill = new HashSet(); for (Tile seh : seenEnemyHills) { if (!Board.isVisible(seh)) { curSeenHill.add(seh); } } for (Tile eh : Board.getEnemyHills()) { curSeenHill.add(eh); } seenEnemyHills = curSeenHill; for (int r = 0; r < Board.getRows(); r++) { for (int c = 0; c < Board.getCols(); c++) { Tile t = new Tile(r, c); if (Board.isVisible(t)) { visitedAgo.remove(t); if (Board.getTileType(t) == TileType.FOOD) { curSeenFood.add(t); } } else if (Board.afaikIsWalkable(t)) { int newVal = visitedAgo.containsKey(t) ? visitedAgo.get(t) + 1 : 1; visitedAgo.put(t, newVal); } } } seenFood = curSeenFood; } }

That isn’t the only code I modified, a majority of the modified code is in two more files, Board.java and Assigner.java, plus a small tweak to Aim.java (adding the allDirections method). If you need to see them check out the full gist.

Board.java is a static version rewrite of Ants.java, a sort of utility class. Perhaps its not the best coding practice to make it static so I didn’t have to pass it around, but I was in a hurry and by definition the code would only live for 34 hours (making the code on the server, sort of Zombie code).

The Assigner.java class was my attempt at a breadth-first search solution. It was more efficient at getting food, but timed out too much and so I dumped it instead of trying to sort out its performance.

Don't Watch TV News

2011-11-30T17:04:00-07:00

Don’t watch TV news, especially the 24 hour news coverage. The incentives are aligned so that news is fast, inaccurate and shocking, and often inaccurate in order to be shocking.

A number of studies have consistently shown that TV news makes people more stupid, i.e. more misinformed, more irrational, more confused, and less able to draw reasonable conclusions or make good decisions.

Instead take the time to accomplish or progress towards your goals. I’ve slowly been cutting down on my TV and for fun internet browsing. It actually makes a difference in how I feel about the direction of my life. Not so much because I’m “wasting” less time, but that I am moving towards my goals. And that feels good. TV news should be the easiest to cut from your life, mostly because it’s crap. If I want news I’ll catch some NPR or read the Economist, the more leisurely news schedule is actually a good thing.

Don’t believe me that TV news is crap?

Some News Leaves People Knowing Less
Part Deux: Many Think US is bailing out Greece; NPR, Jon Stewart Out-Fox Cable News
Iyengar, S. 1991. Is Anyone Responsible? How Television Frames Political Issues. Chicago : The University of Chicago Press.
Iyengar, S. et Kinder, D. R. 1987. News That Matters: Television and American Opinion. Chicago : The University of Chicago Press.
The New Videomalaise : Effects of Televised Incivility on Political Trust DC Mutz - American Political Science Review, 2005 - Cambridge Univ Press

Algorithms Are Not a Four-Letter Word

2011-11-24T15:30:00-07:00

Algorithms are important. And actually interesting too, especially if you get away from the dry textbooks and into some applications.

This presentation does that with minimum spanning tree algorithms and shows how they can be used to generate mazes. Watching as the mazes are created is beautiful, so check them out if that’s all you get from this post.

In addition he uses these algorithms as examples of deliberate practice, something I’ll be talking about a lot more, hopefully soon. That’s because it is important for maximizing one’s potential productivity, and Jamis makes a convincing case for deliberate practice.

Haskell Amuse-Bouche: Google Tech Talk [Video]

2011-11-21T15:26:00-07:00

As I mentioned in a previous post one of the languages I will be learning soon is Haskell. If you’ve never heard of Haskell or you have and don’t know why you should be interested in learning it, this Google Tech Talk is for you. I really enjoyed it and am excited to have my mind wrung out and then filled up with new ways of thinking about programming.

10x Productivity and Developing Expertise: Introduction

2011-11-16T17:47:00-07:00

Abstract

The fact that some programmers are ten times better than others is widely cited and gained a lot of exposure thanks to bloggers like Joel Spolsky. The original experiment to uncover this effect is over 40 years old, and yet the reasons why some are so much more effective than others is poorly understood, in the research and especially among programmers in the field.

This post will hopefully be the start of a series of posts on the differences between the best, worst and median programmers as seen in factors of ten difference in productivity, even when experiments control for experience. I’ll delve into how an individual programmer can become more like the most effective and productice programmers and how small teams, also affected by large differences in productivity, can make themeselves more competitive and productive. We’ll tour the psychological studies and mine the software engineering studies to find out what has been proven to work.

Based on the research I’ve done so far, likely future topics include: increasing efficiency, avoiding rework through design best practices, accurate requirements, focus on quality, doing less through simplicity and reuse, and team programming methodologies such as Agile and CCMI.

Oh, if this sounds like a scholarly research paper … that’s because it is based on a paper I did. But now with the new format I can dig even deeper and spend more time on specific issues that didn’t fit the audience of the actual paper. Hooray for the internet and stay tuned.

Introduction

Programming productivity is a phenomenon primarily studied from the perspective of a business owner or manager, and it seems obvious why. Increasing the value of what you produce relative to your costs leads almost directly to profit, the purpose for a company’s very existence. If a company cannot thrive in a competitive market, increased productivity allows it to decrease prices to survive. The correlation between productivity and profit is even stronger when talking about software because the marginal cost of delivering completed software to a customer is essentially zero. Almost the entire cost to a software company is the cost of production.

Since most of the research on programming productivity is from the employer’s perspective, it is less obvious to an individual how they could increase their own productivity. It is also less obvious why a programmer would want to increase his productivity.

The rationale that most excites me is the opportunity to increase my influence and power to achieve my own goals and make an impact on the world. If I choose projects that have a positive impact on the world, then by becoming better at programming I can have a larger positive impact on the world. That excites me quite a bit.

However, there is another reason to try and become more productive, the same one as for businesses in fact: profit.

More productive employees are more valuable to and better compensated by businesses as evidenced by the wide ranges of salaries for developers. According to the Bureau of Labor Statistics, those at the twenty-fifth percentile make 1.54 times as much as those at the seventy-fifth percentile, while those at the tenth percentile make 1.9 times as much (Bureau 2011). For most individual developers there is significant room to improve their individual earnings.

Presumably a wide variability in programmer performance justifies the wide variability in programmer earnings. In fact there is much more variability in performance than in earnings. Whereas some programmers make about 2.5 times what others make (Bureau 2011), some programmers are ten times as productive as others or more. In the 1960s Sackman, Erikson and Grant conducted the first study to note this variability. Despite some methodological flaws, their data demonstrated a greater than tenfold difference in productivity between the best programmers and the worst. From their sample of professional programmers with at least seven years’ experience, they found no relationship between experience and productivity or code quality (Sackman 1968). This result has been confirmed in study after study: even among programmers with the same level of experience, some are ten times as productive as others (Curtis 1981; DeMarco 1985; Curtis 1986; Boehm 1988; Valett 1989; Boehm 2000).

Knowing that programmer productivity varies so much, several questions demand answers. What separates the best programmers from mediocre programmers? Is it possible to significantly improve my own productivity? How? For the remainder of the series I will do my best to answer these questions. This is still an active area of research, however, with many possible upcoming breakthroughs, but I will focus on things that have been well researched thus far.

Making Things Happen: A Simple Help Queue

2011-11-09T17:13:00-07:00

I’m currently a TA for BYU’s intro to programming class (CS 142) which is fun and interesting and a good opportunity to cement the intro concepts in my mind by teaching and explaining them as simply as possible.

There was one negative about being a TA. We had to stare at this all day:

This is the Help Queue as viewed by a TA, which sits open eternally, (while on shift at least) watching us like those annoying people who read over your shoulder, while you are trying to read some programming article or work on the wiki or run through next week’s lab. Then all of a sudden it’s like the bat signal, interrupting what you’re doing, because somebody, somewhere, needs help.

That’s fine and dandy but there are two problems with it though: it’s ugly and it doesn’t quite work the way we the TA’s or the students like.

See? Ugly.

Batman to the Rescue

I wish. Instead I volunteered to play around with the code.

Did I know Python? Not really, unless you count completing the first few challenges from the Python Challenge.

Did I know Google App Engine? Well, not unless you count hearing about it.

Had I ever written css? Nope, never.

Did that stop me? Nope, even though it would be generous to say I was poorly prepared for the project.

The hardest part of jumping into an existing project, as usual, was reading, understanding and getting up to speed with the existing code.

6 or seven iterations later based on feedback, 18 coding hours and only an hour and a half spent on the css, I ended up with this: Note that Sara requested help after Jared but gets priority because she is ready to pass off and go home.

I’ve gotten really excited about this. More than I probably should. Family members and roommates alike respond with a big whoop. However, I was really surprised by how easy picking up and creating a decent looking page using CSS really was. I’m not going to claim that it is beautiful though. Far from it.

“Ooh, Shiny! Can I Touch It?”

Yes, yes you may. I created a demo site so that you could play with and not cause too much mayhem.

Here’s the link TA’s use to give help. And here’s what students see when they need to ask for help.

There are still a few performance issues, made especially apparent by the recent changes in AppEngine’s price structure. Also the code is a little messy.

If anyone’s interested I’ll refactor it, fix performance issues and release it as a small open source project on my github account. Comments and Questions Welcome

Ants AI Challenge: Getting Started

2011-11-02T15:48:00-06:00

The Ants AI Challenge is an very cool competition to test and improve your coding chops. Essentially you write a program to control a colony of ants like the Bugger Queen from Ender’s Game would.

The official website does a pretty good job of getting you started, but I’ll give you some reccomendations having recently run through the startup process.

Go Through the 5 Minute Quickstart Guide

This will get you through the account setup process and show you how to upload your bot to be tested after you’ve modified it.

Go Through The Tutorial

This will help you develop simple versions of most of the tactics you’ll likely be working on for the rest of the contest:

Avoiding Collisions
Gathering Food
Exploring the Map
Managing New Ant Arrival through the Ant Hills
Exploring the Map
Attacking and Defending Hills
Battles with other Ants
Pathfinding

Download and Learn How to Use the Tools

As you build and improve your bot you’ll want to make sure that it still works and is improving. The simplest way to do that is to run a game using the tools. Actually the tutorial will go over this, but its important enough to say twice.

If your bot is crashing you’ll want to see what went wrong, but its not obvious at first glance how to do that. What you need to do is add the -E option to your game playing script. This will print out any errors (and any debug information piped to stderr) into the game_logs folder by bot. This has been extremely helpful for me.

Some Other Cool Resources

Just keep in mind that I haven’t dug into some of these yet.

Better starter bot code - Some changes have been made to the starter bots in the code repository, but haven’t yet been pushed to the zip files the website gives out. I know for a fact that the Java version is missing some useful code (seen vs unseen tracking). Download a zipfile or tarball if the code looks different than what you’ve got.

A better visualizer - so that you can see what your bot is thinking without having to resort to print statements. Very cool.

Tcp Servers - Is the official server playing games too slow for you? Run games on your own computer playing other bots doing the same thing through a tcp server. All the best bots are doing it …. :)

One Reason Array Based Programming Is Amazing [Video]

2011-10-27T13:36:00-06:00

I mentioned array based programming yesterday as one of the paradigms to consider when choosing a new programming language. If you want a good reason why you should consider them, take a look at this video of Conway’s Game of Life in APL, one of a few Array Based Programming Languages:

For some more craziness see some code golf for Project Euler Problems using K, another array based language. Remember though, this is code golf, so solutions should be expected to be hard to understand and much less readable than production code.

Expand Your Mind With Your Next Programming Language

2011-10-26T23:46:00-06:00

Choosing A Language

Rationale

When choosing a language to program in there are usually many factors that help you make the decision. The one most commonly relied on is probably familiarity. That is often okay because it is often the correct answer for the more important question, “Which language will help me get this project done best?” Usually the answer is the language you know best.

When I don’t have to use a language I already know for business reasons or expediency, I try and pick a language that will expand my mind, even if I don’t expect to use it for something “serious” or for pay.

Why do I struggle to accomplish the same tasks I can blaze through in Java or C++? I promise it’s not because I’m a glutton for punishment. I do it because knowing more than one language allows me to step out of the “I’m programming in X” mindset, where my solutions and thinking are restricted by language X. I want to have all of the ideas from programming available to me as I think about and solve the problem. Then I can implement those ideas in whichever language I end up using.

Methodology

Alright, by this point you’re hopefully on board with the idea that the next language you learn should expand your mind. Keeping that in mind, how should we choose our next language?

The important thing to understand is that most programming languages are more alike than they are different. Usually this is great. Once you’ve learned one programming language it is much easier to learn a second. For example it is not too hard for a C++ programmer to get used to Java, or even for a Java programmer to switch to Python or Ruby. The major ideas between these languages are mostly the same. A programmer who is switching will spend most of their learning time getting used to the libraries i.e. where everything they are used to is at.

These large similarities make it harder for us to pick a programming language that will be sufficiently different from what we are used to. If we don’t pick something weird enough we won’t get the kind of mental challenge and growth we are looking for.

A helpful tool to figure out how similar languages are is to group them into paradigms. I’m going to assume that your first or most recent experience is in an imperative / Object-Oriented language like Java or C++. If that’s not the case then I’d recommend starting there so that you can communicate easily with the majority of the profession who did start there.

If that doesn’t describe you, make a list of the languages that you know and cross off any of the paradigms that they fit. If you’re not sure, Wikipedia can help you out. Then pick a language from the remaining paradigms that interests you the most.

I’ve only listed one example for each paradigm. This is because an exhaustive list is 1. impossible, 2. exhausting to put together and 3. nobody will be satisfied when you’re done. If you don’t like my choice for a paradigm, choose your own. That’s fine. Great even. The best language to learn is one where you know you can get some help. If that help is someone you know in person, even better. That’s why when you see my plan for the future, you shouldn’t be surprised that two of them aren’t even on my primary list. My roommate right now is pretty good at Haskell and (almost) always available to answer questions in person. As for Racket instead of Scheme, there are good textbooks and one of my professors knows Racket really well.

Anyways …

Recommendations

Object-Oriented: Take your pick
Imperative/Procedural: An assembly language
Functional: Scheme
Logic: Prolog
Concatenative (or Stack-based): Joy
Array Based: J - on wikipedia
Dataflow: Lucid - on haskellWiki
Multi-Paradigm: Oz - on wikipedia

Minor Paradigms

After you’ve gone through each of the major categories this list could help you choose a sub-paradigm.

OO as Alan Kay envisioned: Smalltalk
OO Aspects: AspectJ
OO Roles: Perl 5 with Moose
OO protoype based: io
Functional “dynamically” typed: Scheme
Functional “statically” typed: Standard ML
Functional lazy: Haskell
Functional for error tolerance and distributed systems: Erlang
Statistics: R
Hardware Description: SystemVerilog
Text manipulation: Awk (and regex if you don’t know it well)

What About Me?

What I Know

1 semester of C
1 semester of Assembly
2 years of Java
A few weeks of Python
2 semesters of C++
Learning Scala right now

So I’ve hit the imperative and object oriented angle pretty hard and barely started touching on the functional side with Scala.

My Plan

Racket for programming languages course - Eager, dynamically typed functional language
Haskell from a friend before he graduates - Lazy, statically typed functional language
Prolog
Hadoop
J
Joy
????

P=NP: A Timey Wimey Conversation

2011-10-19T20:33:00-06:00

If you are baffled by the discussion about P=NP and what it all means, that’s ok. I was too. Luckily for you and past me this post can communicate with both of you through time and space.

Let’s get cracking then. Computer scientists do not know whether P = NP or P != NP.

Hold on a second, isn’t that trivial? Isn’t that the same thing as finding out whether N = 1 or not, or maybe if P = 0 or not? Wait, what do N and P stand for anyways?

That last one is a really good question. N and P aren’t variables, so you can’t solve for them algebraically like you tried to do there. They are actually abbreviations for different groups of problems. So N stands for nondeterministic and P for polynomial.

Ok, so you’re saying that there are two groups of problems, polynomial and nondete… whatever polynomial problems and that computer scientists don’t know if these groups of problems are the same or not? This still sounds easy. If there are different categories aren’t they by definition not the same?

Hmmm …. I guess it might be important to explain the basics of time complexity. Time comple-

Time complexity? Like how time travel stories get really complex and stop making sense if you think about them too much?

Heh, not quite, but speaking of time travel stories have you seen Primer?

You’ve at least read Heinlein’s By His Bootstraps?

No. You know I haven’t.

Oh, right.

Look you’re the one who contacted me. Let’s get on with this if you insist.

Alright. Time complexity means that some problems can be solved faster than others. Sure. The problems that can be solved quickly are also part of the set of problems that take longer to solve because you could take longer to solve the problem if you wanted to. To get more concrete, problems in P can be solved in polynomial time on a deterministic computer, and problems in NP can be solved in polynomial time on a nondeterministic computer.

Hold on. You keep mentioning “determinism” but I have no clue what that is supposed to mean.

Hmmm …

Well first of all determinism is best explained with its fraternal twin nondeterminism. As you know, an algorithm is a series of steps that when followed produce a consistent result. I guess I do now. Ok, well imagine that there is a choice in those series of steps, like a fork in the road, but only one path leads to your desired result. A deterministic computer would pick one of the paths somehow and try that path. Eventually it might run into a dead end and decide that it should have taken the other path and will come back and try that one.

Now this is where things get a little weird, a nondeterministic computer works almost the same except that it has a perfect, temporarry cloning machine. When it gets to a fork in the road it doesn’t even make a choice it makes a perfect copy of itself, including another cloning machine and one copy goes down one path and the other down the second. When one of the copies finds the answer he shoots a flare into the air (letting the computer’s operator know that it has an answer) and all the other copies disappear.

Now you could imagine that with all those free copies you do a lot more work in the same amount of time. There’s only one problem though, as you might have guessed we have no really good ideas for creating this magical cloning device. The only nondeterministic computer exists in Theory Land, a magical place Computer Scientists like to pretend exists. Why? … Because it lets them focus on the essential parts and not all the tricky implementation details when coming up with new ideas. Unfortunately for us, our friend the nondeterministic computer doesn’t seem likely to leave Theory Land anytime soon.

Then why talk about it?

Well, NP problems can be solved on a deterministic computer, which we do have in the real world. This just takes much longer than to solve problems from P. A deterministic computer has to do all the work himself, take all the paths until he gets an answer. In fact as soon as your problem is bigger than a toy example, the time to solve the problem grows so quickly that solving it is impractical.

Is there any way for a computer to solve large versions of those problems?

Well, if you don’t have to get the 100% accurate solution there are often ways to get slightly imprecise answers much quicker than it would take to find the perfect solution. The point is that finding the exact solution very quickly becomes impractical. If P=NP though, we have proven that there is an easy way to solve all these problems.

Isn’t that a good thing though?

For some things, absolutely! For instance we could improve shipping and infrustructure efficiency if we could solve the travelling salseman problem. There would be some problems though. We rely on the fact that some problems are hard in order to keep your credit card number and other sensitive data safe when you send it over the internet.

Come on, how often do I do that?

Ok, imagine a world without Amazon.

My book selection! And I’d have to actually get off my butt and go somewhere, like … Barnes and Noble I guess. No thanks.

I know exactly how you feel. Rolls eyes. A world where encryption can’t be trusted or can be easily broken may be a step backwards.

I guess I can see why finding out that P=NP might be important. You would have to implement some sort of disaster plan in order to prevent an e-commerce collapse. Speaking of disaster plans do you have Zombie plan yet?

Nope, I just kind of made it up as I went along. What!?! Uhh … nevermind. I don’t have much time left before this connection closes …

And this is what you choose to tell me? Can’t you at least throw out some stock ticker symbols?

No, shut up and let me finish one last thing. But … Shush! I forgot to tell you how people try to prove P=NP or P!=NP. Proving that a problem in NP is also P does get you faster solution for that problem, but that just means we mislabeled the problem in the first place. NP-complete problems are a subset of NP and have been proven equivalent to each other. You can actually translate each of them into any other NP-complete problem which is cool. Is it? Shush.

NP-complete problems have the property that if you can prove that one NP-complete problem is actually in P then you have proven that all NP-complete problems are in P and actually at the same time, that all NP problems are also in P.

I guess that is kind of cool. But I still don’t see why proving either P=NP or P!=NP is important.

Actually it may not be. Some people think it is pretty important, or at least prestigious and spend many hours trying to prove one of the two theorems. Every few years someone thinks they have it solved, only to have missed something. Others feel finding out one way or the other may actually not be that important.

You really could have picked something more valuable for us both than this.

Hey, it’ll save you from some confusion, and it is an important theory from Computer Science, which you are studying.

I am?

Uhh, I’m almost out of time, so before I go, here’s a piece of paper with the essentials in case you forget. Farewell and good luck!

The Note

P and N are not variables

P and NP are abbreviations for the names of two different classes of problems, (polynomial and nondeterministic polynomial)

P is a subset of NP. Problems in P are also in NP but as far as we know not vice versa.

Problems in P likely have practical solutions while problems in NP have no practical exact solutions.

If P=NP all problems that a computer can solve have a fairly easy solution (whether we curently know that solution yet or not)

This would be good for some things (transportationg and shipping among others) and bad for others (encryption and e-commerce)

NP-complete problems are a subgroup of NP and are all equivalent to each other. They are important because they can can be used to prove whether P=NP or P!=NP. Proving that one NP-complete problem is in P is equivalent to proving all NP problems are in P.

AAPL Buy $83 Sell $424