<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:blogger='http://schemas.google.com/blogger/2008' xmlns:georss='http://www.georss.org/georss' xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-26866996</id><updated>2024-10-09T02:30:10.415-05:00</updated><category term="products"/><category term="analysis"/><category term="java"/><category term="reports"/><category term="tips"/><category term="ideas"/><category term="language design"/><category term="bugs"/><category term="functional programming"/><category term="lisp"/><category term="techniques"/><category term="lighthearted"/><category term="best practices"/><category term="mathematica"/><category term="scheme"/><category term="types"/><category term="news"/><category term="performance"/><category term="ml"/><category term="ruby"/><category term="careers"/><category term="definitions"/><category term="design principles"/><category term="groovy"/><category term="javascript"/><category term="lambda calculus"/><category term="python"/><category term="webapps"/><category term="c++"/><category term="compilers"/><category term="contest"/><category term="ila"/><category term="interviews"/><category term="mathematca"/><title type='text'>JFKBits</title><subtitle type='html'>Programming Language Engineering Perspectives</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default?redirect=false'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default?start-index=26&amp;max-results=25&amp;redirect=false'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>158</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-26866996.post-6917875458426511670</id><published>2010-03-08T11:43:00.000-06:00</published><updated>2010-03-08T11:44:09.455-06:00</updated><title type='text'>Code Turbulence</title><content type='html'>Code turbulence: the chaotic, stochastic changes in the code base resulting from functionally equivalent changes to names, function signatures, etc. that require everyone to stop their progress and synchronize with.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/6917875458426511670/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/6917875458426511670' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/6917875458426511670'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/6917875458426511670'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2010/03/code-turbulence.html' title='Code Turbulence'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-5099288202315671606</id><published>2009-07-23T15:57:00.002-05:00</published><updated>2009-07-23T16:02:13.559-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="contest"/><category scheme="http://www.blogger.com/atom/ns#" term="performance"/><title type='text'>Speedlink: Engine Yard SHA-1 Contest Winner Writeup</title><content type='html'>Another blog from the Engine Yard&#39;s winning team: &lt;a href=&quot;http://fnord.phfactor.net/2009/07/23/paul-wins-an-iphone/&quot;&gt;Paul Wins an iPhone&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;This fills in some of the details left out in &lt;a href=&quot;http://www.win.tue.nl/cccc/sha-1-challenge.html&quot;&gt;the more prominent post&lt;/a&gt;, such as &lt;a href=&quot;http://cr.yp.to/nearsha.html&quot;&gt;the hand-optimized C code for getting 10x improvement on the SHA-1 for ordinary CPUs (not the GPU)&lt;/a&gt;.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/5099288202315671606/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/5099288202315671606' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/5099288202315671606'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/5099288202315671606'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2009/07/speedlink-engine-yard-sha-1-contest.html' title='Speedlink: Engine Yard SHA-1 Contest Winner Writeup'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-221943282193001536</id><published>2009-07-21T20:10:00.003-05:00</published><updated>2009-07-22T08:58:03.870-05:00</updated><title type='text'>A Crack at Cracking the EngineYard&#39;s SHA-1 Challenge</title><content type='html'>That was fun. This is a brief report on my strategy for the just-concluded Engine Yard programming contest. My farm of thirty-something CPU cores, with Polar SSL&#39;s SHA-1 hash code and my wrappings managed to come in about #57 with a Hamming distance of 36 for &lt;tt&gt;key dijkstra parameters format i10n expo dedicated crispin programming output inspect beta&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;I wasn&#39;t originally going to enter, but after reading some of the blogs and forum discussions, my interest was piqued and I started thinking about the problem.&lt;br /&gt;&lt;br /&gt;I began to get pulled in by counting up the order of magnitude for the search space, and the claim of someone in a discussion forum that he could &quot;solve&quot; the contest in about 5 minutes with about 30,000 nodes.  My estimate was that the search space was about 10^63, large enough to make a lot of decisions pretty clear.&lt;br /&gt;&lt;br /&gt;Also, I had higher priority activities (work, family, intermittent power outages that were traced back to a tree rubbing away the insulation on one wire of our service line and neutral), so I didn&#39;t start coding for the contest until it was already underway, which of course is a bit self-defeating, but I&#39;m happy with where I ended up.&lt;br /&gt;&lt;br /&gt;Hardware: I have a collection of nodes at work that are typically unused, most of them of relatively recent vintage meaning somewhere around 3GHz.  Mac, Linux, and Windows are all represented, most of them dual core with a couple of 4- or 8-core machines, all of them AMD 64 chips.  At the peak I had 37 cores running.&lt;br /&gt;&lt;br /&gt;Software: Prototyped in Mathematica, then recoded in C.  &lt;a href=&quot;http://polarssl.org/?page=show_source&amp;type=source&amp;file=sha1&quot;&gt;Polar SSL&#39;s SHA1&lt;/a&gt;, perfectly acceptable Hamming distance function based on a bytewise bit counting lookup table, and a simple enumeration technique for generating candidate phrases that ignores the capitalization and random character suffix parts of the search space.&lt;br /&gt;&lt;br /&gt;Parallelism: Manually start the contest executable by hand:&lt;pre&gt;&lt;br /&gt;% nohup contest &amp;&lt;br /&gt;&lt;/pre&gt; Not scalable, but fine for the dozen-odd machines I had.  I did two or three rounds of manually killing all the executables and restarting them with an updated version, which did get old.  To distribute the workload, the executable picks a random initial phrase.  Random generator is seeded with the random seed based on the current time and process ID.&lt;br /&gt;&lt;br /&gt;Tracking minimum: contest executable is compiled with the current best known minimum.&lt;br /&gt;&lt;br /&gt;Status reporting: All programs are started from the same network file system directory and write updates there.  When an executable finds a new minimum, it writes the phrase to a file using the Hamming distance, hostname, and PID in the filename.  My Windows explorer window, sorted by name looks something like this, and I just ignore or delete files with higher scores from nodes that don&#39;t know a better score has been found:&lt;pre&gt;&lt;br /&gt;36-mylinuxbox-10485.txt&lt;br /&gt;36-my8corewindowsbox-7361.txt&lt;br /&gt;38-mylinuxbox-9587.txt&lt;br /&gt;38-mymacmini-82075.txt&lt;br /&gt;38-an8corelinuxbox-30256.txt&lt;br /&gt;...&lt;/pre&gt;&lt;br /&gt;Performance: the SHA-1 algorithm (of course) seemed to be the limiting factor for my C code, clocking in somewhere around 1M hashes/sec.  My initial simple and stupid method of forming candidate phrases was responsible for cutting performance in about half:&lt;pre&gt;sprintf(phrase,&quot;%s %s %s %s %s %s %s %s %s %s %s %s&quot;,&lt;br /&gt;  contestDictionaryWords[word[11]]], contestDictionaryWords[word[10]]], &lt;br /&gt;  contestDictionaryWords[word[9]]], contestDictionaryWords[word[8]]], &lt;br /&gt;  contestDictionaryWords[word[7]]], contestDictionaryWords[word[6]]], &lt;br /&gt;  contestDictionaryWords[word[5]]], contestDictionaryWords[word[4]]], &lt;br /&gt;  contestDictionaryWords[word[3]]], contestDictionaryWords[word[2]]], &lt;br /&gt;  contestDictionaryWords[word[1]]], contestDictionaryWords[word[0]]]);&lt;br /&gt;&lt;/pre&gt;All this copying is horribly slow, but it got me underway.  Before retiring for the night on Monday, I refined it so this copy was only done every 1000 times.&lt;br /&gt;&lt;br /&gt;When I went back and wrote something that optimally copies only what&#39;s changed, I saw around a 50% to 75% speed improvement.  One of the tricks in the improvement is to make two copies of the dictionary: one with a space appended to each word (for the first 11 words) and one without the space.  Here&#39;s how the final phrase initialization code looks:&lt;pre&gt;&lt;br /&gt;void initPhrase()&lt;br /&gt;{&lt;br /&gt;  int i, w;&lt;br /&gt;  unsigned char *p;&lt;br /&gt;&lt;br /&gt;  initRandom();&lt;br /&gt;&lt;br /&gt;  /* Start with a random enumeration to partition work */&lt;br /&gt;  for(i=0; i &lt; 12; i++)&lt;br /&gt;    word[i] = random() % 1000;&lt;br /&gt;&lt;br /&gt;  p = phrase;&lt;br /&gt;  for(i=11; i &gt; 0; i--){&lt;br /&gt;    w = word[i];&lt;br /&gt;    strcpy(p, contestDictionaryWordsWithSpace[w]);&lt;br /&gt;    phrasepartend[i] = p += contestDictionaryWordWithSpaceLen[w];&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  w = word[0];&lt;br /&gt;  strcpy(p, contestDictionaryWords[w]);&lt;br /&gt;  p += contestDictionaryWordLen[w];&lt;br /&gt;  phraselen = p-phrase;&lt;br /&gt;&lt;br /&gt;  printf(&quot;initial phrase: %s\n&quot;, phrase);&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Hamming distance: my initial code for this seemed to perform well enough that I never bothered optimizing it further, e.g. performing the XORs a 32-bit word at a time:&lt;pre&gt;&lt;br /&gt;unsigned challengeDistance(unsigned char hash[20])&lt;br /&gt;{&lt;br /&gt;  int i, distance;&lt;br /&gt;  distance = 0;&lt;br /&gt;  for(i=0; i &lt; 20; i++){&lt;br /&gt;    distance += numOnes[challengeHash[i] ^ hash[i]];&lt;br /&gt;  }&lt;br /&gt;  return distance;&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt; For each byte, xor the hash with the corresponding byte in the challenge hash, and use a precomputed lookup table to count the ones.&lt;br /&gt;&lt;br /&gt;Performance reporting: The main program structure is a &lt;tt&gt;while(1)&lt;/tt&gt; loop that generates and evaluates candidate phrases in batches of a few million.  These batches are times and current results are printed to the console.  I had dozens of console windows where I could keep an eye on progress: &lt;pre&gt;&lt;br /&gt;390392448 evals, 1000000.000000 eval/sec, 40 session min, 36 min&lt;br /&gt;400392448 evals, 909090.909091 eval/sec, 40 session min, 36 min&lt;br /&gt;410392448 evals, 909090.909091 eval/sec, 40 session min, 36 min&lt;br /&gt;420392448 evals, 909090.909091 eval/sec, 40 session min, 36 min&lt;br /&gt;430392448 evals, 1000000.000000 eval/sec, 40 session min, 36 min&lt;br /&gt;440392448 evals, 909090.909091 eval/sec, 40 session min, 36 min&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Conclusion: I put together a modest-performing program that performed about 100 billion searches during the contest and found a phrase that had a score of 36.  One thing I&#39;d like to do is analyze the performance of the SHA-1 hash implementation itself and understand why it takes a few thousand machine cycles, what the limiting factor is, and whether there&#39;s a better performing implementation.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/221943282193001536/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/221943282193001536' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/221943282193001536'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/221943282193001536'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2009/07/crack-at-cracking-engineyards-sha-1.html' title='A Crack at Cracking the EngineYard&#39;s SHA-1 Challenge'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-9155394117619478813</id><published>2009-04-28T13:30:00.007-05:00</published><updated>2009-04-28T13:59:33.709-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="mathematca"/><title type='text'>What Tools do I Need to Program from a Winnebago</title><content type='html'>Geez, does this guy get it: &lt;a href=&quot;http://collison.ie/blog/2009/04/hacking-for-fun-and-profit-with-mathematica-and-the-google-analytics-api&quot;&gt;Patrick Collison blog : Hacking for fun and profit with Mathematica and the Google Analytics API&lt;/a&gt;.  It&#39;s always nice to see Mathematica used in more traditional programming and software prototyping senses.&lt;br /&gt;&lt;br /&gt;The maps in Patrick&#39;s post prompted me to post a fun result I got yesterday.&lt;br /&gt;&lt;br /&gt;Below is a map of the United States showing me where all the comfy cities are --- according to my definition of comfortable.  These are the cities that show up in &lt;a href=&quot;http://www.google.com/url?sa=t&amp;source=web&amp;ct=res&amp;cd=1&amp;url=http%3A%2F%2Freference.wolfram.com%2Fmathematica%2Fref%2FCityData.html&amp;ei=v033SciAKI3IM6WhoK4P&amp;usg=AFQjCNEbCMiNrVxk7f1C0O9o8KZTaiO9LA&quot;&gt;CityData&lt;/a&gt; where the current temperature, grabbed from &lt;a href=&quot;http://reference.wolfram.com/mathematica/ref/WeatherData.html&quot;&gt;WeatherData&lt;/a&gt;, lies within my personally-defined comfort zone of 74-84 degrees Fahenheit (23.2C-28.9C).&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFugKqQUJhV98yFgZ86oaZWPOYkuKJj9YjgqId-2asRUFbLhaVrj0gqrjOWb3hwUmbjorvTe1xVg5tMDL5BvffXPpgX4trWbWneYFryuIfTG5L0HY-D4h-JJFU8YsoWWSjv2j7/s1600-h/us-comfort-zone.png&quot;&gt;&lt;img style=&quot;display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 360px; height: 163px;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFugKqQUJhV98yFgZ86oaZWPOYkuKJj9YjgqId-2asRUFbLhaVrj0gqrjOWb3hwUmbjorvTe1xVg5tMDL5BvffXPpgX4trWbWneYFryuIfTG5L0HY-D4h-JJFU8YsoWWSjv2j7/s400/us-comfort-zone.png&quot; border=&quot;0&quot; alt=&quot;&quot;id=&quot;BLOGGER_PHOTO_ID_5329813873107777730&quot; /&gt;&lt;/a&gt;&lt;br /&gt;The post&#39;s title is a &lt;a href=&quot;http://stackoverflow.com&quot;&gt;stackoverflow.com&lt;/a&gt; nod and hints at the real question I had.  What I was curious to do but didn&#39;t have time is to explore the historic temperatures and answer the question &quot;if I lived in a motor home, where could I go during the year so that I was always living in a temperature range that I specify?&quot;  I don&#39;t really want to do that, I&#39;m used to the four seasons and lots of weather variety, but I started to wonder what those places would be.&lt;br /&gt;&lt;br /&gt;There&#39;s absolutely nothing clever to the code for this, it&#39;s sheer brute force actually. We just ask for all the U.S. cities in the database, look up their temperature, select those in the right range, and then plot them.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Needs[&quot;Units`&quot;];&lt;br /&gt;&lt;br /&gt;CityHasDesirableTemperatureQ[city_, low_, high_] := &lt;br /&gt; With[{lowC = ConvertTemperature[low, Fahrenheit, Celsius], &lt;br /&gt;   highC = ConvertTemperature[high, Fahrenheit, Celsius], &lt;br /&gt;   temp = WeatherData[city, &quot;Temperature&quot;]},&lt;br /&gt;  lowC &amp;lt;= temp &amp;lt;= highC];&lt;br /&gt;&lt;br /&gt;allCities = CityData[{All, &quot;UnitedStates&quot;}];&lt;br /&gt;&lt;br /&gt;Graphics[{Gray, CountryData[&quot;UnitedStates&quot;, &quot;Polygon&quot;], &lt;br /&gt;  PointSize[Large], Red, &lt;br /&gt;  Point[Reverse[CityData[#, &quot;Coordinates&quot;]]] &amp; /@ &lt;br /&gt;   Select[allCities, CityHasDesirableTemperatureQ[#, 74, 84] &amp;]}]&lt;br /&gt;&lt;/pre&gt;</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/9155394117619478813/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/9155394117619478813' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/9155394117619478813'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/9155394117619478813'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2009/04/what-tools-do-i-need-to-program-from.html' title='What Tools do I Need to Program from a Winnebago'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFugKqQUJhV98yFgZ86oaZWPOYkuKJj9YjgqId-2asRUFbLhaVrj0gqrjOWb3hwUmbjorvTe1xVg5tMDL5BvffXPpgX4trWbWneYFryuIfTG5L0HY-D4h-JJFU8YsoWWSjv2j7/s72-c/us-comfort-zone.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-7798046333430782003</id><published>2009-03-18T09:55:00.007-05:00</published><updated>2009-03-18T16:06:53.527-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="mathematica"/><category scheme="http://www.blogger.com/atom/ns#" term="products"/><title type='text'>Ship It</title><content type='html'>We just shipped the server side of one of my products, the &lt;a href=&quot;http://www.wolfram.com/products/gridmathematica/install/&quot;&gt;Wolfram Lightweight Grid System&lt;/a&gt;.  &lt;br /&gt;&lt;br /&gt;I am reminded of the immortal words of &lt;a href=&quot;http://coco.etechwds.com/&quot;&gt;Steve Bjork&lt;/a&gt;, famed &lt;a href=&quot;http://en.wikipedia.org/wiki/TRS-80_Color_Computer&quot;&gt;CoCo&lt;/a&gt; game programmer, when he said you know your program is going to be good because you start to hate it.  He seemed to capture that feeling you get when the thrill of seeing something cool come to life is long gone, and you&#39;re grinding through all the details that need to be nailed down so you can ship.&lt;br /&gt;&lt;br /&gt;Check out some of the &lt;a href=&quot;http://www.wolfram.com/broadcast/screencasts/lightweightgridsystem/&quot;&gt;screencasts&lt;/a&gt;.  Also, check out &lt;a href=&quot;http://blog.wolfram.com/2009/03/18/the-evolution-of-parallel-computing-with-mathematica/&quot;&gt;Roman&#39;s blog post&lt;/a&gt; for some of his longer-term perspective on parallel computing within Mathematica.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/7798046333430782003/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/7798046333430782003' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/7798046333430782003'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/7798046333430782003'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2009/03/ship-it.html' title='Ship It'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-8999698749446195178</id><published>2009-03-09T16:36:00.003-05:00</published><updated>2009-03-09T16:45:13.313-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="products"/><title type='text'>Speedlink: Harvest</title><content type='html'>&lt;a href=&quot;http://www.getharvest.com/&quot;&gt;Harvest&lt;/a&gt; is a time-tracking webapp that has a broader scope than &lt;a href=&quot;http://jfkbits.blogspot.com/2009/03/speedlink-time-tracking-tool.html&quot;&gt;Time Tracking Tool&lt;/a&gt;, and so far I like it a bit better for my needs.  The web interface is just a tad more efficient than that that of TTT.  &lt;br /&gt;&lt;br /&gt;Harvest is geared for businesses and generating invoices for billable hours, so it has this concept of Timesheets that let you pick tasks from your projects.  This matches how I work pretty well, even though I&#39;m not directly billing anybody.  On my timesheet I can see at a glance where my time has gone for that day, and how much time total I&#39;ve spent so far.  This is the number one thing I need on an ongoing basis because my schedule is so irregular and I task switch so often between home and work.&lt;br /&gt;&lt;br /&gt;I haven&#39;t looked into what support it has for planning, and tracking how estimates match actual figures, but I imagine the support is not there.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/8999698749446195178/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/8999698749446195178' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/8999698749446195178'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/8999698749446195178'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2009/03/speedlink-harvest.html' title='Speedlink: Harvest'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-8462145073396317728</id><published>2009-03-07T10:49:00.003-06:00</published><updated>2009-03-07T10:54:03.820-06:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="products"/><title type='text'>Speedlink: Time Tracking Tool</title><content type='html'>&lt;a href=&quot;http://products.lontzen.net/ttt/index.php&quot;&gt;Time Tracking Tool&lt;/a&gt;, a program that I&#39;m evaluating.  This is pretty close to what I&#39;d want to use.  It has the Play/Pause notion, as well as a task hierarchy.  When you activate a task, it shows the time you started, to help remind you if (when) you forget to pause it when you leave the computer.  That&#39;s probably the chief drawback to the whole problem, when you get distracted from a task (phone call, someone stops by your desk to ask a question), by nature it&#39;s hard to remember to tell the computer what&#39;s going on.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/8462145073396317728/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/8462145073396317728' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/8462145073396317728'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/8462145073396317728'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2009/03/speedlink-time-tracking-tool.html' title='Speedlink: Time Tracking Tool'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-8298278948587045350</id><published>2009-03-05T10:00:00.003-06:00</published><updated>2018-11-12T12:09:35.660-06:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="mathematica"/><title type='text'>Stephen Lifts the Lid on Wolfram|Alpha</title><content type='html'>We finally have said something public about &lt;a href=&quot;http://www.wolframalpha.com/&quot;&gt;Wolfram|Alpha&lt;/a&gt;.  The tagline for the product is &quot;Computational Knowledge Engine&quot;, and &lt;a href=&quot;http://blog.wolfram.com/2009/03/05/wolframalpha-is-coming/&quot;&gt;Stephen&#39;s blog post&lt;/a&gt; introduces the concept this way:&lt;br /&gt;
&lt;blockquote&gt;
Fifty years ago, when computers were young, people assumed that they’d quickly be able to handle all these kinds of things.&lt;br /&gt;
&lt;br /&gt;
And that one would be able to ask a computer any factual question, and have it compute the answer.&lt;/blockquote&gt;
</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/8298278948587045350/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/8298278948587045350' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/8298278948587045350'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/8298278948587045350'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2009/03/stephen-lifts-lid-on-wolframalpha.html' title='Stephen Lifts the Lid on Wolfram|Alpha'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-802754259195970848</id><published>2008-07-18T17:10:00.003-05:00</published><updated>2008-07-18T17:15:23.854-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="lighthearted"/><category scheme="http://www.blogger.com/atom/ns#" term="mathematica"/><title type='text'>Quote of the Day</title><content type='html'>We just wrapped up the &lt;a href=&quot;http://www.wolfram.com/news/events/summerschool2008/&quot;&gt;Advanced Mathematica Summer School&lt;/a&gt;, two weeks of lectures and on-site consulting and coaching for attendees in developing a Mathematica project idea.  As a developer, this was an awesome opportunity to work hands-on with real users and develop a better feeling for (1) real world problems, and (2) ways that people use our products, versus what we may assume they want and what they know.&lt;br /&gt;&lt;br /&gt;During the lightning presentations this morning, one attendee wanted to express his appreciation:&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;&lt;em&gt;&quot;I want to thank Wictor for writing some really cool code for me,&lt;br /&gt;and I want to thank Paul-Jean for helping me understand the code that Wiktor wrote.&quot;&lt;/em&gt;&lt;br /&gt;&lt;/blockquote&gt;</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/802754259195970848/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/802754259195970848' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/802754259195970848'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/802754259195970848'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/07/quote-of-day.html' title='Quote of the Day'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-1275331679331408415</id><published>2008-07-16T23:22:00.004-05:00</published><updated>2008-07-16T23:42:09.132-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="java"/><category scheme="http://www.blogger.com/atom/ns#" term="mathematica"/><category scheme="http://www.blogger.com/atom/ns#" term="reports"/><title type='text'>Yes We Controlled A Rover</title><content type='html'>I&#39;m just recovering from this weekend&#39;s &lt;a href=&quot;http://icfpcontest.org/&quot;&gt;ICFP Programming Contest&lt;/a&gt; in which we had to write a controller for a simulated Mars Rover.  I&#39;m pretty sure we&#39;re not winning anything, even assuming we ever get Mathematica in the hands of the contest organizers.  After the rules were announced the week before, I frantically went around trying to figure out if we could even enter a contest with Mathematica, a commercial product that requires a license key to run, finally coming up with two workable options.  Only to discover on Saturday that one of the options wasn&#39;t going to work because the submission form was capped at 20MB.  I had counted on being able to upload a stripped down copy of Mathematica weighing in at around 140MB.  The contest staffer on the IRC chat channel told me we would work it out later.&lt;br /&gt;&lt;br /&gt;We used Java to do the socket stuff, and JLink to go between Java and Mathematica.  This isn&#39;t as hard as it might sound, and especially as all three of us on the team develop Java/Mathematica hybrid products (Wolfram Workbench is Eclipse-based and webMathematica works with Tomcat).&lt;br /&gt;&lt;br /&gt;The code we worked on during the contest had Java in control and called Mathematica to process each event from the server (i.e. the rover).  This worked fine, but I wish I&#39;d listened more carefully to the 3 senior people at work who advised Mathematica be in control and call Java.&lt;br /&gt;&lt;br /&gt;Last night I inverted the control and rewrote the little event loop in Mathematica, and suddenly the light went on that we really missed out by not having the Mathematica front end part of our interactive development process.  I was quickly able to write a Dynamic-based dashboard: literally we could watch the controller&#39;s state updated in real time, including calculated state.  We saw things like instantaneous speed, heading, and the rover&#39;s steering and acceleration state.  It seems it would have been so much more helpful to see these numbers as our rover wandered around, to get more familiar with the problem.  As it was we stared at static log files of telemetry messages after the run was over.&lt;br /&gt;&lt;br /&gt;I hope I can write a little more about our experience later.  It was fun though.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/1275331679331408415/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/1275331679331408415' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/1275331679331408415'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/1275331679331408415'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/07/yes-we-controlled-rover.html' title='Yes We Controlled A Rover'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-7741846473411739104</id><published>2008-06-25T16:57:00.006-05:00</published><updated>2008-06-25T17:07:29.583-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="java"/><category scheme="http://www.blogger.com/atom/ns#" term="news"/><title type='text'>Replacing One Line with Two in Eclipse</title><content type='html'>Eclipse 3.4 is out today.  In reading through the list of new features, I find one that I literally could have used yesterday.&lt;br /&gt;&lt;br /&gt;Simply, I was doing a refactoring where I wanted to replace one import declaration by two, so my desired replacement text was two lines.  Eclipse has a nice multi-file search/replace feature.  However, I couldn&#39;t find or figure out if the replace text field would accept some syntax to indicate &quot;I want a line break here.&quot;  Trying the obvious &lt;tt&gt;\n&lt;/tt&gt; resulted in a literal backslash-n.&lt;br /&gt;&lt;br /&gt;In Eclipse 3.4, we can now use &lt;tt&gt;\R&lt;/tt&gt; to indicate a platform-dependent line break, and &lt;tt&gt;\n&lt;/tt&gt; and &lt;tt&gt;\r&lt;/tt&gt; now work as expected.  Usually I hate being on the bleeding edge, but this time I curse my reluctance to have been trying out the release candidates.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/7741846473411739104/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/7741846473411739104' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/7741846473411739104'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/7741846473411739104'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/06/replacing-one-line-with-two-in-eclipse.html' title='Replacing One Line with Two in Eclipse'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-1181994440200541303</id><published>2008-06-17T13:44:00.006-05:00</published><updated>2008-06-17T16:18:36.139-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="mathematica"/><category scheme="http://www.blogger.com/atom/ns#" term="news"/><title type='text'>Breakout</title><content type='html'>In relation to the release of Firefox 3.0, someone at &lt;a href=&quot;http://wolfram.com&quot;&gt;work&lt;/a&gt; mentioned that we have &lt;a href=&quot;http://www.wolfram.com/broadcast/screencasts/&quot;&gt;screencasts&lt;/a&gt; too, and I was happy to see one where Theo shows off &lt;a href=&quot;http://www.wolfram.com/broadcast/screencasts/theo/breakout/&quot;&gt;Luc Berthelet&#39;s version of Breakout&lt;/a&gt;.  Theo scrolls through the code on the screencast, so by judicious pausing you can probably copy it, but I wish I could find the notebook somewhere, for, you know, research purposes.&lt;br /&gt;&lt;br /&gt;In a serendipitous connection, Wikipedia tipped me off that in the past month or so Luc has broken out of his position at Electronic Arts to start &lt;a href=&quot;http://www.tirnua.com/&quot;&gt;Tir Nua&lt;/a&gt;, something in the virtual world line of goods.  One of his new employees that he brought with him is a name familiar around my workplace, former Wolfram Research employee &lt;a href=&quot;http://www.tirnua.com/Team/Sarah.php&quot;&gt;Sarah Flannery&lt;/a&gt;.  I remember hearing Sarah give a talk last fall about the webMathematica-based Wiki she&#39;d worked up for Sims Online players, and how she&#39;d tackled the runaway inflation in the Sims Online economy (all sources and no sinks).&lt;br /&gt;&lt;br /&gt;People like Luc and Sarah are fun users for Wolfram employees like me who consider themselves mainline software designers and like to see Mathematica applied more like a regular programming language in areas very different from what you might typically expect.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/1181994440200541303/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/1181994440200541303' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/1181994440200541303'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/1181994440200541303'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/06/breakout.html' title='Breakout'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-6960579986257425978</id><published>2008-06-04T22:02:00.004-05:00</published><updated>2008-06-04T22:37:11.038-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="lisp"/><category scheme="http://www.blogger.com/atom/ns#" term="products"/><title type='text'>Picking a CLOS Implementation to Play With</title><content type='html'>Being provoked into trying to learn a bit more about CLOS&#39;s method of multiple dispatch, and wanting to find an implementation, led me on this search path in a few spare minutes:&lt;br /&gt;&lt;br /&gt;1. Search Google for &lt;a href=&quot;http://www.google.com/search?q=clos&quot;&gt;CLOS&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;2. Click &lt;a href=&quot;http://www.dreamsongs.com/CLOS.html&quot;&gt;http://www.dreamsongs.com/CLOS.html&lt;/a&gt;, which is on the site of Richard Gabriel (&quot;worse is better&quot;)&lt;br /&gt;&lt;br /&gt;3. Click the intriguing title &lt;a href=&quot;http://www.dreamsongs.com/Files/clos-cacm.pdf&quot;&gt;CLOS: Integrating Object-Oriented and Functional Programming&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;4. Read with interest on the &lt;em&gt;first page&lt;/em&gt; the essence of multi methods:&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;&lt;em&gt;A generic function, or polymorphic function as it is sometimes called, is one whose implementation depends on the types of its arguments. That is, the code that is executed for a particular set of arguments is mechanically determined by the types of the arguments.&lt;br /&gt;&lt;br /&gt;In strictly functional languages, an operation is defined by a single, monolithic piece of code; any argument-type conditionality is expressed as code explicitly programmed in by the user. In contrast, the CLOS notion of generic functions supports automatic selection from among separately defined, typespecific implementational parts. Yet, from the client’s point of view, generic functions and traditional ordinary functions are called in the same way: The procedural abstraction barrier is still in force.&lt;br /&gt;&lt;/em&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;5. Start looking for implementations.  Search reddit for &lt;a href=&quot;http://www.reddit.com/r/programming/search?q=lisp+implementations&amp;x=0&amp;y=0&quot;&gt;&quot;lisp implementations&quot;&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;6. Click &lt;a href=&quot;http://common-lisp.net/~dlw/LispSurvey.html&quot;&gt;Common Lisp Implementations: A Survey&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;7. The survey lists 11 Common Lisp implementations.  Of these, only 5 are listed as available on the Unixes and Windows.  Of those, only 3 do not have commercial licenses.  Two of the remaining choices have weird names: Armed Bear Common Lisp, which does not look promising (its own home page says, under &quot;Bugs&quot;, that &quot;ABCL&#39;s CLOS is intolerably slow&quot;), and Embedded Common Lisp (ECL).  It&#39;s difficult to tell what immediately from Wikipedia or Google what kind of support has.  But the remaining candidate, GNU clisp, appears to be in good standing, judging from the Wikipedia article (where it is claimed this is the platform Paul Graham used for Viaweb), and the &lt;a href=&quot;http://sourceforge.net/project/stats/detail.php?group_id=1355&amp;ugn=clisp&amp;mode=alltime&amp;&amp;type=prdownload&quot;&gt;Sourceforge stats&lt;/a&gt; which claim 100 downloads a day in the past week.&lt;br /&gt;&lt;br /&gt;8. Find the &lt;a href=&quot;http://clisp.cons.org/&quot;&gt;Clisp site&lt;/a&gt;.  Another Lisp site &lt;a href=&quot;http://www.lispcast.com/drupal/node/29&quot;&gt;without an obvious Download link&lt;/a&gt;.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/6960579986257425978/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/6960579986257425978' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/6960579986257425978'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/6960579986257425978'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/06/picking-clos-implementation.html' title='Picking a CLOS Implementation to Play With'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-5011795672120113712</id><published>2008-05-30T21:14:00.005-05:00</published><updated>2008-05-30T23:42:45.428-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="analysis"/><category scheme="http://www.blogger.com/atom/ns#" term="techniques"/><title type='text'>Multiple Definitions</title><content type='html'>We had an interesting discussion at work yesterday debating whether Mathematica supports multimethods with its variety of pattern-matching.  So I&#39;m taking the opportunity to do a mini-survey of the multiple dispatch spectrum, starting with overloading.  As far as Mathematica, it clearly gives you the power of selecting from multiple definitions based on runtime information; more on this in a minute.&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/Method_overloading&quot;&gt;Overloading &lt;/a&gt; allows you to write different definitions of a function or method, and the definition used when you call the function depends on the number and types of the arguments you pass them.  That is, the overload resolution is done by the compiler at compile time with static type analysis.  Whatever type your arguments are, and exactly those types, will determine which definition is chosen.&lt;br /&gt;&lt;br /&gt;Multimethods let you write different definitions of a function or method, and the definition used when you call the function depends on the number and types of the arguments you pass to them, as examined at runtime.  Now you can write one definition for a class, and specializations for its subclasses if so desired, and the definition will be chosen based on the actual type of the arguments at runtime.&lt;br /&gt;&lt;br /&gt;You get into a fuzzy gray third area if the runtime values can also be used to select different definitions.  This is where Mathematica lies, because its pattern matches can be used to differentiate between a list of at least one element and a list of at least two elements, or between the symbols &lt;a href=&quot;http://reference.wolfram.com/mathematica/ref/Null.html&quot;&gt;Null&lt;/a&gt; and &lt;a href=&quot;http://reference.wolfram.com/mathematica/ref/Infinity.html&quot;&gt;Infinity&lt;/a&gt;.  What&#39;s useful for writing symbolic algorithms turns out to be useful for regular programmers.&lt;br /&gt;&lt;br /&gt;It seems that ML&#39;s structural pattern matching is also in this fuzzy gray third area, and that helps me make an interesting connection.  For my purposes, multiple dispatch is interesting because it&#39;s the way to do &lt;a href=&quot;http://jfkbits.blogspot.com/2007/12/revisiting-tree-traversal-patterns.html&quot;&gt;expression tree traversal&lt;/a&gt;.  That is, it lets you write pretty printers and type checkers and the like without needing to code the dispatch yourself (if the node is actually a lambda abstraction, do &lt;em&gt;this&lt;/em&gt;, but if it&#39;s a cons cell, do &lt;em&gt;that&lt;/em&gt;).  What I&#39;m noticing now is that one way or another, multimethods and pattern matching are giving you the notational convenience that I enjoy in writing tree traversals, with still perhaps an edge to pattern matching on that score.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/5011795672120113712/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/5011795672120113712' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/5011795672120113712'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/5011795672120113712'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/05/multiple-definitions.html' title='Multiple Definitions'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-1213044604726032688</id><published>2008-05-28T22:11:00.002-05:00</published><updated>2008-05-28T22:23:54.428-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="compilers"/><category scheme="http://www.blogger.com/atom/ns#" term="products"/><title type='text'>The ucc Compiler</title><content type='html'>Someone with the name dreamAnders has posted announcements to comp.compilers and comp.lang.c for his small open-source C compiler, &lt;a href=&quot;http://sourceforge.net/projects/ucc&quot;&gt;ucc&lt;/a&gt;, the chief attraction of which is that the source code is meant to be small and self-explanatory.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/1213044604726032688/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/1213044604726032688' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/1213044604726032688'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/1213044604726032688'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/05/ucc-compiler.html' title='The ucc Compiler'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-6600800364113906018</id><published>2008-05-21T21:41:00.005-05:00</published><updated>2008-05-21T23:13:00.638-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="analysis"/><category scheme="http://www.blogger.com/atom/ns#" term="lisp"/><category scheme="http://www.blogger.com/atom/ns#" term="scheme"/><title type='text'>On &quot;I can&#39;t believe I&#39;m praising Tcl&quot;</title><content type='html'>Today we look at a refreshing use case for a programming language, where economy of expression in calling functions from a read-eval-print loop is prized.  Raganwald recently tagged &lt;a href=&quot;http://www.yosefk.com/blog/i-cant-believe-im-praising-tcl.html&quot;&gt;&quot;I can&#39;t believe I&#39;m praising TCL&quot;&lt;/a&gt;, in which the embedded systems author helped us understand how TCL made for a great debugger command environment, and the &quot;pop infix languages (C/Java/Python/Ruby/you name it)&quot; don&#39;t.  &lt;br /&gt;&lt;br /&gt;In this case the author wants to define some glue function or functions, and then in the language&#39;s interactive interpreter, call his function over and over.  He&#39;s not programming; he&#39;s commanding, so the function calls need to be short and sweet, so he doesn&#39;t mind typing them for hour after hour as he thinks about the real problem, a buggy piece of embedded hardware.  The author wants a command shell, where he uses his command interface to an embedded device as a kind of gdb replacement.  An example session to set breakpoints, inspect memory, etc., looks like this:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ pmem 0 stat&lt;br /&gt;IDLE&lt;br /&gt;$ pmem 0 bkpt 0 0xbff&lt;br /&gt;$ pmem 0 bkpt 1 0xa57&lt;br /&gt;$ pmem 0 cmd run&lt;br /&gt;$ pmem 0 stat&lt;br /&gt;DEBUG&lt;br /&gt;$ pmem 0 pc&lt;br /&gt;0xbff&lt;br /&gt;$ pmem 0 rstack&lt;br /&gt;3 return addresses&lt;br /&gt;addr 0: 0x0005&lt;br /&gt;addr 1: 0x05a8&lt;br /&gt;addr 2: 0x0766&lt;br /&gt;$ pmem 0 cmd stp&lt;br /&gt;$ pmem 0 pc&lt;br /&gt;0xc00&lt;br /&gt;&lt;/pre&gt; The question then is whether a language you may be designing or using could support something close to this syntactic economy for calling functions.&lt;br /&gt;&lt;br /&gt;The argument for Tcl over the pop infix language may perhaps be best summarized by this quote: &lt;blockquote&gt;And then we have interactive shells. And in Python it’s doit(&quot;xx&quot;,&quot;yy&quot;). And in Lisp it’s (doit &quot;xx&quot; &quot;yy&quot;), or (doit :xx :yy), or (doit xx yy) if you make it a macro. And in Ruby it’s doit :xx :yy, if you use symbols and omit parens. And that’s about as good as you can get without using your own parser as in doit &quot;xx yy&quot;, which can suck in the (more rare) case when you do need to evaluate expressions before passing parameters, and doesn’t completely remove overhead. Also note how all these languages use (), which makes you press Shift, instead of [] which doesn’t. Ruby and Perl let you omit (), but it costs in readability. And [] is unanimously reserved for less important stuff than function calls.&lt;/blockquote&gt;&lt;br /&gt;&lt;h2&gt;Analysis&lt;/h2&gt;&lt;br /&gt;First we see the emphasis is not on defining functions, on programming, but on using, on the syntax for calling, functions.  The author wants an internal DSL (Domain Specific Language).&lt;br /&gt;&lt;br /&gt;Second, it should be noted that in discussing () that Scheme lets you use [] as well as ().  There&#39;s good Scheme style, where [] is reserved for the &lt;tt&gt;let&lt;/tt&gt; blocks, but if you open up Dr. Scheme or Chez Scheme, define some choice Turtle graphics functions, and start typing commands like &lt;tt&gt;[penup] [rt 45] [pendown] [fd 100]&lt;/tt&gt; it will work fine.&lt;br /&gt;&lt;br /&gt;One thing the author noted is that TCL&#39;s preference for strings over variables makes &lt;tt&gt;bkpt&lt;/tt&gt; a string and &lt;tt&gt;$bkpt&lt;/tt&gt; a variable, whereas in the pop infix languages, it&#39;s the variables that get lexical economy and strings that need delimiters.  Because of this preference, calling a Tcl command lets you pass in what look like symbols, but you treat them as strings in the command definition.  Hence, for the author&#39;s use case, a chief consideration seemed to be a way to write symbolic arguments, where the command in question may take a one-of-several subcommand or option name, without lexical decoration like string delimiters or single quotes or colons.  I wonder if this was really a language design goal of Tcl, because it&#39;s hard to understand the motivation for the string-vs-variable syntax any other way.  For all that, enumerated types or sum types are a known language feature that meet the author&#39;s criterion.  In Standard ML you could define a &lt;tt&gt;datatype Subcommand = bkpt | status | memset&lt;/tt&gt; or the like, and now undecorated references like &lt;tt&gt;bkpt&lt;/tt&gt; can appear as arguments.&lt;br /&gt;&lt;br /&gt;Note if you do define your functions as Scheme macros, to address the symbol/string problem, and if you modified Scheme to accept an S-expression forest on each line (i.e. no need to delimit a top-level input line with parens), you&#39;d have the economic expression of Tcl.  I think this is worth considering in some circles where Scheme may be more familiar.&lt;br /&gt;&lt;h2&gt;Footnote&lt;/h2&gt; This could be a nice motify for the &quot;language design of the day&quot;: extend the basic Scheme-like interpreter to support an extensible debugger command interface.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/6600800364113906018/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/6600800364113906018' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/6600800364113906018'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/6600800364113906018'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/05/on-i-cant-believe-im-praising-tcl.html' title='On &quot;I can&#39;t believe I&#39;m praising Tcl&quot;'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-6237862612538930732</id><published>2008-05-16T22:39:00.004-05:00</published><updated>2008-05-17T00:19:50.804-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="language design"/><category scheme="http://www.blogger.com/atom/ns#" term="lisp"/><category scheme="http://www.blogger.com/atom/ns#" term="scheme"/><category scheme="http://www.blogger.com/atom/ns#" term="techniques"/><title type='text'>Call by Need Lambda a Poor Man&#39;s Macro?</title><content type='html'>I&#39;ve been seriously considering why more languages don&#39;t include a call-by-need lambda (hereafter called &quot;lazy lambda&quot;).  With its delayed evaluation it offers macro-like powers for writing non-looping forms (they&#39;re bad for loop forms since the arguments are evaluated but once by design), but they don&#39;t have the bloating effect of macro expansion (if code size is more important to you than the time overhead of a function call), and they are hygienic.  They&#39;re not a cure-all, but this seems to be an approach which can still be wielded effectively by trained professionals.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;How to Use Lazy Lambda&lt;/h2&gt;&lt;br /&gt;Here&#39;s how a lazy lambda would work in an existing language like Scheme.  You have a new abstraction &lt;tt&gt;lazy-lambda&lt;/tt&gt; with precisely the same syntax as &lt;tt&gt;lambda&lt;/tt&gt;.  When applied, &lt;tt&gt;lazy-lambda&lt;/tt&gt; follows call by need evaluation rules.  That means arguments to the lazy procedure are not evaluated at the call site, but only when their corresponding formal parameter is first referenced inside the body.  On this reference, the evaluated argument value is remembered for future references inside the body.  Here&#39;s how you might write something like &lt;a href=&quot;http://jfkbits.blogspot.com/2008/02/call-by-name-yo-elvis.html&quot;&gt;Groovy&#39;s Elvis operator&lt;/a&gt;:&lt;br /&gt;&lt;pre class=&quot;code&quot;&gt;&lt;br /&gt;(define ?: (lazy-lambda (expr default) &lt;br /&gt;  (if (null? expr) default expr)))&lt;br /&gt;&lt;/pre&gt; The thing I like is that this is automatically hygienic: it works fine even if you call it in an environment with variables named &lt;tt&gt;expr&lt;/tt&gt; or &lt;tt&gt;default&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Implementation&lt;/h2&gt;&lt;br /&gt;I like to divide my thinking about new features into two phases: introduction and use.  When &lt;tt&gt;lazy-lambda&lt;/tt&gt; is introduced, the parser needs to create a structure essentially identical to that of a lambda, namely an arguments list and a body expression, but of course it needs to be marked as a different type from lambda so the evaluator can distinguish the two.  Lazy lambda is used in two ways, once when it is evaluated (e.g. when a &lt;tt&gt;lazy-lambda&lt;/tt&gt; expression is returned from a function) and once when it is applied.&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;&lt;strong&gt;Summary of Lazy Lambda Implementation&lt;/strong&gt;&lt;br /&gt;Introduction: &lt;tt&gt;&#39;(lazy-lambda args body)&lt;/tt&gt;&lt;br /&gt;Evaluation: &lt;tt&gt;&#39;(closure (lazy-lambda args body) env)&lt;/tt&gt;&lt;br /&gt;Application: Bind args to thunks, evaluate thunks in a memoizing way&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;When &lt;tt&gt;lazy-lambda&lt;/tt&gt; is evaluated, it should create a closure, the pair of the &lt;tt&gt;lazy-lambda&lt;/tt&gt; expression and the environment in which it was evaluated.  This closure needs to be marked as a different type from a regular closure. Alternatively the evaluator can be arranged to check the type of the &quot;lambda&quot; expression: a closure may look like &lt;tt&gt;&#39;(closure (lambda args body) env)&lt;/tt&gt; or &lt;tt&gt;&#39;(closure (lazy-lambda args body) env)&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;What happens a &lt;tt&gt;lazy-lambda&lt;/tt&gt; closure is applied? We know you don&#39;t evaluate the arguments, but what then?  As with eager closures, you first create a new environment scope.  Then, instead of binding the formal arguments to the evaluated values of the arguments, you bind the formal arguments to some box (container) of the unevaluated argument expressions.  The container needs to be distinct from any other language type so that the evaluator knows how to treat it.  That is, once we set up the new environment scope, we will simply evaluate the body of the &lt;tt&gt;lazy-lambda&lt;/tt&gt; under this new environment, and the references to the formal arguments need to be handled in this special memoizing way.  So let&#39;s introduce an expression type &lt;tt&gt;delayed&lt;/tt&gt;, which is not directly creatable in the language, and we bind each formal argument &lt;tt&gt;x&lt;/tt&gt; with actual argument expression &lt;tt&gt;A&lt;/tt&gt; to the &quot;value&quot; &lt;tt&gt;(delayed A env)&lt;/tt&gt;.  The &lt;tt&gt;env&lt;/tt&gt; value will be needed when we evaluate &lt;tt&gt;A&lt;/tt&gt;, because we will need to evaluate it in the calling environment, not whatever environment is in effect when the symbol is first referenced.  (Think about what gets returned by &lt;tt&gt;(lambda (x) ((lazy-lambda (x) (begin (set! x (+1 x)) x))) x)&lt;/tt&gt;.)  Then when the evaluator handles variable references and gets a &lt;tt&gt;delayed&lt;/tt&gt; value back from the environment lookup, it&#39;s time to do the memoizing: evaluate &lt;tt&gt;A&lt;/tt&gt; in environment &lt;tt&gt;env&lt;/tt&gt;, rebind (&lt;tt&gt;set!&lt;/tt&gt;) the referenced variable with its evaluated value, and return that.&lt;br /&gt;&lt;h2&gt;Conclusion&lt;/h2&gt;&lt;br /&gt;None of the popular scripting languages I can think of (Javascript, Perl, Python, Ruby) have a macro facility, but most of them have anonymous functions which evaluate to closures in the Scheme sense.  On the other hand, they also tend to have a richer set of control structures (Perl&#39;s &lt;tt&gt;unless&lt;/tt&gt;), and they have other mechanisms (dare I say classes and objects?) which address most of the killer applications for macros, and hence for lazy-lambda.  But for all that, I&#39;d have to figure that those languages, and the tinier languages or DSLs, could add this feature.&lt;br /&gt;&lt;br /&gt;Macros are subtle things to get right, and I&#39;m sure there are deficiencies I haven&#39;t addressed here.  But that shouldn&#39;t stop us from thinking about these issues, and I think there&#39;s some potential value in the call by need lambda.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/6237862612538930732/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/6237862612538930732' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/6237862612538930732'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/6237862612538930732'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/05/call-by-need-lambda-poor-mans-macro.html' title='Call by Need Lambda a Poor Man&#39;s Macro?'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-7092982407729258840</id><published>2008-05-14T22:48:00.003-05:00</published><updated>2008-05-14T23:05:02.958-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="java"/><category scheme="http://www.blogger.com/atom/ns#" term="language design"/><category scheme="http://www.blogger.com/atom/ns#" term="lisp"/><category scheme="http://www.blogger.com/atom/ns#" term="scheme"/><category scheme="http://www.blogger.com/atom/ns#" term="techniques"/><title type='text'>Thoughts on an S-Expression Parser</title><content type='html'>In this post we look at a tiny Scheme parser, and generalize it to a non-language-specific S-expression parser that can be subclassed to handle any language design, as a base class for studying language varieties.&lt;br /&gt;&lt;br /&gt;As I mentioned last time, I recently wrote a tiny Schemish interpreter, and then started extracting from it a basic interpreter platform that other people or myself could use to try out different language designs or implementations.  One practical goal would be for instructors to provide the framework and have students modify it.  The approach is not mine, it&#39;s from Sam Kamin&#39;s &quot;&lt;a href=&quot;http://www.amazon.com/gp/redirect.html?ie=UTF8&amp;amp;location=http%3A%2F%2Fwww.amazon.com%2FProgramming-Languages-Samuel-N-Kamin%2Fdp%2F0201068249%3Fie%3DUTF8%26s%3Dbooks%26qid%3D1209783903%26sr%3D8-2&amp;amp;tag=j09-20&amp;amp;linkCode=ur2&amp;amp;camp=1789&amp;amp;creative=9325&quot;&gt;Programming Languages: An Interpreter-Based Approach&lt;/a&gt;&lt;img src=&quot;http://www.assoc-amazon.com/e/ir?t=j09-20&amp;amp;l=ur2&amp;amp;o=1&quot; alt=&quot;&quot; style=&quot;border: medium none  ! important; margin: 0px ! important;&quot; width=&quot;1&quot; border=&quot;0&quot; height=&quot;1&quot; /&gt;&quot; (1990).&lt;br /&gt;&lt;br /&gt;The original Schemish parser had these two parsing methods:&lt;br /&gt;&lt;pre class=&quot;code&quot;&gt;&lt;br /&gt;protected Expr parseExpr() throws ParseError&lt;br /&gt;{&lt;br /&gt;  Token token = consumeToken();&lt;br /&gt;  switch(token.type)&lt;br /&gt;  {&lt;br /&gt;  case &#39;&quot;&#39;: return new StringLit(token.text);&lt;br /&gt;  case &#39;(&#39;: return parseExprList(token);&lt;br /&gt;  default:&lt;br /&gt;    // Test the token to see if it&#39;s a numeric&lt;br /&gt;    Int intExpr = Int.fromString(token.text);&lt;br /&gt;    if(intExpr == null)&lt;br /&gt;      return new Atom(token.text);&lt;br /&gt;    else&lt;br /&gt;      return intExpr;&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;  &lt;br /&gt;protected Expr parseExprList(Token openParen) throws ParseError&lt;br /&gt;{&lt;br /&gt;  Vector acc = new Vector();&lt;br /&gt;  while(peekToken().type != &#39;)&#39;)&lt;br /&gt;  {&lt;br /&gt;    Expr expr = parseExpr();&lt;br /&gt;    acc.add(expr);&lt;br /&gt;  }&lt;br /&gt;  Token closeParen = consumeToken();&lt;br /&gt;  &lt;br /&gt;  // Handle special forms&lt;br /&gt;  ExprList retval = null;&lt;br /&gt;  if(acc.size() &amp;gt; 0 &amp;amp;&amp;amp; ((Expr)acc.firstElement()).isAtom())&lt;br /&gt;  {&lt;br /&gt;    Expr head = (Expr)acc.firstElement();&lt;br /&gt;    if(head.isAtom(&quot;lambda&quot;))&lt;br /&gt;    {&lt;br /&gt;      String headName = head.getAtom().getText();&lt;br /&gt;      String lambdaUsage = &quot;Syntax error: &quot;+&lt;br /&gt;        &quot;expected (&quot;+headName+&quot; (arg ...) body)&quot;;&lt;br /&gt;      if(acc.size() != 3)&lt;br /&gt;        throwError(openParen, lambdaUsage);&lt;br /&gt;      Expr argExpr = (Expr)acc.get(1);&lt;br /&gt;      if(!argExpr.isExprList())&lt;br /&gt;        throwError(openParen, lambdaUsage);&lt;br /&gt;      ExprList argList = (ExprList)argExpr;&lt;br /&gt;      Expr[] args = argList.getElements();&lt;br /&gt;      HashSet argSet = new HashSet(); // to check for duplicates&lt;br /&gt;      for(int i=0; i &amp;lt; args.length; ++i)&lt;br /&gt;      {&lt;br /&gt;        if(!args[i].isAtom())&lt;br /&gt;          throwError(openParen, lambdaUsage);&lt;br /&gt;        boolean wasAdded = argSet.add(args[i]);&lt;br /&gt;        if(!wasAdded)&lt;br /&gt;          throwError(openParen, &quot;Syntax error: argument &quot;+&lt;br /&gt;            args[i].getAtom()+&quot; appears more than once&quot;);&lt;br /&gt;      }&lt;br /&gt;      Expr bodyExpr = (Expr)acc.get(2);&lt;br /&gt;      retval = Lambda(argList, bodyExpr);&lt;br /&gt;    }&lt;br /&gt;  }&lt;br /&gt;    &lt;br /&gt;  if(retval == null)&lt;br /&gt;    retval = new ExprList(acc);&lt;br /&gt;  &lt;br /&gt;  retval.filename = m_filename;&lt;br /&gt;  retval.firstLine = openParen.line;&lt;br /&gt;  retval.lastLine = closeParen.line;&lt;br /&gt;  return retval;&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The lambda-handling code is big, and obscures the structure of the list parsing part. That may not be so bad if you&#39;re working on this one interpreter, you don&#39;t forget the basic list-parsing structure, because it&#39;s simple.  You&#39;ll care more about all the special forms.  But for my purposes, of wanting to write many interpreters from the same code base, we want to be able to talk about different interpreters, and it&#39;s a little awkward: you&#39;re always sharing code patches, and having to explain where they go.  Instead, why not give a sufficient base class, and share complete subclasses?&lt;br /&gt;&lt;br /&gt;The improvement gives a parser class which handles only S-expressions; it knows only atoms and expression lists, it doesn&#39;t know any language constructs at all: it knows no keywords.  You can think of it as generating just a tree structure, like an XML parser.  Parsers for a particular language design will be written as subclasses, and will override the methods &lt;tt&gt;constructAtom&lt;/tt&gt; and &lt;tt&gt;constructExprList&lt;/tt&gt;.  These methods are reminiscent of the actions in YACC-like parser generators, the blocks of code that construct parse tree elements given data named by the grammar rule symbols and from the general lexer and parser state (e.g. line numbers in our case).&lt;br /&gt;&lt;br /&gt;Thus, &lt;tt&gt;parseExpr&lt;/tt&gt; and &lt;tt&gt;parseExprList&lt;/tt&gt; reduce to tiny fragments and subclasses can flesh out the meat of special forms in &lt;tt&gt;constructExprList&lt;/tt&gt;:&lt;br /&gt;&lt;pre class=&quot;code&quot;&gt;&lt;br /&gt;public Expr parseExpr() throws ParseException&lt;br /&gt;{&lt;br /&gt;  Token token = consumeToken();&lt;br /&gt;  Expr retval = (token.type == &#39;(&#39;)? &lt;br /&gt;    parseExprList(token) &lt;br /&gt;    : constructAtom(token);&lt;br /&gt;  return retval;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;protected Expr parseExprList(Token openParen)&lt;br /&gt;  throws ParseException&lt;br /&gt;{&lt;br /&gt;  Vector acc = new Vector();&lt;br /&gt;  while(peekToken().type != &#39;)&#39;)&lt;br /&gt;  {&lt;br /&gt;    Expr element = parseExpr();&lt;br /&gt;    acc.add(element);&lt;br /&gt;  }&lt;br /&gt;  Token closeParen = consumeToken();&lt;br /&gt;&lt;br /&gt;  Expr retval = constructExprList(acc, m_filename, &lt;br /&gt;    openParen.line, closeParen.line);&lt;br /&gt;  return retval;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;protected Expr constructAtom(Token token)&lt;br /&gt;{&lt;br /&gt;  return new Atom(token.text);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;protected Expr constructExprList(&lt;br /&gt;  AbstractCollection exprs, String srcId, &lt;br /&gt;  int startLine, int endLine)&lt;br /&gt;{&lt;br /&gt;  ExprList retval = new ExprList(exprs);&lt;br /&gt;  retval.filename = srcId;&lt;br /&gt;  retval.firstLine = startLine;&lt;br /&gt;  retval.lastLine = endLine;&lt;br /&gt;  return retval;&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Now the Schemish interpreter can subclass Parser and call it SchemeParser, overriding constructExprList to handle special syntactic forms, and overriding constructAtom to handle language-specific literals, such as character literals or rational number literals (&lt;tt&gt;2/3&lt;/tt&gt;).</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/7092982407729258840/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/7092982407729258840' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/7092982407729258840'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/7092982407729258840'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/05/thoughts-on-s-expression-parser.html' title='Thoughts on an S-Expression Parser'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-3543252807701643030</id><published>2008-05-09T23:15:00.004-05:00</published><updated>2008-05-10T00:45:13.454-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="java"/><category scheme="http://www.blogger.com/atom/ns#" term="lisp"/><category scheme="http://www.blogger.com/atom/ns#" term="scheme"/><title type='text'>Thoughts and Code for the S-Expression Lexer</title><content type='html'>My recent project has been a tiny Schemish interpreter, and more recently have been considering a design that would work well as the Java counterpart to Kamin&#39;s chapter 1 interpreter, a calculator language using Lisp syntax, which is to be morphed into any one of a variety of different languages based on S-expression syntax.&lt;br /&gt;&lt;br /&gt;So, this post is just for sharing some observations in working on the lexical analyzer part, as well as its code.&lt;br /&gt;&lt;br /&gt;Just so it&#39;s abundantly clear, the problem a lexical analyzer solves, for S-expressions anyway, is to get a stream of parens, quotes, and symbols out of a character input source.&lt;br /&gt;&lt;br /&gt;The apostrophe quote operator, as for creating list literals like &lt;tt&gt;&#39;(lambda (x) x)&lt;/tt&gt;, appears to be a token not requiring whitespace.  I&#39;ve never seen the apostrophe not have preceding whitespace, but after testing MzScheme, ChezScheme and a few other minor Scheme implementations, it&#39;s apparent that &lt;tt&gt;a&#39;b&#39;c&lt;/tt&gt; is the same as &lt;tt&gt;a &#39;b &#39;c&lt;/tt&gt;.  I&#39;ve not done any serious development in Scheme, but I wonder whether this is common knowledge among Scheme programmers.&lt;br /&gt;&lt;br /&gt;Similarly, Scheme read-eval-print loops appear to accept forests of S-expressions at the prompt, not just one expression.  If you type &lt;tt&gt;1 2 3 4&lt;/tt&gt; and the values get echoed back.  This is useful for multiple defines, or for pasting code.  Obviously, any REPL loop should support this behavior if at all possible.&lt;br /&gt;&lt;br /&gt;I was happy to see that StreamTokenizer tracks line numbers.  Having line numbers available &quot;for free&quot; helps error messages instantly.  For what it&#39;s worth, I want my scanner to track column numbers too, but I understand if James Gosling (marked as the author of StreamTokenizer) didn&#39;t want to get into arguments about what a tab is worth.&lt;br /&gt;&lt;br /&gt;The xUnit test frameworks are a much welcome tool for testing language processors.  In 1998 I was fiddling with diff-based Perl scripts for automated regression testing.  Writing JUnit tests is a lot more fun, productive, and exact than relying on textual comparison with blessed output.&lt;br /&gt;&lt;br /&gt;Rather than subclass &lt;a href=&quot;http://java.sun.com/j2se/1.4.2/docs/api/java/io/StreamTokenizer.html&quot;&gt;StreamTokenizer&lt;/a&gt;, I wanted to change the consumer&#39;s model of looping from &quot;while the current token is not end-of-file&quot; to a standard Iterator terminating on &lt;tt&gt;!hasNext()&lt;/tt&gt;.  This required making use of &lt;tt&gt;pushBack()&lt;/tt&gt; on every token get, but I considered the ease of use for the client had a slight edge.&lt;br /&gt;&lt;br /&gt;Using an iterator means you need an object to return, so there is a small &lt;tt&gt;Token&lt;/tt&gt; class returned by the iterator.  Token bundles the public fields of the StreamTokenizer that represent the token value.  I opted not to have StreamTokenizer parse numbers.  Since the intended use is ultimately interpreters for arbitrary languages merely based on S-expression syntax, I needed to let them have their own numeric literal syntax and domain of representation (double? float? BigInteger?).  Now for some code.  Token looks like this:&lt;br /&gt;&lt;pre class=&quot;code&quot;&gt;&lt;br /&gt;package jfkbits;&lt;br /&gt;import java.io.StreamTokenizer;&lt;br /&gt;&lt;br /&gt;public class Token&lt;br /&gt;{&lt;br /&gt;  public static final int SYMBOL = StreamTokenizer.TT_WORD;&lt;br /&gt;  public int type;&lt;br /&gt;  public String text;&lt;br /&gt;  public int line;&lt;br /&gt;&lt;br /&gt;  public Token(StreamTokenizer tzr)&lt;br /&gt;  {&lt;br /&gt;    this.type = tzr.ttype;&lt;br /&gt;    this.text = tzr.sval;&lt;br /&gt;    this.line = tzr.lineno();&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  public String toString()&lt;br /&gt;  {&lt;br /&gt;    switch(this.type)&lt;br /&gt;    {&lt;br /&gt;    case SYMBOL:&lt;br /&gt;    case &#39;&quot;&#39;:&lt;br /&gt;      return this.text;&lt;br /&gt;    default:&lt;br /&gt;      return String.valueOf((char)this.type);&lt;br /&gt;    }&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt; Here&#39;s how the type field, an int, as defined in StreamTokenizer, works: for &quot;ordinary characters&quot;, the type is the character literal cast to an int.  The LispTokenizer marks &#39;(&#39;, &#39;)&#39;, and the apostrophe quote operator &#39;\&#39;&#39; as ordinary.  In code using Token, this reads very naturally, as in &lt;tt&gt;if(token.type == &#39;(&#39;) return parseExprList();&lt;/tt&gt;.  For &quot;words&quot;, atoms in our case, the type is a negative int defined by &lt;tt&gt;StreamTokenizer.TT_WORD&lt;/tt&gt;, which Token redefines as SYMBOL.  If we read a Token t with &lt;tt&gt;t.type==Token.SYMBOL&lt;/tt&gt;, the good stuff (like &quot;42&quot;, &quot;x&quot;, &quot;eval&quot;, or &quot;lambda&quot;) is in &lt;tt&gt;t.text&lt;/tt&gt;.  String literals have the type code of the delimiter, so &lt;tt&gt;t.type==&#39;&quot;&#39;&lt;/tt&gt; means we&#39;ve got a string literal, the contents of which (without the delimiter!) are also in &lt;tt&gt;t.text&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;And what about string literals?  Strictly speaking, the same decision I made about numeric literals should also apply to string literals.  Namely, that different languages have different syntaxes and potentially different representations.  Perhaps I should not configure StreamTokenizer to recognize string literals.  In that case, the parser would get atoms containing the double quotes, themselves, and the parser would be expected to split it apart.  Currently, I don&#39;t expect this tool to be used for studying string literals very much.&lt;br /&gt;&lt;br /&gt;And finally, for the code itself:&lt;br /&gt;&lt;pre class=&quot;code&quot;&gt;&lt;br /&gt;package jfkbits;&lt;br /&gt;&lt;br /&gt;import java.io.BufferedReader;&lt;br /&gt;import java.io.IOException;&lt;br /&gt;import java.io.Reader;&lt;br /&gt;import java.io.StreamTokenizer;&lt;br /&gt;import java.io.StringReader;&lt;br /&gt;import java.util.Iterator;&lt;br /&gt;&lt;br /&gt;public class LispTokenizer implements Iterator&lt;br /&gt;{&lt;br /&gt;  // Instance variables have default access to allow unit tests access.&lt;br /&gt;  StreamTokenizer m_tokenizer;&lt;br /&gt;  IOException m_ioexn;&lt;br /&gt;&lt;br /&gt;  /** Constructs a tokenizer that scans input from the given string.&lt;br /&gt;   * @param src A string containing S-expressions.&lt;br /&gt;   */&lt;br /&gt;  public LispTokenizer(String src)&lt;br /&gt;  {&lt;br /&gt;    this(new StringReader(src));&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  /** Constructs a tokenizer that scans input from the given Reader.&lt;br /&gt;   * @param r Reader for the character input source&lt;br /&gt;   */&lt;br /&gt;  public LispTokenizer(Reader r)&lt;br /&gt;  {&lt;br /&gt;    if(r == null)&lt;br /&gt;      r = new StringReader(&quot;&quot;);&lt;br /&gt;    BufferedReader buffrdr = new BufferedReader(r);&lt;br /&gt;    m_tokenizer = new StreamTokenizer(buffrdr);&lt;br /&gt;    m_tokenizer.resetSyntax(); // We don&#39;t like the default settings&lt;br /&gt;&lt;br /&gt;    m_tokenizer.whitespaceChars(0, &#39; &#39;);&lt;br /&gt;    m_tokenizer.wordChars(&#39; &#39;+1,255);&lt;br /&gt;    m_tokenizer.ordinaryChar(&#39;(&#39;);&lt;br /&gt;    m_tokenizer.ordinaryChar(&#39;)&#39;);&lt;br /&gt;    m_tokenizer.ordinaryChar(&#39;\&#39;&#39;);&lt;br /&gt;    m_tokenizer.commentChar(&#39;;&#39;);&lt;br /&gt;    m_tokenizer.quoteChar(&#39;&quot;&#39;);&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  public boolean hasNext()&lt;br /&gt;  {&lt;br /&gt;    if(m_ioexn != null)&lt;br /&gt;      return false;&lt;br /&gt;    try&lt;br /&gt;    {&lt;br /&gt;      m_tokenizer.nextToken();&lt;br /&gt;    }&lt;br /&gt;    catch(IOException e)&lt;br /&gt;    {&lt;br /&gt;      m_ioexn = e;&lt;br /&gt;      return false;&lt;br /&gt;    }&lt;br /&gt;    if(m_tokenizer.ttype == StreamTokenizer.TT_EOF)&lt;br /&gt;      return false;&lt;br /&gt;    m_tokenizer.pushBack();&lt;br /&gt;    return true;&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  /** Return the most recently caught IOException, if any,&lt;br /&gt;   * &lt;br /&gt;   * @return&lt;br /&gt;   */&lt;br /&gt;  public IOException getIOException()&lt;br /&gt;  {&lt;br /&gt;    return m_ioexn;&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  public Token nextToken()&lt;br /&gt;  {&lt;br /&gt;    return (Token)next();&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  public Object next()&lt;br /&gt;  {&lt;br /&gt;    try&lt;br /&gt;    {&lt;br /&gt;      m_tokenizer.nextToken();&lt;br /&gt;    }&lt;br /&gt;    catch(IOException e)&lt;br /&gt;    {&lt;br /&gt;      m_ioexn = e;&lt;br /&gt;      return null;&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    Token token = new Token(m_tokenizer);&lt;br /&gt;    return token;&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  public void remove()&lt;br /&gt;  {&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt; A token stream can be processed something like this:&lt;br /&gt;&lt;pre class=&quot;code&quot;&gt;&lt;br /&gt;LispTokenizer tzr = new LispTokenizer(&quot;(define x 42)&quot;);&lt;br /&gt;for(Iterator it=tzr; it.hasNext(); ) {&lt;br /&gt;   Token token = it.nextToken();&lt;br /&gt;   processToken(token);&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt; And finally, some unit tests.&lt;br /&gt;&lt;pre class=&quot;code&quot;&gt;&lt;br /&gt;package jfkbits;&lt;br /&gt;&lt;br /&gt;import java.io.StreamTokenizer;&lt;br /&gt;import java.util.Iterator;&lt;br /&gt;&lt;br /&gt;import junit.framework.TestCase;&lt;br /&gt;&lt;br /&gt;public class LispTokenizerTest extends TestCase&lt;br /&gt;{&lt;br /&gt;  public LispTokenizerTest(String name)&lt;br /&gt;  {&lt;br /&gt;    super(name);&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  public void testLispTokenizerIterator()&lt;br /&gt;  {&lt;br /&gt;    LispTokenizer tzr;&lt;br /&gt;    &lt;br /&gt;    tzr = new LispTokenizer(&quot;&quot;);&lt;br /&gt;    assertFalse(tzr.hasNext());&lt;br /&gt;&lt;br /&gt;    tzr = new LispTokenizer(&quot; &quot;);&lt;br /&gt;    assertFalse(tzr.hasNext());&lt;br /&gt;&lt;br /&gt;    tzr = new LispTokenizer(&quot;\n&quot;);&lt;br /&gt;    assertFalse(tzr.hasNext());&lt;br /&gt;    &lt;br /&gt;    tzr = new LispTokenizer(&quot;7&quot;);&lt;br /&gt;    assertTrue(tzr.hasNext());&lt;br /&gt;    checkToken(1, &quot;7&quot;, Token.SYMBOL, tzr.next());&lt;br /&gt;    assertFalse(tzr.hasNext());&lt;br /&gt;&lt;br /&gt;    tzr = new LispTokenizer(&quot;()&quot;);&lt;br /&gt;    assertTrue(tzr.hasNext());&lt;br /&gt;    checkToken(1, null, &#39;(&#39;, tzr.next());&lt;br /&gt;    checkToken(1, null, &#39;)&#39;, tzr.next());&lt;br /&gt;    assertFalse(tzr.hasNext());&lt;br /&gt;&lt;br /&gt;    tzr = new LispTokenizer(&quot;(newline)&quot;);&lt;br /&gt;    assertTrue(tzr.hasNext());&lt;br /&gt;    checkToken(1, null, &#39;(&#39;, tzr.next());&lt;br /&gt;    checkToken(1, &quot;newline&quot;, Token.SYMBOL, tzr.next());&lt;br /&gt;    checkToken(1, null, &#39;)&#39;, tzr.next());&lt;br /&gt;    assertFalse(tzr.hasNext());&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  private void checkToken(int line, String text, int type, Object tokenObj)&lt;br /&gt;  {&lt;br /&gt;    assertNotNull(tokenObj);&lt;br /&gt;    assertTrue(tokenObj instanceof Token);&lt;br /&gt;    Token token = (Token)tokenObj;&lt;br /&gt;    assertEquals(line, token.line);&lt;br /&gt;    if(text != null &amp;&amp; token.type == StreamTokenizer.TT_WORD)&lt;br /&gt;      assertEquals(text, token.text);&lt;br /&gt;    assertEquals(type, token.type);&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  public void testCharacterMapping()&lt;br /&gt;  {&lt;br /&gt;    assertEquals((int)&#39;(&#39;, mkTokenizer(&quot;(&quot;).nextToken().type);&lt;br /&gt;    assertEquals((int)&#39;)&#39;, mkTokenizer(&quot;)&quot;).nextToken().type);&lt;br /&gt;    assertEquals((int)&#39;\&#39;&#39;, mkTokenizer(&quot;&#39;&quot;).nextToken().type);&lt;br /&gt;&lt;br /&gt;    assertEquals(StreamTokenizer.TT_WORD, mkTokenizer(&quot;0&quot;).nextToken().type);&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  public void testSimpleLispExpressions()&lt;br /&gt;  {&lt;br /&gt;    test(&quot;&quot;,new String[]{});&lt;br /&gt;    test(&quot;()&quot;,new String[]{&quot;(&quot;,&quot;)&quot;});&lt;br /&gt;    test(&quot; ()&quot;,new String[]{&quot;(&quot;,&quot;)&quot;});&lt;br /&gt;    test(&quot;\n()&quot;,new String[]{&quot;(&quot;,&quot;)&quot;});&lt;br /&gt;    test(&quot;() &quot;,new String[]{&quot;(&quot;,&quot;)&quot;});&lt;br /&gt;    test(&quot;()\n&quot;,new String[]{&quot;(&quot;,&quot;)&quot;});&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  public void testLispExpressionsWithComments()&lt;br /&gt;  {&lt;br /&gt;    test(&quot;;Comment here\n()&quot;, new String[]{&quot;(&quot;,&quot;)&quot;});&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  public void testLispExpressionsWithStrings()&lt;br /&gt;  {&lt;br /&gt;    test(&quot;\&quot;\&quot;&quot;,new String[]{&quot;&quot;});&lt;br /&gt;    test(&quot;\&quot;a\&quot;&quot;,new String[]{&quot;a&quot;});&lt;br /&gt;    test(&quot;\&quot; a\&quot;&quot;,new String[]{&quot; a&quot;});&lt;br /&gt;    test(&quot;\&quot;a \&quot;&quot;,new String[]{&quot;a &quot;});&lt;br /&gt;    test(&quot;(print \&quot;Hello world.\\n\&quot;);&quot;,&lt;br /&gt;      new String[]{&quot;(&quot;,&quot;print&quot;,&quot;Hello world.\n&quot;,&quot;)&quot;});&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  public void testFactorial()&lt;br /&gt;  {&lt;br /&gt;    String src = &lt;br /&gt;      &quot;;;\n&quot;+&lt;br /&gt;      &quot;;; Compute the factorial of a given number\n&quot;+&lt;br /&gt;      &quot;\n&quot;+&lt;br /&gt;      &quot;(defun fact (n)\n&quot;+ &lt;br /&gt;      &quot;  (if (&amp;lt; n 2)\n&quot;+ &lt;br /&gt;      &quot;      1\n&quot;+&lt;br /&gt;      &quot;    (* n (fact (- n 1)))\n&quot;+&lt;br /&gt;      &quot;    )\n&quot;+&lt;br /&gt;      &quot;  )\n&quot;&lt;br /&gt;      ;&lt;br /&gt;    String[] expected = {&lt;br /&gt;      &quot;(&quot;,&quot;defun&quot;,&quot;fact&quot;,&quot;(&quot;,&quot;n&quot;,&quot;)&quot;,&lt;br /&gt;      &quot;(&quot;,&quot;if&quot;,&quot;(&quot;,&quot;&amp;lt;&quot;,&quot;n&quot;,&quot;2&quot;,&quot;)&quot;,&lt;br /&gt;      &quot;1&quot;,&lt;br /&gt;      &quot;(&quot;,&quot;*&quot;,&quot;n&quot;,&quot;(&quot;,&quot;fact&quot;,&quot;(&quot;,&quot;-&quot;,&quot;n&quot;,&quot;1&quot;,&quot;)&quot;,&quot;)&quot;,&quot;)&quot;,&lt;br /&gt;      &quot;)&quot;,&lt;br /&gt;      &quot;)&quot;&lt;br /&gt;    };&lt;br /&gt;    test(src,expected);&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  static void test(String src, String[] expectedTokens)&lt;br /&gt;  {&lt;br /&gt;    LispTokenizer tzr = mkTokenizer(src);&lt;br /&gt;    int i = 0;&lt;br /&gt;    for(Iterator it=tzr; it.hasNext(); i++)&lt;br /&gt;    {&lt;br /&gt;      Token token = (Token)it.next();&lt;br /&gt;      assertNotNull(token);&lt;br /&gt;      assertTrue(&quot;Expected &quot;+expectedTokens.length+&quot; tokens, got more&quot;, &lt;br /&gt;        i &amp;lt; expectedTokens.length);&lt;br /&gt;      assertEquals(expectedTokens[i], token.toString());&lt;br /&gt;    }&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  static LispTokenizer mkTokenizer(String src)&lt;br /&gt;  {&lt;br /&gt;    return new LispTokenizer(src);&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/3543252807701643030/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/3543252807701643030' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/3543252807701643030'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/3543252807701643030'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/05/thoughts-on-s-expression-lexer.html' title='Thoughts and Code for the S-Expression Lexer'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-5684415033924757230</id><published>2008-05-07T22:25:00.003-05:00</published><updated>2008-05-07T23:17:13.081-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="language design"/><title type='text'>Motivations for Little Interpreters</title><content type='html'>We&#39;re continuing a series about writing little interpreters.  Although it is fun and cool, so are isolated guitar licks for about the first five minutes; long-term you probably will want to play whole songs if you want to have listeners.  Similarly, you need an application, killer or not, for an interpreter or it will lie around collecting bit dust and require &quot;maintenance&quot; (right, Dijkstra?).&lt;br /&gt;&lt;br /&gt;I suppose you could say I&#39;m talking about an interpreter for a Domain Specific Language here, but really all I mean is that if you care enough to write an interpreter, you may as well take the opportunity to build into it something you can use, and for this post I&#39;ll address the area of a useful domain.&lt;br /&gt;&lt;br /&gt;Specifically, if you care about an interpreter it means it will support a domain you care about: photos, 3D graphics, matrices.&lt;br /&gt;&lt;br /&gt;In the implementation language of the interpreter, you will want to define&lt;br /&gt;1. Data structures&lt;br /&gt;2. Ways to construct and operate on them&lt;br /&gt;&lt;br /&gt;and you&#39;ll make provisions for these in your language.  Thinking these through carefully, and probably writing them into a runtime library complete with unit tests, is a respectable way to approach things before sticking an interpreter front end onto the data and code of your domain of interest.&lt;br /&gt;&lt;br /&gt;Let me give you one simple example of an interpreter I wrote for personal productivity.  In 1995 I wrote &quot;Joel&#39;s Calculator&quot; to help me write status reports for my manager, and also to do some basic statistical analysis.  (No, I didn&#39;t have Excel; I was working with an HP-UX workstation.)  Our embedded systems group had gotten a rare opportunity for a complete product rewrite, and we needed to track our time and estimates carefully.  We had to track time spent on particular modules and count overhead, and in our weekly status report submit an account of time on our various modules, with overhead (anything not in a particular module) evenly divided among our projects.  I would keep a log of time spent that looked something like this: &lt;pre&gt;Tue May 7&lt;br /&gt;0840-0920 Email&lt;br /&gt;0920-1135 Serial driver&lt;br /&gt;1250-1300 Ovr&lt;br /&gt;1300-1400 Phase 2 mtg&lt;br /&gt;1400-1600 Serial driver testing&lt;br /&gt;1600-1715 Code review &lt;br /&gt;&lt;/pre&gt; It was an easy way to keep track, in a text editor.  Then at the end of the week, I would sum the &quot;billable hours&quot; for modules like Serial driver.  What I ended up wanting was a way to do &quot;time arithmetic&quot;:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;serial = 11:35-9:20 + 16:00-14:00&lt;br /&gt;ovr = 9:20-8:40+0:10+14:00-12:50+17:15-16:00&lt;br /&gt;&lt;/pre&gt; This worked very well for me as a workflow.  I could write simple expressions like this, assign them to variables, combine the results gradually, and so on.  This was really a desk calculator, but I was pleased with the effect that I could enter times textually in a very natural way, use simple infix arithmetic notation, and have it just work.&lt;br /&gt;&lt;br /&gt;In this case my chief domain of interest was pretty simple, a pair of hour and minutes, and the operations on the domain were also fairly simple, but a little challenging to get right.&lt;br /&gt;&lt;br /&gt;Of course, you may be coming at this from the perspective that the goal and the domain are already well defined.  For example, you&#39;re working with a library of interest and you&#39;d like to play with it interactively; you can envision making Scheme bindings and writing abstractions to make it do something useful.  Or you even have a very powerful system written in an existing but limited language, but you need to bust out of the limits.  Using a lightweight Lisp/Scheme interpreter that you can modify to script some macros to generate code in another language may solve some scaling problems for you.&lt;br /&gt;&lt;br /&gt;So, if you want to start a little interpreter project, which I&#39;d like to encourage, before you get started, pick a goal, pick a domain, and refine the domain as the first step.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/5684415033924757230/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/5684415033924757230' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/5684415033924757230'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/5684415033924757230'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/05/motivations-for-little-interpreters.html' title='Motivations for Little Interpreters'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-8560480465935698013</id><published>2008-05-06T15:05:00.003-05:00</published><updated>2008-05-06T15:20:03.747-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="products"/><title type='text'>Google AppEngine Activation Arrives</title><content type='html'>I signed up for &lt;a href=&quot;http://code.google.com/appengine/&quot;&gt;Google AppEngine&lt;/a&gt;, &lt;a href=&quot;http://reddit.com/r/programming/search?q=appengine&quot;&gt;well-covered on programming.reddit.com&lt;/a&gt;, about a month ago, but got the &quot;don&#39;t email us, we&#39;ll email you&quot; message at that time.  Today I got my email letting me know I can start farming on the Google spread.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/8560480465935698013/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/8560480465935698013' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/8560480465935698013'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/8560480465935698013'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/05/google-appengine-activation-arrives.html' title='Google AppEngine Activation Arrives'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-6742801363557922308</id><published>2008-05-02T21:42:00.004-05:00</published><updated>2008-05-02T22:55:16.143-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="ideas"/><category scheme="http://www.blogger.com/atom/ns#" term="lisp"/><category scheme="http://www.blogger.com/atom/ns#" term="scheme"/><title type='text'>Little Lisps: Programming Candy or Spinach?</title><content type='html'>Last time (&lt;a href=&quot;http://jfkbits.blogspot.com/2008/04/streamtokenizer-and-declarative-lexing.html&quot;&gt;StreamTokenizer and Declarative Lexing&lt;/a&gt;), I mentioned an idea of presenting language designs in this space as a puzzle to be solved, like a crossword puzzle.  I invented the idea there and its been growing on me. &lt;br /&gt;&lt;br /&gt;The previous week or so I&#39;d been reading Sam Kamin&#39;s &quot;&lt;a href=&quot;http://www.amazon.com/gp/redirect.html?ie=UTF8&amp;amp;location=http%3A%2F%2Fwww.amazon.com%2FProgramming-Languages-Samuel-N-Kamin%2Fdp%2F0201068249%3Fie%3DUTF8%26s%3Dbooks%26qid%3D1209783903%26sr%3D8-2&amp;amp;tag=j09-20&amp;amp;linkCode=ur2&amp;amp;camp=1789&amp;amp;creative=9325&quot;&gt;Programming Languages: An Interpreter-Based Approach&lt;/a&gt;&lt;img src=&quot;http://www.assoc-amazon.com/e/ir?t=j09-20&amp;amp;l=ur2&amp;amp;o=1&quot; alt=&quot;&quot; style=&quot;border: medium none  ! important; margin: 0px ! important;&quot; width=&quot;1&quot; border=&quot;0&quot; height=&quot;1&quot; /&gt;&quot;, so this didn&#39;t seem like such a crazy idea as it may sound.  In &quot;Programming Languages&quot;, Kamin starts with an interpreter for an extremely simple language using Lisp syntax, and procedes with each chapter to show what modifications need to be made to get interpreters for Lisp, Scheme, SASL, APL, Smalltalk, Prolog and others.  They all use Lisp syntax, so the code changes are kept quite manageable, and the reader can focus on the essential differences in scoping, evaluation strategies, and the like.&lt;br /&gt;&lt;br /&gt;It seems, as a first step, if the Language Design Crossword Puzzle were to be a reality, that making available a standard interpreter source such as Kamin&#39;s is a reasonable idea.  &quot;Given interpreter0, add macros&quot; would be a puzzle.  &quot;Given interpreter0, add object serialization.&quot;  This is similar in spirit to &lt;a href=&quot;http://jfkbits.blogspot.com/2008/02/evaluation-animation-and-debugging.html&quot;&gt;comments I made earlier about the utility of lambda calculus&lt;/a&gt; for studying language features.&lt;br /&gt;&lt;br /&gt;But is the Language Design Crossword Puzzle a good idea?  What&#39;s the point?  &lt;a href=&quot;http://arcanesentiment.blogspot.com/2008/03/dont-i-like-to-hack-lisp.html&quot;&gt;Arcane Sentiment&lt;/a&gt;, a blog I discovered and subscribed to today, introspects on writing little language implementations.  He describes certain parts of the exercise as &quot;programming candy&quot;, and ironically, they&#39;re often the parts written in C, a series of little programming victories.  The hard, ill-defined problems to be written in Lisp are the parts that tend to slow him down and demotivate him. (Arcane, I hope I&#39;m fairly characterizing that post.  Please correct me if not.)&lt;br /&gt;&lt;br /&gt;I had to chuckle in self-recognition, as earlier this week I was watching the first few SICP videos, evaluating the examples in a Scheme-ish interpreter I&#39;d whipped up on Monday and Tuesday, extending it during pauses in dialog while Sussman wrote on the blackboard. Until he hit the pi fraction example and I realized I wasn&#39;t at all sure if I wanted to right at that moment be writing code to rationalize denominators and factor fractions or whatever else Scheme might do to support exact rational number (fraction) arithmetic (e.g. &lt;tt&gt;(/ 1/2 (+ 1/8 1/3))&lt;/tt&gt;).  That problem at that moment was not interesting for me, and was not well-specified; are rationals always reduced to lowest terms?  How is conversion to and from reals handled?  I&#39;d have to go study R5RS to learn the expected behavior.  Handling the number tower was not my goal going into this project.&lt;br /&gt;&lt;br /&gt;Why does Arcane Sentiment dabble in Lispy implementations?  Why do I?  Why reinvent the wheel?  For me, it&#39;s a way to learn, to study.  Toy implementations are rewarding, as they let you discard parts of a language implementation system that are indeed hard, and focus on particular points of interest.  You need to be careful not to oversimplify, if you intend to take your lessons back to a real system.  But this approach is something we advise junior programmers all the time: if you&#39;re struggling with how a library or language feature is working in the application, try writing a small example program first until you understand how it works.&lt;br /&gt;&lt;br /&gt;So, I propose we have the best of both world.  Language design problems can be programming candy, as well as programming spinach, that is something good for you.  My wife has been making a spinach recipe from her Spain and Portugal cookbook which features raisins and pine nuts.  It rocks.&lt;br /&gt;&lt;br /&gt;The other question is, is there interest in a &quot;language design puzzle&quot; feature?  Before we get to that, let me ask a more relevant question: what aspects of programming language implementation or operation are of interest to &lt;em&gt;you&lt;/em&gt;?  Macros?  Evaluation strategy?  Optimizations?  Drop me a line to the Google mail account jfkbits and let me know.&lt;br /&gt;&lt;br /&gt;&lt;hr /&gt;&lt;br /&gt;Blog challenge: write a post using the phrase &quot;free as in spinach&quot;.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/6742801363557922308/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/6742801363557922308' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/6742801363557922308'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/6742801363557922308'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/05/little-lisps-programming-candy-or.html' title='Little Lisps: Programming Candy or Spinach?'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-6253285118758147662</id><published>2008-04-30T13:26:00.005-05:00</published><updated>2008-04-30T14:31:22.800-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="ideas"/><category scheme="http://www.blogger.com/atom/ns#" term="java"/><category scheme="http://www.blogger.com/atom/ns#" term="lisp"/><category scheme="http://www.blogger.com/atom/ns#" term="scheme"/><category scheme="http://www.blogger.com/atom/ns#" term="techniques"/><title type='text'>StreamTokenizer and Declarative Lexing</title><content type='html'>One thing I&#39;ve noticed about marketing is how frequently an appeal is made to change your lifestyle to include more of whatever is being sold.  I first noticed this when I read in a catalog that Ikea wants me out of that rat race of the outside world, so I can spend more time at home.  And by the way, don&#39;t I want my home to be a comfortable inviting place to spend time?  After that, I&#39;ve seen it everywhere.  Here at JFKBits we want to know why the world doesn&#39;t write more language processors.  Everyone can enjoy a world of recreational symbol translation.  I was on vacation last week, reading a programming language book by the beach.  Maybe someday we&#39;ll run programming language designs in this space and JFKBits readers can implement them as a kind of crossword puzzle of the day, in an hour on the train home from work.  So today, we&#39;re thinking about how to get a lexer up and going more quickly.&lt;br /&gt;&lt;hr/&gt;&lt;br /&gt;When you&#39;re designing a syntax, it&#39;s typically done at a high-level:&lt;br /&gt;&lt;ul&gt; &lt;li&gt;Comment syntax: line-oriented or delimited, or both?  do comments nest?&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Identifier syntax: Limited choice for the first character (e.g. letters and &#39;_&#39;), followed by mix of alphanumeric and underscore.  In Lisp, symbols syntax is much more flexible.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Operators and grouping syntax: parentheses, brackets, multi-character operators like ++, -&gt;, and &amp;&amp;=.&lt;/li&gt; &lt;/ul&gt;&lt;br /&gt;It might be useful, in terms of quickly prototyping a domain-specific language or language-aware utility (e.g. a tool that finds buggy code patterns), to have a lexer generator that lets you declare properties in these terms rather than translating to regular expressions.&lt;br /&gt;&lt;br /&gt;Java&#39;s &lt;a href=&quot;http://java.sun.com/j2se/1.4.2/docs/api/java/io/StreamTokenizer.html&quot;&gt;StreamTokenizer&lt;/a&gt;, while not a universal tool, is a step in that direction.&lt;br /&gt;&lt;br /&gt;What I like about StreamTokenizer is that it raises the level of programming closer to the problem domain.  Regular expressions are the traditional and well-understood means of specifying a lexer&#39;s accepted language, but this declarative style is possibly a better programmer&#39;s tool.  (Note well that I&#39;m talking about StreamTokenizer, not StringTokenizer which is a much simpler state machine.)&lt;br /&gt;&lt;br /&gt;For example, in playing with Kamin&#39;s interpreters with their S-expression-based syntax, I&#39;ve been using this configuration to scan LISP-type code in Java:&lt;br /&gt;&lt;pre class=&quot;code&quot;&gt;&lt;br /&gt;tokenizer = new StreamTokenizer(new BufferedReader(reader));&lt;br /&gt;tokenizer.resetSyntax(); // We don&#39;t like the default settings&lt;br /&gt;tokenizer.whitespaceChars(0, &#39; &#39;);&lt;br /&gt;tokenizer.wordChars(&#39; &#39;+1,255);&lt;br /&gt;tokenizer.ordinaryChar(&#39;(&#39;);&lt;br /&gt;tokenizer.ordinaryChar(&#39;)&#39;);&lt;br /&gt;tokenizer.commentChar(&#39;;&#39;);&lt;br /&gt;tokenizer.quoteChar(&#39;&quot;&#39;);&lt;br /&gt;&lt;/pre&gt; The basic idea is that StreamTokenizer is a state machine with hard-coded states and transition arcs, and you configure it with what characters are in the relevant character classes.  For example, from the state of skipping whitespace, there&#39;s a transition to a string literal state, when a character from the &quot;quoteChar&quot; class is encountered.  The string literal state accumulates, transitioning back to itself on characters not in the quoteChar class.  It&#39;s simply up to you to configure which character or characters constitute a quote delimiter.  The essential observation is that lexers for many useful languages share state machines of identical shapes, and they differ only in the definitions of character classes.&lt;br /&gt;&lt;br /&gt;Of course, not every language fits the preprogrammed state machine shape.  StreamTokenizer can&#39;t even handle Java, because it has no capacity for multi-character operators.  It can be configured so that &#39;+&#39; is an &quot;ordinary character&quot;, meaning it is lexed as a single-character token, but there&#39;s no way for a parser to know if two &#39;+&#39; tokens in sequence came from the input &quot;++&quot; or &quot;+ +&quot;.  This is what I mean when I say StreamTokenizer is not a universal lexing tool.&lt;br /&gt;&lt;br /&gt;But I still wonder if there&#39;s room for this declarative manner of input, building on higher-level concepts like identifiers and operators, for a more general lexing tool. This could be combined with a programmatic knowledge base of some of the standard lexical idioms running around.  You could say &quot;give me Python-style identifiers, with the standard Java operators except these four which don&#39;t apply to my language.&quot;  I&#39;m not at all sure there is need for such generality, but I think it&#39;s worth writing down, as an idea for future inspiration.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/6253285118758147662/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/6253285118758147662' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/6253285118758147662'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/6253285118758147662'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/04/streamtokenizer-and-declarative-lexing.html' title='StreamTokenizer and Declarative Lexing'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-3974714574002971069</id><published>2008-04-17T11:48:00.004-05:00</published><updated>2008-04-17T13:59:25.114-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="java"/><title type='text'>Small Tokenizing World</title><content type='html'>I wanted to see what Java programs out there parse Lisp syntax using the handy &lt;a href=&quot;http://java.sun.com/j2se/1.4.2/docs/api/java/io/StreamTokenizer.html#quoteChar(int)&quot;&gt;StreamTokenizer&lt;/a&gt;, and the first one I found is &lt;a href=&quot;http://www.samskivert.com/internet/deep/1997/03/19/body.html&quot;&gt;Mike Bayne&lt;/a&gt;&#39;s, circa 1997.&lt;br /&gt;&lt;br /&gt;Mike was one of the bright guys who did version 2 of the Prizm IDE that I helped work on for my senior project at &lt;a href=&quot;http://www.rose-hulman.edu/&quot;&gt;Rose-Hulman&lt;/a&gt;.  Now he runs his own online game company, &lt;a href=&quot;http://www.threerings.net/&quot;&gt;ThreeRings.net&lt;/a&gt;, out of San Francisco.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/3974714574002971069/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/3974714574002971069' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/3974714574002971069'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/3974714574002971069'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/04/small-tokenizing-world.html' title='Small Tokenizing World'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-26866996.post-4357318540104274699</id><published>2008-04-15T23:40:00.004-05:00</published><updated>2008-04-15T23:48:23.383-05:00</updated><title type='text'>AP Computer Science Language</title><content type='html'>I&#39;ve been reading Sam Kamin&#39;s &quot;Programming Languages: An Interpreter Based Approach&quot; from 1990, which compares programming languages through the device of writing an interpreter in Pascal for a form of the language using Lisp-based syntax.&lt;br /&gt;&lt;br /&gt;On reading all that Pascal source code, it takes me back. I reflect that I learned Pascal as a senior in high school, to prepare for the AP Computer Science test.  That was in 1990.  I see that the AP tests now use Java as their language.  I wonder, was there any other language between Pascal in 1990 and Java in 2008?&lt;br /&gt;&lt;br /&gt;If I had to bet on what language would succeed Java as that used for the AP exam, I&#39;d bet on Python.</content><link rel='replies' type='application/atom+xml' href='http://jfkbits.blogspot.com/feeds/4357318540104274699/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/26866996/4357318540104274699' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/4357318540104274699'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/26866996/posts/default/4357318540104274699'/><link rel='alternate' type='text/html' href='http://jfkbits.blogspot.com/2008/04/ap-computer-science-language.html' title='AP Computer Science Language'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry></feed>