<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" version="2.0">

<channel>
	<title>Colin Ross</title>
	
	<link>http://vcolin.com</link>
	<description>Software development etc.</description>
	<lastBuildDate>Sat, 20 Jun 2009 13:38:34 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/vColin" type="application/rss+xml" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com" /><item>
		<title>Haskell resource: Typeclassopedia</title>
		<link>http://vcolin.com/2009/02/16/haskell-resource-typeclassopedia/</link>
		<comments>http://vcolin.com/2009/02/16/haskell-resource-typeclassopedia/#comments</comments>
		<pubDate>Mon, 16 Feb 2009 13:23:08 +0000</pubDate>
		<dc:creator>Colin Ross</dc:creator>
				<category><![CDATA[Haskell]]></category>
		<category><![CDATA[type classes]]></category>

		<guid isPermaLink="false">http://vcolin.com/?p=419</guid>
		<description><![CDATA[Brent Yorgey has published a first draft of what looks to be a fantastic resource for the Haskell programmer who wants to understand what all those strange type classes like Applicative, Monoid and Functor actually mean.

I am in the process of reading though it and the quote
Hmm, it doesn’t compile &#8230; maybe I’ll stick in an fmap [...]


Related posts:<ol><li><a href='http://vcolin.com/2009/02/16/xml-parsing-haskell/' rel='bookmark' title='Permanent Link: XML parsing with Haskell'>XML parsing with Haskell</a> <small>Although I have been aware of the existence of XML,...</small></li><li><a href='http://vcolin.com/2009/01/06/creating-minimal-http-server-haskell/' rel='bookmark' title='Permanent Link: Creating a minimal HTTP server in Haskell'>Creating a minimal HTTP server in Haskell</a> <small>I have been spending quite a lot of time lately...</small></li></ol>]]></description>
			<content:encoded><![CDATA[<p></p><p>Brent Yorgey has published a first draft of what looks to be a fantastic resource for the Haskell programmer who wants to understand what all those strange type classes like Applicative, Monoid and Functor actually mean.</p>
<p><span id="more-419"></span></p>
<p>I am in the process of reading though it and the quote</p>
<blockquote><p>Hmm, it doesn’t compile &#8230; maybe I’ll stick in an fmap here&#8230; nope, let’s see&#8230; maybe I need another (.) somewhere? &#8230; umm &#8230;</p></blockquote>
<p>describes me all too well at times.</p>
<p>While it may not yet be complete, it is still quite a meaty piece of work, weighing in at 48 pages, including a full 111 useful references.</p>
<p>Read it now: <a href="http://www.cis.upenn.edu/~byorgey/papers/typeclassopedia-draft-090216.pdf">The Typeclassopedia</a>. Alternatively, <a href="http://byorgey.wordpress.com/2009/02/16/the-typeclassopedia-request-for-feedback/">the announcement</a>.</p>
<p>All it needs now is a name which I can type successfully on the first attempt.</p>


<p>Related posts:<ol><li><a href='http://vcolin.com/2009/02/16/xml-parsing-haskell/' rel='bookmark' title='Permanent Link: XML parsing with Haskell'>XML parsing with Haskell</a> <small>Although I have been aware of the existence of XML,...</small></li><li><a href='http://vcolin.com/2009/01/06/creating-minimal-http-server-haskell/' rel='bookmark' title='Permanent Link: Creating a minimal HTTP server in Haskell'>Creating a minimal HTTP server in Haskell</a> <small>I have been spending quite a lot of time lately...</small></li></ol></p>]]></content:encoded>
			<wfw:commentRss>http://vcolin.com/2009/02/16/haskell-resource-typeclassopedia/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>XML parsing with Haskell</title>
		<link>http://vcolin.com/2009/02/16/xml-parsing-haskell/</link>
		<comments>http://vcolin.com/2009/02/16/xml-parsing-haskell/#comments</comments>
		<pubDate>Mon, 16 Feb 2009 13:08:50 +0000</pubDate>
		<dc:creator>Colin Ross</dc:creator>
				<category><![CDATA[Haskell]]></category>
		<category><![CDATA[parsec]]></category>
		<category><![CDATA[parsing]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://vcolin.com/?p=404</guid>
		<description><![CDATA[Although I have been aware of the existence of XML, I have never had any real experience of working with it. It looks like that is changing.

It all started when, on a whim, I purchased XSLT by Doug Tidwell. As I began reading it, typing in various bits and pieces of XML, I came to [...]


Related posts:<ol><li><a href='http://vcolin.com/2009/01/06/creating-minimal-http-server-haskell/' rel='bookmark' title='Permanent Link: Creating a minimal HTTP server in Haskell'>Creating a minimal HTTP server in Haskell</a> <small>I have been spending quite a lot of time lately...</small></li></ol>]]></description>
			<content:encoded><![CDATA[<p></p><p>Although I have been aware of the existence of XML, I have never had any real experience of working with it. It looks like that is changing.</p>
<p><span id="more-404"></span></p>
<p>It all started when, on a whim, I purchased <a href="http://oreilly.com/catalog/9780596527211/">XSLT by Doug Tidwell</a>. As I began reading it, typing in various bits and pieces of XML, I came to the conclusion that that I really needed to sit down and learn what XML is actually all about.</p>
<p>Up until that point my idea of XML was as a relatively simple format with tags, and so long as the tags match, everything is ok. How hard could it be to create an XML parser? It surely  must be pretty straightforward, and could be achieved in a few dozen lines of code.</p>
<p>And so began journey into the depths of the <a href="http://www.w3.org/TR/xml/">XML 1.0 specification</a>. My code is currently just creeping over 2000 lines and has yet to pass my initial test suite, although it is getting closer.</p>
<p>I used the <a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/parsec">parsec library</a> to do the parsing of the file together with the <a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/compact-string">compact-string library</a> to deal with encodings and they worked pretty smoothly once I figured out what I was trying to achieve with them. In fact, getting the initial XML parser was relatively straightforward &#8211; just some grunt work to implement it all. The problems mainly occurred with various well-formedness constraints and (something I have yet to tackle at all) the validity constraints. Additionally, since there are some <a href="http://www.w3.org/XML/Test/">nice tests</a> to go with the specification, my focus is currently on getting them to pass, and in particular writing out my parsed XML document in canonical form.</p>
<p>My development of the XML parser has (will have!) the following stages:</p>
<ul>
<li>Read the file as UTF16</li>
<li>Preprocess to get rid of carriage returns</li>
<li>Parse the basic structure</li>
<li>Expand various entities and make sure they also parse correctly</li>
<li>Ensure the various well-formedness constraints are satisfied</li>
<li>Ensure the various validity constraints are satisfied</li>
<li>Completion!</li>
</ul>
<p>I am currently in the middle of stage 5 partly because I keep getting distracted by making the canonical export work (it is needed for checking results of some tests). Unfortunately, I believe I am still doing the easy stuff and that the most difficult stage is the next one where the validity constraints have to be enforced.</p>
<p>Nonetheless, it has been a valuable learning experience &#8211; my knowledge of XML has increased many-fold as has my familiarity with parsec and compact-string libraries.</p>
<p>One thing I have noticed with parsec is that I probably over-use the &#8220;try&#8221; combinator. I took a very lazy approach and basically implemented my own version of the parsec combinators which scatter &#8220;try&#8221;s like there is no tomorrow, to ensure I didn&#8217;t have to do any hard thinking about when they are actually required. While this means that parsing successfully is easy, it does seem to mean that if there is a parse error, I get almost no clue where the error was, which is somewhat unfortunate. Still, hopefully as time goes on, I can start to remove the &#8220;try&#8221;s bit by bit and gain some performance improvements along the way.</p>
<p>I did come up with a perhaps-useful combinator of my own during the process of implementing the XML specification. At several stages, there is a rule of the form</p>
<pre>NotVowel = Char - [aeiou]</pre>
<p>which should match any character except a vowel. My first attempt to implement this used &#8220;manyTill&#8221; like so</p>

<div class="wp_syntax"><div class="code"><pre class="haskell" style="font-family:monospace;">notVowel <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span>
   manyTill char vowel</pre></div></div>

<p>but this has the unfortunate effect of &#8220;eating&#8221; the first vowel encountered. Instead, it should be implemented something like this:</p>

<div class="wp_syntax"><div class="code"><pre class="haskell" style="font-family:monospace;">notVowel <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span>
   m <span style="color: #339933; font-weight: bold;">&lt;-</span> optionMaybe <span style="color: #339933; font-weight: bold;">$</span> C<span style="color: #339933; font-weight: bold;">.</span>lookAhead vowel
   <span style="color: #06c; font-weight: bold;">case</span> m <span style="color: #06c; font-weight: bold;">of</span>
      Just <span style="color: #339933; font-weight: bold;">_</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> P<span style="color: #339933; font-weight: bold;">.</span>unexpected <span style="background-color: #3cb371;">&quot;Vowel encountered&quot;</span>
      Nothing <span style="color: #339933; font-weight: bold;">-&gt;</span> char</pre></div></div>

<p>or, in its more general form:</p>

<div class="wp_syntax"><div class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">type</span> Parser <span style="color: #339933; font-weight: bold;">=</span> P<span style="color: #339933; font-weight: bold;">.</span>Parsec Input XmlState
butNot <span style="color: #339933; font-weight: bold;">::</span> <span style="color: green;">&#40;</span>Parser a<span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>Parser b<span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>Parser a<span style="color: green;">&#41;</span>
butNot p e <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span>
   m <span style="color: #339933; font-weight: bold;">&lt;-</span> optionMaybe <span style="color: #339933; font-weight: bold;">$</span> C<span style="color: #339933; font-weight: bold;">.</span>lookAhead e
   <span style="color: #06c; font-weight: bold;">case</span> m <span style="color: #06c; font-weight: bold;">of</span>
      Just <span style="color: #339933; font-weight: bold;">_</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> P<span style="color: #339933; font-weight: bold;">.</span>unexpected <span style="background-color: #3cb371;">&quot;butNot failure&quot;</span>
      Nothing <span style="color: #339933; font-weight: bold;">-&gt;</span> p</pre></div></div>

<p>This seems to do the job marvellously well. Basically, it looks for the &#8220;bad&#8221; input and if it finds it, fails. If it doesn&#8217;t find the &#8220;bad&#8221; input, then it happily parses &#8220;good&#8221; input.</p>


<p>Related posts:<ol><li><a href='http://vcolin.com/2009/01/06/creating-minimal-http-server-haskell/' rel='bookmark' title='Permanent Link: Creating a minimal HTTP server in Haskell'>Creating a minimal HTTP server in Haskell</a> <small>I have been spending quite a lot of time lately...</small></li></ol></p>]]></content:encoded>
			<wfw:commentRss>http://vcolin.com/2009/02/16/xml-parsing-haskell/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>The difference between scripting and programming</title>
		<link>http://vcolin.com/2009/01/27/difference-scripting-programming/</link>
		<comments>http://vcolin.com/2009/01/27/difference-scripting-programming/#comments</comments>
		<pubDate>Tue, 27 Jan 2009 09:51:54 +0000</pubDate>
		<dc:creator>Colin Ross</dc:creator>
				<category><![CDATA[Haskell]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[language]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[scripting]]></category>

		<guid isPermaLink="false">http://vcolin.com/?p=399</guid>
		<description><![CDATA[Jeff Atwood asks the question &#8220;What&#8217;s the difference between a programming language and a scripting language?&#8220;. Although he doesn&#8217;t seem to come to a conclusion about that, he does decide that he is a scripter rather than a programmer. This is based on his experiences with BASIC (a scripting language) and C (a programming language). So [...]


No related posts.]]></description>
			<content:encoded><![CDATA[<p></p><p>Jeff Atwood asks the question &#8220;<a href="http://www.codinghorror.com/blog/archives/001216.html">What&#8217;s the difference between a programming language and a scripting language?</a>&#8220;. Although he doesn&#8217;t seem to come to a conclusion about that, he does decide that he is a scripter rather than a programmer. This is based on his experiences with BASIC (a scripting language) and C (a programming language). So what is the essential difference?</p>
<p><span id="more-399"></span>As with all such questions, there are many possible answers. As many commenters noted on Jeff&#8217;s blog, this is not black and white territory &#8211; there is a continuum between scripting languages and programming languages. But this also avoids the essential question, which is trying to find the difference between the two varieties.</p>
<p>By definition, a scripting language lets you do scripting. And a programming language lets you do programming. So is there an inherent difference between scripting and programming? Superficially, I would say no &#8211; both involve typing some instructions into an editor in order to be executed later and perform some action.</p>
<p>Where I think the difference lies is in the timescales of the time of execution. When scripting, we expect the action to happen once we are finished coding. Indeed, this may well be the purpose of doing the scripting in the first place &#8211; to get some task completed, whether that be some database manipulation or file system tidy up or whatever. However, when programming, I get the sense there is a more long-term purpose. Maybe we are coding a library for others to use. Maybe we are creating a product to sell. When programming, we are not after a quick fix, we are seeking to create something more enduring. As such, programming languages tend to be stricter about what they accept so that less can go wrong later on.</p>
<p>So, what is the difference between a programming language and a scripting language? Well, given the previous paragraph, a scripting language is one that provides sufficient flexibility and functionality to get things done. This will probably be in the form of libraries that come packaged with the implementation, in some sort of  &#8221;batteries included&#8221; way as Python&#8217;s standard libraries are <a href="http://www.python.org/about/">described</a>. Because results are to be obtained quickly, the functionality needs to be readily available with a minimum of fuss. When programming on the other hand, with the longer-term view in mind, there is more time to look around for the right library, or even to write it yourself if need be. C is like this &#8211; you don&#8217;t get much with C except for the basics. You want to do something fancy, you need to go and find it.</p>
<p>Of course the interesting thing is that while few would argue that C is a scripting language, even if it came with a variety of libraries,  it is quite credible to treat scripting languages as programming languages. If what you write with Python, or Perl, or Ruby etc. is actually a library for future use by others, then arguably you are programming rather than scripting. And if you are programming with a language, that suggests that the language itself is a programming language.</p>
<p>Haskell, my current language has a foot in both camps. Fundamentally, I think it is a programming language, but the beauty of it is that it can be treated for scripting. There is a move to make this more explicit with the <a href="http://www.haskell.org/haskellwiki/Haskell_Platform">Haskell Platform</a> which is a project to create what is effectively a Haskell standard library to provide all the common functionality necessary to be productive with Haskell. When this is ready (hopefully sometime this year) there should be less need to poke about among the various Haskell packages looking for an appropriate database package (for example). Instead, there will be one provided which should be good enough until proven otherwise.</p>
<p>Eventually, it all boils down to preconceptions. There is no fundamental difference between a programming language and a scripting language, and it doesn&#8217;t make sense to try and make that distinction. Instead, the real question is &#8220;What is the difference between programming and scripting&#8221;, and the answer to that is one of attitude. Personally, I don&#8217;t particularly like either, and when working in Haskell I swing wildly between one extreme and the other &#8211; prototyping quickly (scripting) and then tidying up and modularising (programming). Instead I would just describe what I do as &#8220;coding&#8221;. Haskell is, then, a coding language, and I like it.</p>


<p>No related posts.</p>]]></content:encoded>
			<wfw:commentRss>http://vcolin.com/2009/01/27/difference-scripting-programming/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Does your IDE define you or support you?</title>
		<link>http://vcolin.com/2009/01/18/ide-define-support/</link>
		<comments>http://vcolin.com/2009/01/18/ide-define-support/#comments</comments>
		<pubDate>Sun, 18 Jan 2009 19:15:54 +0000</pubDate>
		<dc:creator>Colin Ross</dc:creator>
				<category><![CDATA[Haskell]]></category>
		<category><![CDATA[ide]]></category>
		<category><![CDATA[snooker]]></category>

		<guid isPermaLink="false">http://vcolin.com/?p=391</guid>
		<description><![CDATA[As I have been watching the Masters snooker, there has been a great deal of discussion about Ronnie O&#8217;Sullivan breaking his cue before the tournament and then managing to play exceptionally well with a replacement which he has never used before.  The majority of commentators seem to believe that this shows O&#8217;Sullivan&#8217;s natural prowess and [...]


No related posts.]]></description>
			<content:encoded><![CDATA[<p></p><p>As I have been watching the Masters snooker, there has been a great deal of discussion about Ronnie O&#8217;Sullivan <a href="http://www.dailymail.co.uk/sport/othersports/article-1120287/I-just-want-smash-cues-Destructive-Ronnie-reveals-inner-anger-booking-semi-final-spot.html">breaking his cue</a> before the tournament and then managing to play exceptionally well with a replacement which he has never used before.  The majority of commentators seem to believe that this shows O&#8217;Sullivan&#8217;s natural prowess and flair for the game of snooker.  I tend to agree.</p>
<p><span id="more-391"></span></p>
<p>At the same time, I have been following a <a href="http://intoverflow.wordpress.com/2009/01/13/why-haskell-is-beyond-ready-for-prime-time/">discussion about Haskell at Integer Overflow</a> where there has been a bit of a discussion in the comments about there not being an IDE for Haskell and that this is holding back the adoption of Haskell as a mainstream software engineering language. The argument goes that once the code base gets beyond a certain size, an IDE is necessary to simply get any work done. I tend to disagree.</p>
<p>The two situations are somewhat related.</p>
<p>For O&#8217;Sullivan, the cue is simply a conduit for getting the snooker skills inherent in his arm and brain onto the table. For other, mere mortal, snooker players, the cue is more than that. For them, the cue is an essential ingredient in their success.</p>
<p>Programmers tend to act similarly with regard to IDEs. On the one hand, there are those than can use an IDE, but it is simply a way of getting the code written. On the other hand are those that cannot function without their IDE. Put the latter breed of programmer in front of a basic text editor and command prompt and they are lost. They need the support and comfort that the IDE they know and love provides.</p>
<p>Imagine if a new breed of snooker cue  was invented &#8211; the steadi-cue. This cue had the uncanny ability to remain completely steady so that even a rank amateur could make much more accurate shots. It is not inconceivable that a player using such a cue could beat the natural ability of O&#8217;Sullivan. However, the snooker purist would be aghast &#8211; snooker tournaments would no longer be about who is the best player; after all, you would barely have to practise.</p>
<p>When a programmer uses an IDE the same thing occurs. However, because programmers are not generally judged on their ability but rather on what they produce, nobody has a problem with using the equivalent of the steadi-cue; the IDE. They may not win a best-programmer competition, but getting a product out that works is more important.</p>
<p>Which is all well and good if that is how you want to regard yourself &#8211; as a cog in a machine, albeit an important cog. However, if you want to actually excel in what you do, try casting that IDE away once in a while at least. It can be liberating. For starters, you gain a far greater understanding of what is actually going on when you click the &#8220;Build&#8221; button for example. All those compilation flags you never worry about suddenly become a lot more interesting when you have to figure them out for yourself.</p>
<p>There is also a sense of liberation when the shackles of the IDE are cast off &#8211; programming can then occur almost anywhere at any time and a text editor can start up a lot quicker than a fully-featured IDE. Having tasted the freedom, I don&#8217;t want to go back.</p>
<p>Still&#8230; sometimes I wish I had intellisense.</p>


<p>No related posts.</p>]]></content:encoded>
			<wfw:commentRss>http://vcolin.com/2009/01/18/ide-define-support/feed/</wfw:commentRss>
		<slash:comments>21</slash:comments>
		</item>
		<item>
		<title>Finding a member of an infinite list</title>
		<link>http://vcolin.com/2009/01/12/finding-member-infinite-list/</link>
		<comments>http://vcolin.com/2009/01/12/finding-member-infinite-list/#comments</comments>
		<pubDate>Mon, 12 Jan 2009 20:00:31 +0000</pubDate>
		<dc:creator>Colin Ross</dc:creator>
				<category><![CDATA[Haskell]]></category>
		<category><![CDATA[infinite list]]></category>
		<category><![CDATA[membership]]></category>

		<guid isPermaLink="false">http://vcolin.com/?p=367</guid>
		<description><![CDATA[Haskell, and GHC in particular never cease to amaze me.  While thinking about writing an infinite sorted list data type, I was playing around with ghci and discovered that it allows you to find a member of a list which is infinitely far from the beginning.  Then I discovered it didn&#8217;t.

An initial misunderstanding
The expression I [...]


Related posts:<ol><li><a href='http://vcolin.com/2009/01/11/notes-working-finite-sorted-lists/' rel='bookmark' title='Permanent Link: Notes on working with finite sorted lists'>Notes on working with finite sorted lists</a> <small>In a previous post, I wrote some information on ways...</small></li><li><a href='http://vcolin.com/2009/01/08/fast-right-smart/' rel='bookmark' title='Permanent Link: The fast, the right and the smart'>The fast, the right and the smart</a> <small>Working my way through the puzzles in Project Euler is...</small></li></ol>]]></description>
			<content:encoded><![CDATA[<p></p><p><a href="http://haskell.org/">Haskell</a>, and <a href="http://haskell.org/ghc/">GHC</a> in particular never cease to amaze me.  While thinking about writing an infinite sorted list data type, I was playing around with ghci and discovered that it allows you to find a member of a list which is infinitely far from the beginning.  Then I discovered it didn&#8217;t.</p>
<p><span id="more-367"></span></p>
<h3>An initial misunderstanding</h3>
<p>The expression I used was</p>
<pre>elem 2 ([1..] ++ [2..])</pre>
<p>since this would appear to construct an infinite list by adding an infinite list of <code>2</code>s to the end of an infinite list of <code>1</code>s, I was fully expecting to have to interrupt the calculation as ghci happily chewed through all the <code>1</code>s.</p>
<p>Instead, the answer <code>True</code> popped up immediately.  Impressive.</p>
<p>Then I realised my mistake.</p>
<p>I had created my lists incorrectly &#8211; the correct code is in fact:</p>
<pre>elem 2 ([1,1..] ++ [2,2..])</pre>
<p>and &#8216;lo and behold, the expected chewing does indeed occur.  It did get me thinking though.</p>
<h3>From misunderstanding to thoughtfulness</h3>
<p>In theory, a clever enough data structure would be able to return <code>True</code> for a suitable equivalent of this expression.</p>
<p>The question is just finding what that data structure should be.  I have been thinking about infinite lists recently, after my <a href="http://vcolin.com/2009/01/11/notes-working-finite-sorted-lists/">previous post on finite lists</a>, and in particular which of the functions defined in <code>Data.List</code> could be guaranteed to complete when the list is infinite.</p>
<p>It turns out that almost all can be converted to infinite lists, but many of the functions will require additional constraints on the elements in the list in order to ensure that they return.</p>
<p>Consider the <code>elem</code> function.  With an arbitrary infinite list, there is no guarantee that a call to <code>elem</code> will return.  An example of a list where it wouldn&#8217;t return is</p>
<pre>elem 0 [1..]</pre>
<p>We can resolve this problem by requiring that all the elements of the list are greater than or equal to some value.  In this case, we know that all elements are at least <code>1</code>, so we can immediately return <code>False</code>.</p>
<p>However, how do we then deal with the following:</p>
<pre>elem 2 $ [1,3..]</pre>
<p>Here, the infinite list consists of all the positive odd integers.  Again, it is clear that <code>2</code> is not a member, but the current implementation of <code>elem</code> does not terminate.  We can get around the problem by using the knowledge that the elements of the list are sorted.  In which case we can use</p>
<pre>head (takeWhile (&lt;2) [1,3..]) == 2</pre>
<p>which gets us the correct answer of <code>True</code>.  It also allows us to discard our previous condition &#8211; since being sorted means that the minimum element will be the head of the list.  But all is not well.</p>
<p>Consider now, the problem</p>
<pre>elem 2 $ [1, 1 + 1/2, 1 + 3/4, 1 + 7/8, 1 + 15/16 etc.]</pre>
<p>(which isn&#8217;t valid Haskell, but you get the idea).  In this case, every element of the list is less than the element we are searching for, so neither <code>elem</code> nor <code>takeWhile</code> would ever terminate.  Instead, we need to impose another constraint on the list, namely that the list is unbounded &#8211; so that we will be guaranteed that <code>takeWhile</code> will return.</p>
<p>Or will we?</p>
<p>Consider now the list formed in the following way:</p>
<pre>[1,1..] ++ [2..]</pre>
<p>This monstrosity satisfies the sorted, minimum element and unbounded constraints, but still would not return.  There are two possible remedies to this.  First, and most severe, is to require that all the elements of the list be unique.  This will resolve this problem as the <code>[1,1..]</code> would then be invalid.  The other is to require that there is only a finite number of each element.  This second constraint is less severe than the first, and so is preferred.</p>
<p>So, given that the list satisfies the following:</p>
<ul>
<li>is sorted</li>
<li>is unbounded</li>
<li>has a finite number of all elements</li>
</ul>
<p>I believe we can be guaranteed to find any given element in finite time.</p>
<h3>From thoughtfulness to enlightenment</h3>
<p>Assuming that we are now in a position to find whether an element is a member of a suitable constrained list, are there any ways in which we can  relax the constraints?</p>
<p>Well, firstly there is nothing particularly special about the ascending order &#8211; we could easily change that to descending order provided we correspondingly alter the minimum element condition to instead be a maximum element condition.</p>
<p>Having done this, a more general approach now presents itself &#8211; we can find a given element in an infinite list provided the following conditions hold on the list l with ordering function<code> f :: a -&gt; a -&gt; Ordering</code></p>
<ul>
<li>unbounded: there exists m such that <code>all (==GT) $ map (f m) l</code></li>
<li>sorted: <code>all (/=GT) $ zipWith f l (tail l)</code></li>
<li>finite counts: <code>not $ any (isInfinite.genericLength) $ group l</code></li>
</ul>
<p>where I have tried to use Haskell notation as far as possible, even though actually trying to execute the code on infinite lists would probably cause non-termination.</p>
<p>All that is needed now is an implementation, so that is the next step.</p>


<p>Related posts:<ol><li><a href='http://vcolin.com/2009/01/11/notes-working-finite-sorted-lists/' rel='bookmark' title='Permanent Link: Notes on working with finite sorted lists'>Notes on working with finite sorted lists</a> <small>In a previous post, I wrote some information on ways...</small></li><li><a href='http://vcolin.com/2009/01/08/fast-right-smart/' rel='bookmark' title='Permanent Link: The fast, the right and the smart'>The fast, the right and the smart</a> <small>Working my way through the puzzles in Project Euler is...</small></li></ol></p>]]></content:encoded>
			<wfw:commentRss>http://vcolin.com/2009/01/12/finding-member-infinite-list/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Notes on working with finite sorted lists</title>
		<link>http://vcolin.com/2009/01/11/notes-working-finite-sorted-lists/</link>
		<comments>http://vcolin.com/2009/01/11/notes-working-finite-sorted-lists/#comments</comments>
		<pubDate>Sun, 11 Jan 2009 17:48:27 +0000</pubDate>
		<dc:creator>Colin Ross</dc:creator>
				<category><![CDATA[Haskell]]></category>
		<category><![CDATA[lists]]></category>
		<category><![CDATA[project euler]]></category>
		<category><![CDATA[sorting]]></category>

		<guid isPermaLink="false">http://vcolin.com/?p=357</guid>
		<description><![CDATA[In a previous post, I wrote some information on ways to work with sorted infinite lists.  Since then I have investigated some more and realise that the code I gave could be considerably improved.  Particularly if you are interested in sorted finite lists.

The improvement
I made the remark that while
elem n primes
would not return for the [...]


Related posts:<ol><li><a href='http://vcolin.com/2009/01/08/fast-right-smart/' rel='bookmark' title='Permanent Link: The fast, the right and the smart'>The fast, the right and the smart</a> <small>Working my way through the puzzles in Project Euler is...</small></li><li><a href='http://vcolin.com/2009/01/12/finding-member-infinite-list/' rel='bookmark' title='Permanent Link: Finding a member of an infinite list'>Finding a member of an infinite list</a> <small>Haskell, and GHC in particular never cease to amaze me....</small></li></ol>]]></description>
			<content:encoded><![CDATA[<p></p><p>In a <a href="http://vcolin.com/2009/01/08/fast-right-smart/">previous post</a>, I wrote some information on ways to work with sorted infinite lists.  Since then I have investigated some more and realise that the code I gave could be considerably improved.  Particularly if you are interested in sorted finite lists.</p>
<p><span id="more-357"></span></p>
<h3>The improvement</h3>
<p>I made the remark that while</p>
<pre>elem n primes</pre>
<p>would not return for the infinite list primes, the alternative</p>
<pre>(head $ dropWhile (&lt;n) primes) == n</pre>
<p>would work.</p>
<p>This is true, but it raises a couple of issues that need to be addressed before it can be used more widely.</p>
<p>The first problem is that it actually doesn&#8217;t work for finite lists, which is rather annoying but can be resolved by adding an appropriate test before taking the head:</p>
<pre>contains n ns = (not $ null rest) &amp;&amp; (head rest == n)
   where
   rest = dropWhile (&lt;n) ns</pre>
<p>The other issue with finite lists is that this is actually pretty inefficient and a better way to test for membership is to use sets. A much faster approach is this:</p>
<pre>contains n ns = Set.member n $ Set.fromList ns</pre>
<p>Even though this would appear to be doing a lot more work, it does appear to be considerably faster.  However, this does not work for infinite lists, because the call to <code>Set.fromList</code> never terminates.</p>
<p>I have been making increased use of both of these methods to test for membership as I continue to work through <a href="http://projecteuler.net">Project Euler</a> probems.  Now at level 2!</p>


<p>Related posts:<ol><li><a href='http://vcolin.com/2009/01/08/fast-right-smart/' rel='bookmark' title='Permanent Link: The fast, the right and the smart'>The fast, the right and the smart</a> <small>Working my way through the puzzles in Project Euler is...</small></li><li><a href='http://vcolin.com/2009/01/12/finding-member-infinite-list/' rel='bookmark' title='Permanent Link: Finding a member of an infinite list'>Finding a member of an infinite list</a> <small>Haskell, and GHC in particular never cease to amaze me....</small></li></ol></p>]]></content:encoded>
			<wfw:commentRss>http://vcolin.com/2009/01/11/notes-working-finite-sorted-lists/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>The fast, the right and the smart</title>
		<link>http://vcolin.com/2009/01/08/fast-right-smart/</link>
		<comments>http://vcolin.com/2009/01/08/fast-right-smart/#comments</comments>
		<pubDate>Thu, 08 Jan 2009 21:55:27 +0000</pubDate>
		<dc:creator>Colin Ross</dc:creator>
				<category><![CDATA[Haskell]]></category>
		<category><![CDATA[project euler]]></category>

		<guid isPermaLink="false">http://vcolin.com/?p=332</guid>
		<description><![CDATA[Working my way through the puzzles in Project Euler is pretty enlightening at times.  I made it to level 1 yesterday to much acclaim and adulation.  Taking this as an opportunity to look back at the sort of  patterns I have been coding threw up a couple of interesting practises which I have been performing.

Manipulating [...]


Related posts:<ol><li><a href='http://vcolin.com/2009/01/11/notes-working-finite-sorted-lists/' rel='bookmark' title='Permanent Link: Notes on working with finite sorted lists'>Notes on working with finite sorted lists</a> <small>In a previous post, I wrote some information on ways...</small></li><li><a href='http://vcolin.com/2009/01/12/finding-member-infinite-list/' rel='bookmark' title='Permanent Link: Finding a member of an infinite list'>Finding a member of an infinite list</a> <small>Haskell, and GHC in particular never cease to amaze me....</small></li><li><a href='http://vcolin.com/2009/01/07/validation-num2str/' rel='bookmark' title='Permanent Link: Validation of use of Num2Str'>Validation of use of Num2Str</a> <small>It&#8217;s always good to see that the code you write...</small></li></ol>]]></description>
			<content:encoded><![CDATA[<p></p><p>Working my way through the puzzles in <a href="http://projecteuler.net">Project Euler</a> is pretty enlightening at times.  I made it to level 1 yesterday to much acclaim and adulation.  Taking this as an opportunity to look back at the sort of  patterns I have been coding threw up a couple of interesting practises which I have been performing.<br />
<span id="more-332"></span></p>
<h3>Manipulating strings to and from numbers</h3>
<p>A large number of the problems involving the rearrangement of numbers, for example, finding which pair of two-digit primes when rearranged in ascending order and concatenated give the cube of the difference of the original primes (I just made that up &#8211; there may well not be an answer).  In this example, if our two primes were represented as ab and cd then we would need to perform the rearrangement to get ba and dc as well.  The tactic I have been adopting for this is judicious use of <code>show</code> and <code>read</code>.  For the example I just gave, the call would something like</p>
<pre>reverseNum = read.reverse.show</pre>
<p>In order to find all the two-digit numbers my current ploy is to make use of the permutations function in Data.List and apply it to a string of digits.  After a bit of processing, the correct answers pop out.  For example, here is the code to return all nine-digit numbers without zeros:</p>
<pre>nineDigitsWithStrings = map read $ permutations "123456789"</pre>
<p>Its short and clear, and not a lot can go wrong with it.  But the use of <code>read</code> just looks a bit ugly.  I had a nagging feeling that the right way to do this was to actually do the conversion &#8220;properly&#8221; by stepping though the list multiplying by ten at each step, to get the same, proper answer.  Something like this in fact:</p>
<pre>nineDigitsWithNums = map makeNum $ permutations [1..9]
   where
   makeNum ds = makeNum' (reverse ds)
   makeNum' [] = 0
   makeNum' (n:ns) = n + 10 * makeNum' ns</pre>
<p>A couple of things to note &#8211; firstly, the reverse is required so that  the proper digits get more multiplications.  Second &#8211; it appears to be a lot more complicated than the string version, but let&#8217;s be honest, there is a lot of complexity hiding within a call to <code>read</code> of which there is none in this version.  I had no idea which might be faster so I created a quick test to see if one was clearly more efficient than the other.  The results were somewhat surprising &#8211; the quick and dirty string-based method was around 20% faster and used about 20% less memory.  So much for doing things the &#8220;right&#8221; way!  Why this is the case I do not know (yet), but the reverse calls might be the culprit.  Unfortunately, they are somewhat important in getting the right answer.</p>
<h3>Dealing with sorted lists</h3>
<p>Another pattern I noticed using occurred when dealing with sorted infinite lists.  These keep cropping up &#8211; for example the list of primes is a prime (ho-ho!) example.</p>
<p>In this case, I would often want to grab the members of the list smaller than a given number.  Naively, I would write</p>
<pre>filter (&lt;n) primes</pre>
<p>but this would never complete, due to the infiniteness of the list.  Knowing that the list is sorted, the correct effect can be obtained by a slight tweak:</p>
<pre>takeWhile (&lt;n) primes</pre>
<p>which completes quite happily.  This sort of thinking is also useful when interrogating a list for the existence of a particular element.  Again, a naive approach would be to use</p>
<pre>elem n primes</pre>
<p>which again doesn&#8217;t complete due to the infiniteness of the list.  Adopting the same sort of approach as before, we can re-write this as</p>
<pre>(head $ dropWhile (&lt;n) primes) == n</pre>
<p>which (once you get your head around what it is actually doing) is simple, and more to the point works.</p>
<p>Having these solutions at your fingertips allows you to avoid the problem of arbitrarily truncating the infinite list, which would be the other option.  It isn&#8217;t a good option though &#8211; truncate too early and you miss the elements you need.  But truncate too late and you spend far more time than you need to in dealing with the list.  Haskell&#8217;s lazy lists are a real boon in this situation.</p>


<p>Related posts:<ol><li><a href='http://vcolin.com/2009/01/11/notes-working-finite-sorted-lists/' rel='bookmark' title='Permanent Link: Notes on working with finite sorted lists'>Notes on working with finite sorted lists</a> <small>In a previous post, I wrote some information on ways...</small></li><li><a href='http://vcolin.com/2009/01/12/finding-member-infinite-list/' rel='bookmark' title='Permanent Link: Finding a member of an infinite list'>Finding a member of an infinite list</a> <small>Haskell, and GHC in particular never cease to amaze me....</small></li><li><a href='http://vcolin.com/2009/01/07/validation-num2str/' rel='bookmark' title='Permanent Link: Validation of use of Num2Str'>Validation of use of Num2Str</a> <small>It&#8217;s always good to see that the code you write...</small></li></ol></p>]]></content:encoded>
			<wfw:commentRss>http://vcolin.com/2009/01/08/fast-right-smart/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Validation of use of Num2Str</title>
		<link>http://vcolin.com/2009/01/07/validation-num2str/</link>
		<comments>http://vcolin.com/2009/01/07/validation-num2str/#comments</comments>
		<pubDate>Wed, 07 Jan 2009 20:17:25 +0000</pubDate>
		<dc:creator>Colin Ross</dc:creator>
				<category><![CDATA[Haskell]]></category>
		<category><![CDATA[Num2Str]]></category>
		<category><![CDATA[project euler]]></category>
		<category><![CDATA[scotland]]></category>

		<guid isPermaLink="false">http://vcolin.com/?p=320</guid>
		<description><![CDATA[It&#8217;s always good to see that the code you write has some real-life application, however small.  So, it was heartening to find a Project Euler problem that required exactly the functionality provided by my Num2Str utility.

Project Euler
In an attempt to hone my Haskell skills, I have been playing around with Project Euler.  Combining as it [...]


Related posts:<ol><li><a href='http://vcolin.com/2008/12/21/converting-numbers-strings/' rel='bookmark' title='Permanent Link: Converting numbers to strings'>Converting numbers to strings</a> <small>As a quick exercise for myself I coded up some...</small></li><li><a href='http://vcolin.com/2009/01/08/fast-right-smart/' rel='bookmark' title='Permanent Link: The fast, the right and the smart'>The fast, the right and the smart</a> <small>Working my way through the puzzles in Project Euler is...</small></li></ol>]]></description>
			<content:encoded><![CDATA[<p></p><p>It&#8217;s always good to see that the code you write has some real-life application, however small.  So, it was heartening to find a <a href="http://projecteuler.net">Project Euler</a> problem that required exactly the functionality provided by my <a href="http://vcolin.com/projects/num2str/">Num2Str</a> utility.</p>
<p><span id="more-320"></span></p>
<h3>Project Euler</h3>
<p>In an attempt to hone my Haskell skills, I have been playing around with Project Euler.  Combining as it does programming with mathematics it is the perfect playground for me.  The programs required are usually relatively short in both length and running time and so are ideal when you have a few minutes spare.</p>
<h3>Validation!</h3>
<p>Upon reaching the 17th challenge, I was delighted to be confronted by the problem:</p>
<blockquote><p>If the numbers 1 to 5 are written out in words: one, two, three, four, five, then there are 3 + 3 + 5 + 4 + 4 = 19 letters used in total.</p>
<p>If all the numbers from 1 to 1000 (one thousand) inclusive were written out in words, how many letters would be used?</p>
<p>NOTE: Do not count spaces or hyphens. For example, 342 (three hundred and forty-two) contains 23 letters and 115 (one hundred and fifteen) contains 20 letters. The use of &#8220;and&#8221; when writing out numbers is in compliance with British usage.</p></blockquote>
<p>Immediately reaching for my trusty num2str function, I found that my algorithm agreed with the results given for 1,2,3,4,5 and 342.  Validation is a great thing and in this case gave me a sense of enormous well-being.  Admittedly it did require a minor post-processing step to remove spaces and hyphens, but that was to be expected and easy in Haskell (<code>filter isLetter</code> anyone?).</p>
<p>It was but a small step from validation to merrily running the full problem and getting the correct answer.</p>
<h3>Small victories</h3>
<p>It is small victories such as this that make life worth living.  I am now on a quest to become the highest rated <a href="http://projecteuler.net/index.php?section=scores&amp;country=Scotland">Scottish Haskeller</a> (login required) on Project Euler.  I have some way to go but am currently in 7th place.  Onwards for Freedom!</p>


<p>Related posts:<ol><li><a href='http://vcolin.com/2008/12/21/converting-numbers-strings/' rel='bookmark' title='Permanent Link: Converting numbers to strings'>Converting numbers to strings</a> <small>As a quick exercise for myself I coded up some...</small></li><li><a href='http://vcolin.com/2009/01/08/fast-right-smart/' rel='bookmark' title='Permanent Link: The fast, the right and the smart'>The fast, the right and the smart</a> <small>Working my way through the puzzles in Project Euler is...</small></li></ol></p>]]></content:encoded>
			<wfw:commentRss>http://vcolin.com/2009/01/07/validation-num2str/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Testing a chat server</title>
		<link>http://vcolin.com/2009/01/06/testing-chat-server/</link>
		<comments>http://vcolin.com/2009/01/06/testing-chat-server/#comments</comments>
		<pubDate>Tue, 06 Jan 2009 23:05:06 +0000</pubDate>
		<dc:creator>Colin Ross</dc:creator>
				<category><![CDATA[Haskell]]></category>
		<category><![CDATA[chat server]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://vcolin.com/?p=310</guid>
		<description><![CDATA[After my feeble fumblings towards what is quite possibly the most minimal HTTP server designed (minimal as in functionality rather than in minimal but complete according to specification), it was clear that part of the problem was trying to use HTTP preformance tools to test the chat server. To rectify this, it made sense that [...]


Related posts:<ol><li><a href='http://vcolin.com/2009/01/06/creating-minimal-http-server-haskell/' rel='bookmark' title='Permanent Link: Creating a minimal HTTP server in Haskell'>Creating a minimal HTTP server in Haskell</a> <small>I have been spending quite a lot of time lately...</small></li></ol>]]></description>
			<content:encoded><![CDATA[<p></p><p>After my feeble fumblings towards what is quite possibly the most minimal HTTP server designed (minimal as in functionality rather than in minimal but complete according to specification), it was clear that part of the problem was trying to use HTTP preformance tools to test the chat server. To rectify this, it made sense that I write my own little test utility that would allow me to hammer the chat server and test it in a more appropriate way.</p>
<p><span id="more-310"></span></p>
<h4>Aside</h4>
<p>It is still clear that there are some issues with the <a href="http://sequence.complete.org/node/258">original chat server</a> since the shutdowns shouldn&#8217;t be occurring regardless of what kind of mad clients try to connect.  However, the hope is that with a more focused test utility, the problems can be resolved more easily.</p>
<h3>The plan</h3>
<p>Rather conveniently, there is a corresponding simple <a href="http://sequence.complete.org/node/257">TCP client</a> for the original <a href="http://sequence.complete.org/node/258">simple TCP server</a>.  Once again, this is where we shall begin our quest.</p>
<p>The idea is to convert the client from a command line user-interface to an automated tool that will connect to a server, send some input, receive some output, close the connection and then return the output received.  In this way we will be able to test that we receive from the server what we expect.</p>
<h3>The implementation</h3>
<p>Rather suprisingly, implementation went relatively smoothly.  I rapidly got a command line utility that takes a host, port and number of connections to make.  It then forks a new thread for each one and each writes a single line, consisting of its number.  Each then listens for a second beforewriting a final string and then closing the connection.</p>
<p>While this is happening, I am also listening to replies from the server.  Any replies are tagged with the client id and added to a giant map.  When the terminating string is recognised, that client is removed from a large list and when all clients have completed, this in turn returns.  The value returned is a map of client id to the strings received from the server by that client. This should now allow some analysis to ensure all are working as they should.</p>
<h3>Initial testing</h3>
<p>First testing was against the <a href="http://vcolin.com/2009/01/06/creating-minimal-http-server-haskell/">minimal HTTP server</a>.  And it worked.  But being keen to test it aggressively, I immediately cranked up the number of connections to make to 2000.  At that point I got this rather worrying error.</p>
<pre>getProtocolByName: does not exist (no such protocol name: tcp)</pre>
<p>Thinking that it was likely that I had deleted the TCP protocol, I suspected an issue with open sockets once more.  This was confirmed by a quick search over the number of connections.  The maximum I was allowed without getting the delightful error above was in fact 1021.  This looked fair enough when the output from <code>socklist</code> included the following TCP sockets</p>
<pre>type  port      inode     uid    pid   fd  name
tcp   9999    2284399    1000  15202    3  minimal
tcp    631    2245789       0      0    0
tcp     25      99077       0      0    0</pre>
<p>and <code>ulimit -a</code> provided the information</p>
<p>open files                      (-n) 1024</p>
<p>which implied that there would be 1021 sockets available.</p>
<h4>Interestingly enough&#8230;</h4>
<p>Doing a <a href="http://www.google.com/search&amp;q=getprotocolbyname+does+not+exist+no+such+protocol+name+tcp">Google search for &#8220;getprotocolbyname does not exist no such protocol name tcp&#8221;</a> turns up several Haskell-related links.  Could this be an indicator that Haskell is often being used for large-scale socket manipulation? Or that Haskell programmers are more likely to be doing strange things with large numbers of sockets?  Or even that Haskell programmers don&#8217;t know what they are doing and do crazy things like that?</p>
<h3>Further testing</h3>
<p>Since I was flush with my success, I moved on to testing the original chat server.  Here, as expected, things did not go as well.</p>
<p>Small number of connections were fine.  But when I rolled out the big guns and set 500 connections at the server, it promptly fell over (gracefully as ever).</p>
<p>Although disappointing that the chat server is somehow broken, the fact that the test utility reproduces the problem is reassuring as I slowly add to my minimal HTTP server making it more fully-featured.</p>
<h3>Scary code</h3>
<p>The thing about Haskell is that it lets you do some frightening stuff in a single line.  Case in point:</p>
<pre>sequence_ $ map forkIO $ zipWith ($) (zipWith ($) (repeat (start host portStr chan)) [1..n]) $ map (:[]) $ map show [1..n]</pre>
<p>which is just scary.  Of course there is probably a much simpler way to do something like this, but if it works, don&#8217;t mess with it, at least unless there is some sort of reliable test framework to rely on.</p>
<h3>Conclusion</h3>
<p>Getting a basic test framework was easy, but the proof of the pudding will be in the eating.  Or, in other words, I shall wait and see how much use I can make of it as I progress.  The final state of the code is reproduced below &#8211; for all its pretty colours, it is really quite messy!</p>
<h4>The code</h4>

<div class="wp_syntax"><div class="code"><pre class="haskell" style="font-family:monospace;">&amp;gt; <span style="color: #06c; font-weight: bold;">module</span> Main <span style="color: #06c; font-weight: bold;">where</span>
&nbsp;
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> <span style="color: #06c; font-weight: bold;">Prelude</span> <span style="color: #06c; font-weight: bold;">hiding</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">catch</span><span style="color: green;">&#41;</span>
&nbsp;
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> Network <span style="color: green;">&#40;</span>connectTo<span style="color: #339933; font-weight: bold;">,</span> withSocketsDo<span style="color: #339933; font-weight: bold;">,</span> PortID<span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">..</span><span style="color: green;">&#41;</span><span style="color: green;">&#41;</span>
&nbsp;
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> System<span style="color: #339933; font-weight: bold;">.</span><span style="color: #cccc00; font-weight: bold;">IO</span>
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> System<span style="color: #339933; font-weight: bold;">.</span><span style="color: #cccc00; font-weight: bold;">IO</span><span style="color: #339933; font-weight: bold;">.</span>Error <span style="color: green;">&#40;</span>isEOFError<span style="color: green;">&#41;</span>
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> System<span style="color: #339933; font-weight: bold;">.</span>Environment <span style="color: green;">&#40;</span>getArgs<span style="color: green;">&#41;</span>
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> Control<span style="color: #339933; font-weight: bold;">.</span>Exception <span style="color: green;">&#40;</span>finally<span style="color: #339933; font-weight: bold;">,</span> <span style="font-weight: bold;">catch</span><span style="color: #339933; font-weight: bold;">,</span> Exception<span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">..</span><span style="color: green;">&#41;</span><span style="color: green;">&#41;</span>
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> Control<span style="color: #339933; font-weight: bold;">.</span>Concurrent
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> Control<span style="color: #339933; font-weight: bold;">.</span>Concurrent<span style="color: #339933; font-weight: bold;">.</span>STM
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> <span style="color: #06c; font-weight: bold;">qualified</span> Data<span style="color: #339933; font-weight: bold;">.</span>List <span style="color: #06c; font-weight: bold;">as</span> L
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> <span style="color: #06c; font-weight: bold;">qualified</span> Data<span style="color: #339933; font-weight: bold;">.</span>Map <span style="color: #06c; font-weight: bold;">as</span> M
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> Control<span style="color: #339933; font-weight: bold;">.</span><span style="color: #cccc00; font-weight: bold;">Monad</span>
&nbsp;
&amp;gt; main <span style="color: #339933; font-weight: bold;">=</span> withSocketsDo <span style="color: #339933; font-weight: bold;">$</span> <span style="color: #06c; font-weight: bold;">do</span>
&amp;gt;     <span style="color: green;">&#91;</span>host<span style="color: #339933; font-weight: bold;">,</span> portStr<span style="color: #339933; font-weight: bold;">,</span> nStr<span style="color: green;">&#93;</span> &amp;lt;<span style="color: #339933; font-weight: bold;">-</span> getArgs
&amp;gt;     <span style="color: #06c; font-weight: bold;">let</span> n <span style="color: #339933; font-weight: bold;">=</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">read</span> nStr <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">Int</span><span style="color: green;">&#41;</span>
&amp;gt;     chan &amp;lt;<span style="color: #339933; font-weight: bold;">-</span> atomically newTChan
&amp;gt;     <span style="font-weight: bold;">sequence_</span> <span style="color: #339933; font-weight: bold;">$</span> <span style="font-weight: bold;">map</span> forkIO <span style="color: #339933; font-weight: bold;">$</span> <span style="font-weight: bold;">zipWith</span> <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">$</span><span style="color: green;">&#41;</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">zipWith</span> <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">$</span><span style="color: green;">&#41;</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">repeat</span> <span style="color: green;">&#40;</span>start host portStr chan<span style="color: green;">&#41;</span><span style="color: green;">&#41;</span> <span style="color: green;">&#91;</span>1<span style="color: #339933; font-weight: bold;">..</span>n<span style="color: green;">&#93;</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">$</span> <span style="font-weight: bold;">map</span> <span style="color: green;">&#40;</span>:<span style="color: green;">&#91;</span><span style="color: green;">&#93;</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">$</span> <span style="font-weight: bold;">map</span> <span style="font-weight: bold;">show</span> <span style="color: green;">&#91;</span>1<span style="color: #339933; font-weight: bold;">..</span>n<span style="color: green;">&#93;</span>
&amp;gt;     output &amp;lt;<span style="color: #339933; font-weight: bold;">-</span> mainLoop chan  <span style="color: green;">&#91;</span>1<span style="color: #339933; font-weight: bold;">..</span>n<span style="color: green;">&#93;</span> <span style="color: #339933; font-weight: bold;">$</span> M<span style="color: #339933; font-weight: bold;">.</span>fromList <span style="color: #339933; font-weight: bold;">$</span> <span style="font-weight: bold;">map</span> <span style="color: green;">&#40;</span>\i <span style="color: #339933; font-weight: bold;">-</span>&amp;gt; <span style="color: green;">&#40;</span>i<span style="color: #339933; font-weight: bold;">,</span> <span style="color: green;">&#91;</span><span style="color: green;">&#93;</span><span style="color: green;">&#41;</span><span style="color: green;">&#41;</span> <span style="color: green;">&#91;</span>1<span style="color: #339933; font-weight: bold;">..</span>n<span style="color: green;">&#93;</span>
&amp;gt;     <span style="font-weight: bold;">putStrLn</span> <span style="color: #339933; font-weight: bold;">$</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">show</span> output<span style="color: green;">&#41;</span>
&nbsp;
&amp;gt; start <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">String</span> <span style="color: #339933; font-weight: bold;">-</span>&amp;gt; <span style="color: #cccc00; font-weight: bold;">String</span> <span style="color: #339933; font-weight: bold;">-</span>&amp;gt; TChan <span style="color: green;">&#40;</span><span style="color: #cccc00; font-weight: bold;">Int</span><span style="color: #339933; font-weight: bold;">,</span> <span style="color: #cccc00; font-weight: bold;">String</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">-</span>&amp;gt; <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-</span>&amp;gt; <span style="color: green;">&#91;</span><span style="color: #cccc00; font-weight: bold;">String</span><span style="color: green;">&#93;</span> <span style="color: #339933; font-weight: bold;">-</span>&amp;gt; <span style="color: #cccc00; font-weight: bold;">IO</span> <span style="color: green;">&#40;</span><span style="color: green;">&#41;</span>
&amp;gt; start host portStr chan i ss <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span>
&amp;gt;     <span style="color: #06c; font-weight: bold;">let</span> port <span style="color: #339933; font-weight: bold;">=</span> <span style="font-weight: bold;">fromIntegral</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">read</span> portStr <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">Int</span><span style="color: green;">&#41;</span>
&amp;gt;     sock &amp;lt;<span style="color: #339933; font-weight: bold;">-</span> connectTo host <span style="color: #339933; font-weight: bold;">$</span> PortNumber port
&amp;gt;
&amp;gt;     netTID   &amp;lt;<span style="color: #339933; font-weight: bold;">-</span> spawn <span style="color: #339933; font-weight: bold;">$</span> listenLoop i <span style="color: green;">&#40;</span>hGetLine sock<span style="color: green;">&#41;</span> chan
&amp;gt;     <span style="font-weight: bold;">sequence_</span> <span style="color: #339933; font-weight: bold;">$</span> <span style="font-weight: bold;">map</span> <span style="color: green;">&#40;</span>hPutStrLn sock<span style="color: green;">&#41;</span> ss
&amp;gt;     threadDelay <span style="color: red;">1000000</span>
&amp;gt;     atomically <span style="color: #339933; font-weight: bold;">$</span> writeTChan chan <span style="color: green;">&#40;</span>i<span style="color: #339933; font-weight: bold;">,</span> <span style="background-color: #3cb371;">&quot;COMPLETE&quot;</span><span style="color: green;">&#41;</span>
&amp;gt;     hClose sock</pre></div></div>



<p>Related posts:<ol><li><a href='http://vcolin.com/2009/01/06/creating-minimal-http-server-haskell/' rel='bookmark' title='Permanent Link: Creating a minimal HTTP server in Haskell'>Creating a minimal HTTP server in Haskell</a> <small>I have been spending quite a lot of time lately...</small></li></ol></p>]]></content:encoded>
			<wfw:commentRss>http://vcolin.com/2009/01/06/testing-chat-server/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Creating a minimal HTTP server in Haskell</title>
		<link>http://vcolin.com/2009/01/06/creating-minimal-http-server-haskell/</link>
		<comments>http://vcolin.com/2009/01/06/creating-minimal-http-server-haskell/#comments</comments>
		<pubDate>Tue, 06 Jan 2009 22:48:22 +0000</pubDate>
		<dc:creator>Colin Ross</dc:creator>
				<category><![CDATA[HTTP]]></category>
		<category><![CDATA[Haskell]]></category>
		<category><![CDATA[ab]]></category>
		<category><![CDATA[httperf]]></category>
		<category><![CDATA[network]]></category>
		<category><![CDATA[server]]></category>

		<guid isPermaLink="false">http://vcolin.com/?p=297</guid>
		<description><![CDATA[I have been spending quite a lot of time lately looking at A simple TCP server written by the elusive and somewhat anonymous mrd.  It is, as the name suggests a simple TCP server which acts as a rudimentary chat server.  Any input given is output to all clients.  All very straightforward.  [...]


Related posts:<ol><li><a href='http://vcolin.com/2009/01/06/testing-chat-server/' rel='bookmark' title='Permanent Link: Testing a chat server'>Testing a chat server</a> <small>After my feeble fumblings towards what is quite possibly the...</small></li><li><a href='http://vcolin.com/2009/02/16/xml-parsing-haskell/' rel='bookmark' title='Permanent Link: XML parsing with Haskell'>XML parsing with Haskell</a> <small>Although I have been aware of the existence of XML,...</small></li></ol>]]></description>
			<content:encoded><![CDATA[<p></p><p>I have been spending quite a lot of time lately looking at <a href="http://sequence.complete.org/node/258">A simple TCP server</a> written by the elusive and somewhat anonymous mrd.  It is, as the name suggests a simple TCP server which acts as a rudimentary chat server.  Any input given is output to all clients.  All very straightforward.  Or so I thought.</p>
<p><span id="more-297"></span></p>
<h3>First steps</h3>
<p>My problems began many months ago when I began building a kind of generic network server on top of this basic framework.  Everything seemed to be going well until I wondered to myself how it would cope under load.  So I fired up <a href="http://httpd.apache.org/docs/2.0/programs/ab.html">ab</a> and set about stressing my server.  Sadly the stress was too much and my server exited cleanly.</p>
<p>Cleanly?  Yes, it seemed that my server just shut down without a murmur.  I found that a little odd &#8211; I was expecting it to become unresponsive and churn away to itself, or to crash with some error about too many sockets open or the like.  What I was not expecting was what appeared to all intents and purposes to be a nice normal exit as though everything was normal.</p>
<p>Much head-scratching, hand-wringing, nail-biting and teeth-gnashing later, I  was a bit of a wreck but my server would still merrily shutdown under load. Clearly this was a time to go back to first principles.  Unfortunately it  wasn&#8217;t so clear to me at the time, so I continued ever onwards, changing bits and pieces here and there to try and tease the answer out of the rather  convoluted server that was evolving before my very eyes.  Unsurprisingly, the answer was not forthcoming.</p>
<h3>Time to get serious</h3>
<p>Fast-forward a few months and I had another go.  More fool me.  Once again after a few days work when I performed the stress test, the graceful shutdown again occurred.  This time I stopped myself from beginning to create large convoluted forking mechanisms and instead decided to try testing the original building block itself.</p>
<p>Lo and behold &#8211; the same behaviour was present in the original.  Now we were (finally) getting somewhere.  Even better, the shutdowns weren&#8217;t my fault &#8211;  they must be either present in the basic server or else are some artefact of the interaction with the operating system.</p>
<p>Time constraints got in the way however, and I didn&#8217;t get a chance to look at what exactly was the problem.  And so the issue sat in the back of my mind waiting for the opportunity to taunt me once more.  And that time came a few days ago.</p>
<p>I sat down with the code and promised myself that I would resolve the issue. First, I made sure the problem was still present &#8211; which it was.  My next step was to try the <a href="http://www.haskell.org/haskellwiki/IRC_channel">Haskell IRC channel</a> but there were no clues to be found there.  So I resorted to poking the code myself with surgical precision looking for any tumour that could be causing the symptoms.</p>
<h3>Specifics</h3>
<p>At this point I should probably start providing some specifics about what I was actually doing, since I suspect there are actually several issues present.  The test run is</p>
<pre>ab -n 1000 -c 100 http://localhost:9999/</pre>
<p>with the following behaviours:</p>
<ul>
<li>left to run</li>
<li>interrupted with ctrl-c</li>
<li>interrupted with ctrl-c, rerun, interrupted, rerun etc.</li>
</ul>
<p>There are few things to be aware of here.  First, <code>ab</code> is sending HTTP requests and expecting HTTP responses.  It doesn&#8217;t get these, instead it gets a whole bunch of HTTP requests sent back if it is lucky and they are likely to be all mixed up.  Here is a typical log of the console when running:</p>
<pre>new client
data: GET / HTTP/1.0
data: Host: localhost:9999
data: User-Agent: ApacheBench/2.3
data: Accept: */*
data:
data: GET / HTTP/1.0
data: Host: localhost:9999
data: User-Agent: ApacheBench/2.3
data: Accept: */*
data:
data: Host: localhost:9999
data: User-Agent: ApacheBench/2.3
data: Accept: */*
new client
new client
data: GET / HTTP/1.0</pre>
<p>As a result of these malformed HTTP responses, the individual responses will, if all is well, time out.  So, there are potentially 1000 requests waiting for responses.</p>
<p>The second thing to be aware of is that the behaviour is non-deterministic. This is not unexpected, as we are at the mercy of threading.  Sometimes the first test (left to run) will not cause the server to shutdown, other times it will.  Sometimes the second test (interrupted) will not cause the server to shutdown, other times it also will.  However, the third test has always resulted in the server shutting down, almost always on the second interruption &#8211; although not always.</p>
<h3>The investigation begins</h3>
<p>The first exploratory incision was made by removing the</p>
<pre>hPutStrLn h line</pre>
<p>line.  This does appear to have the effect of making the server reliable once more.  Unfortunately it also has the effect of making the char server very quiet, since nothing is ever output to the clients.  Rather an extreme solution, but it begins to suggest where the problem might lie.</p>
<p>The next attempt was to reintroduce the client communication but to make it small &#8211; by simply outputting a single character.  This relatively small change retrieved the shutdown once more.  So even a single character seems to cause problem.  Unless, it is not the character itself, but the flushing of the character that is the issue?</p>
<p>Attempt number three reintroduced the original client communication but removed the</p>
<pre>hFlush h</pre>
<p>line trailing it.  Now, I fully expected that this would again change the client behaviour most likely by preventing any output, but I was curious as to whether it was the flush rather than the write that was the problem.   Interestingly, the server still shutdown but it seemed to be a little more reliable as it was only the third (repeated interrupts and reruns) which cause the shutdown.</p>
<p>At this point, I sat back a little and considered what (little) I knew.  I  knew that the server was shutting down cleanly.  I knew that the process of writing to the clients was causing the problem.  I knew precious little else.</p>
<h3>Resorting to logic</h3>
<p>Having exhausted my surgical probing for now, I resorted to trying to reason logically about the problem.  The main issue was the fact that the shutdown was controlled &#8211; this suggested by looking at the code that there must be an exception being called.</p>
<pre>start servSock `finally` sClose servSock</pre>
<p>Commenting out the <code>finally</code> clause I was confident I would see some nice error pointing me to the cause of all my problems.  But it was not to be. Commenting out the <code>finally</code> had no apparent effect whatsoever.  Deflated, I replaced it and thought some more.</p>
<p>My next attempt to track the problem was, I thought, inspired.  Since I had already reasoned that there was clearly some sort of I/O issue, I decided to remove all the internal <code>putStrLn</code>s in case they were somehow overloading something.  Namely, these likely culprits</p>
<pre>putStrLn "new client"

putStrLn $ "data: " ++ line

when (dropped &gt; 0) $
putStrLn ("clients lost: " ++ show dropped)</pre>
<p>Once again though, my hopes were dashed &#8211; the console was much quieter, and  then provided me with the by-now all-too-familiar bash prompt.  Still it shutdown without a murmur.</p>
<p>The next plan of attack was to try and find if there was some uncaught  exception somewhere which was somehow causing the server to exit.  A quick look on <a href="http://haskell.org/hoogle/">Hoogle</a> led to <code>setUncaughtExceptionHandler</code>.  Feeling confident again, I liberally sprinkled some trivial handlers to do nothing more that output something so I could see where the uncaught exception might be.  But as was becoming an apparent trend, nothing was output.</p>
<h3>From despair to documentation</h3>
<p>Around this time I was beginning to despair of actually finding the problem. Instead, I thought that perhaps the way to go was to accept the shortcomings of my little server and instead make the server talk HTTP to the client so that <code>ab</code> would get meaningful responses.  However I felt that this was akin to sweeping the problem under the carpet.</p>
<p>By this stage I was desperate and had resorted to reading documentation.  While doing do, I stumbled upon <a href="http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurrent.html#9">GHC&#8217;s implementation of concurrency</a>.  Up to this point had been simply compiling my server with the command</p>
<pre>ghc --make original</pre>
<p>but this suggested to me that perhaps a <code>-threaded</code> would be worth a try.  Try I did, and while it seemed to take a bit more effort to get the server to quit, quit it nevertheless did.  Not to be daunted I decided  that while I was  messing around with threads, I might as well change the calls to <code>forkIO</code> to <code>forkOS</code> to see if that made a difference.  Make a difference it did &#8211; the server was now again much easier to overload.  Back to the drawing board once more.</p>
<p>It was time to start poking at the operating system.  I set up a script to try and start 2000 instances of my server to see what happened.</p>
<pre>for i in `seq 3000 4999`
do
./original $i &amp;
echo $i
done</pre>
<p>I kept an eye on the number of sockets being used with <code>socklist | wc -l</code>, but I needn&#8217;t have worried.  The OS seemed quite happy to start up that many instances.  It even seemed to manage when I then ran <code>ab</code> against one of them. However, the server still died under load.  Still, I thought, it might be worth increasing the <code>ulimit</code>s just in case.  In particular, the maximum  number of open files might be worth increasing in case that is having an effect.  After quite a lot of tinkering with making the number of open files ludicrously large in both the server shell and the testing shell, I came to the probably inevitable conclusion that this was not causing the problem  either.</p>
<h3>Blame the code, not the OS!</h3>
<p>By now, I had exhausted my efforts at trying to blame the operating system, so I was back to poking at the server code.</p>
<p>It seemed that it was time to try to sweep the problem under the carpet for now, do I modified the code to transform it from being a simple chat server to effectively be a very dull HTTP server by simply returning a trivial response to each request, and then closing the connection.  To do this, I removed the <code>forM clients</code> loop and instead added the following lines to just after the &#8220;new client&#8221; output:</p>
<pre>hPutStrLn h "HTTP/1.1 200 Ok\r\nContent Length: 2\r\n\r\nHi"
hClose h</pre>
<p>Now, upon running the first (uninterrupted test, everything works much better. Which is to say, the server still shuts down, but it does it now with a proper error, namely</p>
<pre>hPutStr: illegal operation (handle is closed)</pre>
<p>This is progress!  In addition, <code>ab</code> reports that there 1000 requests made (as expected) and that 64 failed with exceptions.  There is also only one place in the code where <code>hPutStr</code> is used, so the search is narrowed down quite considerably.</p>
<p>The question is now this: how can the handle be closed when it has only just connected?  Also, upon repeating the test, I found that sometimes, the additional error</p>
<pre>accept: invalid argument (Bad file descriptor)</pre>
<p>is returned.  This however, seems more than likely due to my over-zealous testing before sufficient sockets have cleaned themselves up, so I don&#8217;t worry too much about it.  As expected, if I curb my impatience and wait until</p>
<pre>socklist | wc -l</pre>
<p>returns a nice small number like 6 rather than a silly number like 1214, the test behaves better.  Unfortunately, repeating the test yet more times  discovers that the number of sockets seems irrelevant and that sometimes the test fails spectacularly badly and sometimes it almost works well &#8211; but the server always shuts down with the <code>illegal operation error</code>, occasionally coupled with the <code>invalid argument error</code>.</p>
<p>Now, the <code>accept</code> error looks like a problem with the underlying socket which could be resolved with a suitable <code>catch</code> clause, so I add one in:</p>
<pre>start servSock `catch` (\e -&gt; error "TROUBLE!") `finally` sClose servSock</pre>
<p>Voila!  Now we get the server exiting with the single word &#8220;TROUBLE&#8221;.  The plan now is to try drilling back in the callstack to that point to isolate the call that is causing the problem.</p>
<h3>Distractions</h3>
<p>As an aside, I notice at this point that the <a href="http://www.haskell.org/ghc/docs/latest/html/libraries/network/Network.html">network package</a> that I am  using is two versions out of date &#8211; 2.1.0.0 vs 2.2.0.1.  Updating doesn&#8217;t appear to have any effect though, but it is probably better to keep up-to-date.</p>
<p>Another aside I noticed while looking at the code is that the <code>tselect</code> function appears to favour newer connections &#8211; they get added to the front of the list and then have their input dealt with first.  This means that the first client to connect may just sit there, never getting its input read while the last client to connect is dealt with immediately.  In order to see what effect this had, I changed the line</p>
<pre>mainLoop servSock acceptChan $ (ch,h):clients</pre>
<p>to</p>
<pre>mainLoop servSock acceptChan $ clients ++ [(ch,h)]</pre>
<p>which, while admittedly will be slower to perform might have a beneficial effect.  So did it?  Well, with fast, original, pre-pending version, the server seems to quit with the double error described above.  However with the slow,  new, append version, we get a single error.  Unfortunately the new version is reluctant to actually return any response &#8211; the console gets a few &#8220;new client&#8221; messages, but not the full 1000.</p>
<h4>Revelations coming thick and fast</h4>
<p>More revelations came to me now &#8211; due to the alteration of the server to return a response immediately upon connection, a lot of the code was now redundant. To see just how redundant, I removed the bits that weren&#8217;t needed.  This left me with the following code:</p>
<pre>module Main where
import Prelude hiding (catch)
import Network (listenOn, accept, sClose, Socket,
withSocketsDo, PortID(..))

import System.IO
import System.Environment (getArgs)
import Control.Exception (finally, catch)
import Control.Concurrent

main = withSocketsDo $ do
[portStr] &lt;- getArgs
let port = fromIntegral (read portStr :: Int)
servSock &lt;- listenOn $ PortNumber port
putStrLn $ "listening on: " ++ show port
acceptLoop servSock `catch` (\e -&gt; error $ "TROUBLE" ++ (show e)) `finally` sClose servSock

acceptLoop :: Socket -&gt; IO ()
acceptLoop servSock = do
(cHandle, host, port) &lt;- accept servSock

hPutStrLn cHandle "HTTP/1.1 200 Ok\r\nContent Length: 2\r\n\r\nHi"
hFlush cHandle
hClose cHandle

acceptLoop servSock</pre>
<p>The result when attempting to test this were surprising.  The server was now fine, from what I could tell, but <code>ab</code> would now give errors of the form</p>
<pre>apr_socket_recv: Connection reset by peer (104)</pre>
<p>after a number of successful requests.  Investigation via <a href="http://www.google.co.uk/search?q=apr_socket_recv+connection+reset+by+peer">Google</a> led to a <a href="http://www.gossamer-threads.com/lists/apache/dev/327159">discussion on an Apache dev mailing list</a>] which threw up the <code>-r</code> option. Using this allowed the test to complete but with a large number of errors.</p>
<h3>If thing aren&#8217;t working, swap tools</h3>
<p>Beginning to wonder if this will all ever end, I decide that <code>ab</code> may be at fault, and try out <a href="http://www.hpl.hp.com/research/linux/httperf/">httperf</a>] instead.  Initial runs were reasonably successful, but were all giving <code>connreset</code> errors which looked odd. Unfortunately all my efforts to fix this by using <code>HTTP/1.0</code> and/or <code>Connection: close</code> were to no avail and then <code>httperf</code> decided to become rather unuseful and consistently fail with an assertion failure.</p>
<p>Noone said this was going to be easy!</p>
<h3>Inspiration strikes</h3>
<p>And then there is an inspiration.  Because of course, behind a lot of the  madness there is the simple fact of human error.  To be specific, the use of <code>putStrLn</code> rather than <code>putStr</code>.  Oops.  This was adding some extra characters to the response, which was no doubt confusing things somewhat.  With this change, <code>httperf</code> works once more, but still announces <code>connreset</code> errors which I have now decided to ignore.</p>
<p>The good news is that the server itself now seems to be solid.  Which it really should be, since it doesn&#8217;t do any real work any more.  Sadly, the more complex version does not benefit so greatly and still experiences the illegal operation errors.</p>
<h4>The home straight</h4>
<p>But then there was more good news, accompanied, almost inevitably by more human error.  In face the call I was using to <code>httperf</code> was invalid since I was trying to use the <code>--num-calls</code> option which I thought meant make a certain number of calls to the server.  In fact it means the number of calls to  issue on each connection &#8211; and requires that the server support persistent connections: which mine does not.  Removing this option makes the <code>httperf</code> test pass with flying colours &#8211; hurrah!  This valuable information was found via <a href="http://www.comlore.com/httperf/httperf-quickstart-guide.pdf">Theodore Bullock&#8217;s httperf QuickStart Guide</a>.</p>
<p>While <code>ab</code> still complains with its <code>apr_socket_recv</code> it looks as though  this might just be because the server is simply too fast.  The documentation for <code>ab</code> does recommend that the client and server are not on the same machine.  So for now, I am content that I have a solid platform to build upon and will now use `httperf` as my stress testing mechanism.</p>
<h3>Conclusion</h3>
<p>So what am I left with after all this investigation?  Several lessons learnt  and a very minimal but nonetheless robust Haskell HTTP server.</p>
<ol>
<li>When basing work on others&#8217; work, test the original thoroughly first.</li>
<li>When using command line tools, make sure you know what the options do.</li>
<li>When implementing a protocol, be careful that you follow it.</li>
<li>Writing down what you investigate helps organise your thoughts.</li>
</ol>
<h4>The code</h4>

<div class="wp_syntax"><div class="code"><pre class="haskell" style="font-family:monospace;">&amp;gt; <span style="color: #06c; font-weight: bold;">module</span> Main <span style="color: #06c; font-weight: bold;">where</span>
&nbsp;
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> <span style="color: #06c; font-weight: bold;">Prelude</span> <span style="color: #06c; font-weight: bold;">hiding</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">catch</span><span style="color: green;">&#41;</span>
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> Network <span style="color: green;">&#40;</span>listenOn<span style="color: #339933; font-weight: bold;">,</span> accept<span style="color: #339933; font-weight: bold;">,</span> sClose<span style="color: #339933; font-weight: bold;">,</span> Socket<span style="color: #339933; font-weight: bold;">,</span> withSocketsDo<span style="color: #339933; font-weight: bold;">,</span> PortID<span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">..</span><span style="color: green;">&#41;</span><span style="color: green;">&#41;</span>
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> System<span style="color: #339933; font-weight: bold;">.</span><span style="color: #cccc00; font-weight: bold;">IO</span>
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> System<span style="color: #339933; font-weight: bold;">.</span>Environment <span style="color: green;">&#40;</span>getArgs<span style="color: green;">&#41;</span>
&amp;gt; <span style="color: #06c; font-weight: bold;">import</span> Control<span style="color: #339933; font-weight: bold;">.</span>Exception <span style="color: green;">&#40;</span>finally<span style="color: #339933; font-weight: bold;">,</span> <span style="font-weight: bold;">catch</span><span style="color: green;">&#41;</span>
&nbsp;
&amp;gt; main <span style="color: #339933; font-weight: bold;">=</span> withSocketsDo <span style="color: #339933; font-weight: bold;">$</span> <span style="color: #06c; font-weight: bold;">do</span>
&amp;gt;     <span style="color: green;">&#91;</span>portStr<span style="color: green;">&#93;</span> &amp;lt;<span style="color: #339933; font-weight: bold;">-</span> getArgs
&amp;gt;     <span style="color: #06c; font-weight: bold;">let</span> port <span style="color: #339933; font-weight: bold;">=</span> <span style="font-weight: bold;">fromIntegral</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">read</span> portStr <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">Int</span><span style="color: green;">&#41;</span>
&amp;gt;     servSock &amp;lt;<span style="color: #339933; font-weight: bold;">-</span> listenOn <span style="color: #339933; font-weight: bold;">$</span> PortNumber port
&amp;gt;     <span style="font-weight: bold;">putStrLn</span> <span style="color: #339933; font-weight: bold;">$</span> <span style="background-color: #3cb371;">&quot;listening on: &quot;</span> <span style="color: #339933; font-weight: bold;">++</span> <span style="font-weight: bold;">show</span> port
&amp;gt;     acceptLoop servSock
&amp;gt;         `<span style="font-weight: bold;">catch</span>` <span style="color: green;">&#40;</span>\e <span style="color: #339933; font-weight: bold;">-</span>&amp;gt; <span style="font-weight: bold;">error</span> <span style="color: #339933; font-weight: bold;">$</span> <span style="background-color: #3cb371;">&quot;===EXCEPTION: &quot;</span> <span style="color: #339933; font-weight: bold;">++</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">show</span> e<span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">++</span> <span style="background-color: #3cb371;">&quot;===&quot;</span><span style="color: green;">&#41;</span>
&amp;gt;         `finally` sClose servSock 
&nbsp;
&amp;gt; acceptLoop <span style="color: #339933; font-weight: bold;">::</span> Socket <span style="color: #339933; font-weight: bold;">-</span>&amp;gt; <span style="color: #cccc00; font-weight: bold;">IO</span> <span style="color: green;">&#40;</span><span style="color: green;">&#41;</span>
&amp;gt; acceptLoop servSock <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span>
&amp;gt;     <span style="color: green;">&#40;</span>cHandle<span style="color: #339933; font-weight: bold;">,</span> host<span style="color: #339933; font-weight: bold;">,</span> port<span style="color: green;">&#41;</span> &amp;lt;<span style="color: #339933; font-weight: bold;">-</span> accept servSock
&nbsp;
&amp;gt;     hPutStr cHandle <span style="color: #339933; font-weight: bold;">$</span> <span style="background-color: #3cb371;">&quot;HTTP/1.1 200 Ok<span style="background-color: #3cb371; font-weight: bold;">\r</span><span style="background-color: #3cb371; font-weight: bold;">\n</span>&quot;</span> <span style="color: #339933; font-weight: bold;">++</span>
&amp;gt;                       <span style="background-color: #3cb371;">&quot;Content-Length: 2<span style="background-color: #3cb371; font-weight: bold;">\r</span><span style="background-color: #3cb371; font-weight: bold;">\n</span>&quot;</span> <span style="color: #339933; font-weight: bold;">++</span>
&amp;gt;                       <span style="background-color: #3cb371;">&quot;<span style="background-color: #3cb371; font-weight: bold;">\r</span><span style="background-color: #3cb371; font-weight: bold;">\n</span>&quot;</span> <span style="color: #339933; font-weight: bold;">++</span>
&amp;gt;                       <span style="background-color: #3cb371;">&quot;Hi&quot;</span>
&amp;gt;     hClose cHandle
&amp;gt;     acceptLoop servSock</pre></div></div>



<p>Related posts:<ol><li><a href='http://vcolin.com/2009/01/06/testing-chat-server/' rel='bookmark' title='Permanent Link: Testing a chat server'>Testing a chat server</a> <small>After my feeble fumblings towards what is quite possibly the...</small></li><li><a href='http://vcolin.com/2009/02/16/xml-parsing-haskell/' rel='bookmark' title='Permanent Link: XML parsing with Haskell'>XML parsing with Haskell</a> <small>Although I have been aware of the existence of XML,...</small></li></ol></p>]]></content:encoded>
			<wfw:commentRss>http://vcolin.com/2009/01/06/creating-minimal-http-server-haskell/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
	</channel>
</rss>
