<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

 <title></title>
 <link href="http://rlgomes.github.com/atom.xml" rel="self"/>
 <link href="http://rlgomes.github.com/"/>
 <updated>2017-04-09T04:22:32+00:00</updated>
 <id>http://rlgomes.github.com/</id>
 <author>
   <name>Rodney Gomes</name>
   <email>rodneygomes@gmail.com</email>
 </author>

 
 <entry>
   <title>Is numpy really that much faster ?</title>
   <link href="http://rlgomes.github.com/work/python/numpy/python3/2017/04/02/15.11-is-numpy-really-that-much-faster.html"/>
   <updated>2017-04-02T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/python/numpy/python3/2017/04/02/15.11-is-numpy-really-that-much-faster-?</id>
   <content type="html">&lt;p&gt;&lt;strong&gt;numpy&lt;/strong&gt; has been the rage for quite sometime within the python community and I
have yet to find a nice write up that really compares the performance of using
&lt;strong&gt;numpy&lt;/strong&gt; vs regular python lists to get specific tasks done so I decided on
writing up a quick and dirty comparison.&lt;/p&gt;

&lt;p&gt;First lets simply compare the amount of memory required to store 1 million
integers in a &lt;strong&gt;python&lt;/strong&gt; list vs 1 million integers in a &lt;strong&gt;numpy&lt;/strong&gt; array:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;memory_profiler&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;profile&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@profile&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'__main__'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;and in &lt;strong&gt;numpy&lt;/strong&gt;:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;memory_profiler&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;profile&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;numpy&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;np&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@profile&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zeros&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'__main__'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We used the neat little library &lt;a href=&quot;https://pypi.python.org/pypi/memory_profiler&quot;&gt;memory_profiler&lt;/a&gt;
to take care of profiling memory usage for the &lt;code class=&quot;highlighter-rouge&quot;&gt;main()&lt;/code&gt; function and in this one
little test we can already see the tremendous benefits of using &lt;strong&gt;numpy&lt;/strong&gt; over
traditional python lists:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;python 1_million_python_list.py
Filename: 1_million_python_list.py

Line &lt;span class=&quot;c&quot;&gt;#    Mem usage    Increment   Line Contents&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;================================================&lt;/span&gt;
     3     13.1 MiB      0.0 MiB   @profile
     4                             def main&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;:
     5     21.0 MiB      7.9 MiB       data &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; 0 &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;_ &lt;span class=&quot;k&quot;&gt;in &lt;/span&gt;range&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;0, 1000000&lt;span class=&quot;o&quot;&gt;)]&lt;/span&gt;


2017-04-02 15:25:09 &lt;span class=&quot;nv&quot;&gt;$?&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;0 &lt;span class=&quot;nb&quot;&gt;pwd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;/home/rlgomes/workspace/python3/numpy &lt;span class=&quot;nv&quot;&gt;venv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;env &lt;span class=&quot;nv&quot;&gt;duration&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;65.931s                 
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;vs&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;python 1_million_numpy_array.py
Filename: 1_million_numpy_array.py

Line &lt;span class=&quot;c&quot;&gt;#    Mem usage    Increment   Line Contents&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;================================================&lt;/span&gt;
     5     26.1 MiB      0.0 MiB   @profile
     6                             def main&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;:
     7     26.4 MiB      0.3 MiB       data &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; np.zeros&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;1000000&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;


2017-04-02 15:23:57 &lt;span class=&quot;nv&quot;&gt;$?&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;0 &lt;span class=&quot;nb&quot;&gt;pwd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;/home/rlgomes/workspace/python3/numpy &lt;span class=&quot;nv&quot;&gt;venv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;env &lt;span class=&quot;nv&quot;&gt;duration&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;.290s                                                                                          
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Which in terms of time to execute &lt;strong&gt;numpy&lt;/strong&gt; is 227x faster and in terms of memory
usage it is 26x more efficient.&lt;/p&gt;

&lt;p&gt;Now what if we assume the list already exists in memory and we don’t care about
the time it took to load it or even how much space it occupies what happens when
we try to calculate statistics over the million integers:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;timeit&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;random&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;numpy&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;np&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;python_list&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;numpy_array&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;array&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;python_list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timeit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timeit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'sum(python_list)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;globals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;locals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%.5&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;fs for python sum(list)'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timeit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timeit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'np.sum(numpy_array)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;globals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;locals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%.5&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;fs for python numpy.sum(array)'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timeit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timeit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'max(python_list)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;globals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;locals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%.5&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;fs for python max(list)'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timeit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timeit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'np.amax(numpy_array)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;globals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;locals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%.5&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;fs for python numpy.amax(array)'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The above gives the following output on my machine:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;python million_integer_stats.py
0.56150s &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;python sum&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;list&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
0.06869s &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;python numpy.sum&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;array&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
2.05134s &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;python max&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;list&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
0.06859s &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;python numpy.amax&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;array&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Which puts &lt;strong&gt;numpy&lt;/strong&gt; at 8x faster at calculating the sum of a million integers
and at 29.9x faster at calculating the maximum value.&lt;/p&gt;

&lt;p&gt;Any other comparisons will surely continue to show just how much more efficient
&lt;strong&gt;numpy&lt;/strong&gt; is at statistical analysis and there are a ton of more functionality
the library has to offer from operating on matrices to fitting polynomials to
existing data.&lt;/p&gt;

&lt;p&gt;The main thing to takeaway from this post is to use &lt;strong&gt;numpy&lt;/strong&gt; when you are doing
any kind of math over hundreds of thousands of numbers as it will perform much
better and remove the need for coding up your own statistical functions.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Human friendly context aware duration parsing library</title>
   <link href="http://rlgomes.github.com/work/python/date/parsing/2017/03/04/15.59-human-friendly-context-aware-date-parsing.html"/>
   <updated>2017-03-04T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/python/date/parsing/2017/03/04/15.59-human-friendly-context-aware-date-parsing</id>
   <content type="html">&lt;p&gt;Recently needed the ability to parse durations from human readable strings that
were also context aware. The context being the date to start your duration
calculation from so that if you started on January 1st 2017 and wanted 2 months
you’d get exactly 31 (number of days in January 2017) + 28 (number of days in
February).  Then if I gave it the context of April 1st I’d get 61 days since
there were two months with 31 days each.&lt;/p&gt;

&lt;p&gt;I tried to find an existing library with no luck so I wrote &lt;code class=&quot;highlighter-rouge&quot;&gt;delta&lt;/code&gt; to take care
of the job and hopefully someone else would find it of use. You can get your
hands on delta easily through &lt;a href=&quot;https://pypi.python.org/pypi&quot;&gt;pypi&lt;/a&gt; like so:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;pip install delta
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Once installed you can use it like so:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;delta&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;datetime&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;datetime&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;delta&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'1 year 2 months and 3 days'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;delta&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'2 months and 3.5 weeks'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;datetime&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2017&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;You can see that &lt;code class=&quot;highlighter-rouge&quot;&gt;delta&lt;/code&gt; allows you to easily include a context or not and when
you don’t supply the context it will assume the current date. Another thing you
may have noticed is you can get quite expressive with the duration expressions
being able to do all of the following:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;1 year 2 months and 3 weeks
2 months, 3 weeks and 12 days
1y 2m 3w 4d
3.5 years and 2.7 days
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;delta&lt;/code&gt; will handle all of those without any issues.&lt;/p&gt;

&lt;p&gt;If you find &lt;code class=&quot;highlighter-rouge&quot;&gt;delta&lt;/code&gt; useful then head over to the &lt;a href=&quot;https://github.com/rlgomes/delta&quot;&gt;github&lt;/a&gt;
project and open any issues or contribute a PR for any additional features you’d
like.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Shell tips and tricks</title>
   <link href="http://rlgomes.github.com/work/scripting/shell/zsh/2016/12/03/20.00-shell-tips-n-tricks.html"/>
   <updated>2016-12-03T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/scripting/shell/zsh/2016/12/03/20.00-shell-tips-n-tricks</id>
   <content type="html">&lt;p&gt;Its been a while since I’ve written a blog post and I thought I’d dedicate this
one to some of the tricks I’ve used when working with my shell in my terminal
that make me more productive on a daily basis.&lt;/p&gt;

&lt;p&gt;To start I like to use zsh because it has a lot of features that few other
shells have such as:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;autocomplete commands such as kill with the list of currently running
processes and it does this for quite a few other commands&lt;/li&gt;
  &lt;li&gt;shared history (among different zsh sessions)&lt;/li&gt;
  &lt;li&gt;hook functions (we’ll talk about these below)&lt;/li&gt;
  &lt;li&gt;many more features…&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are a ton of them but the ones that I find the most useful are the
hook functions which you can read more about
&lt;a href=&quot;http://zsh.sourceforge.net/Doc/Release/Functions.html#Hook-Functions&quot;&gt;here&lt;/a&gt;.
I find myself using the hook functions to make my life easier every day when
working. The first very simple set of things I do with the hooks is to use
the &lt;strong&gt;chpwd&lt;/strong&gt; hook to automatically change the title of my terminal to match
the name of the directory I’m currently in. In my &lt;strong&gt;.zshrc&lt;/strong&gt; I have:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;set_window_title&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt; 
    &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; -ne &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\0&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;33]0;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$PWD&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\0&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;07&quot;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# set the title on the first pass through here&lt;/span&gt;
set_window_title

&lt;span class=&quot;c&quot;&gt;# update the console title on directory change&lt;/span&gt;
chpwd&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    set_window_title
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The previous will make it so that your title matches the current working
directory at all times but it won’t handle well the situation where the current
working directory name is super long. So I’ve devised a slightly different
solution where I’ll truncate the longer title and put a few dots and show the
last part of the path which is more important to the user, like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;pad&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;${#&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt; -lt &lt;span class=&quot;nv&quot;&gt;$2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;then
        &lt;/span&gt;pad &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; &quot;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$2&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else
        &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fi&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

ltrunc&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nv&quot;&gt;FILL&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;...&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$3&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; !&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;then
        &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;FILL&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$3&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;fi
    &lt;/span&gt;print -P &lt;span class=&quot;s2&quot;&gt;&quot;%&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$2&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$FILL&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

lpad_title&lt;span class=&quot;o&quot;&gt;(){&lt;/span&gt;
    &lt;span class=&quot;nv&quot;&gt;VAR&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;ltrunc &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; 32&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;
    pad &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$VAR&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; 32
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

set_window_title&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt; 
    &lt;span class=&quot;nb&quot;&gt;export &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;PREFIX&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$VIRTUAL_ENV&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; !&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt; 
    &lt;span class=&quot;k&quot;&gt;then
        &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;VENV&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;$(&lt;/span&gt;basename &lt;span class=&quot;nv&quot;&gt;$VIRTUAL_ENV&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;)&lt;/span&gt; 
        &lt;span class=&quot;nb&quot;&gt;export &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;PREFIX&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$VENV&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;)&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;fi

    &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;STRPATH&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;lpad_title &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$PWD&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; -ne &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\0&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;33]0;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;PREFIX&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;STRPATH&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\0&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;07&quot;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Its definitely more complex but makes for better looking titles when the directo
name is longer than 32 characters.&lt;/p&gt;

&lt;p&gt;The next trick I like to use is to wrap “long” running commands so I can get
desktop notifications when the’ve completed. This can be done on any shell
really since I just uses &lt;strong&gt;aliases&lt;/strong&gt; to achieve the desired effect. This is an
example implementation:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;NOTIFY_CMDS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=(&lt;/span&gt;grunt npm bower git make ant python&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;

run_cmd&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nv&quot;&gt;$@&lt;/span&gt;
    &lt;span class=&quot;nv&quot;&gt;CMD&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$@&lt;/span&gt; | tr -d &lt;span class=&quot;s1&quot;&gt;'\r'&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;
    RETURN &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$?&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$RETURN&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; 0 &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;then
        &lt;/span&gt;notify-send -t&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;5000 &lt;span class=&quot;s2&quot;&gt;&quot;Shell&quot;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$CMD&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;], finished&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else
        &lt;/span&gt;notify-send -t&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;5000 &lt;span class=&quot;s2&quot;&gt;&quot;Shell&quot;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$CMD&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;], finished with failure&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;fi
    return&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$RETURN&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;CMD &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;NOTIFY_CMDS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[@]&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;do
    &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;alias&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$CMD&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;run_cmd &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$CMD&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The above makes it so that the commands in the variable &lt;strong&gt;NOTIFY_CMDS&lt;/strong&gt; will be
aliased to run through the run_cmd function that can then check the return code
and in my case use &lt;strong&gt;libnotify&lt;/strong&gt; through &lt;strong&gt;notify-send&lt;/strong&gt; to show the
notifications on my desktop for those commands and the status of how they
exited.&lt;/p&gt;

&lt;p&gt;So why do I find the notification mechanism above so useful ? Well I like to
start those long running tasks and then move onto something else while things
are building, compiling or copying and not have to constantly come back to the
window to see if the command has completed. With this notification I know I’ll
get a little popup that I can look at quickly and know the result right then and
there without having to switch back to another window.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Pseudo machine learning</title>
   <link href="http://rlgomes.github.com/work/machine-learning/python/machine%20learning/prediction/2016/10/09/10.59-pseudo-machine-learning-datascience.html"/>
   <updated>2016-10-09T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/machine-learning/python/machine%20learning/prediction/2016/10/09/10.59-pseudo-machine-learning-datascience</id>
   <content type="html">&lt;p&gt;Been reading up on a lot of machine learning python tool kits and how exactly
one can apply data science to everyday scenarios. While doing so I’ve been
wondering how much of these machine learning algorithms can be written in a less
mathematical way and instead by analyzing the problem at hand and attempting to
devise an algorithm to do some data classification but just devising a data
structure (model) that can capture from a given data sample what makes something
fall into category A or B.&lt;/p&gt;

&lt;p&gt;For this specific experiment I’m going to use the data set from
&lt;a href=&quot;https://www.kaggle.com/snap/amazon-fine-food-reviews&quot;&gt;Amazon fine food reviews&lt;/a&gt;
from &lt;a href=&quot;https://www.kaggle.com/&quot;&gt;Kaggle&lt;/a&gt; which has about 500K reviews from amazon
including the review score given by the person who wrote the review. What I’ll
attempt to do is to create a model that can take lets say 250K worth of reviews
and figure out what set of words (ngrams) make for a high score of 5 vs a low
score of 1 and then apply this to all other 250K reviews and see how close the
model comes to predicting the reviews score based solely on the words used. The
idea here is that we’d end up with a model that basically understands sentiment
of what the person was writing and if they actually liked or disliked the thing
they were reviewing.&lt;/p&gt;

&lt;p&gt;Lets get started by downloading the data from the &lt;strong&gt;Kaggle&lt;/strong&gt; source above and
for this example we’re going to use the &lt;strong&gt;sqlite&lt;/strong&gt; database provided. I created
a quick &lt;a href=&quot;http://docs.python-guide.org/en/latest/dev/virtualenvs/&quot;&gt;virtualenv&lt;/a&gt; like so:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;virtualenv env -p python3
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Then proceeded to write a simple script called &lt;strong&gt;analysis.py&lt;/strong&gt;, like so:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sqlite3&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqlite3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'amazon-fine-foods/database.sqlite'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;execute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'select score, summary, text from reviews limit 5'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'__main__'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Which produces the output&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;5, &lt;span class=&quot;s1&quot;&gt;'Good Quality Dog Food'&lt;/span&gt;, &lt;span class=&quot;s1&quot;&gt;'I have bought several of the Vitality canned dog food products and have found them all to be of good quality. The product looks more like a stew than a processed meat and it smells better. My Labrador is finicky and she appreciates this product better than  most.'&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;1, &lt;span class=&quot;s1&quot;&gt;'Not as Advertised'&lt;/span&gt;, &lt;span class=&quot;s1&quot;&gt;'Product arrived labeled as Jumbo Salted Peanuts...the peanuts were actually small sized unsalted. Not sure if this was an error or if the vendor intended to represent the product as &quot;Jumbo&quot;.'&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;4, &lt;span class=&quot;s1&quot;&gt;'&quot;Delight&quot; says it all'&lt;/span&gt;, &lt;span class=&quot;s1&quot;&gt;'This is a confection that has been around a few centuries.  It is a light, pillowy citrus gelatin with nuts - in this case Filberts. And it is cut into tiny squares and then liberally coated with powdered sugar.  And it is a tiny mouthful of heaven.  Not too chewy, and very flavorful.  I highly recommend this yummy treat.  If you are familiar with the story of C.S. Lewis\'&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;The Lion, The Witch, and The Wardrobe&quot;&lt;/span&gt; - this is the treat that seduces Edmund into selling out his Brother and Sisters to the Witch.&lt;span class=&quot;s1&quot;&gt;')
(2, '&lt;/span&gt;Cough Medicine&lt;span class=&quot;s1&quot;&gt;', '&lt;/span&gt;If you are looking &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;the secret ingredient &lt;span class=&quot;k&quot;&gt;in &lt;/span&gt;Robitussin I believe I have found it.  I got this &lt;span class=&quot;k&quot;&gt;in &lt;/span&gt;addition to the Root Beer Extract I ordered &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;which was good&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; and made some cherry soda.  The flavor is very medicinal.&lt;span class=&quot;s1&quot;&gt;')
(5, '&lt;/span&gt;Great taffy&lt;span class=&quot;s1&quot;&gt;', '&lt;/span&gt;Great taffy at a great price.  There was a wide assortment of yummy taffy.  Delivery was very quick.  If your a taffy lover, this is a deal.&lt;span class=&quot;s1&quot;&gt;')
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Now the way I’d break this down is that for each review body I’d assume there is
a language used that highly influences the score given to the review. Things
such as stating &lt;code class=&quot;highlighter-rouge&quot;&gt;I disliked the way&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;I truly enjoyed&lt;/code&gt; would be indicative
of a positive or negative review and the sum of all of these kind of statements
would have a high correlation to the score given by the user. That means we
want to create a structure that can capture the average score associated with
the use of certain groupings of words. Those groupings are known as n-grams and
you can read more on that &lt;a href=&quot;https://en.wikipedia.org/wiki/N-gram&quot;&gt;here&lt;/a&gt; and
we can easily break up those reviews into n-grams like so:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ngramize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;tuples&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:])&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)])&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;' '&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tuples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The above simply breaks up the text provided into words and then iterates
through the resulting elements in order constructing n-grams of length &lt;code class=&quot;highlighter-rouge&quot;&gt;minimum&lt;/code&gt;
to &lt;code class=&quot;highlighter-rouge&quot;&gt;maximum&lt;/code&gt; along the way.&lt;/p&gt;

&lt;p&gt;Now that we have our n-grams I’ve simply decided I only care about n-grams of
length 2 to 6 for the time being and we’ll want to now use the existing score
to calculate the average score associated with each n-gram’s usage through out
the various reviews we use as our training data. Here’s the quick and dirty
approach I came up with:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;json&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sqlite3&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ngramize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;tuples&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:])&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)])&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;' '&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tuples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqlite3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'amazon-fine-foods/database.sqlite'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;execute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'select score, summary, text from reviews limit 5'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# this will store each ngram the average score associated with said ngram's&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# usage&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;summary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;review&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngramize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;review&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                    &lt;span class=&quot;s&quot;&gt;'score'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                    &lt;span class=&quot;s&quot;&gt;'appearances'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
                &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

            &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'score'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'appearances'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'score'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'appearances'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dumps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;indent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;__main__&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Running the above produces a lengthy output of n-grams to average score values
and for just 5 reviews has over 900 n-grams associated. We’ve started seeing a
few silly things in the output such as n-grams across sentences which obviously
don’t make sense:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  &lt;span class=&quot;s2&quot;&gt;&quot;good product.I love these chips&quot;&lt;/span&gt;: 5.0,
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;as well as HTML tags in the middle of the n-grams constructed:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  &lt;span class=&quot;s2&quot;&gt;&quot;it's ready.&amp;lt;br /&amp;gt;&amp;lt;br /&amp;gt;Tastes&quot;&lt;/span&gt;: 5.0,
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;So lets use &lt;a href=&quot;https://www.crummy.com/software/BeautifulSoup/bs4/doc/&quot;&gt;Beautiful Soup&lt;/a&gt;
to extract the HTML tags out of our way and instead of calculating n-grams for
the whole text lets do so at the sentence level by splitting our text on the
final period of a sentence. Here’s how things look now in terms of the python
solution:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;bs4&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BeautifulSoup&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;json&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sqlite3&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ngramize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# remove HTML tags&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;soup&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BeautifulSoup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'html.parser'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;soup&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# process sentence by sentence&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sentence&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'.'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sentence&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;tuples&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:])&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)])&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;' '&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tuples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqlite3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'amazon-fine-foods/database.sqlite'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;execute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'select score, summary, text from reviews limit 5'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# this will store each ngram the average score associated with said ngram's&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# usage&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;summary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;review&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngramize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;review&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                    &lt;span class=&quot;s&quot;&gt;'score'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                    &lt;span class=&quot;s&quot;&gt;'appearances'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
                &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

            &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'score'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'appearances'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'score'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'appearances'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dumps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;indent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;__main__&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Running the analysis script against the first 100 reviews takes 1.3s and for
1000 takes 4s. I’m sure we could spend a ton of time optimizing things but right
now I’ll take correctness over speed and will dive back into any performance
gains later.&lt;/p&gt;

&lt;p&gt;Now that we have a mapping of n-grams to an average score associated with said
n-gram we can start to verify how well this model works. The first idea that
occurred to me is to see exactly how close the model predicts the scores for
the exact same reviews it was using for training. I simply calculated the score
by averaging the value of all of the n-grams found within a review and spitting
out the number side by side with the actual score. The below output is from
using the first 1000 reviews as training data and simply applying the model to
the first 20 reviews:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;predicted: 4.87, actual: 5.00
predicted: 1.21, actual: 1.00
predicted: 4.01, actual: 4.00
predicted: 2.48, actual: 2.00
predicted: 4.92, actual: 5.00
predicted: 4.00, actual: 4.00
predicted: 4.89, actual: 5.00
predicted: 4.83, actual: 5.00
predicted: 4.94, actual: 5.00
predicted: 4.91, actual: 5.00
predicted: 4.88, actual: 5.00
predicted: 4.89, actual: 5.00
predicted: 1.50, actual: 1.00
predicted: 4.07, actual: 4.00
predicted: 4.90, actual: 5.00
predicted: 4.78, actual: 5.00
predicted: 2.41, actual: 2.00
predicted: 4.84, actual: 5.00
predicted: 4.92, actual: 5.00
predicted: 4.90, actual: 5.00
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;So far this is very promising but lets not forget this is the model running
against the exact same data it was trained on.&lt;/p&gt;

&lt;p&gt;Now we’ve reached the point where we want to simply train with the first 10K
of reviews and then run the model against the next 10K of reviews and calculate
the root mean square error (&lt;a href=&quot;https://en.wikipedia.org/wiki/Root-mean-square_deviation&quot;&gt;RMSE&lt;/a&gt;)
of our prediction.&lt;/p&gt;

&lt;p&gt;The new python script looks like so:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;bs4&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BeautifulSoup&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;numpy&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;square&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;json&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sqlite3&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ngramize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# remove HTML tags&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;soup&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BeautifulSoup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'html.parser'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;soup&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# process sentence by sentence&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sentence&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'.'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sentence&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;tuples&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:])&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)])&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;' '&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tuples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqlite3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'amazon-fine-foods/database.sqlite'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;execute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'select score, summary, text from reviews limit 20000'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# this will store each ngram the average score associated with said ngram's&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# usage&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;summary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;review&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;

        &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngramize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;review&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                    &lt;span class=&quot;s&quot;&gt;'score'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                    &lt;span class=&quot;s&quot;&gt;'appearances'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
                &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

            &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'score'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'appearances'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'score'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'appearances'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# lets use the n-gram scores to calculate the predicted score for an&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# existing review to see just how close we can get&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;summary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;review&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngramize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;review&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;predicted_score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ngrams_found&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;predicted_score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngrams_found&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;


        &lt;span class=&quot;n&quot;&gt;predicted_score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;predicted_score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams_found&lt;/span&gt;
        &lt;span class=&quot;c&quot;&gt;#print('actual: %2.2f prediction: %2.2f' % (score, predicted_score))&lt;/span&gt;

        &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;predicted_score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'RMSE: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%2.2&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;f'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;square&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;__main__&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The output from this script gave us the following:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;RMSE: 1.16
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Which means on average we’re off by 1.16 with our prediction of the review score
as provided by the person who wrote the review. Now that isn’t horrible on a
scale of 1-5 we’re only off by 23% with a super simple model we wrote in less
than an hour. There a few things we can try to make this work better and one of
them is to use the notion of &lt;a href=&quot;https://en.wikipedia.org/wiki/Stop_words&quot;&gt;stop words&lt;/a&gt;
to avoid creating n-grams composed solely of stop words since those are “fluff”
in the spoken language. I won’t go into those just yet as I want to see how this
experiment does as we train on a bigger chunk of the data set. So I trained the
model against the first 100K of reviews and then predicted the following 100K
and we got:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;RMSE: 1.04
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Which means that with a larger training set we’re seeing the model behave much
better in terms of predictions. Another thing to note is while the script runs
its building an enormous hash table for all of n-grams found and it can grow to
several gigabytes of space. So for my last experiment I’ll try to train with the
first 250K worth of reviews and see how the RMSE looks for predicting the remaining
250K of reviews. The process hit 3.7GB of usage and after a little over 5
minutes we got got the following:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;RMSE: 1.04
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Which means we’d be able to predict the overall we’re able to predict a review
score with an error of about 21% and we used a method that was very easy to
explain and follow along without any unnecessary complexities of complicated
machine learning algorithms.&lt;/p&gt;

&lt;p&gt;Another idea I had was to simply train the model with the first 100K of reviews
and then see if I wrote my own negative/positive review and see if the score I
would get would reflect what I had written. Therefore providing me with a pretty
good sentiment analysis tool. So I restructured the existing script so I could
train separately from predicting and also be able to save the model after
training so I could then use to analyze text and be provided with a value from
1 (negative sentiment) to 5 (positive sentiment). This is what the tool looks
like now:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;#!/usr/bin/env python3&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pickle&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sqlite3&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;click&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;requests&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;bs4&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BeautifulSoup&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;numpy&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;square&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ngramize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# remove HTML tags&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;soup&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BeautifulSoup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'html.parser'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;soup&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# process sentence by sentence&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sentence&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'.'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sentence&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;tuples&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)])&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;' '&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tuples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@click.group&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;cli&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;pass&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@cli.command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@click.option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'--limit'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'-l'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'how many entries from the amazon fine foods sqlite db to '&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
                   &lt;span class=&quot;s&quot;&gt;'train with'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@click.option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'--amazon-reviews-path'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'amazon-fine-foods'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'the path to the directory containing the amazon fine '&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
                   &lt;span class=&quot;s&quot;&gt;'foods review database.sqlite file'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@click.option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'--output-model'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'model.pickle'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'output filename to save the trained model'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;train&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;limit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;amazon_reviews_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;output_model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqlite3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;amazon_reviews_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'database.sqlite'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;execute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'select score, text from reviews limit &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;d'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;limit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# this will store each ngram the average score associated with said ngram's&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# usage&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;review&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

        &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngramize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;review&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;

            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                    &lt;span class=&quot;s&quot;&gt;'score'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                    &lt;span class=&quot;s&quot;&gt;'appearances'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
                &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

            &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'score'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'appearances'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'score'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;
                               &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'appearances'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;output_model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'wb'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;pickle&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dump&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pickle&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HIGHEST_PROTOCOL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@cli.command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@click.option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'--skip'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'-s'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'how many entries to skip ahead, ie more than the ones '&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
              &lt;span class=&quot;s&quot;&gt;'you trained against'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@click.option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'--limit'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'-l'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'how many entries the --skip value to predict scores for'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@click.option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'--amazon-reviews-path'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'amazon-fine-foods'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'the path to the directory containing the amazon fine '&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
                   &lt;span class=&quot;s&quot;&gt;'foods review database.sqlite file'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@click.option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'--input-model'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'model.pickle'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'input filename of the previously trained model'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;predict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;skip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;limit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;amazon_reviews_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;input_model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input_model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'rb'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pickle&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqlite3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;amazon_reviews_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'database.sqlite'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;execute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'select score, text from reviews limit &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;d'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt;
                             &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;skip&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;limit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# lets use the n-gram scores to calculate the predicted score for an&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# existing review to see just how close we can get&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;review&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;skip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;

        &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngramize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;review&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;predicted_score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ngrams_found&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;

            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;predicted_score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ngrams_found&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams_found&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;predicted_score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;predicted_score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams_found&lt;/span&gt;

        &lt;span class=&quot;c&quot;&gt;#print('actual: %2.2f prediction: %2.2f' % (score, predicted_score))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;predicted_score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'RMSE: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%2.2&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;f'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;square&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@cli.command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@click.argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'text'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@click.option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'--input-model'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'model.pickle'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'input filename of the previously trained model'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;analyze&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;input_model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input_model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'rb'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pickle&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngramize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ngrams_found&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngram_scores&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ngram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;ngrams_found&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams_found&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ngrams_found&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'score: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;__main__&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;cli&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The script grew pretty significantly in size but now we can actually train the
model and subsequently test it against different ranges in the review database
as well as against a string of text. So lets first train against the first 100K
reviews:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./senti.py -l 100000
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Now we can run against the next 1000 reviews in the database and see how high or
low the RMSE is:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;./senti.py predict -l 100000 -s 1000
RMSE: 0.75
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Nothing new about these last few uses but now comes the more interesting part
in which we see how well the model can be used to analyze a new arbitrary piece
of text in terms of a positive or negative sentiment:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;./senti.py analyze &lt;span class=&quot;s1&quot;&gt;'I like this scripting tool'&lt;/span&gt;
score: 4.236562683156721
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Now what if express that we &lt;code class=&quot;highlighter-rouge&quot;&gt;really like&lt;/code&gt; this scripting tool:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;./senti.py analyze &lt;span class=&quot;s1&quot;&gt;'I really like this scripting tool'&lt;/span&gt;
score: 4.246059117643124
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The score was slightly higher and what if we &lt;code class=&quot;highlighter-rouge&quot;&gt;love&lt;/code&gt; this scripting tool:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;./senti.py analyze &lt;span class=&quot;s1&quot;&gt;'I love this scripting tool'&lt;/span&gt;
score: 4.637583998372766
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;How well does negative sentiment detection work:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;./senti.py analyze &lt;span class=&quot;s1&quot;&gt;'I do not like this scripting tool'&lt;/span&gt;
score: 3.639240038501364
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;That wasn’t as low as we’d like but lets see if expressing a strong dislike
results in a lower value:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;./senti.py analyze &lt;span class=&quot;s1&quot;&gt;'I really do not like this scripting tool'&lt;/span&gt;
score: 3.8001524620308818
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Lower but no where near where I’d want it to be so I just started expressing
in a harsher manner how negatively I felt about this tool:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;./senti.py analyze &lt;span class=&quot;s1&quot;&gt;'I hate this scripting tool'&lt;/span&gt;
score: 2.586253369272237
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;I’m now starting to see that as well as this tool has been working its having
a hard time when predicting scores for negative sentiments. Experimenting a bit
I’m seeing that smaller text is behaving badly and my theory is that small
n-grams such as “I really” “this scripting” and others that don’t express actual
sentiment on their own would have a negative effect on the prediction since the
sentence “I really hate” vs “I really like” would be skewed by the usage of
“I really” in the various reviews. So I decided to only train the model on
n-grams with a minimum length of 3 and maximum length of 6 and the resulting
model is behaving much better:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;./senti.py analyze &lt;span class=&quot;s1&quot;&gt;'I like this'&lt;/span&gt;
score: 4.549878345498784
&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;./senti.py analyze &lt;span class=&quot;s1&quot;&gt;'I love this'&lt;/span&gt;
score: 4.7824878387769285
&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;./senti.py analyze &lt;span class=&quot;s1&quot;&gt;'I really love this'&lt;/span&gt;
score: 4.54860095976375
&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;./senti.py analyze &lt;span class=&quot;s1&quot;&gt;'I do not like this'&lt;/span&gt;
score: 3.118788898794749
&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;./senti.py analyze &lt;span class=&quot;s1&quot;&gt;'I hate this'&lt;/span&gt;
score: 1.0
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Which is behaving quite a bit better than before and now I’m curious if the
model does a better job of predicting the scores for the next 100K of reviews:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;&amp;gt; &lt;/span&gt;./senti.py predict -l 100000 -s 100000
RMSE: 1.01
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;So we actually improved even if just a tiny bit on the already existing RMSE of
1.04.&lt;/p&gt;

&lt;p&gt;This was simply an experiment to see how well this whole idea would work and I’m
surprised I was able to get any usefulness out of something I wrote up in a few
hours while documenting and experimenting with the approach as I went along. If
you attempt to use the code here make sure to install the following
requirements:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;beautifulsoup4==4.5.1
click==6.6
numpy==1.11.2
requests==2.11.1
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Which you can do by issuing &lt;code class=&quot;highlighter-rouge&quot;&gt;pip install xyz=1.2.3&lt;/code&gt; for each of those lines and
the above script should just work granted you’ve downloaded the
&lt;a href=&quot;https://www.kaggle.com/snap/amazon-fine-food-reviews&quot;&gt;Amazon fine food reviews&lt;/a&gt;
data set.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>iPython Notebooks Rock!</title>
   <link href="http://rlgomes.github.com/work/python/ipython/awesome/notebooks/2014/07/06/10.42-using-ipython-notebooks.html"/>
   <updated>2014-07-06T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/python/ipython/awesome/notebooks/2014/07/06/10.42-using-ipython-notebooks</id>
   <content type="html">&lt;p&gt;Decided to write a quick and short post on using &lt;strong&gt;ipython notebooks&lt;/strong&gt; as I
came across this just today and found it to be extremely useful. If you already
used ipython then I think &lt;strong&gt;ipython notebooks&lt;/strong&gt; will be super quick to pick up.
The idea behind the &lt;strong&gt;ipython notebooks&lt;/strong&gt; is that they’re an ipython session in
the browser that you can save and share with others. Then within that session
you can easily edit the existing session so you can undo parts of it that you
didn’t want to share and the final product is a clean session of python code
that you were attempting to show someone else how something works.&lt;/p&gt;

&lt;p&gt;The nice thing about &lt;strong&gt;ipython notebook&lt;/strong&gt; is that if you’re already using 
&lt;strong&gt;ipython&lt;/strong&gt; then you already have it installed and ready to go:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; ipython notebook
...
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The previous command will open your browser on an ipython session and you can 
start writing into the Web UI the same expressions you’d do in &lt;strong&gt;ipython&lt;/strong&gt; on 
the command line like so:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;import random
data = [ random.randint(1,1000) for _ in range(0, 10) ]
print data
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Once you’ve filled in a slot on the screen you can hit &lt;strong&gt;Ctrl+Enter&lt;/strong&gt; or go to
the top and press the play button to have your code intepreted and the result
rendered in the ipytho notebook. For the above you’d have something like so:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/2014/ipython_notebooks/simple_ipython_session.png&quot; alt=&quot;simple session&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Now you can save the above session and share it with a colleague with ease. Just
click on the &lt;strong&gt;File&lt;/strong&gt; -&amp;gt; &lt;strong&gt;Download as&lt;/strong&gt; and pick your option. You can start to
see just how useful this is when you share your &lt;strong&gt;ipynb&lt;/strong&gt; file with another
pythonista and actually share a piece of executable code that can be used to
iterate on an idea within the ipython shell.&lt;/p&gt;

&lt;p&gt;For a more interesting example make sure to &lt;strong&gt;pip&lt;/strong&gt; install the vincent library
and then have a look at the &lt;a href=&quot;/images/2014/ipython_notebooks/vincent_session.ipynb&quot;&gt;vincent session&lt;/a&gt;
from within your own &lt;strong&gt;ipython notebook&lt;/strong&gt; session by starting that session in a
directory that contains the &lt;strong&gt;ipynb&lt;/strong&gt; file. If the previous loading worked
correctly you’ll be looking at a similar session to the following one:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/2014/ipython_notebooks/vincent_session.png&quot; alt=&quot;vincent session&quot; /&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Pythonbrew - the best way to install and manage python on your system</title>
   <link href="http://rlgomes.github.com/work/python/brew/install/2013/06/20/18.00-pythonbrew-best-way-to-install-and-manage-python-on-your-system.html"/>
   <updated>2013-06-20T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/python/brew/install/2013/06/20/18.00-pythonbrew---best-way-to-install-and-manage-python-on-your-system</id>
   <content type="html">&lt;p&gt;This will be a short post but I just wanted to make sure to write up a simple
post on how exactly pythonbrew works and how much time and effort it can save
you when managing different python installs on any given OS.&lt;/p&gt;

&lt;p&gt;So pythonbrew is quite simple to install, you can start by having a look at the
github project &lt;a href=&quot;https://github.com/utahta/pythonbrew&quot;&gt;here&lt;/a&gt;, but installing it
is as easy as:&lt;/p&gt;

&lt;console&gt;
curl -kL http://xrl.us/pythonbrewinstall | bash
&lt;/console&gt;

&lt;p&gt;and then adding the following to your bashrc:&lt;/p&gt;

&lt;console&gt;
[[ -s $HOME/.pythonbrew/etc/bashrc ]] &amp;amp;&amp;amp; source $HOME/.pythonbrew/etc/bashrc
&lt;/console&gt;

&lt;p&gt;With this you now have pythonbrew ready to go and you can check what are the 
currently available version of python you can install with:&lt;/p&gt;

&lt;console&gt;
&amp;gt;pythonbrew list -k
# Pythons
Python-1.5.2
Python-1.6.1
Python-2.0.1
Python-2.1.3
Python-2.2.3
Python-2.3.7
Python-2.4.6
Python-2.5.6
Python-2.6.8
Python-2.7.3
Python-3.0.1
Python-3.1.4
Python-3.2.3
Python-3.3.0
&lt;/console&gt;

&lt;p&gt;Now isn’t that impressive that you can actually install a python version from 
1.5.2 to the very latest bleeding 3.3.0 version ? So installing any of those
versions is as easy as:&lt;/p&gt;

&lt;console&gt;
&amp;gt; pythonbrew install Python-2.4.6
Downloading Python-2.4.6.tgz as /home/rlgomes/.pythonbrew/dists/Python-2.4.6.tgz
######################################################################## 100.0%
Extracting Python-2.4.6.tgz into /home/rlgomes/.pythonbrew/build/Python-2.4.6

This could take a while. You can run the following command on another shell to track the status:
  tail -f &quot;/home/rlgomes/.pythonbrew/log/build.log&quot;

Patching Python-2.4.6
Installing Python-2.4.6 into /home/rlgomes/.pythonbrew/pythons/Python-2.4.6
pythonbrew list
Downloading distribute_setup.py as /home/rlgomes/.pythonbrew/dists/distribute_setup.py
######################################################################## 100.0%
Installing distribute into /home/rlgomes/.pythonbrew/pythons/Python-2.4.6
Installing pip into /home/rlgomes/.pythonbrew/pythons/Python-2.4.6

Installed Python-2.4.6 successfully. Run the following command to switch to Python-2.4.6.
  pythonbrew switch 2.4.6
&lt;/console&gt;

&lt;p&gt;Checking what versions are currently available is also very easy:&lt;/p&gt;

&lt;console&gt;
&amp;gt; pythonbrew list
# pythonbrew pythons
  Python-2.4.6
  Python-2.7.2
  Python-2.7.3 (\*)
  Python-3.2
  Python-3.3.0
&lt;/console&gt;

&lt;p&gt;As you can see the above is stating that version 2.7.3 is currently in use and 
for you to switch the current terminal session over to another version you can 
do this like so:&lt;/p&gt;

&lt;console&gt;
&amp;gt; pythonbrew use 2.4.6
Using `Python-2.4.6
&lt;/console&gt;

&lt;p&gt;By this point you realize this is just to darn easy and makes using multiple 
python versions on the same host quite easy. The other nice thing is that 
pythonbrew already has virtual environments built in and you can see the rest 
of those commands right in the README provided in github.&lt;/p&gt;

&lt;p&gt;You should really start using pythonbrew from now on as it makes installing and
changing python versions quite easy.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>From Prototype To Production</title>
   <link href="http://rlgomes.github.com/work/writings/software/engineering/tutorial/2012/10/21/18.44-From-Prototype-To-Production.html"/>
   <updated>2012-10-21T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/writings/software/engineering/tutorial/2012/10/21/18.44-From-Prototype-To-Production</id>
   <content type="html">&lt;p&gt;&lt;strong&gt;START 12:01 October 14th, 2012&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In the sequence of my previous post I’d like to give a more real world example
of how a modern software engineer should be able to write and test code using 
the vast amount of tools/frameworks available to get the job done in timely 
fashion while producing high quality code.&lt;/p&gt;

&lt;p&gt;To start of course we’ll need some sort of a project to build while writing 
about how to solve the issues we come across. So for this writing I will attempt
to create a key/value store in &lt;strong&gt;Python&lt;/strong&gt; that can be used to mock out a real 
key/value store such as Redis, LevelDB or Amazon’s Dynamo. The current 
requirements will be:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Support 3 simple API calls:
    &lt;ul&gt;
      &lt;li&gt;GET key&lt;/li&gt;
      &lt;li&gt;SET key value&lt;/li&gt;
      &lt;li&gt;DEL key&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;The protocol used to communicate should be human readable and really 
efficient, just like Redis’s communication protocol is.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As part of this writing I will track in real time how many hours I’m spending
on this project while writing the post so that at the end you can get an idea
of how little overhead writing tests and documentation while developing really 
has.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;STOP: 12:11 on October 14th, 2012&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;START 14:57 on October 14th, 2012&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So the first thing we’ll have to do is to define the exact protocol we’re going
to use in a way that can be easily consumed by others who will attempt to talk 
the same protocol or create clients to talk this custom protocol. We mentioned
earlier we’d be using a protocol similar to what Redis uses. You can read up
on Redis’s communication protocol &lt;a href=&quot;http://redis.io/topics/protocol&quot;&gt;here&lt;/a&gt; and 
we’ll be greatly simplifying this protocol for this writing like so:&lt;/p&gt;

&lt;pre&gt;
\*[number of arguments] CR LF
[command name] CR LF
[argument 1] CR LF
[argument 2] CR LF
&lt;/pre&gt;

&lt;p&gt;Which means that the first thing sent is the indication of how many arguments 
will follow separted by a carriage return and line feed character (\r\n). Then
each of the arguments a single line termianted by \r\n.&lt;/p&gt;

&lt;p&gt;So sending a &lt;strong&gt;SET&lt;/strong&gt; request for the key &lt;strong&gt;A&lt;/strong&gt; to set it to the value 100 would 
look like so on the wire:&lt;/p&gt;

&lt;pre&gt;
\*3\r\nSET\r\nA\r\n100\r\n
&lt;/pre&gt;

&lt;p&gt;Replies will also be very similar to the way Redis deals with this type of thing
and we’ll basically start a response with a &lt;strong&gt;+&lt;/strong&gt; on success followed by a single
line response, or we’ll start with &lt;strong&gt;-&lt;/strong&gt; if there was an error followed by a
single line with the error message.&lt;/p&gt;

&lt;p&gt;We now have to decide what we’ll use to build the protocol server on and
currently one of the most flexible and best performant ones in the Python world 
is &lt;strong&gt;Twisted&lt;/strong&gt; which can be used to easily create your own custom protocol or 
better yet used to easily build your own HTTP, FTP, etc server in minutes. I 
had to brush up on my knowledge of &lt;strong&gt;Twisted&lt;/strong&gt; and how to create my own custom
protocol and after reading through the documentation for some 15 minutes I found
that what I wanted to use was the &lt;strong&gt;LineReceiver&lt;/strong&gt; implementation to build my 
protocol on a per line reading of any connection. The first example that you 
may be able to put together using the &lt;strong&gt;LineReceiver&lt;/strong&gt; class may look like 
this:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;twisted.internet.protocol&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Factory&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;twisted.protocols.basic&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;LineReceiver&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;twisted.internet&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reactor&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Answer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LineReceiver&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;answers&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'How are you?'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'Fine'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;I don't know what you mean&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;lineReceived&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;answers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;has_key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendLine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;answers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendLine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;answers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AnswerFactory&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Factory&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;pass&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;buildProtocol&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Answer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;reactor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;listenTCP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;9999&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AnswerFactory&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;reactor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This of course is just an example of how you can use twisted to make a line
reading protocol handler. Now lets actually use this to read our new custom 
protocol which is a multi-line protocol that needs to reconstruct each command
from the multiple lines that it is broken up into on the wire.&lt;/p&gt;

&lt;p&gt;Now even before we start writing the actual server handler code lets first 
write up a few very simple unit test that we can use to verify that we have a 
working &lt;strong&gt;set&lt;/strong&gt; and &lt;strong&gt;get&lt;/strong&gt; commands:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;socket&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;unittest&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ProtocolTest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unittest&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TestCase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;setUp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_connection&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;socket&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;socket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;socket&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AF_INET&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;socket&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SOCK_STREAM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'localhost'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;9999&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;tearDown&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;_send_cmd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cmd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;cmd_string&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'*&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\r\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;cmd_string&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\r\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cmd&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;cmd_string&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\r\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;

        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendall&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cmd_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_read_response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;_read_response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;status&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;''&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;status&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'+'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c&quot;&gt;# successful response with a single line response&lt;/span&gt;
           &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c&quot;&gt;# unsucceful response with error message&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Exception&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_basic_set_and_get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_send_cmd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'SET'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'A'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;resp&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_send_cmd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'GET'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'A'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;resp&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'100'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'expected 100 got &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;resp&lt;/span&gt;
        
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'__main__'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;unittest&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;There is quite a lot of test code displayed there but that’s because we had to 
create those helper methods for sending commands and receiving responses. The
actually test itself is just 3 lines to send a &lt;strong&gt;SET&lt;/strong&gt; command verify with a 
&lt;strong&gt;GET&lt;/strong&gt; command that the current server recorded our 100 value correctly.&lt;/p&gt;

&lt;p&gt;We’re now back to the server code because we now need to restructure our 
&lt;strong&gt;CommandReader&lt;/strong&gt; so that it can actually read each command line by line and
translate that into the right underlying set/get command. In a first approach
at writing our &lt;strong&gt;CommandReader&lt;/strong&gt; what we need to do is to make this 
&lt;strong&gt;LineReceiver&lt;/strong&gt; act as state machine that transitions between commands in a 
very well defined manner. Every line that starts with an asterisk is expected
to be a new command that is consumed till all arguments are read and the command
is dispatched and the response is sent back to the client awaiting a response.&lt;/p&gt;

&lt;p&gt;A first approach may look like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;CommandReader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LineReceiver&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
    state machine comand reader 
    &quot;&quot;&quot;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reading_argument_size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expected_arguments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
       
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;lineReceived&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;startswith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'*'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;c&quot;&gt;# new command starting&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendLine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'-unexpected start of new command&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\r&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expected_arguments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:])&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expected_arguments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c&quot;&gt;# command complete lets dispatch and return OK&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendLine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'+OK'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c&quot;&gt;# we're still reading the arguments&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expected_arguments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Of course if we run our test against this server it thinks it has stored data and
will fail to retrieve the desired data because we’re always responding with ‘OK’
as you can see here:&lt;/p&gt;

&lt;console&gt;
&amp;gt; python tests/protocol.py
F
======================================================================
FAIL: test_basic_set_and_get (__main__.ProtocolTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File &quot;tests/protocol.py&quot;, line 45, in test_basic_set_and_get
    assert resp == '100', 'expected 100 got %s' % resp
AssertionError: expected 100 got OK

----------------------------------------------------------------------
Ran 1 test in 0.002s

FAILED (failures=1)
&lt;/console&gt;

&lt;p&gt;Lets take our current implementation and make the &lt;strong&gt;CommandReceiver&lt;/strong&gt; smart
enough to look up the correctly handler based on the command name supplied and
return the response that the handling function returns.&lt;/p&gt;

&lt;p&gt;After some debugging and restructuring the code a bit as I made changes and 
re-ran the tests I realized that the checking for a complete command should 
always be done after processing each line. Then I also figured that 
&lt;strong&gt;socket.sendLine&lt;/strong&gt; already adds the newline character at the end of each 
response. Once I fixed the code up like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;CommandReader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LineReceiver&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
    state machine comand reader 
    &quot;&quot;&quot;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cmdhandler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_cmdhandler&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cmdhandler&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_cmd_names&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cmdhandler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reading_argument_size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expected_arguments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
       
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;lineReceived&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;startswith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'*'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;c&quot;&gt;# new command starting&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendLine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'-unexpected start of new command&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\r&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expected_arguments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:])&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; 
        
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c&quot;&gt;# we're still reading the arguments&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expected_arguments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expected_arguments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c&quot;&gt;# lookup the method handler&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;command&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

            &lt;span class=&quot;c&quot;&gt;# remove the command name from the arguments&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:]&lt;/span&gt;

            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;command&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_cmd_names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;getattr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_cmdhandler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendLine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'+&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendLine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'-unknown command &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Our unit test now passes successfully, like this:&lt;/p&gt;

&lt;console&gt;
&amp;gt; python tests/protocol.py
.
----------------------------------------------------------------------
Ran 1 test in 0.002s

OK
&lt;/console&gt;

&lt;p&gt;Now I have a working prototype that can actually do set and get requests and 
save that information into memory at runtime. At this point its 16:30 on 
October the 14th, 2012 and with writing the unit test and writing the code I’ve 
spent just a little over an hour and a half to have a working prototype that 
could be used by a dependent service to start integrating against.&lt;/p&gt;

&lt;p&gt;What we’ll focus on next is using tools such as &lt;strong&gt;pylint&lt;/strong&gt; to identify problems
in the code as well as using &lt;strong&gt;setuptools&lt;/strong&gt; to create a setup script that can 
be used by anyone to easily install this service and start it for others to 
integrate with while features are being added to the base source.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;STOP 16:32 on October 14th, 2012&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;START 18:59 on October 14th, 2012&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So in order to share our code with others we’ll have to create a setup script
that can be used to easily install and startup the service as well as being 
able to upgrade your current installation as further updates are made to the
code base. For this specific code being written we’ll create a simple 
&lt;strong&gt;setup.py&lt;/strong&gt; file like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;setuptools&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;setup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;find_packages&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;setup&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'kvs'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;version&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'0.0.1'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;author&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'Rodney Gomes'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;author_email&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'rodneygomes@gmail.com'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;''&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;test_suite&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;tests&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;keywords&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'keyvalue'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'storage'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;py_modules&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[],&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;scripts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'kvsserver.py'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;license&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'Apache 2.0 License'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;description&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'simple key value store'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;long_description&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'README.md'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;packages&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;find_packages&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exclude&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'tests'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;install_requires&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; 
                        &lt;span class=&quot;s&quot;&gt;'twisted'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                       &lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;entry_points&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;'console_scripts'&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
            &lt;span class=&quot;s&quot;&gt;'kvs_start = kvsserver:main'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;                                                                      
    &lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;                
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;With that we can now check this code in and anyone who wants to run your service
can easily do the following on the command line with Python 2.7 installed:&lt;/p&gt;

&lt;console&gt;
&amp;gt; git clone https://github.com/rlgomes/kvs.git
&amp;gt; git checkout v0.0.1
&amp;gt; python setup.py install
running install
...
Finished processing dependencies for kvs==0.0.1
&lt;/console&gt;

&lt;p&gt;Now your service can easily be started by using the &lt;strong&gt;‘kvs_start’&lt;/strong&gt; script that 
should now be in your path.&lt;/p&gt;

&lt;p&gt;Before we proceed to start adding more features and tests to our code lets 
introduce the notion of a code style checker and static code analyzer called 
pylint and how we can use it to make sure our code is clean and lean and little
less prone to errors. Using pylint is super easy as you can install the python
package with a quick &lt;strong&gt;‘pip install pylint’&lt;/strong&gt; and then you can call it on any 
code base like so:&lt;/p&gt;

&lt;console&gt;
&amp;gt; pylint kvsserver.py
No config file found, using default configuration
\************* Module kvsserver
C:  1,0: Missing docstring
W:  7,0:CommandReader: Method 'rawDataReceived' is abstract in class 'LineReceiver' but is not overridden
C: 21,4:CommandReader.lineReceived: Invalid name &quot;lineReceived&quot; for type method (should match [a-z_][a-z0-9_]{2,30}$)
C: 54,0:CommandReaderFactory: Missing docstring
C: 59,4:CommandReaderFactory.buildProtocol: Invalid name &quot;buildProtocol&quot; for type method (should match [a-z_][a-z0-9_]{2,30}$)
C: 62,0:CommandHandler: Missing docstring
C: 67,4:CommandHandler.set: Missing docstring
C: 71,4:CommandHandler.get: Missing docstring
C: 76,0:main: Missing docstring
E: 78,4:main: Module 'twisted.internet.reactor' has no 'listenTCP' member
E: 79,4:main: Module 'twisted.internet.reactor' has no 'run' member

... other output ommited ...

Your code has been rated at 6.27/10
&lt;/console&gt;

&lt;p&gt;There is quite a bit of output that pylint supplies the most important parts
are shown above. We can quickly see that we’re missing quite a few docstrings 
and then there are a few things we can ignore such as the missing members that
is obviously just &lt;strong&gt;pylint&lt;/strong&gt; not finding the imports correctly. As with any 
tool &lt;strong&gt;pylint&lt;/strong&gt; is intended to point you in the direction of a problem and you
ultimately have to make the decision to fix something or leave it be and use 
for example in this case a docstring to tell &lt;strong&gt;pylint&lt;/strong&gt; to be quiet. The score
given to your code is an interesting way of showing developers if their code is 
up to par with how &lt;strong&gt;Python&lt;/strong&gt; code should be written and maintained.&lt;/p&gt;

&lt;p&gt;Lets add those docstrings and silence the missing member functions that we know
are in fact there. With a subsequent rerun of &lt;strong&gt;pylint&lt;/strong&gt; I now have a score of
7.45 which is a pretty decent score. Something like pylint should be used in 
order to make sure that the code quality doesn’t degrade drastically with time
and that certain levels of code quality and proper code writing are maintained
across the team.&lt;/p&gt;

&lt;p&gt;We’re now at a stage where others are already able to install and use our code
and need to continue adding features to our existing service while making sure
that with each checkin we don’t break any of the older functionality and yet
are able to quickly introduce new features or bug fixes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;STOP 20:01 on October 14th, 2012&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;START 20:22 on October 20th, 2012&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;At this point I decided to restructure the code a bit by creating a kvs package
in which the &lt;strong&gt;CommandHandler&lt;/strong&gt; logic into its own module. That way I can 
continue development on the way we’re storing/retrieving data without having to 
muck around in the kvsserver module. While doing this I also created a few more 
test to verify the set, get and new del operation all work correctly.&lt;/p&gt;

&lt;p&gt;We now have the full API available with a few additional unit tests that verify
the various use cases for set/get and delete operations. I also spent sometime
and created a very simple set of performance tests to have an idea of how 
well this whole thing performs. To create the basic performance tests I first 
created a &lt;strong&gt;BaseTest&lt;/strong&gt; test case to build that had the earlier used &lt;strong&gt;send&lt;/strong&gt;
command to be use to easily send and receive data from the server and then I 
built the following very simplistic performance test:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;n&quot;&gt;ITERATIONS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20000&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;PerformanceTest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BaseTest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_1st_set_small_key_performance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
        &quot;&quot;&quot;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ITERATIONS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'SET'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'key-&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;d'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'tiny little value'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'SET &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;f ops/sec'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ITERATIONS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_2nd_get_small_key_performance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
        &quot;&quot;&quot;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ITERATIONS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'GET'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'key-&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;d'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'GET &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;f ops/sec'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ITERATIONS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_3rd_del_small_key_performance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
        &quot;&quot;&quot;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ITERATIONS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'DEl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'key-&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;d'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'DEL &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;f ops/sec'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ITERATIONS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;elapsed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The performance numbers were above my expectations, as I was expecting a couple
of thousand operations per second but got:&lt;/p&gt;

&lt;console&gt;
SET 7829.786726 ops/sec
GET 6993.390858 ops/sec
DEL 7976.453482 ops/sec
&lt;/console&gt;

&lt;p&gt;I was satisfied with the single threaded performance of the kvs store at this 
point and want on making the code easier to read &amp;amp; write and so I spent a little
time restructuring the &lt;strong&gt;CommandReader&lt;/strong&gt; class to be a bit smarter in terms of 
how we basically parse each command by switching the &lt;strong&gt;lineReceived&lt;/strong&gt; method 
implementation at run time. I also fixed up the base test class to be more 
specific on the SET/GET &amp;amp; DEL methods being used to talk to the server. Here’s 
what the new &lt;strong&gt;CommandReader&lt;/strong&gt; looks like:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;CommandReader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LineReceiver&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
    state machine comand reader 
    &quot;&quot;&quot;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cmdhandler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_cmdhandler&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cmdhandler&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_cmd_names&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cmdhandler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reading_argument_size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expected_arguments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;

        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lineReceived&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_start_command&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;_start_command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;startswith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'*'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'got &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;

        &lt;span class=&quot;c&quot;&gt;# new command starting&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendLine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'-unexpected start of new command&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\r&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expected_arguments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:])&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;

        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lineReceived&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_read_command&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;_read_command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;command&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expected_arguments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lineReceived&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_read_arguments&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;_read_arguments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;c&quot;&gt;# remove the command name from the arguments&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expected_arguments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expected_arguments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c&quot;&gt;# lookup the method handler&lt;/span&gt;

            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;command&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_cmd_names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;getattr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_cmdhandler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                    &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendLine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'+&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;except&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                    &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendLine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'-&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendLine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'-unknown command &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_command&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lineReceived&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_start_command&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The nice thing at this point is that I’m constantly able to change code quite 
drastically without having to worry if I broke something because the unit tests
are able to give me some confidence in the changes I’m making.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;STOP 21.15  on October 21st, 2012&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;START 15:39 on October 21st, 2012&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;At this point I’d like to take sometime to analyze how much test code I’ve 
written vs how much real product code I’ve written. I’ll do this in the simplest
way possible by just comparing line count:&lt;/p&gt;

&lt;console&gt;
&amp;gt; wc -l tests/*.py
  55 tests/base.py
   0 tests/__init__.py
  41 tests/performance.py
  45 tests/protocol.py
 141 total
&amp;gt; wc -l kvs/*.py  
  41 kvs/cmdhandler.py
   0 kvs/__init__.py
  91 kvs/kvsserver.py
 132 total
&lt;/console&gt;

&lt;p&gt;So right now we actually have a few more lines of test code than we have of 
actual product code. The thing to realize though is that as we add more API 
calls to service, the amount of test code won’t grow by as much as it has till 
this point. Lets really show how this works by adding a few new APIs:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;SHUTDOWN&lt;/li&gt;
  &lt;li&gt;RESET&lt;/li&gt;
  &lt;li&gt;KEYS key_reg_ex&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;strong&gt;SHUTDOWN&lt;/strong&gt; command is used to basically shutdown the server and the 
&lt;strong&gt;RESET&lt;/strong&gt; command is used to reset the store back to empty. The &lt;strong&gt;KEYS&lt;/strong&gt; command
is a little trickier since it involves returning all of the keys that match the
regular expression specified. This will force us to introduce a new return type
to the protocol. What we had in terms of protocol return specification till now
was:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;+&lt;/strong&gt; means the operation succeeded and is followed by OK or the value of 
      what you wanted to return&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;-&lt;/strong&gt; is used before the error message of an operation that failed&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;*&lt;/strong&gt; is used before starting a multi-value response in which the integer
       after the * is the number of lines to read&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With the multi-line response we can now implement the KEYS correctly. Having
implemented all of those features we now have a little more product code lines
than tests and have a pretty well working key/value store that is being used
by others while we make changes and easily reverify our code as we make those
changes. here is the lines of test code vs lines of product code comparison:&lt;/p&gt;

&lt;console&gt;
&amp;gt; wc -l tests/*.py       
  65 tests/base.py
   0 tests/__init__.py
  41 tests/performance.py
  67 tests/protocol.py
 173 total
&amp;gt; wc -l kvs/*.py         
  70 kvs/cmdhandler.py
   0 kvs/__init__.py
 112 kvs/kvsserver.py
 182 total
&lt;/console&gt;

&lt;p&gt;Now I personally don’t care if I have more product code than test code because
to me test code is valuable code that allows me as a developer to actually 
write code that can be used by others and guarantees my code at least does what
I was originally intending.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;STOP 16:46 on October 21st, 2012&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The interesting thing that I’d like to analyze now is roughly how much time was 
spent on this little project till now and of that time how much time was spent
writing tests vs writing product code.&lt;/p&gt;

&lt;p&gt;So the start and stop times tally looks like so:&lt;/p&gt;

&lt;table&gt;
 &lt;tr&gt;
   &lt;th&gt;START&lt;/th&gt;&lt;th&gt;STOP&lt;/th&gt;&lt;th&gt;Description&lt;/th&gt;&lt;th&gt;Duration (minutes)&lt;/th&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
   &lt;td&gt;12:01&lt;/td&gt;&lt;td&gt;12:11&lt;/td&gt;&lt;td&gt;Initial Project Specification&lt;/td&gt;&lt;td&gt;10&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
   &lt;td&gt;14:57&lt;/td&gt;&lt;td&gt;16:32&lt;/td&gt;&lt;td&gt;Prototype Development&lt;/td&gt;&lt;td&gt;95&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
   &lt;td&gt;18:59&lt;/td&gt;&lt;td&gt;20:01&lt;/td&gt;&lt;td&gt;Making beta Version Available&lt;/td&gt;&lt;td&gt;62&lt;/td&gt; 
 &lt;/tr&gt;
 &lt;tr&gt;
   &lt;td&gt;20:22&lt;/td&gt;&lt;td&gt;21:15&lt;/td&gt;&lt;td&gt;Restructuring &amp;and; Performance Testing&lt;/td&gt;&lt;td&gt;53&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
   &lt;td&gt;15:39&lt;/td&gt;&lt;td&gt;16:46&lt;/td&gt;&lt;td&gt;Refactoring Code &amp;and; Adding new APIs&lt;/td&gt;&lt;td&gt;53&lt;/td&gt;
 &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;So just after just 4h and 33m of working on this project we have a working 
service that can be used by others and we’re able to easily and quickly extend
this service while making sure to test existing and new features before each
and every checkin.&lt;/p&gt;

&lt;p&gt;Now there are a few things I should have also worked on but just didn’t feel 
it would have fitted into the length of the post I was working on writing. The
few things I would have focused on next would have included:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Making sure to document the protocol specification with the code in a format
that would allow others to easily and quickly write their own clients. This 
would also make updating the API documentation easier since it resides with 
the code that implements the API.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Adding more tests that would verify the limits of the API usage such as the 
max key and value lengths. Not forgetting to test all of the negative scenarios
of using the protocol such as invalid integer values, invalid operation names, 
etc.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I hope that after reading this post you’ll see that you can easily apply the 
same ideas to any of your projects and allow yourself to be a more efficient 
engineer and also allow you to produce better code.&lt;/p&gt;

&lt;p&gt;The code written during this writing can be cloned from &lt;a href=&quot;https://github.com/rlgomes/kvs&quot;&gt;here&lt;/a&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>The Origin of The Modern Software Engineer</title>
   <link href="http://rlgomes.github.com/work/writings/rant/engineering/personal/2012/10/13/13.38-The-Origin-of-The-Modern-Software-Engineer.html"/>
   <updated>2012-10-13T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/writings/rant/engineering/personal/2012/10/13/13.38-The-Origin-of-The-Modern-Software-Engineer</id>
   <content type="html">&lt;p&gt;In software development these days, what most call “software engineers”, could 
be better defined as code monkeys. I myself was quite a good code monkey back 
in the day and can’t blame most for finding themselves in the same situation.
What I hope to achieve with this writing is to give you an impression of how
we’ve evolved as software engineers over the last 15 years and where this
evolution may lead us to. I hope to also introduce a possible future approach
on how software development should be done and just maybe save some fellow
engineers hours or days of torturous bug hunting.&lt;/p&gt;

&lt;h2 id=&quot;the-stone-age-engineer&quot;&gt;The Stone Age Engineer&lt;/h2&gt;

&lt;p&gt;One of my first jobs never had a QA team or quality engineers to work on
qualifying and verifying products before we put them out the door. The
developers (or stone age engineers) were responsible for taking the features
requested by their bosses and basically developing code to fulfill those
requirements and testing that said code on the devices at hand. The person
writing code that had no formal training in quality assurance or even
understood how to make things more testable, was expected to write code and 
make sure it was ready for production. They were simply expected to produce 
good quality software in a timely fashion. In this scenario, the daily routine 
of the stone age engineer consisted of:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;writing new code that had to be tested manually by said developer or another
developer with which this developer was partnering with.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;working on any bugs reported by customers on the code that had been released
into the customers hands.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;attempting to verify that the current code being released has the new
features working correctly as well as all older features working as 
previously manually tested.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I found myself many times questioning how well I could test an application like
that by hand and how consistent I was at even reproducing the same steps from
previous “test runs” so that I could at least catch previous issues as well.
This is of course what we call today automation and its a word that quite a few
people use but very few actually practice it correctly. Time for automation was
never scheduled and I had no other choice then to get my code written and
tested by hand in order to make deadlines. The unfortunate side effect of such
a development process is that days or weeks later we had yet again produced bug
3214 for the 5th time in 3 months because we had no way to easily verify that
we were at least not breaking any existing features that were working for
customers.&lt;/p&gt;

&lt;p&gt;This forces developers to context switch a lot between features they’re
currently writing and ones that they wrote quite a while ago. Since the code
that was written a while ago never received the required “quality” attention
that it should have, making changes in that code can be difficult and bug
ridden. This results in very long nights of coding to fix production issues and
also makes your code very unmaintainable as most developers never attempt to
re-factor code given that they don’t have tests that would allow them to have
confidence in making any major changes to the code base.&lt;/p&gt;

&lt;h2 id=&quot;the-medieval-engineer&quot;&gt;The Medieval Engineer&lt;/h2&gt;

&lt;p&gt;I’d consider the past 5 years to be the medieval times of development where
developers are now familiar with the idea of writing unit tests and defending
themselves from late night coding battles by having the right “armor” at hand
when they’re developing code. This new confidence in making changes is still a
bit mis-guided as some of you may already know because there are a few things
that are not being tested by unit tests that developers will certainly be prone
to breaking that need to be fixed after pushes to the integration environment.&lt;/p&gt;

&lt;p&gt;During my medieval engineer phase I was introduced to the notion of having one
team that develops code while another team that is entirely responsible for
qualifying and verifying said code. This didn’t mean that writing unit tests
was no longer required, because of course nothing can replace good unit tests.
This really meant that there was now a team dedicated to doing all of the other
phases of testing which meant the integration testing across services as well
as performance testing of the services.&lt;/p&gt;

&lt;p&gt;This of course works well when your QA team is a team of engineers that
actually understand what they’re role is in the software development process. I
say this because in my experience most quality engineers don’t understand that
they’re not in a position of blocking developers but instead helping said
developers find their problems earlier and being able to reverify quickly that
changes have not broken any existing features.&lt;/p&gt;

&lt;p&gt;So in the medieval times of  development we have developers who write code and
unit test it and then ship the code to the quality assurance engineers who may
simply do manual black box testing which has to be rerun on each new code push
from developers. Once this code has reached some level of quality that the
testers feel is sufficient to be pushed to production or made part of an
upcoming release, then the QA team gives the thumbs up. Of course the notion of
automation was well understood by me at this point and manual testing was
avoided with the right tools and frameworks for automating testing so that on
each new push of code we would push a button and wait for the results. With
this the testers can actually spend time writing new tests for the new features
and helping developers diagnose/debug existing bugs in the code.&lt;/p&gt;

&lt;h2 id=&quot;the-renaissance-engineer&quot;&gt;The Renaissance Engineer&lt;/h2&gt;

&lt;p&gt;We live in what I like to call the age of the renaissance engineering, because
we have acquired a lot of knowledge on how to do testing well and we’ve also
learned processes for tracking code development in a way that allows individual
developers to feel like they are making a difference on the team while
management can easily know when to expect things to ship. You may be aware of
some of these processes which include the likes of Agile, Scrum, Waterfall, etc. 
I’m not a fan of taking one of these methods and following it to the tee since 
I believe that the process by which many engineers can work together to produce 
software really depends greatly on the quality of engineers on that team.&lt;/p&gt;

&lt;p&gt;The availability of frameworks and tools that can empower the developer to get
his/her job done are everywhere these days. From tools that can be used to
analyze your code for bad practices or memory leaks to frameworks that can be
used to easily graph out the performance behavior of your REST API. We have
also started understanding the notion of testing early using continuous
integration environments to find problems as soon as possible so they can be
fixed sooner than later.&lt;/p&gt;

&lt;p&gt;The renaissance engineer is capable of writing code and tests at the same time
without feeling that they’re “wasting” time. They use well known tools all the
time and experiment with new tools when they hear about such. Some of them even
prefer to write their tests before they write their code (i.e. test driven
development) and find that their tests to be more valuable than they code they
write. This of course is only the case for a few Leonardo Da Vinci type’s who
have actually learned how to make themselves better engineers throughout their
career and will continue to do so moving forward.&lt;/p&gt;

&lt;p&gt;For the majority of software development shops though the renaissance engineer
usually finds him/her self in teams with various other developers at different
stages of their evolution and sometimes find that the testing responsibility is
totally in the hands of a QA team that may or may not be doing the right thing.
This really irks the renaissance engineer who would like to see the QA team do
things better and at the same times wants to make sure that the development team 
isn’t a blocker for the QA team in terms of giving them the required testing 
hooks and or fixing blocking problems first.  The renaissance engineer living 
in today’s day and age, finds him/her self quite frustrated with most of the 
processes being used and how difficult it is to get others to see that if 
everyone follows a similar process of writing/testing and pushing code then the 
whole system works better as a whole.&lt;/p&gt;

&lt;h2 id=&quot;the-modern-age-engineer&quot;&gt;The Modern Age Engineer&lt;/h2&gt;

&lt;p&gt;If we put ourselves in the shoes of a software engineer from the future,
looking back at the various eras of the software engineering. We are be able to 
evaluate from all of the failures and success, what approaches really worked in 
the software development process. The question is now how we each interpret 
these results and what we feel would actually put us on a faster pace of 
evolving towards becoming a modern age engineer sooner than later in our 
careers. The following is my interpretation of how what I believe would be the 
future of software engineering.&lt;/p&gt;

&lt;p&gt;In a future with a modern age engineers one would not have any separation
between developers and testers. There wouldn’t even be such a distinction because
all members of the team are software engineers capable of writing and 
understanding code. There would be a single software engineering team 
responsible for producing any given product. This team consists of engineers 
who write code to implement features in the product as they also supply the 
tests that are to be used to verify that the code does in fact behave as 
expected. The team consists of experts in their own areas of interest, each 
having expertise in areas such as build &amp;amp; release processes, automation, 
distributed systems, etc. Of course engineers would be allowed to do cross team 
work if they had frameworks/tools or expertise that are of use to other teams 
in the company can benefit from. The notion of someone writing exclusively 
writing tests would be dead and instead engineers would be responsible for 
writing their own tests as well as writing integration tests with other 
engineers that work on dependent services or APIs.&lt;/p&gt;

&lt;p&gt;There following would hold true for the whole organization:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Continuous integration environments are setup and used by everyone to build,
deploy and test their code on a regular basis and everyone understands how
their code is deployed as well as how their tests are executed against their
code.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Test are written before the code and the code is written to make those tests
pass as they are the contract or specification by which the code should
behave.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;These same tests should really serve as documentation on how the code is to
be used by others. If written using documentation macros, decorators or
annotations they can easily generate documentation from the tests themselves.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Everyone’s uses the same build/deploy and execution scripts, so that building
and deploying someone’s code is as easy as building and deploying your own
code.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The team uses all of the available tools to make finding problems in code
such as findbugs, valgrind and coverity a regular and routine practice. All
while using other tools to recommend better practices of writing code in
certain languages such as pylint, javascript lint, etc so that the code is 
kept clean of easy to detect problems.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Code style checkers are enforced at check-in time to verify that everyone
follows the same conventions of either tabs/spaces or whatever magical
concoction that makes code easier to read and maintain by all.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Reviewing code is just a natural part of the process of getting your changes
into the trunk because you don’t want to end up having everyone frown at you
when that change breaks something just because you couldn’t wait another hour
for someone to review your changes before you check them in.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Individual unit tests can be marked as performance enabled tests, that can
then be used by a performance framework to stress tests each unit of
available functionality and come up with performance data within a few hours 
of code landing in the trunk.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The frameworks used are all well understood and maintained in languages that
most of the team members want to use for writing code and test in. The
frameworks are part of the code that is written or maintained in order to help
drive the development process to get things ready for pushing to production
quicker. These same tools and frameworks are built/deployed and executed just
like the rest of the code base and are treated the same way production quality
code is treated.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Everyone feels ownership of the whole code base from the production code,
build and release scripts and the tests that make it possible for their
production code to be stable and ready for deployment.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Yes most of the previous statements may seem like a Utopian view of the
software development process but I see it as a very possible reality which
doesn’t get in the way of software development but does in fact make software
development a well known scientifical process.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;I started this writing with exposing how I feel we’ve evolved in terms of
software engineering and software development processes and hoped to reach a
phase of our evolution which is true to most in this day and age. I believe we
could all do much better at being better software engineers than we currently
are and that I myself, am still learning new approaches to software development
every day. I didn’t spend time focused on the exact tools and frameworks to use, 
since I believe these types of things change too often and I really focused on 
identifying the boundaries created between a traditional software development 
process.&lt;/p&gt;

&lt;p&gt;The fact that we so carefully distinguish the developer role and quality
engineer role and how much of a waste of time and money that distinction is to
companies and the whole development process is one of the main points of my
writing. The real focus should be on how to make all of your engineering power
work together as a cohesive software development group with everyone
understanding how things are built/deployed and executed and all engineers
feeling like they can move around and help others while still being able to
count on others for help when necessary.&lt;/p&gt;

&lt;p&gt;I do hope that after reading this writing you should at least feel that there
is one thing you can work on immediately to make yourself a better software
engineer and possibly another 2-3 things that you can set as short term goals
to get started on at your current position.&lt;/p&gt;

&lt;p&gt;I also hope that you realize that this writing is in no way a precise account
of the situation at all companies but certainly reflects my own experiences and 
the experiences of a few other colleagues in the business and hopefully can 
help others feel they’re not alone or even generate a lot of interesting 
conversations.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Speeding up Python with Numba</title>
   <link href="http://rlgomes.github.com/work/python/numba/speed/jit/2012/08/26/15.24-Speeding-up-Python-with-Numba.html"/>
   <updated>2012-08-26T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/python/numba/speed/jit/2012/08/26/15.24-Speeding-up-Python-with-Numba</id>
   <content type="html">&lt;p&gt;Just a quick post on a sweet dynamic compiler for Python which is numpy aware
and does JIT compiling to LLVM bit-code to speed up your Python loops by quite
a few tremendous orders of magnitude.&lt;/p&gt;

&lt;p&gt;Its still quite beta software and is under heavy development but its extremely
promising where this project is going. Take for example the following simple 
code that just sums a large matrix of numbers like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;matrix_sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;                                                            
    &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;N&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;                                                            
    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;                                                                
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;                                                          
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;N&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;                                                      
            &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;                                                  
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;                &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Of course the example is using numpy arrays and in this case we’d generate the 
array by using the numpy module like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;n&quot;&gt;random_matrix&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;numpy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;randn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;That generates a random matrix of 5000x5000 elements and now we can use that
to measure how long our function takes to calculate the sum of all of the 
elements in the matrix:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;                                                             
&lt;span class=&quot;n&quot;&gt;matrix_sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;D&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'time to use matrix_sum &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;                        &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The time it takes to run this little sum of elements is:&lt;/p&gt;

&lt;console&gt;
time to use matrix_sum 18.0450868607
&lt;/console&gt;

&lt;p&gt;That’s 18 seconds which of course illustrates how bad Python is at handling 
loop with even a simple data type such a float.&lt;/p&gt;

&lt;p&gt;Now to use numba it involves decorating the method we want to have numba 
replace with LLVM bit-code and you can do so currently like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;numba&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;double&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;numba.decorators&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;jit&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@jit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg_types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;double&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[:,:]])(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;matrix_sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;matrix_sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;                                                            
    &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;N&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;                                                            
    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;                                                                
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;                                                          
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;N&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;                                                      
            &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;                                                  
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;                                                               &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;So very simply we tell numba where to apply its magic and our argument and 
return types. The developers on numba are already looking into how to use
introspection to figure out the argument and return types on their own and 
you’d just have to place the @jit decorator on the method you wanted to apply
the numba JIT’ing on.&lt;/p&gt;

&lt;p&gt;With that very small addition our same loop now executes in:&lt;/p&gt;

&lt;console&gt;
time to use jit_matrix_sum 0.0553958415985
&lt;/console&gt;

&lt;p&gt;That is an increase in performance of 325x without having to make the code harder
to read or use some elaborately hard to implement and compreenhend algorithm. 
For kicks I increased the array size by 9x and made it a 15000x15000 array and 
still didn’t need more than half a second to calculate the sum of this new 
array with the JIT’ed method:&lt;/p&gt;

&lt;console&gt;
time to use jit_matrix_sum 0.494293928146
&lt;/console&gt;

&lt;p&gt;To understand how this is possible you have to realize what numba is doing is
converting your array oriented program (ie for loop) and using LLVM to execute
this as quickly as possible on your hardware which has various elements that
are actually designed better for array oriented programs and this is what 
numba is taking advantage of.&lt;/p&gt;

&lt;p&gt;I just wanted to note this library for future reference as I believe they’re on 
the right path and the ideas in the numba library could be integrated into 
Python core and allow for these type of optimizations to be done on all code 
paths, boosting the awesomeness of Python.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Using SST and Freshen</title>
   <link href="http://rlgomes.github.com/work/testing/python/cucumber/selenium/web/2012/05/27/12.00-using-sst-and-freshen.html"/>
   <updated>2012-05-27T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/testing/python/cucumber/selenium/web/2012/05/27/12.00-using-sst-and-freshen</id>
   <content type="html">&lt;p&gt;I work in the quality assurance field and I’ve built and used many tools in my
career to accomplish my every day tasks. I recently had to do some web testing
(ie &lt;strong&gt;Selenium&lt;/strong&gt; appropriate) and had a few requirements a long the lines of not
just having a development language for testing but a testing language that
could be used by non coder friendly individuals (ie black box testers or even
designers as they put together mocks for UI’s).&lt;/p&gt;

&lt;p&gt;For the web testing aspect I got familiar with &lt;strong&gt;&lt;a href=&quot;http://testutils.org/sst/&quot;&gt;SST&lt;/a&gt;&lt;/strong&gt;
and since its written in &lt;strong&gt;Python&lt;/strong&gt; and couldn’t be easier to setup and get
started I found this to be a great way to driven &lt;strong&gt;Selenium&lt;/strong&gt; tests. You should
check the site out and you’ll see how quick and easy you can start writing tests
and have a full blown acceptance suite up and running within an afternoon.&lt;/p&gt;

&lt;p&gt;Now the bigger challenge was what language I would use to allow non coder types
to write tests quick and effectively and yet be able to automate some of their
work so that their tests could be integrated into the acceptance testing suite
and catch issues before shipping and without having tons of black box testing
done. I started digging around and the first big obvious choice is &lt;strong&gt;Cucumber&lt;/strong&gt;
but I found that its written in &lt;strong&gt;Ruby&lt;/strong&gt; and wanting to stick to &lt;strong&gt;Python&lt;/strong&gt; and
its familiar tool chain I searched around for something that would give me the
same language and be written in &lt;strong&gt;Python&lt;/strong&gt;. What I found was
&lt;strong&gt;&lt;a href=&quot;https://github.com/rlisagor/freshen&quot;&gt;Freshen&lt;/a&gt;&lt;/strong&gt;, a very well done clone of
the &lt;strong&gt;Ruby Cucumber&lt;/strong&gt; language which had even a few special additions to the
&lt;strong&gt;Cucumber&lt;/strong&gt; language which made it even more interesting to use for this type
of testing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Freshen&lt;/strong&gt; is very easy to extend and what I wanted to do was to have
&lt;strong&gt;Freshen&lt;/strong&gt; give you a nice and natural way of writing tests that would drive
the &lt;strong&gt;SST&lt;/strong&gt; framework so that you could easily test a web site by people who are
not strong coders. The first aspect to figure out is that given &lt;strong&gt;Cucumber&lt;/strong&gt;
tests look like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-cucumber&quot; data-lang=&quot;cucumber&quot;&gt;&lt;span class=&quot;kn&quot;&gt;Scenario&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; Divide regular numbers
  &lt;span class=&quot;nf&quot;&gt;Given&lt;/span&gt; I have entered 3 into the calculator
  &lt;span class=&quot;nf&quot;&gt;And&lt;/span&gt; I have entered 2 into the calculator
  &lt;span class=&quot;nf&quot;&gt;When&lt;/span&gt; I press divide
  &lt;span class=&quot;nf&quot;&gt;Then&lt;/span&gt; the result should be 1.5 on the screen&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;We need to figure out exactly the wording that we wanted to use when doing
certain actions through &lt;strong&gt;SST&lt;/strong&gt; that would allow you as the test writer to
really trigger the right events and validate the right things have happened and
while doing this you do want to “hide” certain aspects of the &lt;strong&gt;HTML&lt;/strong&gt; content
such as the exact path to certain elements on the page (maintaining XPath or
css selector expression is hard enough for developers). So lets pick the main
&lt;strong&gt;&lt;a href=&quot;http://testutils.org/sst/actions.html&quot;&gt;actions&lt;/a&gt;&lt;/strong&gt; from &lt;strong&gt;SST&lt;/strong&gt; that we’d like
to expose in the &lt;strong&gt;Freshen&lt;/strong&gt; language:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;go_to&lt;/strong&gt; - open a specific page in the currently configured browser.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;go_back&lt;/strong&gt; - hit the back button on the browser.&lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;click_elemnt&lt;/strong&gt; - to click on the various interactable elements on the page.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;assert_title&lt;/strong&gt; - validate the title of the page is correct.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;assert_link&lt;/strong&gt; - validate there is a link to another page on the page.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;assert_text&lt;/strong&gt; - validate that certain textual elements are on the page.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s keep the list to just those for now and as for what we’ll test I shall
pick my blog site which I’d like to make sure is working correctly at any given
moment and can be correctly navigated by any person using it. We could start
defining the language for our tests like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-cucumber&quot; data-lang=&quot;cucumber&quot;&gt;&lt;span class=&quot;kd&quot;&gt;Feature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; Personal Blog

  &lt;span class=&quot;kn&quot;&gt;Scenario&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; Visit the main page
   &lt;span class=&quot;err&quot;&gt;Given I am at http&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;//localhost&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;4000&lt;/span&gt;
   &lt;span class=&quot;nf&quot;&gt;Then&lt;/span&gt; I should see the title Rodney's Corner
    &lt;span class=&quot;nf&quot;&gt;And&lt;/span&gt; I should see the link Blog, Archive, About
    &lt;span class=&quot;nf&quot;&gt;And&lt;/span&gt; I should see the headers Blog, Recent Posts, Coding&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;That was a first analysis and from that we can see that we need a few different
ways of validating content on the page. There’s still an issue with the way we
get to the blog by referring to the &lt;strong&gt;url&lt;/strong&gt; directly, I’d really prefer if we
could refer to it by a name (ie alias) that could be maintained in the
&lt;strong&gt;Python&lt;/strong&gt; configuration file and easily modified to point to different testing
environments. To be able to run the &lt;strong&gt;Freshen&lt;/strong&gt; tests you’ll need to first make
yourself a virtualenv (or install on your base sytem if you prefer) and install
the following:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;pip install -U sst&lt;/li&gt;
  &lt;li&gt;pip install ‘git+git://github.com/rlisagor/freshen.git#egg=freshen’&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once you have that installed we can now start defining the &lt;strong&gt;steps&lt;/strong&gt; &lt;strong&gt;Python&lt;/strong&gt;
module required to start executing our &lt;strong&gt;Freshen&lt;/strong&gt; tests against the blog. We’ll
start by defining the &lt;strong&gt;Given&lt;/strong&gt; statement and the first &lt;strong&gt;Then&lt;/strong&gt; which are the
simplest to begin with:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sst.actions&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;freshen&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@Given&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'I am at (.*)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;i_am_at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;go_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@Then&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'I should see the title (.*)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;should_see_the_title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;assert_title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The previous &lt;strong&gt;steps&lt;/strong&gt; implementation allows us to execute a reduced version of
the previous navigation tests, like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-cucumber&quot; data-lang=&quot;cucumber&quot;&gt;&lt;span class=&quot;kd&quot;&gt;Feature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; Personal Blog

  &lt;span class=&quot;kn&quot;&gt;Scenario&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; Visit the main page
   &lt;span class=&quot;err&quot;&gt;Given I am at http&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;//localhost&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;4000&lt;/span&gt;
   &lt;span class=&quot;nf&quot;&gt;Then&lt;/span&gt; I should see the title Rodney's Corner&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;And to run this you can do so using &lt;strong&gt;nosetests&lt;/strong&gt; command line tool like so:&lt;/p&gt;

&lt;console&gt;
&amp;gt; nosetests --with-freshen -v
Personal Blog: Visit the main page ... ok

----------------------------------------------------------------------
Ran 1 test in 2.894s

OK
&lt;/console&gt;

&lt;p&gt;If you’d like to get the output from &lt;strong&gt;SST&lt;/strong&gt; just make sure to pass the argument
&lt;strong&gt;–nocapture&lt;/strong&gt; and you’ll get the &lt;strong&gt;stdout&lt;/strong&gt; output. Lets look into making the
current steps a bit smarter and easier to maintain in the long run. So the first
thing is how to replace the exact url with a location alias instead. Any easy
way of doing this could be like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sst.actions&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;freshen&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;test_env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'test'&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'ENV'&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;environ&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;test_env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;environ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'ENV'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;test_env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'test'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;url_alias&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;'main blog site'&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'http://localhost:4000'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;test_env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'prod'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;url_alias&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;'main blog site'&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'http://rlgomes.github.com'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Exception&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Unknown environment &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;test_env&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@Given&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'I am at (.*)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;i_am_at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;url&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;url_alias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;url&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;url_alias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;go_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@Then&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'I should see the title (.*)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;should_see_the_title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;assert_title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;I decided to also incorporate the notion of &lt;strong&gt;test&lt;/strong&gt; and &lt;strong&gt;prod&lt;/strong&gt; environment
configuration which you can easily drive from the command line by exporting the
&lt;strong&gt;ENV&lt;/strong&gt; variable. We now need to implement the other steps that are required to
verify links and headers on the page. So here’s a complete solution that handles
the lists of links and headers:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sst.actions&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;freshen&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@After&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;teardown&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;scenario_ctx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;close_window&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;test_env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'test'&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'ENV'&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;environ&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;test_env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;environ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'ENV'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;test_env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'test'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;url_alias&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;'main blog site'&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'http://localhost:4000'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;test_env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'prod'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;url_alias&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;'main blog site'&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'http://rlgomes.github.com'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Exception&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Unknown environment &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;test_env&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;find_element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;search_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
    Attempt to find an element by using the specified search string to find the
    element in the following order of searching:

        1. by id
        2. by css_class
        3. by text
        4. by text_regex

    &quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exists_element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;search_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;search_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exists_element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;css_class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;search_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;css_class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;search_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exists_element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;search_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;search_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exists_element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text_regex&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;search_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text_regex&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;search_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Exception&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Can't find the element &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;search_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@Given&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'I am at (.*)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;i_am_at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;url&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;url_alias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;url&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;url_alias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;go_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@Then&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'I should see the title (.*)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;should_see_the_title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;assert_title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@NamedTransform&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'{list}'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'([&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;, ]+)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'([&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;, ]+)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;transform_user_list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;','&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@Then&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'I should see the links? {list}'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;should_see_the_link&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;links&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;link&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;links&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;find_element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;link&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;assert_link&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@Then&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'I should see the headers? {list}'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;should_see_the_headers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;headers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;header&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;headers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;aux&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'text()=&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&quot;'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;header&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;get_element_by_xpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'//h1[&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s] | //h2[&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s] | //h3[&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s] | //h4[&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s]'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; \
                             &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aux&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aux&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aux&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aux&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;I added a &lt;strong&gt;teardown&lt;/strong&gt; method to correctly close the browser and stop it between
each scenario, so we have a nice clean state when we start the next scenario. I
also created the &lt;strong&gt;find_element&lt;/strong&gt; function that can do a pretty good job at
trying to find your element by trying a few different methods. You could easily
define your own &lt;strong&gt;find_element&lt;/strong&gt; with different rules on how you look for
elements based on the names that are passed. Now, in order for us to write some
more useful tests we actually need to be able to click on those links and move
back and forward through the browsing experience. We can do this by using the
&lt;strong&gt;SST&lt;/strong&gt; commands: &lt;strong&gt;go_back&lt;/strong&gt; and &lt;strong&gt;click_element&lt;/strong&gt; and here’s how we’d
integrate those actions into the available &lt;strong&gt;Freshen&lt;/strong&gt; steps:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sst.actions&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;freshen&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;

&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@When&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'I click back'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;click_back&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;go_back&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@When&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'I click on (.*)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;click_on&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;find_element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;click_element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;We can now write tests like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-cucumber&quot; data-lang=&quot;cucumber&quot;&gt;&lt;span class=&quot;kd&quot;&gt;Feature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; Personal Blog

  &lt;span class=&quot;kn&quot;&gt;Scenario&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; Visit the main page
   &lt;span class=&quot;nf&quot;&gt;Given&lt;/span&gt; I am at main blog site
    &lt;span class=&quot;nf&quot;&gt;Then&lt;/span&gt; I should see the title Rodney's Corner
     &lt;span class=&quot;nf&quot;&gt;And&lt;/span&gt; I should see the links Blog, Archive, About
     &lt;span class=&quot;nf&quot;&gt;And&lt;/span&gt; I should see the headers Blog, Recent Posts, Coding

  &lt;span class=&quot;kn&quot;&gt;Scenario Outline&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; Navigate from the main page
   &lt;span class=&quot;nf&quot;&gt;Given&lt;/span&gt; I am at main blog site
    &lt;span class=&quot;nf&quot;&gt;Then&lt;/span&gt; I should see the title Rodney's Corner
   &lt;span class=&quot;nf&quot;&gt;When&lt;/span&gt; I click on &lt;span class=&quot;nv&quot;&gt;&amp;lt;link&amp;gt;&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;Then&lt;/span&gt; I should see the links &lt;span class=&quot;nv&quot;&gt;&amp;lt;links_to_validate&amp;gt;&lt;/span&gt;
   &lt;span class=&quot;nf&quot;&gt;When&lt;/span&gt; I click back
    &lt;span class=&quot;nf&quot;&gt;Then&lt;/span&gt; I should see the title Rodney's Corner

&lt;span class=&quot;nn&quot;&gt;Examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;  &lt;span class=&quot;nv&quot;&gt;link&lt;/span&gt;   &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;  &lt;span class=&quot;nv&quot;&gt;links_to_validate&lt;/span&gt;   &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;Blog&lt;/span&gt;   &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Blog,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Archive,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;About&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Archive&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Blog,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Archive,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;About&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;About&lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Blog,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Archive,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;About&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;I really like how you can define a &lt;strong&gt;Scenario Outline&lt;/strong&gt; and then add entries to
an &lt;strong&gt;ASCII&lt;/strong&gt; table which drives that test with additional data points. With this
I can now write quite an extensive suite of tests for my blog site that would
at least validate that I can move between the various links on my site without
any navigation issues and that the content on the site is showing the right
headers and links in each of the various pages available.&lt;/p&gt;

&lt;p&gt;There’s also a command line tool called &lt;strong&gt;freshen-list&lt;/strong&gt; installed along with
your &lt;strong&gt;Freshen&lt;/strong&gt; install which allows you to basically list the available
commands for your &lt;strong&gt;Freshen&lt;/strong&gt; tests from any sub directory with &lt;strong&gt;Freshen&lt;/strong&gt;
steps defined. You can call it like so:&lt;/p&gt;

&lt;console&gt;
&amp;gt; freshen-list tests
GIVEN
  tests/steps.py
    I am at (.*)
WHEN
  tests/steps.py
    I click back
    I click on (.*)
THEN
  tests/steps.py
    I should see the headers? ([\w\, ]+)
    I should see the links? ([\w\, ]+)
    I should see the title (.*)
&lt;/console&gt;

&lt;p&gt;Its very simple and could use a little tweaking to make it more useful but it
gives you an immediate sense of what commands are available for writing tests.
The last thing I’ll say about &lt;strong&gt;Freshen&lt;/strong&gt; is that it gives you the ability to
hide a lot of the complexity of how you interact with a given system and allows
you to express those actions in simple English which can be maintained by
someone who doesn’t have good coding skills. Another thing about this is that
you can always change the driving language underneath (i.e. switch to ruby
cucumber) without having to change your tests.&lt;/p&gt;

&lt;p&gt;I hope this write up has helped you get acquainted with &lt;strong&gt;Freshen&lt;/strong&gt; as well as
learning a bit about &lt;strong&gt;SST&lt;/strong&gt; which I feel is really great for web UI testing
that hides a lot of the complexities of working with &lt;strong&gt;Selenium&lt;/strong&gt;.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Prolog and Graphs</title>
   <link href="http://rlgomes.github.com/work/prolog/2012/05/22/19.00-prolog-and-graphs.html"/>
   <updated>2012-05-22T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/prolog/2012/05/22/19.00-prolog-and-graphs</id>
   <content type="html">&lt;p&gt;There are a few things we’ve shown that &lt;strong&gt;Prolog&lt;/strong&gt; can do better than other
languages and now we’re going to show you a data structure that can be very
easily represented in &lt;strong&gt;Prolog&lt;/strong&gt; and for which you can very easily define
traversal methods that do things that in other languages would take hundreds
of lines of code and a lot of testing.&lt;/p&gt;

&lt;p&gt;So lets say we wanted to represent the following graph:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;  &lt;span class=&quot;ss&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;---&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;---&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;e&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;   &lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;         &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;    &lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;       &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;     &lt;span class=&quot;ss&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;---&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;---&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;f&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;            &lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;             &lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;              &lt;span class=&quot;ss&quot;&gt;g&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;We could represent the above using the following facts in &lt;strong&gt;Prolog&lt;/strong&gt;:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Now all we need is to define a simple predicate that can calculate paths by
composition of existing paths. For example if X is connected to Z and Z is
connected to Y then there is a path between X and Y, which is &lt;strong&gt;Prolog&lt;/strong&gt; is
very similar to the sentence we just wrote:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;With that predicate we can now ask &lt;strong&gt;Prolog&lt;/strong&gt; a few different questions:&lt;/p&gt;

&lt;console&gt;
| ?- [graphs_v1].
...

yes
| ?- path(a,b).

true ?

yes
| ?- path(a,e).

true ?

yes
| ?- path(a,x).

Fatal Error: local stack overflow (size: 8192 Kb, environment variable used: LOCALSZ)
&lt;/console&gt;

&lt;p&gt;Oh shoot, seems when we ask for a path that can’t be solved that we run out of
stack space. This is because we don’t have a termination rule and &lt;strong&gt;Prolog&lt;/strong&gt;
keeps instantiating variables for Z that it can then match on a longer sequence
of &lt;strong&gt;path&lt;/strong&gt; facts until it runs out of stack space. The easiest way to solve this
is to make sure that when we start on a path to look for Z that connects X and Y
that we simply validate that both X and Y are valid &lt;strong&gt;atoms&lt;/strong&gt; and not variables.
Which would look like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;

&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;atom&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;atom&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This now works as desired but presents a small problem which is that we can’t
query &lt;strong&gt;Prolog&lt;/strong&gt; for all of the nodes we can reach from a, like so:&lt;/p&gt;

&lt;console&gt;
| ?- findall(X, path(a,X), List).

List = [b,c]

yes
| ?-
&lt;/console&gt;

&lt;p&gt;We’d really like the answer to be &lt;strong&gt;[b,c,d,e,f,g]&lt;/strong&gt; and for that we’re going
to have to find a way to change our solution so that we don’t stack overflow and
still be able to calculate all of the reachable nodes for a given node. Now part
of the problem is that we don’t have a predicate for validating a path with a
different name from the fact that represents the edge between two nodes. So we
really need to fix this by naming these two things differently, like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Now the path method can be defined as finding an edge between X and Y directly
or finding an edge between X and Z and a path between Z and Y, something like
the following:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The nice thing is that now our function behaves beautifully and even avoids
overflowing the stack, but results in a few duplicate entries when requesting
all the paths from &lt;strong&gt;a&lt;/strong&gt; to all other nodes:&lt;/p&gt;

&lt;console&gt;
| ?- findall(X, path(a,X), List).

List = [b,c,e,d,f,g,d,f,g]

yes
&lt;/console&gt;

&lt;p&gt;Removing the duplicates is quite easy with the &lt;strong&gt;setof/2&lt;/strong&gt; predicate which does
the same as the &lt;strong&gt;findall&lt;/strong&gt; predicate but without duplicates in the resulting
set, like so:&lt;/p&gt;

&lt;console&gt;
| ?- setof(X, path(a,X), List).

List = [b,c,d,e,f,g]

yes
&lt;/console&gt;

&lt;p&gt;Everything so far is working well because the paths are unidirectional and we
don’t have any cycles that would give us troubles. Lets just create a simple
edge from &lt;strong&gt;g&lt;/strong&gt; back to &lt;strong&gt;d&lt;/strong&gt;. Now when we run a simple &lt;strong&gt;path(g,f)&lt;/strong&gt; you’ll
notice it doesn’t run out of solutions because it keeps identifying a new path
between &lt;strong&gt;g&lt;/strong&gt; and &lt;strong&gt;d&lt;/strong&gt; that involves going through the &lt;strong&gt;g&lt;/strong&gt; or &lt;strong&gt;d&lt;/strong&gt; again. To
fix this issue we’ll have to use a &lt;strong&gt;visited&lt;/strong&gt; list and keep track of visited
nodes, like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;

&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[]).&lt;/span&gt;

&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;member&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The previous solution now handles cycles just fine and can still be used with
the &lt;strong&gt;findall&lt;/strong&gt; and &lt;strong&gt;setof&lt;/strong&gt; predicates. Lets complicate the graph by adding
weights for each of the connections and then having &lt;strong&gt;Prolog&lt;/strong&gt; calculate which
is the quickest path between two nodes. So lets show you how to represent the
weights in the graph quite easily and then also show you how to find a path
through two nodes and calculate the weight of traveling that path as well as
the path traveled. Lets start with representing weights in the edges between
the nodes, like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;  &lt;span class=&quot;ss&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;e&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;   &lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;         &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;   &lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;        &lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;    &lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;       &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;     &lt;span class=&quot;ss&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;f&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;            &lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;            &lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;             &lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;              &lt;span class=&quot;ss&quot;&gt;g&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt;

&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;I’ve also represented our test graph using an ASCII format that can be easily
understood with the weights of traveling those paths. If we now want to calculate
all of the possible paths and then evaluate which is the best path then we need
to think of solving two problems separately. First is how do we find all the
paths between node &lt;strong&gt;X&lt;/strong&gt; and node &lt;strong&gt;Y&lt;/strong&gt; and the second is how do we pick the
minimal path. The first question we’ll solve using the predicate &lt;strong&gt;findapath&lt;/strong&gt;
which can be expressed as such:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;findapath between &lt;strong&gt;X&lt;/strong&gt; and &lt;strong&gt;Y&lt;/strong&gt; has weight &lt;strong&gt;W&lt;/strong&gt; if there is an &lt;strong&gt;edge&lt;/strong&gt;
between &lt;strong&gt;X&lt;/strong&gt; and &lt;strong&gt;Y&lt;/strong&gt; of weight &lt;strong&gt;W&lt;/strong&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;else findapath between &lt;strong&gt;X&lt;/strong&gt; and &lt;strong&gt;Y&lt;/strong&gt; of weight &lt;strong&gt;W&lt;/strong&gt; is true if we can
find a path between &lt;strong&gt;X&lt;/strong&gt; and &lt;strong&gt;Z&lt;/strong&gt; of weight &lt;strong&gt;W1&lt;/strong&gt; and there is a &lt;strong&gt;findapath&lt;/strong&gt;
between &lt;strong&gt;Z&lt;/strong&gt; and &lt;strong&gt;Y&lt;/strong&gt; of weight &lt;strong&gt;W2&lt;/strong&gt; where &lt;strong&gt;W&lt;/strong&gt; is &lt;strong&gt;W1&lt;/strong&gt; + &lt;strong&gt;W2&lt;/strong&gt;.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt; the above is missing a check that we haven’t already visited the &lt;strong&gt;X&lt;/strong&gt;
while doing subsequent matches on the 2nd rule of the &lt;strong&gt;findapath&lt;/strong&gt; predicate
in order to avoid running forever on a cycle.&lt;/p&gt;

&lt;p&gt;With the above rules our &lt;strong&gt;findapath&lt;/strong&gt; predicate may look like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;findapath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;findapath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;P&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;member&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                                 &lt;span class=&quot;ss&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;W1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                                 &lt;span class=&quot;ss&quot;&gt;findapath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;W2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;P&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
                                 &lt;span class=&quot;nv&quot;&gt;W&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;W1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;W2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The above predicate is a bit more complicated since we’re calculating the weight
and also tracking the exact path we followed till we find the route all the way
to the &lt;strong&gt;Y&lt;/strong&gt; node. With the above predicate you can query with lets say the path
between &lt;strong&gt;a&lt;/strong&gt; and &lt;strong&gt;g&lt;/strong&gt; and if you ask &lt;strong&gt;Prolog&lt;/strong&gt; to give you other solutions
with the &lt;strong&gt;;&lt;/strong&gt; character you can get the following:&lt;/p&gt;

&lt;console&gt;
| ?- findapath(a,g,Weight,Path,[]).

Path = [a,b,e,d,g]
Weight = 9 ? ;

Path = [a,c,d,g]
Weight = 8 ? ;

no
&lt;/console&gt;

&lt;p&gt;So &lt;strong&gt;Prolog&lt;/strong&gt; is capable of finding the two paths that lead from &lt;strong&gt;a&lt;/strong&gt; to &lt;strong&gt;g&lt;/strong&gt;
and to also tell us there is no other possible paths aside from the two already
calculated.&lt;/p&gt;

&lt;p&gt;We now need to write a function that has a similar design pattern to that of the
already used &lt;strong&gt;findall&lt;/strong&gt; and &lt;strong&gt;setof&lt;/strong&gt; predicates. In which, we attempt to find
all the solutions for a given &lt;strong&gt;Goal&lt;/strong&gt; but in the process we also pick from each
subsequent solution the best one and save that to return at the end. The design
pattern used for writing a &lt;strong&gt;findall&lt;/strong&gt; is something similar to the following:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;findall&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Goal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Xlist&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Goal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                           &lt;span class=&quot;ss&quot;&gt;assertz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;queue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;
                           &lt;span class=&quot;ss&quot;&gt;fail&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;

&lt;span class=&quot;ss&quot;&gt;findall&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;XList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;assertz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;queue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;bottom&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;
                        &lt;span class=&quot;ss&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Xlist&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;

&lt;span class=&quot;ss&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;L&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;retract&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;queue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;bottom&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;!,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;L&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[].&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;L&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;retract&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;queue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;!,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;L&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This a slightly modified version from what you may find online and I’ve avoided
using the &lt;strong&gt;;&lt;/strong&gt; operator which is an &lt;strong&gt;else&lt;/strong&gt; operator which condenses the
writing of multiple rules but makes it much harder to read. So the &lt;strong&gt;findall&lt;/strong&gt;
function above can be read:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;For the goal &lt;strong&gt;Goal&lt;/strong&gt; we call the &lt;strong&gt;Goal&lt;/strong&gt; first and then that instantiates the
term &lt;strong&gt;X&lt;/strong&gt; with its value which we then &lt;strong&gt;assertz&lt;/strong&gt; into the &lt;strong&gt;Prolog&lt;/strong&gt; database
and then we &lt;strong&gt;fail&lt;/strong&gt; so that &lt;strong&gt;Prolog&lt;/strong&gt; will back track and find other solutions
for our &lt;strong&gt;Goal&lt;/strong&gt; (which in turn get asserted into the &lt;strong&gt;Prolog&lt;/strong&gt; database).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Once we’ve finished satisfying all of the possible solutions for our &lt;strong&gt;Goal&lt;/strong&gt;
we’ll assert one last element into the &lt;strong&gt;Prolog&lt;/strong&gt; database that will allow us to
&lt;strong&gt;collect&lt;/strong&gt; the various results with the &lt;strong&gt;collect&lt;/strong&gt; predicate.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;strong&gt;collect&lt;/strong&gt; predicate is also quite involved but you just need to understand
something very simple about how it works: It retracts from the &lt;strong&gt;Prolog&lt;/strong&gt;
database in order from the very first &lt;strong&gt;assertz&lt;/strong&gt; to the last fact that we
asserted called &lt;strong&gt;queue(bottom)&lt;/strong&gt; at which point its done collecting all of the
results required to form the resulting list.&lt;/p&gt;

&lt;p&gt;What we’re trying to write is a similar predicate but we want to on every newly
found solution compare to the last best solution and immediately decide which
one to keep in the &lt;strong&gt;Prolog&lt;/strong&gt; database so that we only keep track of 1 solution
at any given time during the predicates solution (very memory efficient
solution). Given the definition of the &lt;strong&gt;findall&lt;/strong&gt; predicate it isn’t that hard
to get to a solution like so for our &lt;strong&gt;findminpath&lt;/strong&gt; predicate:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;dynamic&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;solution&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;findminpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;P&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;solution&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                           &lt;span class=&quot;ss&quot;&gt;findapath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;W1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;P1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]),&lt;/span&gt;
                           &lt;span class=&quot;ss&quot;&gt;assertz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;solution&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;W1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;P1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;
                           &lt;span class=&quot;p&quot;&gt;!,&lt;/span&gt;
                           &lt;span class=&quot;ss&quot;&gt;findminpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;P&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;

&lt;span class=&quot;ss&quot;&gt;findminpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;findapath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;W1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;P1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]),&lt;/span&gt;
                           &lt;span class=&quot;ss&quot;&gt;solution&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;W2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;P2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                           &lt;span class=&quot;nv&quot;&gt;W1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;W2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                           &lt;span class=&quot;ss&quot;&gt;retract&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;solution&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;W2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;P2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;
                           &lt;span class=&quot;ss&quot;&gt;asserta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;solution&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;W1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;P1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;
                           &lt;span class=&quot;ss&quot;&gt;fail&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;

&lt;span class=&quot;ss&quot;&gt;findminpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;P&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;solution&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;P&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;retract&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;solution&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;P&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This is without a doubt one of the most complex predicates we’ve written to date
so don’t be too worried if you don’t get it at first glance. So the first
rule is there to just populate the &lt;strong&gt;Prolog&lt;/strong&gt; database with the first solution
and then proceed with the underlying multiple solution gathering and comparison
in order to always leave the best solution in the &lt;strong&gt;Prolog&lt;/strong&gt; database which the
last rule is going to query and retract and return to the user requesting the
solution.&lt;/p&gt;

&lt;p&gt;You can have written this with two rules and in the current 2nd rule you’d create
a dummy solution like &lt;strong&gt;assertz(solution(100000,[]))&lt;/strong&gt; which would be
immediately lose to the first found path and return nothing but that would be a
lame solution prone to issues such as returning an empty path when there is no
path between two specified nodes or having issues when you’re sum of an existing
path is more than the magical &lt;strong&gt;100000&lt;/strong&gt; that you thought was big enough weight
that you would never exceed.&lt;/p&gt;

&lt;p&gt;You can now use your newly created &lt;strong&gt;findminpath&lt;/strong&gt; to easily calculate the best
path between &lt;strong&gt;a&lt;/strong&gt; and &lt;strong&gt;g&lt;/strong&gt; and you’d get the expected result of:&lt;/p&gt;

&lt;console&gt;
| ?- findminpath(a,g,W,P).

P = [a,c,d,g]
W = 8
&lt;/console&gt;

&lt;p&gt;This post has covered quite a few things but I hope the one thing you can take
away from this post is that &lt;strong&gt;Prolog&lt;/strong&gt; can be extremely easy to express certain
data structures and to also write up certain functions used in every day coding
requirements that would otherwise take days of careful writing and testing to
come up with.&lt;/p&gt;

&lt;p&gt;An interesting thing I came across that was written using &lt;strong&gt;Prolog&lt;/strong&gt; is an
article on how &lt;strong&gt;IBM&lt;/strong&gt; used &lt;strong&gt;Prolog&lt;/strong&gt; to do natural language processing for
their &lt;strong&gt;Jeopardy&lt;/strong&gt; winning computer program called &lt;strong&gt;Watson&lt;/strong&gt;, you can read more
&lt;a href=&quot;http://www.cs.nmsu.edu/ALP/2011/03/natural-language-processing-with-prolog-in-the-ibm-watson-system/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Taming Prolog</title>
   <link href="http://rlgomes.github.com/work/prolog/2012/05/20/17.00-taming-Prolog.html"/>
   <updated>2012-05-20T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/prolog/2012/05/20/17.00-taming-Prolog</id>
   <content type="html">&lt;p&gt;In this post I’d really like to go over how &lt;strong&gt;Prolog&lt;/strong&gt; executes your predicates
and how you can control how this execution is handled. Lets start with another
predicate that we can trace its flow and see how &lt;strong&gt;Prolog&lt;/strong&gt; executes a predicate
with multiple ramifications. Lets pick a simple predicate that can validate if
given &lt;strong&gt;Term1,Term2 and Term3&lt;/strong&gt; if &lt;strong&gt;Term3&lt;/strong&gt; is the minimum of the two.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):-&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;lt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):-&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Now when we trace the execution of lets say &lt;strong&gt;minimum(1,2,X)&lt;/strong&gt; we are expecting
the answer that &lt;strong&gt;X=1&lt;/strong&gt; and then we’d be done, but here is what we get when
tracing and allowing the system to give us all the available options:&lt;/p&gt;

&lt;console&gt;
| ?- minimum(1,2,X).
      1    1  Call: minimum(1,2,_16) ?
      2    2  Call: 1=&amp;lt;2 ?
      2    2  Exit: 1=&amp;lt;2 ?
      1    1  Exit: minimum(1,2,1) ?

X = 1 ? ;
      1    1  Redo: minimum(1,2,1) ?
      2    2  Call: 1&amp;gt;2 ?
      2    2  Fail: 1&amp;gt;2 ?
      1    1  Fail: minimum(1,2,_16) ?

no
&lt;/console&gt;

&lt;p&gt;We know for a fact that as soon as you validate that 1 is the minimum of those
terms that there is no reason to &lt;strong&gt;backtrack&lt;/strong&gt; and try the other rule since it
would logically be false and you could no longer find any solutions that would
be valid. In &lt;strong&gt;Prolog&lt;/strong&gt; you are able to tell the engine to not &lt;strong&gt;backtrack&lt;/strong&gt;
any longer by using a &lt;strong&gt;cut&lt;/strong&gt; which is the operator &lt;strong&gt;!&lt;/strong&gt; and for the previous
predicate it would be used like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):-&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;lt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;!.&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;minimum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):-&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Now if you trace this you’ll see that as soon as &lt;strong&gt;Prolog&lt;/strong&gt; reaches the &lt;strong&gt;cut&lt;/strong&gt;
it will stop searching for other solutions, like so:&lt;/p&gt;

&lt;console&gt;
| ?- minimum(1,2,X).
      1    1  Call: minimum(1,2,_17) ?
      2    2  Call: 1=&amp;lt;2 ?
      2    2  Exit: 1=&amp;lt;2 ?
      1    1  Exit: minimum(1,2,1) ?

X = 1

yes
&lt;/console&gt;

&lt;p&gt;This may not seem like a big deal right now but as we show more complicated
predicates you’ll find that this could meant the difference between executing a
rule in linear time vs exponential time due to the fact that &lt;strong&gt;Prolog&lt;/strong&gt; attempts
to exhaust all possible solutions for a given predicate.&lt;/p&gt;

&lt;p&gt;You’ll find that cut can be heard to understand but as you trace your code and
see situations where &lt;strong&gt;Prolog&lt;/strong&gt; is wasting time by doing backtracking and
attempting to match on other rules that would never work you’ll realize that a
simple placement of the appropriate &lt;strong&gt;cut&lt;/strong&gt; can speed up the execution of your
&lt;strong&gt;Prolog&lt;/strong&gt; predicates.&lt;/p&gt;

&lt;p&gt;Lets try to cover another interesting feature in &lt;strong&gt;Prolog&lt;/strong&gt; which is the ability
to add new predicates to your running database or remove existing ones. Which
means you can now make your predicates dynamic and really get some interesting
things done in &lt;strong&gt;Prolog&lt;/strong&gt;. The 2 predicates used are &lt;strong&gt;asserta/1&lt;/strong&gt; and
&lt;strong&gt;retrace/1&lt;/strong&gt; which respectively add and remove the specified term from the
&lt;strong&gt;Prolog&lt;/strong&gt; database. Lets look at how you’d define the Fibonacci function in
&lt;strong&gt;Prolog&lt;/strong&gt;:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;fib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;fib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;fib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;N&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;N1&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;N&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;nv&quot;&gt;N2&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;N1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;ss&quot;&gt;fib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;N1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;F1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;ss&quot;&gt;fib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;N2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;F2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;nv&quot;&gt;F&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;F1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;F2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Now to execute the above with large numbers you may need to tweak your global
and local stack to something higher. The simplest way is by setting the
environment variables: LOCALSZ and GLOBALSZ to something higher like so:&lt;/p&gt;

&lt;console&gt;
export LOCALSZ=131072
export GLOBALSZ=131072
&lt;/console&gt;

&lt;p&gt;With the above settings if you start executing our Fibonacci example with
numbers from 1 to 25 you’ll notice that the execution time starts to increase
quite quickly and also the stack size would need to be further made larger to
be able to calculate the Fibonacci of 100. Well the good thing is that we just
learned about &lt;strong&gt;asserta/1&lt;/strong&gt; and can basically use the &lt;strong&gt;Prolog&lt;/strong&gt; database to
&lt;strong&gt;memoize&lt;/strong&gt; previous results. This is what the new solution could look like and
this solution can now calculate Fibonacci of 100 without any issues:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;dynamic&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;fib&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;fib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;fib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;fib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;N&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;N1&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;N&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;nv&quot;&gt;N2&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;N1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;ss&quot;&gt;fib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;N1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;F1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;ss&quot;&gt;fib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;N2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;F2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;nv&quot;&gt;F&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;F1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;F2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;ss&quot;&gt;asserta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;fib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;N&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The &lt;strong&gt;dynamic&lt;/strong&gt; predicate call is required in order to tell &lt;strong&gt;Prolog&lt;/strong&gt; that the
&lt;strong&gt;fib/2&lt;/strong&gt; predicate can be dynamically modified. The other small change was to
basically &lt;strong&gt;memoize&lt;/strong&gt; the &lt;strong&gt;fib(N,F)&lt;/strong&gt; so it wouldn’t be recalculated on every
subsequent call.&lt;/p&gt;

&lt;p&gt;There’s a lot you can do with &lt;strong&gt;cut&lt;/strong&gt; and being able to &lt;strong&gt;assert&lt;/strong&gt; and
&lt;strong&gt;retract&lt;/strong&gt; from the &lt;strong&gt;Prolog&lt;/strong&gt; database and I believe you should go and
experiment with the newly found features of &lt;strong&gt;Prolog&lt;/strong&gt; so you can become learn
better how they work.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Learning more Prolog</title>
   <link href="http://rlgomes.github.com/work/prolog/2012/05/17/14.00-learning-more-Prolog.html"/>
   <updated>2012-05-17T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/prolog/2012/05/17/14.00-learning-more-Prolog</id>
   <content type="html">&lt;p&gt;In this post we’ll cover the built-in functions that Prolog has that are useful
and also get into more advanced topics such as using &lt;strong&gt;assert&lt;/strong&gt; and &lt;strong&gt;retract&lt;/strong&gt;
methods and talk a bit about how &lt;strong&gt;Prolog&lt;/strong&gt; executes the code you write and how
to trace through this execution to better understand how &lt;strong&gt;Prolog&lt;/strong&gt; works.&lt;/p&gt;

&lt;p&gt;Lets start by listing a few of the most important built-in functions that you’ll
find yourself using a regular basis:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Type Testing&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;var/1&lt;/strong&gt; - succeeds if term is currently not instantiated.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;nonvar/1&lt;/strong&gt; - succeeds if term is instantiated.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;atom/1&lt;/strong&gt; - succeds if term is an atom.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;integer/1&lt;/strong&gt;, &lt;strong&gt;float/1&lt;/strong&gt;, etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Term Unification&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;=/2&lt;/strong&gt; (unification) - tests if Term1 and Term2 can be unified,
                        example: A is 2 + 1, A = 3.  % will succeed.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;=/2&lt;/strong&gt; (not unifiable) - tests that Term1 and Term2 can not be unified,
                           example: 2 + 1 = 3.  % will fail to unify.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Term Comparison&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;==/2&lt;/strong&gt; (equals) - straight up equality of the terms being compared.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;==/2&lt;/strong&gt; (not equals) - straight up inequiltiy of the terms being compared.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;@&amp;lt;/2&lt;/strong&gt; (lest than), etc - the various other comparison operators.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Lists&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;append/3&lt;/strong&gt; - when it succeeds you’ll have a appended the first two terms to
               each other and the resulting list will be the contained in the
               third term.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;member/2&lt;/strong&gt; - succeeds if term1 is found within the list that term2 is
               referencing.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;delete/3&lt;/strong&gt;, &lt;strong&gt;permutation/2&lt;/strong&gt;, &lt;strong&gt;sublist/2&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Above we’ve also introduced the predicate notation used to indicate how many
arguments the predicate has which is &lt;strong&gt;/n&lt;/strong&gt; where n is the number terms that the
predicate requires.&lt;/p&gt;

&lt;p&gt;You can always look up more predicates that are available to default installations
on your own time now the interesting thing is going to be showing how powerful
some of these harmless predicates really are. Lets start with the &lt;strong&gt;append/3&lt;/strong&gt;
predicate and actually first show how this predicate is implemented:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([],&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Ys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Ys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Ys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Zs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Ys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Zs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The above is already built in so dont’ try loading a file with that otherwise
you’ll be greated with a message stating “ native code procedure append/3 cannot
be redefined (ignored)”. But lets use the predicate to show how easy it is to
append lists to eachother, like the following shows:&lt;/p&gt;

&lt;console&gt;
&amp;gt; prolog
GNU Prolog 1.3.0
By Daniel Diaz
Copyright (C) 1999-2007 Daniel Diaz
| ?- append([1,2,3],[4,5,6],L).

L = [1,2,3,4,5,6]

yes
&lt;/console&gt;

&lt;p&gt;That works as expected and gives us the response we’re expecting. Here is where
we can show one of the most powerful features of &lt;strong&gt;Prolog&lt;/strong&gt; and that is the
ability to do inference of your predicates in order to logically fulfill them
and do things that in other languages would require quite a lot of code writing.&lt;/p&gt;

&lt;p&gt;Before we show how inference works lets instead do a trace of the previous usage
of &lt;strong&gt;append&lt;/strong&gt; so we can better understand how inference works. So to trace a
&lt;strong&gt;Prolog&lt;/strong&gt; execution in &lt;strong&gt;GNU Prolog&lt;/strong&gt; once the interpreter is up hit &lt;strong&gt;Ctrl+C&lt;/strong&gt;
and pick &lt;strong&gt;t&lt;/strong&gt; to enable trace. Then type the predicate as before and now you’ll
be stepping through each step in the &lt;strong&gt;Prolog&lt;/strong&gt; engine on the screen, like so:&lt;/p&gt;

&lt;console&gt;
| ?-
Prolog interruption (h for help) ? t
The debugger will first creep -- showing everything (trace)
| ?- append([1,2,3],[4,5,6],L).
      1    1  Call: append([1,2,3],[4,5,6],_29) ?
      1    1  Exit: append([1,2,3],[4,5,6],[1,2,3,4,5,6]) ?

L = [1,2,3,4,5,6]

yes
{trace}
| ?-
&lt;/console&gt;

&lt;p&gt;Humm that was useless because the &lt;strong&gt;native&lt;/strong&gt; method doesn’t trace through the
&lt;strong&gt;Prolog&lt;/strong&gt; implementation the same way. For this example load the previous
&lt;strong&gt;append&lt;/strong&gt; definition with a different name such as &lt;strong&gt;append1&lt;/strong&gt; and then you
should be able to get a similar trace to the following:&lt;/p&gt;

&lt;console&gt;
| ?-
Prolog interruption (h for help) ? t
The debugger will first creep -- showing everything (trace)
| ?- append1([1,2,3],[4,5,6],L).
      1    1  Call: append1([1,2,3],[4,5,6],_29) ?
      2    2  Call: append1([2,3],[4,5,6],_62) ?
      3    3  Call: append1([3],[4,5,6],_89) ?
      4    4  Call: append1([],[4,5,6],_116) ?
      4    4  Exit: append1([],[4,5,6],[4,5,6]) ?
      3    3  Exit: append1([3],[4,5,6],[3,4,5,6]) ?
      2    2  Exit: append1([2,3],[4,5,6],[2,3,4,5,6]) ?
      1    1  Exit: append1([1,2,3],[4,5,6],[1,2,3,4,5,6]) ?

L = [1,2,3,4,5,6]

yes
{trace}
&lt;/console&gt;

&lt;p&gt;Now the trace can be a bit hard to understand at first but as you trace through
more and more predicate execution you’ll get the hang of how things work. So for
our &lt;strong&gt;append&lt;/strong&gt; implementation we defined &lt;strong&gt;append&lt;/strong&gt; in a special way because we
were taking advantage of tail recursion in &lt;strong&gt;Prolog&lt;/strong&gt; and we defined the
&lt;strong&gt;append&lt;/strong&gt; predicate like so:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;If you have an empty list and a list Ys then the result of appending them is
Ys.&lt;/li&gt;
  &lt;li&gt;If you have a list with head X and tail Xs and another list Ys, then the
resulting list is going to consist of putting X at the head of the list with
Zs being the concatenation of Xs and Ys.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Because of that special definition the &lt;strong&gt;Prolog&lt;/strong&gt; engine will create your
resulting list as it exits from each of the calls and now while its proceeding
to match the termination predicate at &lt;strong&gt;append1([],[4,5,6],_116)&lt;/strong&gt;. So knowing
this and looking at the trace you now know that the &lt;strong&gt;Prolog&lt;/strong&gt; enginew as trying
to reduce your initial request of &lt;strong&gt;append1([1,2,3],[4,5,6],L)&lt;/strong&gt; into one that
terminated with the &lt;strong&gt;append1([],[4,5,6],_XX)&lt;/strong&gt; and then worked its way
backwards to create the resulting appended list.&lt;/p&gt;

&lt;p&gt;The really interesting thing about this backward inference and the ability to
deduce which elements were moved around based on how you defined a predicate
allows &lt;strong&gt;Prolog&lt;/strong&gt; to do even more interesting things such as:&lt;/p&gt;

&lt;console&gt;
| ?- append(A,[4,5],[1,2,4,5]).

A = [1,2] ?

yes
&lt;/console&gt;

&lt;p&gt;We just used our &lt;strong&gt;append&lt;/strong&gt;  method to infer what list appended to [4,5] gives
us the list [1,2,4,5]. This may not seem ground breaking but think about how
hard this would be to implement in another language and you’ll soon see the
power of inference. If you want to see a quick way that inference can be used,
lets take the problem:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;given a list of elements enumerate all of the possible combinations of 2 lists
that can be append to create the resulting list.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Getting &lt;strong&gt;Prolog&lt;/strong&gt; to do this for us is as simple as:&lt;/p&gt;

&lt;console&gt;
| ?- append(A,B,[1,2,3,4]).

A = []
B = [1,2,3,4] ? ;

A = [1]
B = [2,3,4] ? ;

A = [1,2]
B = [3,4] ? ;

A = [1,2,3]
B = [4] ? ;

A = [1,2,3,4]
B = [] ? ;

no
&lt;/console&gt;

&lt;p&gt;Instead of having the interpret iterate the possible solutions lets introduce
the usage of the &lt;strong&gt;findall&lt;/strong&gt; predicate which can be used to create a &lt;strong&gt;List&lt;/strong&gt; of
solutions. The predicate itself is defined as &lt;strong&gt;findall(Object,Goal,List)&lt;/strong&gt;
where the &lt;strong&gt;Object&lt;/strong&gt; is the elements from the &lt;strong&gt;Goal&lt;/strong&gt; you wish to put in the
&lt;strong&gt;List&lt;/strong&gt; for each solution found. So lets just show how to use &lt;strong&gt;findall&lt;/strong&gt; to
get all the solutions in a nice neat little list.&lt;/p&gt;

&lt;console&gt;
| ?- findall((A,B),append(A,B,[1,2,3,4]),List).

List = [([],[1,2,3,4]),([1],[2,3,4]),([1,2],[3,4]),([1,2,3],[4]),([1,2,3,4],[])]
yes
&lt;/console&gt;

&lt;p&gt;While on the subject what if you wanted to not include any solutions with empty
lists as part of the concatenation ? Here’s one possible solution:&lt;/p&gt;

&lt;console&gt;
| ?- findall((A,B),(append(A,B,[1,2,3,4]), A \= [], B \= []),List).

List = [([1],[2,3,4]),([1,2],[3,4]),([1,2,3],[4])]

yes
&lt;/console&gt;

&lt;p&gt;A simple non unification test for A and B to the empty list term and we’ve
fixed our problem.&lt;/p&gt;

&lt;p&gt;We’ve covered quite a few things in this post and I am probably running through
a lot of things without explaining too many details and showing more
implementation and usage scenarios. I really don’t want to spend too much time
writing up theory and explanations you can find online and instead would like to
focus on how to use the various features to complete tasks that would be very
difficult in other languages.&lt;/p&gt;

&lt;p&gt;The next post will cover more details of tracing and I will try to also
introduce the notion of &lt;strong&gt;“cutting”&lt;/strong&gt;, which involves controlling how &lt;strong&gt;Prolog&lt;/strong&gt;
does backtracking through your predicates.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Learning Prolog basics</title>
   <link href="http://rlgomes.github.com/work/prolog/2012/05/16/22.00-learning-prolog-basics.html"/>
   <updated>2012-05-16T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/prolog/2012/05/16/22.00-learning-prolog-basics</id>
   <content type="html">&lt;p&gt;I wanted to review my Prolog skills an at the same time write up a quick set of
posts on how to use Prolog to get certain tasks done in a more efficient manner.
This first post is about going over how you write a basic Prolog program and
how to wrap your mind around this “different” programming paradigm.&lt;/p&gt;

&lt;p&gt;Prolog is a logic programming language based that uses facts and rules to
evaluate if what you are trying to compute is true or false logically. These
facts and rules are loaded into what is usually called the &lt;strong&gt;Prolog&lt;/strong&gt; database
and then you can query them in order to get the answers to the questions that
you want to solve. The most basic thing you can define in Prolog is a fact and
a fact has the form:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;fact&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;atom&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;A fact can be just a simple name followed by the open and closing parenthesis or
it can include arguments which are actually called atoms. To understand better
an atom is a general purpose name used to identify elements and is represented
by a lower case sequence of characters. For example lets say we wanted to
expression that rodney is a human being. The fact for such a thing in &lt;strong&gt;Prolog&lt;/strong&gt;
could be written like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;human&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;rodney&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The above is also equivalente to “human(rodney) :- true.”. Now, lets fire up the
gnu prolog interpreter and load the file that contains the above fact, like so:&lt;/p&gt;

&lt;console&gt;
&amp;gt; prolog
GNU Prolog 1.3.0
By Daniel Diaz
Copyright (C) 1999-2007 Daniel Diaz
| ?- [basics].
compiling /home/rlgomes/workspace/prolog/prolog_basics/basics.pl for byte code...
/home/rlgomes/workspace/prolog/prolog_basics/basics.pl compiled, 5 lines read - 413 bytes written, 4 ms

(4 ms) yes
| ?-
&lt;/console&gt;

&lt;p&gt;Loading the file is done with the usage of the square brackets surrounding the
name of the file that has a &lt;strong&gt;.pl&lt;/strong&gt; extension. Once loaded you can query the
&lt;strong&gt;Prolog&lt;/strong&gt; engine by writing a rule and verify if it matches something in the
&lt;strong&gt;Prolog&lt;/strong&gt; database, like so:&lt;/p&gt;

&lt;console&gt;
| ?- human(rick).

no
&lt;/console&gt;

&lt;p&gt;Here you can see for the first time how the &lt;strong&gt;Prolog&lt;/strong&gt; interpreter responds with
&lt;strong&gt;no&lt;/strong&gt; in order to tell you that it does not know if ‘rick’ is human. So we can
also query the engine for truthful facts like so:&lt;/p&gt;

&lt;console&gt;
| ?- human(rodney).

yes
&lt;/console&gt;

&lt;p&gt;Facts are interesting and the basis of everything that is known to be true
within a &lt;strong&gt;Prolog&lt;/strong&gt; engine but the really interesting part is when you start
writing rules. A rule is a very similar predicate construction that uses
variables to match other values in order to evaluate to a truthful statement. So
for example lets define a rule that says that all humans are mortals, like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;mortal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;human&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The above rule is read “X is mortal if X is human” and is a very simple rule
that you can use just as before to validate that “mortal(rodney).” and you’ll
get the response &lt;strong&gt;yes&lt;/strong&gt;. We can make a more interesting program with the few
things we’ve learned so far and lets just jump into the program below:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;samantha&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;alice's mother
mother(samantha, alice).
mother(samantha, bob).

% joe is alice's&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;father&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;father&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;joe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;alice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;father&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;joe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;bob&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;father&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;joseph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;joe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;

&lt;span class=&quot;err&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;sibling&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;of&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;father&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;has&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;father&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;has&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;father&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;siblings&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;father&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;father&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;

&lt;span class=&quot;ss&quot;&gt;grandparent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;father&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;father&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;grandparent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;mother&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;father&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;grandparent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;father&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;mother&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;grandparent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;mother&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;mother&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This short program can now easily validate very easily if people are siblings or
if someone is someone’s grandparent. You can see how to create multiple rules to
validate the same general predicate but defining the various variations that
would make that predicate true. Lets see how quickly we can verify if the
various people are related.&lt;/p&gt;

&lt;console&gt;
| ?- siblings(alice, joe).

no
| ?- siblings(alice, bob).

true ?

yes
&lt;/console&gt;

&lt;p&gt;You may have noticed that when verifying that Alice and Bob were siblings you
got a slightly different prompt. This new prompt identifies that there is one
way to verify that Alice and Bob are siblings and if you hit just enter you
don’t have any interest in additional solutions but if you hit ‘;’ followed by
enter the &lt;strong&gt;Prolog&lt;/strong&gt; engine will try to verify alternate ways of evaluating that
Alice and Bob are siblings. We’ll go into more details on the alternate routes
to verify the same predicate in future posts.&lt;/p&gt;

&lt;p&gt;Lets make a quick introduction into lists and how &lt;strong&gt;Prolog&lt;/strong&gt; represents and uses
them and then we’ll leave until the next post to get into more complex parts of
&lt;strong&gt;Prolog&lt;/strong&gt;. Now lists are represented in a very easy to understand manner like
so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;nv&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The above will evaluate on the interpreter to truth but doesn’t serve any
purpose in a &lt;strong&gt;Prolog&lt;/strong&gt; file. We can now talk about how to use lists within
&lt;strong&gt;Prolog&lt;/strong&gt; predicates and basically take a list apart. When specifying a list in
a new rule we can separate the current head of the list for the rest of the list
like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;some_rule_over_lists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;H&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Once again the above doesn’t serve any purpose but just introduces you to the
notion of processing a list with a &lt;strong&gt;Prolog&lt;/strong&gt; rule. Now if we wanted to write
a function to calculate the length of a list we can’t really return a value so
what needs to be done is you need your “length” function to actually have a
second argument which would house the result. So we’d define length like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-prolog&quot; data-lang=&quot;prolog&quot;&gt;&lt;span class=&quot;ss&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([],&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;N&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:-&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;NT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;N&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;NT&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This last implementation has quite a few new things in it so lets start by
explaining that when defining any rules that operate over lists you usually have
the predicate that handles the empty list &lt;strong&gt;[]&lt;/strong&gt; and the predicate that handles a
list with a head and tail, like so &lt;strong&gt;[H|T]&lt;/strong&gt;. This is a very common pattern for
recursive functions over lists and is used all the time when handling lists in
&lt;strong&gt;Prolog&lt;/strong&gt;. We are also introducing how to do arithmetic in &lt;strong&gt;Prolog&lt;/strong&gt; using the
&lt;strong&gt;is&lt;/strong&gt; statement to attribute to &lt;strong&gt;N&lt;/strong&gt; the value of calculating the length of
the tail &lt;strong&gt;T&lt;/strong&gt; plus 1. I also decided to introduce the anonymous variable &lt;strong&gt;_&lt;/strong&gt;
because you will get a warning about &lt;strong&gt;H&lt;/strong&gt; not being used whenever you have
things that are matched but not used to calculate anything of importance.&lt;/p&gt;

&lt;p&gt;You can use the previously defined &lt;strong&gt;len&lt;/strong&gt; function like so:&lt;/p&gt;

&lt;console&gt;&amp;gt; prolog
GNU Prolog 1.3.0
By Daniel Diaz
Copyright (C) 1999-2007 Daniel Diaz
| ?- consult(basics).
compiling /home/rlgomes/workspace/prolog/prolog_basics/basics.pl for byte code...
/home/rlgomes/workspace/prolog/prolog_basics/basics.pl:7: warning: singleton variables [H,T] for blah/1
/home/rlgomes/workspace/prolog/prolog_basics/basics.pl compiled, 10 lines read - 1033 bytes written, 10 ms

(4 ms) yes
| ?- len([a,b,c,d,e,f,g,h,i,j], L).

L = 10

yes
| ?-
&lt;/console&gt;

&lt;p&gt;I would advise you to go and play with the interpreter and defining other
&lt;strong&gt;Prolog&lt;/strong&gt; functions such as:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;sum&lt;/strong&gt; function that can sum up the elements in a list of numbers, with
the syntax &lt;strong&gt;sum([1,2,3,4,5], S)&lt;/strong&gt; and returns the sum in the variable
&lt;strong&gt;S&lt;/strong&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;replace function which given a list of elements replaces a specific element
in the list with another element specified and as before the result should
be the last argument in your function. Start with the definition for your
function like so &lt;strong&gt;replace(List, Element, Replacement, NewList)&lt;/strong&gt; and you
should be able to write that up with a little effort.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the following post I will be looking at the built in functions and how to
assert and retract facts from the &lt;strong&gt;Prolog&lt;/strong&gt; database.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Now powered by bootstrap</title>
   <link href="http://rlgomes.github.com/work/personal/2012/05/16/12.00-now-powered-by-boostrap-now.html"/>
   <updated>2012-05-16T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/personal/2012/05/16/12.00-now-powered-by-boostrap-now</id>
   <content type="html">&lt;p&gt;Finally decided the blog required a more professional and better look &amp;amp; feel to
it and at the same time wanted to learn a bit more about this &lt;a href=&quot;http://twitter.github.com/bootstrap/&quot;&gt;&lt;strong&gt;Bootstrap&lt;/strong&gt;&lt;/a&gt;
framework that some engineers at &lt;strong&gt;Twitter&lt;/strong&gt; had released.&lt;/p&gt;

&lt;p&gt;I started by simply having a look at the available examples in the source code
and the basic templates and was quickly able to get a grasp on how to use this
along with &lt;a href=&quot;http://jekyllrb.com/&quot;&gt;&lt;strong&gt;Jekyll&lt;/strong&gt;&lt;/a&gt; to make my site look way more
professional.&lt;/p&gt;

&lt;p&gt;Having spent a total of about 8 hours on the task I am completely satisfied
with the new look and will make further changes in the weeks to come as I
find small visual issues that I haven’t already fixed. To get an idea here is
a screenshot of what the site use to look like before using bootstrap:&lt;/p&gt;

&lt;center&gt;
&lt;img width=&quot;640&quot; src=&quot;/images/2012/may/old_site_look.png&quot; /&gt;
&lt;/center&gt;

&lt;p&gt;And now powered by &lt;strong&gt;Bootstrap&lt;/strong&gt;:&lt;/p&gt;

&lt;center&gt;
&lt;img width=&quot;640&quot; src=&quot;/images/2012/may/new_site_look.png&quot; /&gt;
&lt;/center&gt;
</content>
 </entry>
 
 <entry>
   <title>Tuning the Java Garbage Collector</title>
   <link href="http://rlgomes.github.com/work/java/language/2011/12/30/17.12-Tuning-the-Java-garbage-collector.html"/>
   <updated>2011-12-30T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/java/language/2011/12/30/17.12-Tuning-the-Java-garbage-collector</id>
   <content type="html">&lt;p&gt;Recently, I’ve found a few situations where when writing a prototype for
multi-node storage software that I want to be able to guarantee the stability
of my performance numbers as the number of concurrent requests grows and the JVM
is under high stress where the GC kicks in more often and has the unfortunate
effect of adding higher latency to your requests.&lt;/p&gt;

&lt;p&gt;Recently, I wanted to learn a bit more about the Java GC and how an applications
performance can be “fixed” by tweaking the garbage collector being used. We’ll
start by creating an application that will force the JVM to garbage collect more
often: Call a function in a loop and this function will create a large object
and calculate something complex (log(n)) and then throw away the object.&lt;/p&gt;

&lt;p&gt;We will always run the application with the JVM option: &lt;strong&gt;“-Xmx64M”&lt;/strong&gt; which
basically means we can’t use more than 64MB during our test runs and this
allows us to force the GC to kick-in more often. Our slow/bad application
can be as simple as:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot; data-lang=&quot;java&quot;&gt;&lt;span class=&quot;cm&quot;&gt;/**
 * Call a function in a loop and this function will create a large object
 * and calculate something complex (log(n)) and then throw away the object.
 */&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;BadApp1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;

    &lt;span class=&quot;cm&quot;&gt;/**
     * Generate a large array of integers and then calculate which is the
     * maximum value and returns that releasing the large object previously
     * created.
     */&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;maximum_random&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;random_integers&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1024&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1024&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;];&lt;/span&gt;
        &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;random_integers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;];&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;random_integers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;

        &lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;durations&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;duration&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;currentTimeMillis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;maximum_random&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;currentTimeMillis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;duration&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;durations&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;duration&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;println&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;calculation took &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;duration&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;ms&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

        &lt;span class=&quot;n&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;println&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;average duration was &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;durations&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;ms&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;When we start analyzing how much time it takes to run that program and in each
iteration print out how long we spent calculating the maximum value along with
running with the following options:&lt;/p&gt;

&lt;p&gt;java -Xmx64M -XX:+PrintGCApplicationStoppedTime BadApp1&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;-XX:+PrintGCAppliactionStoppedTime&lt;/strong&gt; tells the JVM to print out exactly
how much time is the GC taking and not allowing the application to do its work.
Here is a quick output of what I’m currently getting on my system:&lt;/p&gt;

&lt;console&gt;
&amp;gt;java -Xmx64M -XX:+PrintGCApplicationStoppedTime BadApp1
...
Total time for which application threads were stopped: 0.0117340 seconds
calculation took 68ms
Total time for which application threads were stopped: 0.0116940 seconds
calculation took 69ms
average duration was 69ms
&lt;/console&gt;

&lt;p&gt;Now we can see that we’re spending a good 0.012s in garbage collection code on
every single iteration, which is basically 12ms of time in each of those 69ms
of average time spent calling that “bad” function.&lt;/p&gt;

&lt;p&gt;Before we start trying out different garbage collectors and tweaking their
settings you should read some of the following links first:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;http://www.oracle.com/technetwork/systems/index-156457.html&lt;/li&gt;
  &lt;li&gt;http://www.petefreitag.com/articles/gctuning/&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;if you want a better explanation of how the various garbage collectors work.&lt;/p&gt;

&lt;p&gt;So getting back to our situation we’re suffering a 12ms overhead from the GC in
every execution of ours and would like to reduce that by tuning the GC. The
first idea might be to try and use the &lt;strong&gt;Parallel Copy Collector&lt;/strong&gt; and see if
this reduces the over head slightly:&lt;/p&gt;

&lt;console&gt;
&amp;gt; java -Xmx64M -XX:+UseParNewGC -XX:+PrintGCApplicationStoppedTime BadApp1
...
Total time for which application threads were stopped: 0.0110810 seconds
calculation took 64ms
Total time for which application threads were stopped: 0.0075140 seconds
calculation took 62ms
average duration was 65ms
&lt;/console&gt;

&lt;p&gt;A very minimal improvement which isn’t totally new since the only thing being
done by this garbage collector is to have a thread per core when collecting the
young generation garbage. Since my machine is a dual core machine and there
really is only 1 object to garbage collect in my code then I don’t expect almost
any improvement.&lt;/p&gt;

&lt;p&gt;For the particular application that we wrote could benefit from a garbage
collector that has to deal with a large young generation and as the
documentation states:&lt;/p&gt;

&lt;p&gt;“Use the throughput collector when you want to improve the performance of your
application with larger numbers of processors. In the default collector garbage
collection is done by one thread, and therefore garbage collection adds to the
serial execution time of the application. The throughput collector uses
multiple threads to execute a minor collection and so reduces the serial
execution time of the application.”&lt;/p&gt;

&lt;p&gt;So lets see what the benefit is like:&lt;/p&gt;

&lt;console&gt;
&amp;gt;java -Xmx64M -XX:+UseParallelGC -XX:+PrintGCApplicationStoppedTime BadApp1
...
Total time for which application threads were stopped: 0.0042150 seconds
calculation took 46ms
Total time for which application threads were stopped: 0.0053080 seconds
calculation took 45ms
average duration was 48ms
&lt;/console&gt;

&lt;p&gt;Now we’re 30% faster just by making a better choice in the GC being used. We’ve
gotten this “additional boost in performance” by now spending just 4ms per GC.&lt;/p&gt;

&lt;p&gt;In general you don’t go about fiddling with the garbage collector unless you’ve
noticed your application isn’t behaving as you would expect it to and that
you’re unable to guarantee stable performance behavior and after some
investigation have found that the GC is in fact to be “blamed”. There are
plenty of other situations in which tuning the GC can lead to a better
performing Java application but I just wanted to show that even a badly written
application can be “fixed” by a small change to the GC that is being used.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Adding comments to your blog with disqus</title>
   <link href="http://rlgomes.github.com/work/default/2011/12/25/12.10-Adding-comments-to-your-blog-with-disqus.html"/>
   <updated>2011-12-25T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/default/2011/12/25/12.10-Adding-comments-to-your-blog-with-disqus</id>
   <content type="html">&lt;p&gt;This is just a quick write up on how easy it is to add comments to your blog or
site with &lt;a href=&quot;http://disqus.com&quot;&gt;Disqus&lt;/a&gt;. You’ll need to create an account at
disqus. Once that is done you can go to your dashboard and add a new site for
the specific site you want to track comments on. The last step is to simple
follow the instructions at &lt;strong&gt;Install&lt;/strong&gt; section within your new sites admin page.&lt;/p&gt;

&lt;p&gt;In my case all I had to add was:&lt;/p&gt;

&lt;console&gt;
&amp;lt;div id=&quot;disqus_thread&quot;&amp;gt;&amp;lt;/div&amp;gt;
&amp;lt;script type=&quot;text/javascript&quot;&amp;gt;
    /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
    // required: replace example with your forum shortname
    var disqus_shortname = 'example';

    /* * * DON'T EDIT BELOW THIS LINE * * */
    (function() {
        var dsq = document.createElement('script');
        dsq.type = 'text/javascript'; dsq.async = true;
        dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
        (document.getElementsByTagName('head')[0] ||
         document.getElementsByTagName('body')[0]).appendChild(dsq);
    })();
&amp;lt;/script&amp;gt;
&amp;lt;noscript&amp;gt;Please enable JavaScript to view the
&amp;lt;a href=&quot;http://disqus.com/?ref_noscript&quot;
&amp;gt;comments powered by Disqus.&amp;lt;/a&amp;gt;&amp;lt;/noscript&amp;gt;
&amp;lt;a href=&quot;http://disqus.com&quot; class=&quot;dsq-brlink&quot;&amp;gt;blog comments powered by
&amp;lt;span class=&quot;logo-disqus&quot;&amp;gt;Disqus&amp;lt;/span&amp;gt;&amp;lt;/a&amp;gt;
&lt;/console&gt;

&lt;p&gt;Once I replaced the ‘example’ disqus shortname with the shortname of my site
everything worked perfectly. The only thing you’ll find is that if you have
a static blog with jekyll as I do you won’t be able to see the comments on the
static site when you startup jekyll locally with the “jekyll –server” command.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Memoize decorator for python</title>
   <link href="http://rlgomes.github.com/work/python/memoize/decorator/2011/12/21/20.00-memoize-decorator-for-python.html"/>
   <updated>2011-12-21T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/python/memoize/decorator/2011/12/21/20.00-memoize-decorator-for-python</id>
   <content type="html">&lt;p&gt;I ran into a situation when writing some code recently where I had to add 
caching to an existing function in order to speed up the execution of my 
function when it was called multiple times with the same arguments. I know 
from experience that this involves a technique called memoization in which you 
basically cache the previous result and when the same arguments are passed to 
your function you pull from the cache instead of re-executed the same code
that takes a while and would render the exact same result.&lt;/p&gt;

&lt;p&gt;Adding the memoization/caching feature to your function isn’t hard but 
immediately I found it made the whole function look ugly and cluttered, and I 
was not satisfied with that. Luckily I was using python which has a feature 
called decorators which allow you to easily add more “functionality” to an 
existing function without cluttering its “code space”.&lt;/p&gt;

&lt;p&gt;As any good engineer the first thing to do was to look around and see if someone
had already created a memoize decorator for python and I quickly found there 
was a decent example in the python documentation but it lacked quite a few 
features including the ability to handle functions with lists or other objects
as arguments. So I set out to create my own memoize module that could be easily
used in other projects without having to clutter my code with “memoization” code.&lt;/p&gt;

&lt;p&gt;I built the memoize module that is now available &lt;a href=&quot;https://github.com/rlgomes/memoize&quot;&gt;here&lt;/a&gt;
and the README included has enough information on how to use this in your own 
project.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Optimizing Haskell Programs</title>
   <link href="http://rlgomes.github.com/work/haskell/2011/11/14/20.30-Optimizing-Haskell-Programs.html"/>
   <updated>2011-11-14T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/haskell/2011/11/14/20.30-Optimizing-Haskell-Programs</id>
   <content type="html">&lt;p&gt;Before we start optimizing the &lt;strong&gt;wc&lt;/strong&gt; command line tool we wrote lets first find
a simple way to compare this implementation with the wc command tool available
on my linux (written in C). I created a file with 197,000 dummy lines (with
each line just over 80 characters long) and measured how long it takes to count
the number of lines, with each tool:&lt;/p&gt;

&lt;console&gt;
time bash -c &quot;cat test | wc -l&quot;
197000
bash -c &quot;cat test | wc -l&quot;  0.00s user 0.02s system 65% cpu 0.043 total

time bash -c &quot;cat test | ./wc.hs -l&quot;
197000
bash -c &quot;cat test | ./wc.hs -l&quot;  1.46s user 0.07s system 95% cpu 1.590 total
&lt;/console&gt;

&lt;p&gt;So the current &lt;strong&gt;Haskell&lt;/strong&gt; implementation is 37x slower than the native C
version. The first thing to note is how running the haskell program without
compiling is not efficient at all. So lets put together a simple make file and
use &lt;strong&gt;ghci&lt;/strong&gt; to compile the &lt;strong&gt;.hs&lt;/strong&gt; file to a native executable. Here’s a
possible make file:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-makefile&quot; data-lang=&quot;makefile&quot;&gt;&lt;span class=&quot;nl&quot;&gt;init&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;err&quot;&gt;mkdir&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;-p&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;build&lt;/span&gt;
    &lt;span class=&quot;err&quot;&gt;cp&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;*.hs&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;build&lt;/span&gt;

&lt;span class=&quot;nl&quot;&gt;wc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;wc.hs init&lt;/span&gt;
    &lt;span class=&quot;err&quot;&gt;cd&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;build&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;ghc&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;--make&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;wc.hs&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;-o&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;wc&lt;/span&gt;

&lt;span class=&quot;nl&quot;&gt;clean&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;err&quot;&gt;rm&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;-fr&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;build&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;*.o&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;*.hi&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;So after a simple compilation the results we’re now getting are:&lt;/p&gt;

&lt;console&gt;
time bash -c &quot;cat test | build/wc -l&quot;
197000
bash -c &quot;cat test | build/wc -l&quot;  0.96s user 0.04s system 95% cpu 1.049 total
&lt;/console&gt;

&lt;p&gt;Which puts at at 24x slower which is already some progress with absolutely no
code changes. Now the &lt;strong&gt;ghc&lt;/strong&gt; compiler also allows you to use a few optimizing
flags that can help make the output code quicker. Lets use the basic &lt;strong&gt;-O2&lt;/strong&gt;
optimization and see how much we can gain. We’ll actually add the compilation
directive to the &lt;strong&gt;.hs&lt;/strong&gt; file with the following line:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;o&quot;&gt;#!&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;usr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bin&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;runhaskell&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;{-# LANGUAGE DeriveDataTypeable #-}&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;{-# OPTIONS_GHC -O2 #-}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_removed&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;code&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;here&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Now after recompiling, here is the current performance of our tool:&lt;/p&gt;

&lt;console&gt;
time bash -c &quot;cat test | build/wc -l&quot;
197000
bash -c &quot;cat test | build/wc -l&quot;  0.96s user 0.04s system 95% cpu 1.049 total
&lt;/console&gt;

&lt;p&gt;Not much of a boost really and its not a surprise since our program isn’t very
complicated we can’t expect the compiler to be able to save time that easily.
There are a few things to look at before we start profiling and here’s a list
of the usual suspects when it comes to bad performance in &lt;strong&gt;Haskell&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;String&lt;/strong&gt; is painfully slow and is known for being 20x slower than a similar
C implementation. The fix is to use the &lt;strong&gt;ByteString&lt;/strong&gt; type which is known to
only be 2x slower than a similar C implementation.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;read and show are known to also perform badly and you should use the same
functions that manipulate the &lt;strong&gt;ByteString&lt;/strong&gt; datatype.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So lets start by importing the required library to use the &lt;strong&gt;ByteString&lt;/strong&gt;
datatype and using all of the functions that handle &lt;strong&gt;ByteString&lt;/strong&gt;. Be aware
that it can be a bit of hassle to get your functions working with the
&lt;strong&gt;ByteString&lt;/strong&gt; data type and quite a bit of work to get all the types lined
up just write, but here’s what the &lt;strong&gt;wc&lt;/strong&gt; command implementation looks like
that now uses &lt;strong&gt;ByteString&lt;/strong&gt;s:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;o&quot;&gt;#!&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;usr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bin&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;runhaskell&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;{-# LANGUAGE DeriveDataTypeable #-}&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;{-# OPTIONS_GHC -O2 #-}&lt;/span&gt;

&lt;span class=&quot;kr&quot;&gt;module&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Main&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt;

&lt;span class=&quot;kr&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;System....&lt;/span&gt;
&lt;span class=&quot;kr&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Control.Arrow&lt;/span&gt;
&lt;span class=&quot;kr&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;qualified&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Data.ByteString.Char8&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;

&lt;span class=&quot;kr&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chars&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lines_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;words_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;kr&quot;&gt;deriving&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Show&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Typeable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;wc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;chars&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;m&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;print the byte counts&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;lines_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;print the character counts&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;words_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;print the word counts&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Print newline, word, and byte counts for each FILE, &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;&quot;and a total line if more than one FILE is specified.&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;&quot; With no FILE, or when FILE is -, read standard &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;&quot;input.  A word is a non-zero-length sequence of &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;&quot;characters delimited by white space.&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;summary&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;wc v0.0.1, (C) Rodney Gomes&quot;&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;countwords&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pack&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;countlines&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pack&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lines&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;countchars&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pack&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;space&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pack&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot; &quot;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;flat&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;concat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;space&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;space&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;addnl&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;concat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pack&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;optionHandler&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chars&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addnl&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;countchars&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;optionHandler&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lines_&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addnl&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;countlines&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;optionHandler&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;words_&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addnl&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;countwords&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;optionHandler&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addnl&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flat&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;countlines&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;countwords&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;countchars&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;main&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cmdArgs&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;wc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;interact&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;optionHandler&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Just a few tweaks really in terms of the functions being used and how to handle
the new &lt;strong&gt;ByteString&lt;/strong&gt; datatype. I also realized that the tool wasn’t outputting
a necessary newline at the end of the output so I added that. After this small
change surprisingly enough we’re now really close to the C implementation
speed:&lt;/p&gt;

&lt;console&gt;
time bash -c &quot;cat test | build/wc -l&quot;
197000
bash -c &quot;cat test | build/wc -l&quot;  0.02s user 0.04s system 77% cpu 0.073 total
&lt;/console&gt;

&lt;p&gt;Now we’re still 69% slower than the C implementation which means we have room
for improvement, but the interesting part is we’re actually 72% faster than the
C implementation when we have to calculate the number of lines, words and
characters at the same time:&lt;/p&gt;

&lt;console&gt;
time bash -c &quot;cat test | wc&quot;
 197000 3743000 19306000
bash -c &quot;cat test | wc&quot;  0.57s user 0.02s system 95% cpu 0.613 total

time bash -c &quot;cat test | build/wc&quot;
197000 3743000 19306000
bash -c &quot;cat test | build/wc&quot;  0.30s user 0.04s system 94% cpu 0.357 total
&lt;/console&gt;

&lt;p&gt;Now we’ve reached the point where if we want to get our counting of lines to
perform as well as the C implementation we’re going to have to profile our
&lt;strong&gt;Haskell&lt;/strong&gt; program. To profile we need to compile with a few additional flags:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-makefile&quot; data-lang=&quot;makefile&quot;&gt;&lt;span class=&quot;nl&quot;&gt;wc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;wc.hs init&lt;/span&gt;
    &lt;span class=&quot;err&quot;&gt;cd&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;build&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;ghc&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;-prof&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;-auto-all&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;-rtsopts&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;--make&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;wc.hs&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;-o&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;wc&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;We added the &lt;strong&gt;-prof -auto-all&lt;/strong&gt; to build with profiling enabled, the -auto-all
generates cost centres for all top level functions, you can read more about that
in the &lt;strong&gt;Haskell&lt;/strong&gt; documentation on profiling.  When you try to run the
&lt;strong&gt;make wc&lt;/strong&gt; again if you get something like so:&lt;/p&gt;

&lt;console&gt;
wc.hs:7:8:
    Could not find module `System....':
      Perhaps you haven't installed the profiling libraries for package
                                                                  `cmdargs-0.9'?
      Use -v to see a list of the files searched for.
&lt;/console&gt;

&lt;p&gt;Just run the cabal install command like so for each package:&lt;/p&gt;

&lt;console&gt;
cabal install --reinstall -p cmdargs
&lt;/console&gt;

&lt;p&gt;That will reinstall the package and make sure to compile the required profiling
information. You can now run the same command like so:&lt;/p&gt;

&lt;console&gt;
wc -l +RTS -p -RTS
&lt;/console&gt;

&lt;p&gt;You’l now have a nice &lt;strong&gt;wc.prof&lt;/strong&gt; file to look at which contains information
like this in it (slightly reformatted to fit):&lt;/p&gt;

&lt;console&gt;
    Mon Nov 14 19:05 2011 Time and Allocation Profiling Report  (Final)

       wc +RTS -p -RTS -l

    total time  =        0.04 secs   (2 ticks @ 20 ms)
    total alloc =  13,593,584 bytes  (excludes profiling overheads)

COST CENTRE                    MODULE               %time %alloc
main                           Main                  50.0    1.3
countlines                     Main                  50.0   98.5

                                                    individual    inherited
COST CENTRE              MODULE            no. entries %time %alloc %time %alloc

MAIN            MAIN                         1   0    0.0    0.0   100.0   100.0
 main           Main                       358   3   50.0    1.3   100.0    99.9
  optionHandler Main                       361   1    0.0    0.0    50.0    98.5
   addnl        Main                       363   1    0.0    0.0     0.0     0.0
   countlines   Main                       362   2   50.0   98.5    50.0    98.5
  wc            Main                       360   0    0.0    0.0     0.0     0.0
 CAF            Main                       352  33    0.0    0.0     0.0     0.0
  addnl         Main                       364   0    0.0    0.0     0.0     0.0
  wc            Main                       359   1    0.0    0.0     0.0     0.0
 CAF            Data.Typeable              350   5    0.0    0.0     0.0     0.0
 CAF            GHC.Show                   348   1    0.0    0.0     0.0     0.0
 CAF            Data.HashTable             290   3    0.0    0.0     0.0     0.0
 CAF            GHC.IO.Handle.FD           288   3    0.0    0.0     0.0     0.0
 CAF            GHC.IO.FD                  272   4    0.0    0.0     0.0     0.0
 CAF            GHC.IO.Handle.Internals    252   1    0.0    0.0     0.0     0.0
 CAF            GHC.IO.Encoding.Iconv      246   2    0.0    0.0     0.0     0.0
 CAF            GHC.Conc.Signal            243   1    0.0    0.0     0.0     0.0
 CAF            Data.Data                  227   3    0.0    0.0     0.0     0.0
 CAF            System.....Implicit.Global 226   3    0.0    0.0     0.0     0.0
 CAF            System.....Implicit.Reader 223   1    0.0    0.0     0.0     0.0
 CAF            Data.Generics.Any.Prelude  222   2    0.0    0.0     0.0     0.0
 CAF            System.....Explicit        209   6    0.0    0.0     0.0     0.0
 CAF            System.....Explicit.Type   202   1    0.0    0.0     0.0     0.0
 CAF            System.....Explicit.Help   189   1    0.0    0.0     0.0     0.0
 CAF            System.....Implicit.Ann    187   4    0.0    0.0     0.0     0.0
 CAF            System.....Annotate        186   1    0.0    0.0     0.0     0.0
 CAF            Data.ByteString.Char8      184   1    0.0    0.0     0.0     0.0
&lt;/console&gt;

&lt;p&gt;With the above we can now see that we’re spending 50% of our time in countlines
and the other 50% in the main function which is most likely in the &lt;strong&gt;interact&lt;/strong&gt;
function reading the input and 50% of the time parsing and counting in the
&lt;strong&gt;countlines&lt;/strong&gt;. Since we don’t have access to the &lt;strong&gt;interact&lt;/strong&gt; function what
we can do is create &lt;strong&gt;cost centres&lt;/strong&gt; for the &lt;strong&gt;countlines&lt;/strong&gt; function and see if
it identifies something we weren’t expect. So we’ll add this to our source:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;countwords&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pack&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;countlines&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;cp&quot;&gt;{-# SCC &quot;C.pack&quot; #-}&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pack&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;cp&quot;&gt;{-# SCC &quot;show&quot; #-}&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;\&lt;/span&gt;
             &lt;span class=&quot;cp&quot;&gt;{-# SCC &quot;length&quot; #-}&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;cp&quot;&gt;{-# SCC &quot;C.lines&quot; #-}&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lines&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;countchars&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pack&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;So when we recompile and run with the above &lt;strong&gt;cost centres&lt;/strong&gt; we can then get
more detail from profiling:&lt;/p&gt;

&lt;console&gt;
COST CENTRE  MODULE        no.    entries  %time %alloc   %time %alloc$
$
MAIN         MAIN            1           0  75.0    0.5   100.0  100.0$
 C.pack      Main          358           1   0.0    0.0    25.0   99.4$
  show       Main          359           1   0.0    0.0    25.0   99.4$
   length    Main          360           1   0.0    0.0    25.0   99.4$
    C.lines  Main          361           1  25.0   99.4    25.0   99.4$
&lt;/console&gt;

&lt;p&gt;We can see that we’re spending about 25% of our time each of the places we set
those &lt;strong&gt;cost centres&lt;/strong&gt;. The thing to do now is try to understand why it takes
25% of the time to do the pack call or the show call when they’re just
converting an &lt;strong&gt;Int&lt;/strong&gt; to &lt;strong&gt;String&lt;/strong&gt; and then to a &lt;strong&gt;ByteString&lt;/strong&gt;. I just wanted
to show how to profile an existing tool and that &lt;strong&gt;Haskell&lt;/strong&gt; can in fact be as
quick as &lt;strong&gt;C&lt;/strong&gt; without having to any major changes and using the right types.&lt;/p&gt;

&lt;p&gt;So we’ve pretty much optimized our &lt;strong&gt;wc&lt;/strong&gt; implementation and the only thing
left to do is a detailed comparison of the performance of our implementation vs
the C implementation:&lt;/p&gt;

&lt;p&gt;Counting lines in a file with 500,000 lines where all lines have 80 columns:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Ref Implementation: 0.084s&lt;/li&gt;
  &lt;li&gt;Our Implementation: 0.149s&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Counting words in a file with 500,000 lines where all lines have 80 columns:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Ref Implementation: 1.243s&lt;/li&gt;
  &lt;li&gt;Our Implementation: 0.742s&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Counting chars in a file with 500,000 lines where all lines have 80 columns:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Ref Implementation: 1.224s&lt;/li&gt;
  &lt;li&gt;Our Implementation: 0.116s&lt;/li&gt;
&lt;/ul&gt;
</content>
 </entry>
 
 <entry>
   <title>Writing wc command line tool in Haskell</title>
   <link href="http://rlgomes.github.com/work/haskell/2011/11/13/13.00-Writing-wc-command-line-tool-in-Haskell.html"/>
   <updated>2011-11-13T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/haskell/2011/11/13/13.00-Writing-wc-command-line-tool-in-Haskell</id>
   <content type="html">&lt;p&gt;Now after having gotten a pretty good overview of how to use Haskell’s most
important features we can now use this newly learned skills to write a clone
of the ‘wc’ command line tool completely in Haskell and compare this to what
the current code for the wc command line tool is written in.&lt;/p&gt;

&lt;p&gt;The ‘wc’ command has the following help menu on my linux box:&lt;/p&gt;

&lt;console&gt;
$ wc --help
Usage: wc [OPTION]... [FILE]...
  or:  wc [OPTION]... --files0-from=F
Print newline, word, and byte counts for each FILE, and a total line if
more than one FILE is specified.  With no FILE, or when FILE is -,
read standard input.  A word is a non-zero-length sequence of characters
delimited by white space.
  -c, --bytes            print the byte counts
  -m, --chars            print the character counts
  -l, --lines            print the newline counts
      --files0-from=F    read input from the files specified by
                           NUL-terminated names in file F;
                           If F is - then read names from standard input
  -L, --max-line-length  print the length of the longest line
  -w, --words            print the word counts
      --help     display this help and exit
      --version  output version information and exit

Report wc bugs to bug-coreutils@gnu.org
GNU coreutils home page: http://www.gnu.org/software/coreutils/
General help using GNU software: http://www.gnu.org/gethelp/
For complete documentation, run: info coreutils 'wc invocation'
&lt;/console&gt;

&lt;p&gt;So we’re going to try and implement the counting of characters, lines and words
and also make sure that we can handle piping of a file to the stdin with our
newly created command line tool.&lt;/p&gt;

&lt;p&gt;First thing that is required to write a command line tool is a command line
parsing library. I choose CmdArgs since I found it worked really well and
allowed you to easily define the help menu and summary quite nicely. You can
read up on this library &lt;a href=&quot;http://community.haskell.org/~ndm/cmdargs/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;CmdArgs&lt;/strong&gt; library is really great at expressing what arguments are
available and what each of them do. You have to firstly define the datatype
that identifies the various arguments and their types, like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;kr&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chars&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lines_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;words_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;kr&quot;&gt;deriving&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Show&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Typeable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;So the above just defines that we have 3 flags that can be used and each of them
are either present or not (ie Bool type). The underscore following the name is
used whenever you have a namespace collision such as the fact that &lt;strong&gt;lines&lt;/strong&gt;
and &lt;strong&gt;words&lt;/strong&gt; are both functions that already existing in &lt;strong&gt;Haskell&lt;/strong&gt;. CmdArgs
will automatically strip the underscore when parsing the command line.&lt;/p&gt;

&lt;p&gt;We now have to instantiate our WC data type and fill in the required information
on what each option does as well as give some additional information on what
the tool does and how to use it. Here is how this is done in &lt;strong&gt;Haskell&lt;/strong&gt; when
using &lt;strong&gt;CmdArgs&lt;/strong&gt;:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;wc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;chars&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;m&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;print the byte counts&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;lines_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;print the character counts&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;words_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;print the word counts&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Print newline, word, and byte counts for each FILE, &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;&quot;and a total line if more than one FILE is specified.&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;&quot; With no FILE, or when FILE is -, read standard &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;&quot;input.  A word is a non-zero-length sequence of &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;&quot;characters delimited by white space.&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;summary&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;wc v0.0.1, (C) Rodney Gomes&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The above is using the &lt;strong&gt;&amp;amp;=&lt;/strong&gt; operator is used to annotation the chars type
with additional information such as the name of the option and the help to
show when displaying this option. The whole &lt;strong&gt;WC&lt;/strong&gt; datatype is annotated with
the &lt;strong&gt;help&lt;/strong&gt; and &lt;strong&gt;summary&lt;/strong&gt; which are then used to generate the help menu
like so:&lt;/p&gt;

&lt;console&gt;
wc v0.0.1, (C) Rodney Gomes

wc [OPTIONS]
  Print newline, word, and byte counts for each FILE, and a total line if more
  than one FILE is specified. With no FILE, or when FILE is -, read standard
  input.  A word is a non-zero-length sequence of characters delimited by white
  space.

Common flags:
  -m --chars    print the byte counts
  -l --lines    print the character counts
  -w --words    print the word counts
  -? --help     Display help message
  -V --version  Print version information
&lt;/console&gt;

&lt;p&gt;There you have your argument parsing and menu printing all in less than 15
lines of &lt;strong&gt;Haskell&lt;/strong&gt;. The next bit we’re going to focus on is how do we actually
count lines, words and characters using &lt;strong&gt;Haskell&lt;/strong&gt;. Instead of writing functions
that can calculate the number of lines/words in a &lt;strong&gt;String&lt;/strong&gt; we can easily look
through the &lt;strong&gt;Prelude&lt;/strong&gt; module and find that there are two functions that do the
trick: &lt;strong&gt;lines&lt;/strong&gt; and &lt;strong&gt;words&lt;/strong&gt; and using the already familiar function
composition we can write:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;countlines&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lines&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;countwords&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Now creating a main function in Haskell requires you understand a few things
about the way &lt;strong&gt;IO&lt;/strong&gt; is handled. In &lt;strong&gt;Haskell&lt;/strong&gt; as mentioned early in my posts
is a purely functional language and for things such as &lt;strong&gt;IO&lt;/strong&gt; which are
basically side effects within your function &lt;strong&gt;Haskell&lt;/strong&gt;. This side-effect is
handled by expressing side effects as a &lt;strong&gt;Monad&lt;/strong&gt;. Now I won’t go into all of
the details of a &lt;strong&gt;Monad&lt;/strong&gt; in this post and I suggest you read up on Monads with
these few links:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;http://mvanier.livejournal.com/3917.html&quot;&gt;Yet Another Monad Tutorial&lt;/a&gt; is a
great source of informatoin but its very detailed and runs over the course of a
few posts&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;http://homepages.inf.ed.ac.uk/wadler/papers/marktoberdorf/baastad.pdf&quot;&gt;Monads for functional programming&lt;/a&gt;
the original &lt;strong&gt;Monad&lt;/strong&gt; paper that expresses how to use &lt;strong&gt;Monads&lt;/strong&gt; within a
functional language so you can do impure actions within a pure function.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The quick and dirty introduction to &lt;strong&gt;Monads&lt;/strong&gt; is tha they’re used to do a few
things that we take for granted in other imperative languages, such as:
exceptions, state, output. In the case of the &lt;strong&gt;main&lt;/strong&gt; function has an &lt;strong&gt;IO ()&lt;/strong&gt;
which just means that this function generates output and returns the unit type
&lt;strong&gt;()&lt;/strong&gt; which I usually view as &lt;strong&gt;void&lt;/strong&gt; in Haskell. Lets look at a simple
program that prints ‘Hello World’:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;main&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;putStrLn&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Hello World&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The function &lt;strong&gt;putStrLn&lt;/strong&gt; of course has a type of&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;putStrLn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;IO&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;()&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The other thing to introduce is the &lt;strong&gt;do&lt;/strong&gt; notation which is heavily used along
side &lt;strong&gt;Monads&lt;/strong&gt; because it better expresses the imperative sense of the Monadic
actions. Its not the only way to express monadic actions but seems to be the
easiest to start with for developers coming from the imperative world. I would
read up on the other available options when trying to chain monadic actions
because you’ll find that &lt;strong&gt;Haskell&lt;/strong&gt; has very elegant ways of handling this.
The notation itself allows you to execute separate statements that are not
necessarily used when generating the output of your function. For example:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;main&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;do&lt;/span&gt;
         &lt;span class=&quot;n&quot;&gt;putStrLn&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Input name:&quot;&lt;/span&gt;
         &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;getLine&lt;/span&gt;
         &lt;span class=&quot;n&quot;&gt;putStrLn&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Hi there &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;There you can see some Haskell code that looks extremely like an imperative
program. Now you don’t have to write things like that and can in fact be more
functional when writing code in &lt;strong&gt;Haskell&lt;/strong&gt; and do the same like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;main&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;putStrLn&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Input name:&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;getLine&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;putStrLn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Hello there &quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;There are a few new functions being used here which you may already know if read
the tutorials on &lt;strong&gt;Monads&lt;/strong&gt; I previously mentioned but they’re not hard to
understand the first is &lt;strong&gt;»&lt;/strong&gt; which has the signature:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Monad&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;What it does is simply accept two monadic actions and only return the second,
basically ignore the return from the first function. This was necessary since
we wanted to print the “Input name:” string but didn’t care for its return. The
&lt;strong&gt;»=&lt;/strong&gt; function on the other hand is the function composition function for
Monadic functions. Its signature is more familiar:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Monad&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;You can easily see what the function does and its purpose with in &lt;strong&gt;Haskell&lt;/strong&gt;.
Then the only other magic done in that one line was to infix the &lt;strong&gt;++&lt;/strong&gt; operator
and use it to concatenate the return of the &lt;strong&gt;getLine&lt;/strong&gt; with the “Hello there “
string. I really prefer not using the &lt;strong&gt;do&lt;/strong&gt; notation when possible just because
its not as functionally elegant as other available options. I think its a matter
of taste and you’ll find what makes more sense to use in different situations.&lt;/p&gt;

&lt;p&gt;So the only thing missing is the ability to handle the input from the standard
input from our program in our program. In the &lt;strong&gt;Prelude&lt;/strong&gt; there is a function
that allows us to handle the stdin and output directly to the stdout. This is
the &lt;strong&gt;interact&lt;/strong&gt; function:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;interact&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;IO&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;()&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This function will take your &lt;strong&gt;String-&amp;gt;String&lt;/strong&gt; function and do the required
&lt;strong&gt;IO ()&lt;/strong&gt; output. The input to your function is the whole of the standard input
and what you’ll be returning is the whole of what you want to be printed on
the screen (ie standard output). So you can basically apply the &lt;strong&gt;countlines&lt;/strong&gt;
function like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;kr&quot;&gt;module&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Main&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;countlines&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lines&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;main&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interact&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;countlines&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;I just introduced the module declaration line to also show how to correctly
define your module and tell the ghci compiler which function is your main
entry point. We had to use the &lt;strong&gt;show&lt;/strong&gt; function to convert the number
calculated by our countlines function back into a &lt;strong&gt;String&lt;/strong&gt;. With the above you
should be able to run commands such as:&lt;/p&gt;

&lt;console&gt;
$ cat test.hs | runhaskell test.hs
5
&lt;/console&gt;

&lt;p&gt;With all that we’ve gone over at this point you should be able to write up the
&lt;strong&gt;wc&lt;/strong&gt; command line tool and it may look something like this:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;o&quot;&gt;#!&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;usr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bin&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;runhaskell&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;{-# LANGUAGE DeriveDataTypeable #-}&lt;/span&gt;

&lt;span class=&quot;kr&quot;&gt;module&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Main&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt;

&lt;span class=&quot;kr&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;System.Console.CmdArgs&lt;/span&gt;
&lt;span class=&quot;kr&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Control.Arrow&lt;/span&gt;

&lt;span class=&quot;kr&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chars&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lines_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;words_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;kr&quot;&gt;deriving&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Show&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Typeable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;wc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;chars&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;m&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;print the byte counts&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;lines_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;print the character counts&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;words_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;print the word counts&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Print newline, word, and byte counts for each FILE, &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;&quot;and a total line if more than one FILE is specified.&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;&quot; With no FILE, or when FILE is -, read standard &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;&quot;input.  A word is a non-zero-length sequence of &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                 &lt;span class=&quot;s&quot;&gt;&quot;characters delimited by white space.&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;summary&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;wc v0.0.1, (C) Rodney Gomes&quot;&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;countwords&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;countlines&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lines&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;countchars&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;flat&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot; &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot; &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot; &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;optionHandler&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chars&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;countchars&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;optionHandler&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lines_&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;countlines&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;optionHandler&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;WC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;words_&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;countwords&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;optionHandler&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flat&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;countlines&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;countwords&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;countchars&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;main&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cmdArgs&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;wc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interact&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;optionHandler&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;That is a our implementation of the &lt;strong&gt;wc&lt;/strong&gt; command line tool (minus the
counting of bytes and some of the extended options). There are a few more things
introduced in this implementation that weren’t covered before such as:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;&amp;amp;&amp;amp;&amp;amp;&lt;/strong&gt; operator, also called the ‘fanout’ operator which has the following
signature &lt;strong&gt;&lt;em&gt;(&amp;amp;&amp;amp;&amp;amp;) :: a b c -&amp;gt; a b c’ -&amp;gt; a b (c, c’)&lt;/em&gt;&lt;/strong&gt; is basically used to
apply the two functions to the same argument and return the result which
consists of the tuples of the results of those two functions. We then created a
&lt;strong&gt;flat&lt;/strong&gt; function to flatten out the result of applying using the &lt;strong&gt;&amp;amp;&amp;amp;&amp;amp;&lt;/strong&gt;
operator.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;records pattern matching&lt;/strong&gt; - we had introduced the records notation which
allows you to basically allows you to define the chars,lines_ and words_
functions which can be used against the &lt;strong&gt;WC&lt;/strong&gt; datatype. Now when pattern
matching you can use the same function to match the exact element you’re looking
for and that’s what we’ve done above in the &lt;strong&gt;optionHandler&lt;/strong&gt; function.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The most astounding thing about the above piece of code is how many lines we’ve
actually had to write. The above code is exactly 34 lines of code (in its
current incarnation) and the source for the current C implementation of the
&lt;strong&gt;wc&lt;/strong&gt; command line tool in the source of my current &lt;strong&gt;Ubuntu Oneiric&lt;/strong&gt;
installation is over 700 lines of code.&lt;/p&gt;

&lt;p&gt;Of course lines of code is not a true way to compare software quality between
different languages, but it is a measure of complexity. The biggest difference
here is how easy it is to read this code vs reading the same program written
in C. Just looking at the code above you can see how easy it is to add more
functionality and also how easy it is to read the program.&lt;/p&gt;

&lt;p&gt;In the next post we’re going to actually analyze the performance of the &lt;strong&gt;wc&lt;/strong&gt;
command we wrote and see how close we can get to the performance of the C
implementation of the wc command.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Advanced Typeclasses</title>
   <link href="http://rlgomes.github.com/work/haskell/2011/11/08/22.30-Advanced-Typeclasses.html"/>
   <updated>2011-11-08T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/haskell/2011/11/08/22.30-Advanced-Typeclasses</id>
   <content type="html">&lt;p&gt;Lets go back a bit to the typeclass topic and analyze and understand well a few
important typeclasses that allow you to do a few intestine things. We’ll start
with the &lt;strong&gt;Functor&lt;/strong&gt; typeclass, which has the following definition:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;kr&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Functor&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;fmap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;We’ve introduced a new notation in our signature with the &lt;strong&gt;f a&lt;/strong&gt;, in this
simple signature we’re using a generic type constructor reference instead of a
concrete constructor such as &lt;strong&gt;Maybe&lt;/strong&gt; or from our previous post &lt;strong&gt;Tree&lt;/strong&gt;.  So
when we create an instance of a &lt;strong&gt;Functor f&lt;/strong&gt; we basically letting any function
that wants to use the &lt;strong&gt;Functor&lt;/strong&gt; how to apply and reconstruct our type. We can
see how to easily implement a &lt;strong&gt;Functor&lt;/strong&gt; instance for lists, like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;kr&quot;&gt;instance&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Functor&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;fmap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;map&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Lets define the instance of &lt;strong&gt;Functor&lt;/strong&gt; for the &lt;strong&gt;Tree&lt;/strong&gt; data type we defined
in the previous post:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;kr&quot;&gt;instance&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Functor&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Tree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;fmap&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;fmap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Node&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Node&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fmap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fmap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;With the above we can now easily apply higher order functions to our &lt;strong&gt;Tree&lt;/strong&gt;
instances like so:&lt;/p&gt;

&lt;console&gt;
Main&amp;gt; fmap (+2) sampleTree
3
    4
        .
        .
    5
        6
            .
            .
        .
&lt;/console&gt;

&lt;p&gt;This is where I believe you should really start to see the beauty in the way
that Haskell works with your data types and being able to easily modify,
transform your types. There are a few other class types to look at which include
&lt;strong&gt;Foldable&lt;/strong&gt;, &lt;strong&gt;Traversable&lt;/strong&gt;, etc.&lt;/p&gt;

&lt;p&gt;We’re now going to have a look at the &lt;strong&gt;Show&lt;/strong&gt; and &lt;strong&gt;Read&lt;/strong&gt; class types. We’ve
already seen the &lt;strong&gt;Show&lt;/strong&gt; class type but we want to show the complete function
here and understand a bit better how this works with the &lt;strong&gt;Read&lt;/strong&gt; class type.
So here’s the full definition of the Show class type:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;kr&quot;&gt;class&lt;/span&gt;  &lt;span class=&quot;kt&quot;&gt;Show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;  &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;showsPrec&lt;/span&gt;        &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;ShowS&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt;       &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;showList&lt;/span&gt;         &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;ShowS&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;-- Mimimal complete definition:&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- show or showsPrec&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;showsPrec&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;        &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showsPrec&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;showList&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt;       &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showString&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;[]&quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;showList&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showChar&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;'['&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;shows&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showl&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;
                        &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showl&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt;     &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showChar&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;']'&lt;/span&gt;
                              &lt;span class=&quot;n&quot;&gt;showl&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showChar&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;','&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;shows&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;
                                             &lt;span class=&quot;n&quot;&gt;showl&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;As we did before we only have to implement the &lt;strong&gt;show&lt;/strong&gt; function (don’t want to
get into the &lt;strong&gt;showsPrec&lt;/strong&gt; function at this stage). We’ll create a new binary
tree data type and define a very simple &lt;strong&gt;Show&lt;/strong&gt; instance for this, here’s
what we’re looking at:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;kr&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;BTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Branch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;BTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;BTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;showBTree&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;BTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;showBTree&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;showBTree&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Branch&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;(&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showBTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;|&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showBTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;)&quot;&lt;/span&gt;

&lt;span class=&quot;kr&quot;&gt;instance&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Show&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;BTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showBTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;sampleTree&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Branch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Branch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;So when we do a show of the sampleTree we get the following output:&lt;/p&gt;

&lt;console&gt;
((3|2)|1)
&lt;/console&gt;

&lt;p&gt;Now we have to read the &lt;strong&gt;Read&lt;/strong&gt; instance so that we can easily parse the
expressions of a &lt;strong&gt;BTree&lt;/strong&gt; back into a &lt;strong&gt;BTree&lt;/strong&gt; type that we can then process
with the various functions that’ll we’ll eventually write. The &lt;strong&gt;Read&lt;/strong&gt; class
iself has the following definition:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;kr&quot;&gt;class&lt;/span&gt;  &lt;span class=&quot;kt&quot;&gt;Read&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;  &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;readsPrec&lt;/span&gt;        &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;ReadS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;readList&lt;/span&gt;         &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;ReadS&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;-- Minimal complete definition:&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- readsPrec&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;readList&lt;/span&gt;         &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;readParen&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;False&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;[&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lex&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                                                    &lt;span class=&quot;n&quot;&gt;pr&lt;/span&gt;       &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;readl&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
                       &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;readl&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;]&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lex&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                                        &lt;span class=&quot;p&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;u&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;    &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reads&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                                                    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;u&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;readl'&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
                             &lt;span class=&quot;n&quot;&gt;readl'&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;]&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lex&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
                                        &lt;span class=&quot;p&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;,&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lex&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                                                    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;u&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;    &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reads&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                                                    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;readl'&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;u&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;As before we only need to implement the readsPrec function to have a working
&lt;strong&gt;Read&lt;/strong&gt; implementation. So lets also have a look at the &lt;strong&gt;ReadS&lt;/strong&gt; type:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;kr&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;ReadS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This type defines what a parse does which is to return for parts of the original
string with the accompanying converted type. It also allows you to parse whatever
is part of your &lt;strong&gt;Read&lt;/strong&gt; implementation and return the rest of the string that
wasn’t parsed.&lt;/p&gt;

&lt;p&gt;Before we implement the &lt;strong&gt;Read&lt;/strong&gt; instance for &lt;strong&gt;BTree&lt;/strong&gt;s lets talk about
list comprehension and how to use it. List comprenhension is another feature in
Haskell that is implemented quite elegantly. List comprenhension has the
following syntax:&lt;/p&gt;

&lt;console&gt;
[ expr | qualifier0 , ... , qualifierN ]
&lt;/console&gt;

&lt;p&gt;In which the expr defines the way the elements are composed within the list
being generated and the qualifiers identify which elements are to be in the
resulting list. So lets for example construct a list of all of the odd numbers
from 1 to 100:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;odd100&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;odd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Now here’s how we could start putting together out &lt;strong&gt;readTree&lt;/strong&gt; function:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;readsTree&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Read&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;ReadS&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;BTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;readsTree&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sc&quot;&gt;'('&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Branch&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;u&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;l&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;'|'&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;readsTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                                        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;')'&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;u&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;readsTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;readsTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reads&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;kr&quot;&gt;instance&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Read&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Read&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;BTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;readsPrec&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;readsBTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Now the above function can be a bit to take in all at once but if you have a
closer look its really not doing much more than stating that if a string starts
with the &lt;strong&gt;’(‘&lt;/strong&gt; symbol then a &lt;strong&gt;Branch l r&lt;/strong&gt; can be parsed from this string in
which the &lt;strong&gt;l&lt;/strong&gt; comes from &lt;strong&gt;readSTree s&lt;/strong&gt; and the returned string must start
with the &lt;strong&gt;’|’&lt;/strong&gt; and then continues with &lt;strong&gt;t&lt;/strong&gt; which would be parsed to return
the &lt;strong&gt;r&lt;/strong&gt; side of the &lt;strong&gt;Branch&lt;/strong&gt; and would leave you with at least the &lt;strong&gt;’)’&lt;/strong&gt;
symbol followed by the rest of the string (which may be empty). The last pattern
of the function assumes everything else would be a &lt;strong&gt;Leaf x&lt;/strong&gt; where is comes
from &lt;strong&gt;reads&lt;/strong&gt; of &lt;strong&gt;s&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This can now be easily be used to parse string back into the &lt;strong&gt;BTree&lt;/strong&gt; type,
like so:&lt;/p&gt;

&lt;console&gt;
Main&amp;gt; (reads &quot;(1|(2|3))blah&quot;)::[(BTree Int,String)]
[((1|(2|3)),&quot;blah&quot;)]
&lt;/console&gt;

&lt;p&gt;We do have to tell &lt;strong&gt;Haskell&lt;/strong&gt; what the type of the return is because otherwise
Haskell can’t deduce if its a binary tree of integers or a binary tree of
strings.&lt;/p&gt;

&lt;p&gt;By now I would hope you’d be able to write &lt;strong&gt;Read&lt;/strong&gt; and &lt;strong&gt;Show&lt;/strong&gt; instances for
any of your own new datatypes and be able to have Haskell really work for you.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Higher order functions in Haskell</title>
   <link href="http://rlgomes.github.com/work/haskell/2011/11/01/23.00-Higher-order-functions-in-Haskell.html"/>
   <updated>2011-11-01T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/haskell/2011/11/01/23.00-Higher-order-functions-in-Haskell</id>
   <content type="html">&lt;p&gt;From Haskell’s documentation: “A higher-order function is a function that takes
other functions as arguments or returns a function as result.”. With this in
mind lets start by writing a few functions over lists:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;addone&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;addone&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;addone&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addone&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;divlist&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Fractional&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;divlist&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;divlist&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;divlist&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;We could add a few others but the idea here is that we have to write a new
function everytime we want to traverse a list of elements and apply a function
to these elements while recreating the list in the same order. What we realy
need is a function with the following signature:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;applyf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;You can see that the function takes a function that transforms &lt;strong&gt;a to b&lt;/strong&gt;’s and
a list of &lt;strong&gt;a&lt;/strong&gt;’s to create a list of &lt;strong&gt;b&lt;/strong&gt;’s. We can even define the applyf
function quite easily as:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;applyf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;applyf&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;applyf&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;applyf&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;There actually already exists an &lt;strong&gt;applyf&lt;/strong&gt; function called &lt;strong&gt;map&lt;/strong&gt; in the
Haskell Prelude library. Lets show how to quickly redefine the &lt;strong&gt;addone&lt;/strong&gt; and
&lt;strong&gt;divlist&lt;/strong&gt; function using the &lt;strong&gt;map&lt;/strong&gt; function:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;addone&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;divlist&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;You can quickly see how to use &lt;strong&gt;map&lt;/strong&gt; to basically apply any function to a
list of items and get back the list of the results. We can now start to
introduce a few other high order functions that are in the Prelude library:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;filter&lt;/strong&gt; - the filter function takes a boolean function as an argument
and removes allelements from the input list that do not satisfy the boolean
function. This is very useful when you want to quickly filter out elements
based on a simple boolean function such as filtering a list of tuples in which
the second element is the age and we the list to contain only the teenagers:&lt;/li&gt;
&lt;/ul&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;filterTeens&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;snd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;snd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;foldl&lt;/strong&gt; - reduces a list of elements down to a simple object by applying a
function you supplied and a starting value. You can use this to quickly sum up
a list of values or even concatenate a list of strings, here are a few
examples:&lt;/li&gt;
&lt;/ul&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;sumlist&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;foldl&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;concatlist&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;foldl&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;foldr&lt;/strong&gt; - same as foldl but starts the “folding” from the end of the list
which can have a completely different meaning depending on the folding operation
being used.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are many other higher order functions that are extremely useful when
wanting to apply transformations to lists. These functions can also be
generalized to other structures such as the &lt;strong&gt;Tree&lt;/strong&gt; data that we introduced in
the previous post. We’ll leave the creation of a map, filter for &lt;strong&gt;Tree&lt;/strong&gt; as
an exercise for those who are interested in testing out their skills.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Custom data types in Haskell</title>
   <link href="http://rlgomes.github.com/work/haskell/2011/10/31/20.00-Custom-data-types-in-Haskell.html"/>
   <updated>2011-10-31T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/haskell/2011/10/31/20.00-Custom-data-types-in-Haskell</id>
   <content type="html">&lt;p&gt;Hopefully after the first post in this series you can quickly write up simple
functions over existing Haskell types and also understand basic polymorphism as
well as function composition. We’ll now start digging into creating custom types
and how to handle these in our functions. Lets start by creating a Tree type:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;kr&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Tree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Node&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Tree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Tree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;When defining elements of the type above you can do so in the following manner:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;sampleTree&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Node&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Node&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Node&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Which represents the Tree:&lt;/p&gt;

&lt;pre&gt;
      2
     / \
    /   \
   1     3
  / \   / \
 *   * *   *
&lt;/pre&gt;

&lt;p&gt;We now want to write a function that can calculate the maximum depth of a Tree
and when we write this function we must not forget to handle all of the cases
that compose the data type Tree. Which means handling the Leaf and handling the
Node x y z, like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;maxdepth&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Tree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;maxdepth&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;maxdepth&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Node&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;max&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;maxdepth&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;maxdepth&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;As you can see pattern matching on your custom data type is extremely easy to
read and is just like doing so with any built in type. We introduced the use of
the _ variable which is how you handle variables when you don’t care to use them
in your functions calculations. We’re using the &lt;strong&gt;max&lt;/strong&gt; function from
&lt;strong&gt;Prelude&lt;/strong&gt; to handle calculating the maximum of those two possible choices in
a &lt;strong&gt;Tree&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Lets look at how to print an existing tree with in Haskell. So firstly if you
try to just show the current &lt;strong&gt;sampleTree&lt;/strong&gt; you’ll find that ghci shell will
actually complain it can’t show this new element:&lt;/p&gt;

&lt;console&gt;
No instance for (Show (Tree Integer))
    arising from a use of `print'
Possible fix: add an instance declaration for (Show (Tree Integer))
In a stmt of an interactive GHCi command: print it
&lt;/console&gt;

&lt;p&gt;The above error is basically trying to tell you that Haskell doesn’t know
how to “show” the data type you just created. One easy thing to do is to just
let Haskell derive a basic representation for your type by adding the following
to the declaration of the type:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;kr&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Tree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Node&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Tree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Tree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;kr&quot;&gt;deriving&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Show&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;With that you can now get a simple representation of your data like so:&lt;/p&gt;

&lt;console&gt;
Main&amp;gt; sampleTree
Node 1 (Node 2 Leaf Leaf) (Node 3 (Node 4 Leaf Leaf) Leaf)
&lt;/console&gt;

&lt;p&gt;But you can also define exactly how you’d like to represent your trees. This is
done by defining an instance of the class Show for your datatype and then
defining the function &lt;strong&gt;show&lt;/strong&gt; for your type, something like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;padding&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Num&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;padding&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;padding&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot; &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;padding&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;showTree&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Num&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Tree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;showTree&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;padding&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;.&quot;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;showTree&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Node&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showl&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;in&lt;/span&gt;
                           &lt;span class=&quot;kr&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;in&lt;/span&gt;
                           &lt;span class=&quot;kr&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;padding&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;in&lt;/span&gt;
                           &lt;span class=&quot;n&quot;&gt;showc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showl&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showr&lt;/span&gt;

&lt;span class=&quot;kr&quot;&gt;instance&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Show&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Tree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;showTree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;I’ve introduced here the concept of typeclasses, they are heavily used in
conjunction with polymorphism to better describe the type of elements that can
be used with the current function. Typeclasses seem like class definitions
but they’re much more powerful in the sense that you are defining an abstract
operation that needs to be defined per type that wants to be usable by certain
functions. In the code above you’ll notice how the &lt;strong&gt;padding&lt;/strong&gt; function is
expressing that it can accept any &lt;strong&gt;a&lt;/strong&gt; as long as its an implementation of the
typeclass &lt;strong&gt;Num&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Basically you’re telling Haskell that to &lt;strong&gt;show&lt;/strong&gt; a &lt;strong&gt;Tree a&lt;/strong&gt; type you want
Haskell to represent it in the manner a specific manner by giving Haskell the
definition of the &lt;strong&gt;show&lt;/strong&gt; function you’d rather use. Now when we try to
reference our &lt;strong&gt;sampleTree&lt;/strong&gt; you’ll get a more readable representation like so:&lt;/p&gt;

&lt;console&gt;
Main&amp;gt; sampleTree
1
    2
        .
        .
    3
        4
            .
            .
        .
&lt;/console&gt;

&lt;p&gt;If we wanted we could even extend this &lt;strong&gt;show&lt;/strong&gt; implementation to draw a few
ASCII lines and make the tree a bit easier to read. We’ll leave that as an
exercise for the reader.&lt;/p&gt;

&lt;p&gt;Lets find something more interesting function to write and we’ll start by
writing up a &lt;strong&gt;insert&lt;/strong&gt; function takes a Tree and an element and inserts the
Tree with the new element inserted while at least making sure that the nodes
are in order in the tree so that if we print the tree in order it will print the
elements in order&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;insert&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Ord&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Tree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Tree&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;insert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Node&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Leaf&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;insert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Node&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;
                            &lt;span class=&quot;kr&quot;&gt;then&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Node&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;
                            &lt;span class=&quot;kr&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Node&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Again we used the typeclass Ord which is define as:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;kr&quot;&gt;class&lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Eq&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Ord&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;  &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;compare&lt;/span&gt;              &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Ordering&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Bool&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;min&lt;/span&gt;             &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;-- Minimal complete definition:&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;--      (&amp;lt;=) or compare&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;-- Using compare can be more efficient for complex types.&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;compare&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
         &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;    &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;kt&quot;&gt;EQ&lt;/span&gt;
         &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;    &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;kt&quot;&gt;LT&lt;/span&gt;
         &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;otherwise&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;kt&quot;&gt;GT&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;           &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;compare&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;GT&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;           &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;compare&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;LT&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;           &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;compare&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;LT&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;           &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;compare&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;GT&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;-- note that (min x y, max x y) = (x,y) or (y,x)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;max&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
         &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;    &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
         &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;otherwise&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;min&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
         &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;    &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
         &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;otherwise&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Basically the definition tells you that defining the &lt;strong&gt;compare&lt;/strong&gt; function is
enough to for the other functions be inferred from.&lt;/p&gt;

&lt;p&gt;One last thing about defining data types is the ability to create synonyms for
existing types. This is done using the &lt;strong&gt;type&lt;/strong&gt; keyword and allows you to make
your functions more readable. Here are a few examples:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;kr&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;kr&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt;
&lt;span class=&quot;kr&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Address&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;None&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Addr&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt;
&lt;span class=&quot;kr&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Person&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Address&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;kr&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;StringList&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

</content>
 </entry>
 
 <entry>
   <title>Haskell Basics</title>
   <link href="http://rlgomes.github.com/work/haskell/2011/10/30/20.00-haskell-basics.html"/>
   <updated>2011-10-30T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/haskell/2011/10/30/20.00-haskell-basics</id>
   <content type="html">&lt;p&gt;I will start my learning of Haskell by first going over how to write a simple
program in Haskell and have it execute within the Haskell interpreter. Lets
start with writing a simple “Hello World” application in Haskell:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;o&quot;&gt;#!/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;usr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bin&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;runhaskell&lt;/span&gt;
&lt;span class=&quot;kr&quot;&gt;module&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;where&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;main&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;putStrLn&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Hello, World!&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;That is a pretty small program even when you compare to writing hello world in
an imperative language such as C. I added the shebang to make sure you’d be
able to easily run this on a unix system. To run this you’ll need to install
GHC compiler and interpreter which are supported on all major OS’s. For more
information on getting GHC on your system have a look at:
http://www.haskell.org/ghc/ .&lt;/p&gt;

&lt;p&gt;Lets get into what makes a piece of code in Haskell. So unlike most mainstream
languages that are imperative or object-oriented, functional programming
languages do not have the notion of a sequence of commands/instructions to
execute. A functional programming language consists of a collection of functions
that do a single task and return a result without any side-effects. I’m bringing
up the topic of side-effects so we can understand early on that you can’t do
any of the following in a function (by default):&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;throw an exception, raise an error&lt;/li&gt;
  &lt;li&gt;read/write to any file (including stdin/stdout)&lt;/li&gt;
  &lt;li&gt;change global state (global variables)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Side-effects are not “easily” represented as a pure function in a functional
programming language.&lt;/p&gt;

&lt;p&gt;Lets write our first function that takes a number and gives this number plus
one:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;plus1&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;When writing your functions make sure to always write the signature of the
function. Writing the signature helps you understand what you’re trying to
write as well verifies that you haven’t created any scenario in which your
function returns an unexpected result:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;plus1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;plus1&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;As you look at this function you’ll find the notation is very similar to when
you were writing mathematical expressions in school. Of course now that we have
this function lets load it into the ghci shell and try to apply it to a few
different values:&lt;/p&gt;

&lt;console&gt;
&amp;gt; ghci
GHCi, version 7.0.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude&amp;gt; :l test.hs
[1 of 1] Compiling Main             ( test.hs, interpreted )
Ok, modules loaded: Main.
*Main&amp;gt; plus1 0
1
*Main&amp;gt; plus1 2
3
*Main&amp;gt; plus1 (plus1 2)
4
*Main&amp;gt;
&lt;/console&gt;

&lt;p&gt;Lets jump into deal with lists of data and how to write a few useful functions
to deal with lists. The first thing to realize is that a list in Haskell is
very simply represented as:&lt;/p&gt;

&lt;pre&gt;
[1,2,3,4]
&lt;/pre&gt;

&lt;p&gt;A string is a list of Char and therefore has the type [Char] and can be
represented as:&lt;/p&gt;

&lt;pre&gt;
['a','b','c']
&lt;/pre&gt;

&lt;p&gt;which is in fact the string “abc”. Lists can also be expressed using the
operator (:) which takes an element and adds it to the another list so the
above list can also be presented as:&lt;/p&gt;

&lt;pre&gt;
1:(2:(3:(4:[])))
&lt;/pre&gt;

&lt;p&gt;With this operator we can also introduce pattern matching and how to write
functions that can handle lists. Lets start by writing our very own length
function that can calculate the length of a list. The first thing to write is
the type of this function:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;len&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The above signature is already length takes a list of Ints and returns an Int.
We can easily take this a step further and make this function polymorphic which
allows it to be applied to a list of any type. The signature would look like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;len&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The type a now represents any type that can be put in a list and with this we
can write our length function, like this:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;len&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;len&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;len&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;len&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This definition reads very simply: the length of an empty list is 0 and the
length of a list with an element x and a tail xs is equal to the length of the
tail plus one. We’ve introduced here how to do pattern matching on lists and
also how polymorphism works when you want to create methods that can be used
against various data types that share a common structure. The len function shown
can be used against a list of integers as well as a string which is a list
of Chars. Here’s an example:&lt;/p&gt;

&lt;console&gt;
...
Prelude&amp;gt; :l test.hs
[1 of 1] Compiling Main             ( test.hs, interpreted )
Ok, modules loaded: Main.
*Main&amp;gt; len [3,2,1,3,4,5,5]
7
*Main&amp;gt; len &quot;Hello, World!&quot;
13
&lt;/console&gt;

&lt;p&gt;Polymorphism is one of the other things I find that Haskell does really well
when compared to other languages. All other languages refer to polymorphism as
templates and generics and are in no way as elegant or simple to understand as
polymorphism is in Haskell.&lt;/p&gt;

&lt;p&gt;Most of the operations on lists you’ll ever need are already implemented in the
Prelude library. Just search for “haskell prelude” and you’ll find all of the
available functions, but here are a few of the everyday useful ones:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;head&lt;/strong&gt; - returns the head of a list&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;last&lt;/strong&gt; - returns the last element of a list&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;tail&lt;/strong&gt; - returns the same list without the first element&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;init&lt;/strong&gt; - returns the same list without the last element&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With these functions we can now start to talk about function composition. Which
from algebra class when you had function &lt;em&gt;f&lt;/em&gt; and function &lt;em&gt;g&lt;/em&gt; and wanted to
apply it to a single argument you’d write something like so:&lt;/p&gt;

&lt;pre&gt;
(f o g) x
&lt;/pre&gt;

&lt;p&gt;In haskell there is the composition function which can take two functions as
its arguments and compose them together. This is how the definition of such a
function might look:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;We’ve suddenly introduced 2 other concepts with this definition:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;infix functions&lt;/strong&gt; - By default functions are defined in a prefix manner
which means the function name appears before the arguments. When you want to
declare functions that will appears in the middle of their arguments such as
the +,/,- operators then you need to declare them as you see above wrapped
with brackets and then you can define them with the function name in the
middle of the arguments. You can also turn any regular prefix function into
an infix function by just putting single quotes around it like so: &lt;em&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;div&lt;/code&gt;&lt;/em&gt; is
now an infix version of the prefix &lt;em&gt;div&lt;/em&gt; function.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;lambda expression&lt;/strong&gt; - A lambda expression allows you to define an in
place function in a more mathematical friendly format. You basically describe
for each x what the function will do. So for example:&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;
\x -&amp;gt; x + 2
&lt;/pre&gt;
&lt;p&gt;declares a function that for each x passed as an argument, the function
will returns x plus 2. Lambda expressions are great for saving space and not
having to declare additional named functions when you just need a function to
get the job done there and then.&lt;/p&gt;

&lt;p&gt;I’ll end this post with a few examples of how function composition works and
how it can be useful to write functions composed of other well known functions.
To start lets take a few of the useful prelude list functions we shown above and
try to write a few other useful list functions by using composition:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;lets write a function that can give you the penultimate element of a list. We
can simply call it &lt;strong&gt;penult&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;lets write a function that can give you the same list without the first and
last element and we’ll call it &lt;strong&gt;middle&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So for the &lt;strong&gt;penult&lt;/strong&gt; function we can compose the functions &lt;strong&gt;init&lt;/strong&gt; and
&lt;strong&gt;last&lt;/strong&gt;, the first will give us the list without the last element and the
&lt;strong&gt;last&lt;/strong&gt; call will give us the last element of that list which is the
penultimate element, like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;penult&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;penult&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;last&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;init&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;and the &lt;strong&gt;middle&lt;/strong&gt; function is the composition of the &lt;strong&gt;init&lt;/strong&gt; and &lt;strong&gt;tail&lt;/strong&gt;
functions, like this:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-hs&quot; data-lang=&quot;hs&quot;&gt;&lt;span class=&quot;n&quot;&gt;midddle&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;middle&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tail&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;init&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;By now you can see that the Haskell language is extremely powerful and allows
you to take existing functions and put them together in a very simple way that
allows you to create new and useful functions quickly and cleanly.&lt;/p&gt;

&lt;p&gt;In my next post I’ll start to dive into custom data types and how to pattern
match those when writing your own functions and hopefully get into representing
Trees, Graphs and other data types and how writing functions for those in a
functional language is extremely clean and simple.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Back to Haskell</title>
   <link href="http://rlgomes.github.com/work/haskell/2011/10/29/21.00-why-haskell-rocks.html"/>
   <updated>2011-10-29T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/haskell/2011/10/29/21.00-why-haskell-rocks</id>
   <content type="html">&lt;p&gt;So I recently decided to get back on the functional programming band wagon and
get re-acquainted with haskell. Haskell is one of very few purely functional
languages that really expresses programming in a very beautiful mathematical
manner. I learnt Haskell early in university and was always fascinated by how
simple and yet powerful the language was.&lt;/p&gt;

&lt;p&gt;Of course, shortly after leaving school I found that functional programming
languages were in no way mainstream or being used by software companies at large.
This was of course in 2003 when I finished my masters and things have changed
quite a bit in the past 8 years. In today’s programming scene there are a few
functional programming languages that are being used quite extensively. Erlang
and Scala seem to have found mainstream adoption in a few places and even my
beloved Haskell has some usage behind the curtains at Facebook and Google.&lt;/p&gt;

&lt;p&gt;Now I haven’t touched Haskell in ages but I thought I’d try and relearn Haskell
and be able to write a few simple tools in Haskell. I will be documenting my
progress in the form of a few blog posts and I hope to spike the interest of
others in the language or at least make it easier for myself in the future when
I want to review the language quickly.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Python performance module</title>
   <link href="http://rlgomes.github.com/work/python/programming/pyperf/2011/06/03/00.00-python-performance-module.html"/>
   <updated>2011-06-03T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/python/programming/pyperf/2011/06/03/00.00-python-performance-module</id>
   <content type="html">&lt;p&gt;I just created a new Python performance module to easily give you performance
information about important functions with in your existing Python code without 
having to litter your code with test code and checks here and there. The new 
library is called pyperf and it uses Python decorators to mark the methods you 
want to track performance statistics on.&lt;/p&gt;

&lt;p&gt;I wrote this because I needed to do some performance measurement on another 
project that I’m going to put on my github account that is an old school project
on l-systems and generating/rendering those l-systems into nice 3D images that 
look like organic objects (i.e. trees, bushes, shrubs in a game). I’ll write up
another post on that library a little later for now I wanted to release this
module in case anyone else finds it useful and wants to help extend and make it
useful for themselves and others.&lt;/p&gt;

&lt;p&gt;The pyperf library is very simple to use and you can start measuring the 
performance of your critical methods quickly. The first thing to do is decorate
the functions you want with the pyperf.measure decorator. By default the pyperf
library is not enabled and to enable it you have two choices:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;Set the pyperf.PYPERF to True and this will make it so that the pyperf 
library is monitoring the function that have been previously decorated.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Sending the signal SIGUSR1 to the Python application you have decorated
with the pyperf.measure decorat will enable/disable the pyperf library 
which will print the current state to the logs so you know if its turned
on or off.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The performance report will be printed at the end of your codes execution when 
your Python application exits. If you’d like to get a report at runtime then 
send the SIGUSR2 signal to the Python application and the pyperf library will 
print the current report information.&lt;/p&gt;

&lt;p&gt;There are a few configuration options that you have that can be set by changing
the global variables:&lt;/p&gt;

&lt;p&gt;pyperf.PYPERF_TRACKARGUMENTS - when set to True the pyperf report will track
                                  the function calls by the arguments to the 
                                  same function and separating the results by 
                                  the arguments.&lt;/p&gt;

&lt;p&gt;pyperf.PYPERF_TRACKCALLER - when set to True the pyperf report will track the
                               function calls by the caller to the function you 
                               decorated.&lt;/p&gt;

&lt;p&gt;There are still a few features I’d like to implement which I’ve added to the 
github issues of this repository and I think that configuring the library could
be done in a nicer way but for now this is a first good step in the direction of
creating a nice and clean performance library for any Python code.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Android Development and Ant</title>
   <link href="http://rlgomes.github.com/work/java/android/2011/03/09/21.00-android-development-and-ant.html"/>
   <updated>2011-03-09T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/java/android/2011/03/09/21.00-android-development-and-ant</id>
   <content type="html">&lt;p&gt;When thinking about automating the building/testing of an Android project I ran
across the necessity to do so from the command line and not using Eclipse. After
some searching I found that its quite easy to get this done with the tools that
Android has already available which can generate a simple build.xml file to be
use for building/installing/packaging an Android project.&lt;/p&gt;

&lt;p&gt;To build your Ant build file you can use simply the following from the root
of your project:&lt;/p&gt;

&lt;pre&gt;
android update project -p .
&lt;/pre&gt;

&lt;p&gt;Now if you need to link additional Android projects to this one you can do so
with the following command:&lt;/p&gt;

&lt;pre&gt;
android update project -p . -l path/to/other/android/project
&lt;/pre&gt;

&lt;p&gt;To see the available targets just issue the ususal: &lt;em&gt;ant -p&lt;/em&gt; and you’ll get all
of the available targets:&lt;/p&gt;

&lt;pre&gt;
Main targets:

 clean      Removes output files created by other targets.
 compile    Compiles project's .java files into .class files
 debug      Builds the application and signs it with a debug key.
 install    Installs/reinstalls the debug package onto a running emulator or
            device. If the application was previously installed, the signatures
            must match.
 release    Builds the application. The generated apk file must be signed before                             it is published.
 uninstall  Uninstalls the application from a running emulator or device.
Default target: help
&lt;/pre&gt;
</content>
 </entry>
 
 <entry>
   <title>Fixing Eclipse Content Assist</title>
   <link href="http://rlgomes.github.com/work/java/eclipse/2011/03/06/13.00-fixing-eclipse-content-assist.html"/>
   <updated>2011-03-06T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/java/eclipse/2011/03/06/13.00-fixing-eclipse-content-assist</id>
   <content type="html">&lt;p&gt;If you use eclipse like I do for developing anything form Python scripts to
Android projects you may have noticed that recently your content assist feature
is either really slow or literally freezes up eclipse. Well you’re in luck
because the reason its most likely doing this is because of the ADT Plugin that
you’re using to do Android development.&lt;/p&gt;

&lt;p&gt;So it seems there are a few bugs open on the subject but put simply the content
assist tries to look for the Android source code under each of the Android
folders at &lt;em&gt;sdk_location/platforms/android-x&lt;/em&gt; (where x is the android level).
When the content assist can’t find the &lt;em&gt;sources&lt;/em&gt; folder then it just goes nuts
and wastes alot of cpu cycles, the easy fix is to create a simple &lt;em&gt;sources&lt;/em&gt;
folder under each of those &lt;em&gt;android-x&lt;/em&gt; folders and your content assist will be
back to behaving as usual.&lt;/p&gt;

&lt;p&gt;The longer and less easy fix is to download the source for each of those builds
and place it there, but I really don’t think the majority of developers care for
all of the source code.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>New look and feel for DTF documentation</title>
   <link href="http://rlgomes.github.com/work/java/dtf/2011/01/28/23.00-new-generated-documentation-look-for-dtf.html"/>
   <updated>2011-01-28T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/java/dtf/2011/01/28/23.00-new-generated-documentation-look-for-dtf</id>
   <content type="html">&lt;p&gt;Spent sometime cleaning up the way I generate DTF documentation so that it
outputs the HTML elements with the required class and id attributes. With this
I am now able to correctly style the DTF generated documentation to make it a
little easier on the eyes. As you can see from a few comparative screenshots
below there are a few good improvements and I’ll definitely be working on
cleaning this up further but I’m pretty happy with the current results.&lt;/p&gt;

&lt;figure&gt;
    &lt;img src=&quot;/images/2011/jan/old_dtf_index_doc.png&quot; /&gt;
    &lt;figcaption&gt;old index page&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;figure&gt;
    &lt;img src=&quot;/images/2011/jan/new_dtf_index_doc.png&quot; /&gt;
    &lt;figcaption&gt;new index page&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;figure&gt;
    &lt;img src=&quot;/images/2011/jan/old_dtf_tag_doc.png&quot; /&gt;
    &lt;figcaption&gt;old tag documentation page&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;figure&gt;
    &lt;img src=&quot;/images/2011/jan/new_dtf_tag_doc.png&quot; /&gt;
    &lt;figcaption&gt;new tag documentation page&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;figure&gt;
    &lt;img src=&quot;/images/2011/jan/old_dtf_feature_doc.png&quot; /&gt;
    &lt;figcaption&gt;old feature documentation page&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;figure&gt;
    &lt;img src=&quot;/images/2011/jan/new_dtf_feature_doc.png&quot; /&gt;
    &lt;figcaption&gt;new feature documentation page&lt;/figcaption&gt;
&lt;/figure&gt;
</content>
 </entry>
 
 <entry>
   <title>Python decorators</title>
   <link href="http://rlgomes.github.com/work/python/programming/2011/01/22/12.00-python-decorators.html"/>
   <updated>2011-01-22T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/python/programming/2011/01/22/12.00-python-decorators</id>
   <content type="html">&lt;p&gt;I recently discovered decorators in Python and have been pretty impressed with
how they work. Basically it allows you to change the behavior of a function or
method without having to alter the code that represents that function. Instead
with a decorator you basically label/annotate the method with a keyword that
basically identifies which decorator to apply to this function.&lt;/p&gt;

&lt;p&gt;Lets look at the simple example of you having creating a bunch of functions that
do different tasks and then when you’re running in DEBUG mode you’d like to be
able to get the amount of time spent on each of these calls. Lets first piece
together a simple Python module with a few functions that do some simple tasks
in it, for this example I’ve decided to create a simple logger module:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;I: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;E: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;D: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;W: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;There are four simple methods that log info, error, debug and warning messages
and now we can design our perf module that will be used to decorate these methods
so we can do things such as counting the occurrences of method calls or even
just calculate the average time spent on these calls at runtime. Lets start by
putting together the decorator that can calculate how much time we spent on each
individual call.&lt;/p&gt;

&lt;p&gt;So a decorator is nothing more than a function that accepts another function as
an argument. The very basic decorator definition looks like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;time&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;track&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;new_f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;dur&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s executed in &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;dms&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dur&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_f&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This is a decorator defined as a function you basically have to return a
function as part of the contract of defining a decorator this way. In our case
we create a simple function that wraps the existing one and return that. In this
new function we’re just making the same call as your code intended but measuring
the time spent. The &lt;em&gt;args&lt;/em&gt; are the arguments passed to the original function and
the &lt;em&gt;kwargs&lt;/em&gt; are the keyword arguments passed to the original function.&lt;/p&gt;

&lt;p&gt;Now Lets put together a silly test that simply logs a bunch of lines with our
test logger. Here is this test:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;logger&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;logger&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;just a message&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;logger&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;oh crap!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;logger&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;debug some stuff&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;logger&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;you should have a look a this&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;We’re ready to go back to the original module and add the @track decorator to
each of our calls. Our module will look like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;perf&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;track&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@track&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;I: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@track&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;E: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@track&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;D: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@track&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;W: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;When we execute our same test module we’ll now see the following in the logs&lt;/p&gt;

&lt;pre&gt;
&amp;gt; python test.py
I: just a message
i executed in 0ms
E: oh crap!
e executed in 0ms
D: debug some stuff
d executed in 0ms
W: you should have a look a this
w executed in 0ms
&lt;/pre&gt;

&lt;p&gt;Of course the amount of time spent is less than 0ms and that’s not a surprise
but what you can see here is that we now have the ability to track the
performance of these calls by simply marking them with the decorator keyword.&lt;/p&gt;

&lt;p&gt;Lets take sometime to make that decorator extra smart and have it turned off/on
based on a global DEBUG flag. The end result should look like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;time&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;global&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DEBUG&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DEBUG&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;track&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;global&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DEBUG&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DEBUG&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;new_f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;dur&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s executed in &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;dms&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dur&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_f&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This actually shows that you can make a decorator that adds no overhead when it
is turned off because it just returns the same function that was already in
place before the decorator had been used. Basically we can create a set of
performance measuring decorators that have a negligible impact on our code when
it running in production.&lt;/p&gt;

&lt;p&gt;Of course there are more things that can be done with Python decorators but I
just wanted to write up the starting point for future reference and hopefully
inspire others to come up with some brilliant uses of Python decorators.&lt;/p&gt;

&lt;p&gt;I created a small Eclipse project while writing this entry and you can get it
from &lt;a href=&quot;/images/decorators.tar.gz&quot;&gt;here&lt;/a&gt; if you’d like.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Timetracker with toggl support</title>
   <link href="http://rlgomes.github.com/work/python/tracking/monitor/toggl/2011/01/21/09.00-timetracker-with-toggl.html"/>
   <updated>2011-01-21T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/python/tracking/monitor/toggl/2011/01/21/09.00-timetracker-with-toggl</id>
   <content type="html">&lt;p&gt;The timetracker tool is now using the online time tracking service toggl as 
the default tracker and will automatically create new projects by the name of 
the activity you’ve given your current task and file all of those tasks under
the workspace timetracker. You can change the workspace name in your 
configuration file.&lt;/p&gt;

&lt;p&gt;Once you’ve started tracking your tasks you’ll be able to open up toggl and see 
some pretty awesome pie charts of where you’re spending your time.&lt;/p&gt;

&lt;center&gt;&lt;img src=&quot;/images/toggl_pie_chart.png&quot; /&gt;&lt;/center&gt;

&lt;p&gt;Of course it will be nicer to make more interesting charts/graphs once toggl
matures a bit further but for now at least you can use timetracker to really 
figure out where you spend most of your time when you’re on your laptop.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Working on timetracker</title>
   <link href="http://rlgomes.github.com/work/python/tracking/monitor/2011/01/17/14.00-timetracker.html"/>
   <updated>2011-01-17T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/python/tracking/monitor/2011/01/17/14.00-timetracker</id>
   <content type="html">&lt;p&gt;Been working on a time tracking utility for a while now and made &lt;a href=&quot;http://github.com/rlgomes/timetracker&quot;&gt;timetracker&lt;/a&gt; to
basically track what I do on a daily basis on my gnome desktop. From being able
to tell when I’m chatting with a friend to on the web doing social interactions
on facebook or twitter. The application works very well for me and has been
built so you could easily extend it to monitor your Mac or Windows desktop by
creating your own &lt;em&gt;windowmanager&lt;/em&gt; instance.&lt;/p&gt;

&lt;p&gt;The reporting output has also changed significantly from starting with a default
of using the hamster time tracking utility to moving to an online toggl service
which can handle the amount of data generated on a daily basis by my usage
patterns. The hamster applet wasn’t designed to handle a few thousand task
changes per day and was really having issues keeping up. I filed a bug
&lt;a href=&quot;https://bugs.launchpad.net/ubuntu/+source/hamster-applet/+bug/685001&quot;&gt;here&lt;/a&gt; in
the hopes that gets fixed in the future. Meanwhile the default tracker is the
toggl tracker.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://www.toggl.com&quot;&gt;toggl&lt;/a&gt; offers a free account for tracking your activities
and seems to have a simple and clean interface. I still need to figure out
exactly how to inject the tasks into each project/workspace in a way that makes
toggl’s stats and graphs interesting to the end user.&lt;/p&gt;

&lt;p&gt;If you’re interested in trying out &lt;a href=&quot;http://github.com/rlgomes/timetracker&quot;&gt;timetracker&lt;/a&gt;, then start by reading the README
at the base of the project and it will let you know how to install on linux
system. For those of you interested in using this on another OS please contact
me so we can work out how to achieve this.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>My First Post</title>
   <link href="http://rlgomes.github.com/work/personal/2011/01/17/12.00-first-post.html"/>
   <updated>2011-01-17T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/personal/2011/01/17/12.00-first-post</id>
   <content type="html">&lt;p&gt;Just got the blog up and running with Jekyll and thought I’d post a quick note
to see how things look. This is nothing fancy and just makes tracking my time
and interest in projects very easy since adding a new post is creating a simple
text file and checking it in.&lt;/p&gt;

&lt;p&gt;Hopefully I will be able to write more on the things I’m currently working on
in terms of technology as well as some of my hobbies which include music and
piano playing. With time I’ll make things look a little nicer but I really want
something simple and easy on the eyes.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Continuous Integration Environment for DTF</title>
   <link href="http://rlgomes.github.com/work/java/dtf/2011/01/17/16.00-continuous-integration-environment-for-dtf.html"/>
   <updated>2011-01-17T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/java/dtf/2011/01/17/16.00-continuous-integration-environment-for-dtf</id>
   <content type="html">&lt;p&gt;I’ve been working on building a CI environment for DTF for a while now and I’m
at the stage where the CI environment I put together work quite well for me. I
am using Amazon’s EC2 to host my CI server which has a few simple scripts that
when the EC2 instance is started will do the following:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;checkout my DTF source tree&lt;/li&gt;
  &lt;li&gt;build DTF&lt;/li&gt;
  &lt;li&gt;generate the DTF Documentation (JavaDoc documentation that generates documentation for all tags in DTF) to &lt;a href=&quot;http://rlgomes.github.com/dtf/&quot;&gt;here&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;run the JUnit tests &amp;amp; DTF unit test suite and post the result to &lt;a href=&quot;http://rlgomes.github.com/dtf/results/html/&quot;&gt;here&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;run the DTF performance verification tests and post the results to &lt;a href=&quot;https://github.com/rlgomes/dtf/wiki/Performance-test-results&quot;&gt;here&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;shutdown the EC2 instance&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Takes about 20 minutes to run through everything and the instance itself stays
up and running for an additional 10 minutes (in case I need to ssh into the
instance for any maintenance).&lt;/p&gt;

&lt;p&gt;There are quite a few tools/technologies that were used to achieve the above and
I’ll now try to give a very simple explanation of each one used:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;I choose to use Amazon’s EC2 because it was really easy to get an instance up
and running on EC2 and my only issue was how do I get that instance to start
and stop on demand. After some investigation I wrote up a simple tool that would
generate the HTTP POST/GET calls that I could use to start and stop my instances.
This tool is available &lt;a href=&quot;https://github.com/rlgomes/ec2-tools&quot;&gt;here&lt;/a&gt; and can be
used by anyone else wanting to remotely administer their EC2 instances.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The result tracking of the JUnit tests and DTF’s unit test suite are all done
by outputting results into a JUnit XML format and then using the JUnit report
task in Ant to generate the simple HTML reports.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The graphs generated from executing DTF’s performance verification tests are
generated using &lt;a href=&quot;http://code.google.com/apis/chart/&quot;&gt;Google’s Charting API&lt;/a&gt;.
After poking around at the API for a while I basically made the DTF performance
tests record their performance results by appending to existing event files and
then outputting at the end of the PVT execution all of the performance results
gathered to a file that could be executed to download all of the chart PNG files
and then push those to github to be shown in a wiki page.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are still a few things that need some cleaning up in terms of polish and
also I would like to include the DVT (Deployment Verification Tests) to this
automated testing but for now I am quite happy with how easily I can validate
my changes to DTF do not break anything in an unpredictable manner.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Introducing DTF</title>
   <link href="http://rlgomes.github.com/work/java/dtf/2011/01/17/13.00-introducing-dtf.html"/>
   <updated>2011-01-17T00:00:00+00:00</updated>
   <id>http://rlgomes.github.com/work/java/dtf/2011/01/17/13.00-introducing-dtf</id>
   <content type="html">&lt;p&gt;DTF is my testing framework that has been in development for 4+ years and has
survived my transition from Sun Microsystems to Yahoo. It was open sourced since
May 2010. Now that I have my blog up and running I thought I’d make one of my
first posts be an introduction to DTF and what it can do.&lt;/p&gt;

&lt;p&gt;To start DTF stands for Distributed Testing Framework and it is written in 100%
Java so that it can be easily executed on any environment where Java is
supported. The tests in DTF are written in XML and unlike some of the XML
languages you’ve seen its sort of a mash-up of Ant + Jelly + Some additional
features you’ve never seen in XML.&lt;/p&gt;

&lt;p&gt;The biggest advantage in DTF vs any other Java based distributed testing tool is
that you can write a test that interacts with N different machines and the test
explicitly describes all of the interactions of the components in your system
in one easy to read XML. Most other framework rely on multiple configuration
files and even different files to describe the behavior of each peace. Aside
from that there is also the feature to deploy your setup to your host machines
from a single machine using an XML format to describe your setup.&lt;/p&gt;

&lt;p&gt;With all of this the best thing to do is to start by reading the documentation
written for those interested in DTF and to start really digging in to what DTF
can do for you. DTF’s documentation can be found here: https://github.com/rlgomes/dtf/wiki&lt;/p&gt;
</content>
 </entry>
 

</feed>