<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">
 
  <title>Patrick O. Perry</title>
  
  <link href="http://ptrckprry.com/" />
  <id>tag:ptrckprry.com,2009-10-31:atom.xml</id>
  <updated>2013-01-29T14:08:26-08:00</updated>
  <author>
    <name>Patrick O. Perry</name>
    <email>patperry@seas.harvard.edu</email>
  </author>
 
  
  <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/ptrckprry" /><feedburner:info uri="ptrckprry" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
    <title>Network Science Reading List</title>
    <link href="http://feedproxy.google.com/~r/ptrckprry/~3/VMWZY4cr_Sw/" />
    <updated>2010-11-08T00:00:00-08:00</updated>
    <id>tag:ptrckprry.com,2010-11-08:/blog/2010/11/08/network-science-reading-list/</id>
    <content type="html">&lt;p&gt;Network science is a huge field spanning many disciplines; for newcomers,
it is to know where to start.  What follows is an incomplete list of network
science papers I found to be interesting, organized by topic.&lt;/p&gt;

&lt;h2&gt;Exponential Random Graph Models&lt;/h2&gt;

&lt;p&gt;ERGMs are the most widely-used network models in the social sciences.
They model relational data through statistics like the numbers of triangles
and k-star subgraphs.  Unfortunately, they are difficult to fit and interpret.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Holland, P. W., and Leinhardt, S. (1981),
&lt;a href="http://www.jstor.org/stable/2287037"&gt;&amp;ldquo;An Exponential Family of Probability Distributions for Directed Graphs,&amp;rdquo;&lt;/a&gt;
&lt;em&gt;J. Am. Stat. Assoc.&lt;/em&gt;, &lt;strong&gt;76&lt;/strong&gt;, 33-50&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Anderson, C. J., Wasserman S., and Crouch, B. (1999),
&lt;a href="http://linkinghub.elsevier.com/retrieve/pii/S0378873398000124"&gt;&amp;ldquo;A p* Primer: Logit Models for Social Networks,&amp;rdquo;&lt;/a&gt;
&lt;em&gt;Soc. Networks&lt;/em&gt;, &lt;strong&gt;21&lt;/strong&gt;, 37-66&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Snijders, T. A. B. (2002),
&lt;a href="http://www.cmu.edu/joss/content/articles/volume3/Snijders.pdf"&gt;“Markov Chain Monte Carlo Estimation of Exponential Random Graph Models,”&lt;/a&gt;
&lt;em&gt;J. Soc. Struct.&lt;/em&gt;, &lt;strong&gt;3&lt;/strong&gt;, 1-40&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Handcock, M. S. (2003),
&lt;a href="http://www.csss.washington.edu/Papers/wp39.pdf"&gt;“Assessing Degeneracy in Statistical Models of Social Networks,”&lt;/a&gt;
Working paper no. 39, Center for Statistics and the Social Sciences,
University of Washington-Seattle&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Latent Space Models&lt;/h2&gt;

&lt;p&gt;Latent space models are an alternative to ERGMs which get around dyadic
dependence by positing existence of latent covariates.  Since their
introduction in 2002, they have been extended to include clustering and degree
heterogeneity.  Beware that these models impose a triangle inequality on
social space, which may not be appropriate.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Hoff, P. D., Raftery, A. E., and Handcock, M. S. (2002),
&lt;a href="http://www.stat.washington.edu/raftery/Research/PDF/hoff2002.pdf"&gt;“Latent Space Approaches to Social Network Analysis,”&lt;/a&gt;
&lt;em&gt;J. Am. Stat. Assoc.&lt;/em&gt;, &lt;strong&gt;97&lt;/strong&gt;, 1090–1098&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Handcock, M. S., Raftery, A. E., and Tantrum, J. M. (2007),
&lt;a href="http://www.stat.washington.edu/raftery/Research/PDF/Handcock2007.pdf"&gt;&amp;ldquo;Model-Based Clustering for Social Networks,&amp;rdquo;&lt;/a&gt;
&lt;em&gt;J. R. Statist. Soc. A&lt;/em&gt;, &lt;strong&gt;170&lt;/strong&gt;, 301-354&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Krivitsky, P. N., Handcock, M. S.,  Raftery, A. E., and Hoff, P. D. (2009),
&lt;a href="http://www.stat.washington.edu/raftery/Research/PDF/Krivitsky2009.pdf"&gt;&amp;ldquo;Representing Degree Distributions, Clustering, and Homophily in Social Networks with Latent Cluster Random Effects Models,&amp;rdquo;&lt;/a&gt;
&lt;em&gt;Soc. Networks&lt;/em&gt;, &lt;strong&gt;31&lt;/strong&gt;, 204-213&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Block Models&lt;/h2&gt;

&lt;p&gt;Block models are another class of network models involving latent variables.
While work in the 80s assumed the block structure to be known, the current
approach is to assume each node belongs to an unknown class, and the node&amp;rsquo;s
behavior is determined by its class membership.  Bickell and Chen have
shown it is possible to recover the unknown class labels if the network is
big enough.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Holland, P. W., Laskey, K. B., and Leinhardt, S. (1983),
&lt;a href="http://dx.doi.org/10.1016/0378-8733(83)90021-7"&gt;“Stochastic Blockmodels: First Steps,”&lt;/a&gt;
&lt;em&gt;Soc. Networks&lt;/em&gt;, &lt;strong&gt;5&lt;/strong&gt;, 109–137&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Airoldi, E. M., Blei, D. M., Feinberg, S. E., and Xing, E. P. (2008),
&lt;a href="http://jmlr.csail.mit.edu/papers/volume9/airoldi08a/airoldi08a.pdf"&gt;&amp;ldquo;Mixed Membership Stochastic Blockmodels,&amp;rdquo;&lt;/a&gt;
&lt;em&gt;J. Mach. Learn. Res.&lt;/em&gt;, &lt;strong&gt;9&lt;/strong&gt;, 1981-2014&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Bickell, P. and Chen, A. (2009),
&lt;a href="http://www.stat.berkeley.edu/~bickel/Bickel%20Chen%2021068.full.pdf"&gt;“A Nonparametric View of Network Models and Newman-Girvan and Other Modularities,”&lt;/a&gt;
&lt;em&gt;P. Natl. Acad. Sci.&lt;/em&gt;, &lt;strong&gt;106&lt;/strong&gt;, 21068–21073&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Agent-Based Models&lt;/h2&gt;

&lt;p&gt;Agent-based models are similar in spirit to latent space models (network
dynamics arise from pairwise behavior) while still keeping some of the
attractive features of ERGMs (explicit transitivity or hub/spoke behavior).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Jackson, M. O. and Wolinsky, A. (1996),
&lt;a href="http://dx.doi.org/10.1006/jeth.1996.0108"&gt;&amp;ldquo;A Strategic Model of Social and Economic Networks,&amp;rdquo;&lt;/a&gt;
&lt;em&gt;J. Econ. Theory.&lt;/em&gt;, &lt;strong&gt;71&lt;/strong&gt;, 44-74&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Snijders, T. A. B., Van de Bunt, G. V., and Steglich, C. E. G. (2010),
&lt;a href="http://stat.gamma.rug.nl/SnijdersSteglichVdBunt2009.pdf"&gt;&amp;ldquo;Introduction to Stochastic Actor-Based Models for Network Dynamics,&amp;rdquo;&lt;/a&gt;
&lt;em&gt;Soc. Networks&lt;/em&gt;, &lt;strong&gt;32&lt;/strong&gt;, 44–60&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Community Detection&lt;/h2&gt;

&lt;p&gt;Community detection in networks is like clustering in traditional data
analysis.  For some reason, this has received a lot of attention, especially
in the physics community.  This seems like a fad, but it&amp;rsquo;s worth knowing
about.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Newman, M. E. J. (2006),
&lt;a href="http://arxiv.org/abs/physics/0602124"&gt;&amp;ldquo;Modularity and Community Structure in Networks,&amp;rdquo;&lt;/a&gt;
&lt;em&gt;P. Natl. Acad. Sci.&lt;/em&gt;, &lt;strong&gt;103&lt;/strong&gt;, 8577-8582&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Leskovec, J., Lang, K. J., Dasgupta A., and Mahoney, M. W. (2009),
&lt;a href="http://arxiv.org/abs/0810.1355"&gt;&amp;ldquo;Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters,&amp;rdquo;&lt;/a&gt;
&lt;em&gt;Internet Mathematics&lt;/em&gt;, &lt;strong&gt;6&lt;/strong&gt;, 29-123&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Sampling&lt;/h2&gt;

&lt;p&gt;Sampling and missing data issues are extremely important, but they largely
get ignored.  Mostly, this is because they give rise to really
hard problems.  Often theoretical results are negative&amp;mdash;in particular, many
have attacked respondent-driven sampling&amp;mdash;but without constructive
alternatives, it will be hard to advance the field.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Heckathorn, D. D. (1997),
&lt;a href="http://www.respondentdrivensampling.org/reports/RDS1.pdf"&gt;&amp;ldquo;Respondent-Driven Sampling: A New Approach to the Study of Hidden Populations,&amp;rdquo;&lt;/a&gt;
 &lt;em&gt;Social Problems&lt;/em&gt;, &lt;strong&gt;44&lt;/strong&gt;, 174-199&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Achlioptas, D., Clauset, A., Kempe, D., and Moore, C. (2009),
&lt;a href="http://users.soe.ucsc.edu/~optas/papers/traceroute.pdf"&gt;&amp;ldquo;On the Bias of Traceroute Sampling: Or, Power-Law Degree Distributions in Regular Graphs,&amp;rdquo;&lt;/a&gt;
&lt;em&gt;J. ACM&lt;/em&gt;, &lt;strong&gt;56&lt;/strong&gt;, 1-28&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Handcock, M. S. and Gile, K. J. (2010),
&lt;a href="http://imstat.org/aoas/AOAS221.pdf"&gt;&amp;ldquo;Modeling Social Networks from Sampled Data,&amp;rdquo;&lt;/a&gt;
&lt;em&gt;Ann. Appl. Stat.&lt;/em&gt;, &lt;strong&gt;4&lt;/strong&gt;, 5-25&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Applications&lt;/h2&gt;

&lt;p&gt;The dirty secret of network science is that the hype is disproportionate
to the scientific impact.  Below are two of the more important
application-driven results.  The Christakis and Fowler (2007) paper in
particular generated significant attention, both positive and negative.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Morris, M. (1997),
&lt;a href="http://journals.lww.com/aidsonline/Fulltext/1997/05000/Concurrent_partnerships_and_the_spread_of_HIV.12.aspx"&gt;&amp;ldquo;Concurrent Partnerships and the Spread of HIV,&amp;rdquo;&lt;/a&gt;
&lt;em&gt;AIDS&lt;/em&gt;, &lt;strong&gt;11&lt;/strong&gt;, 641-648&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Christakis, N. A. and Fowler, J. H. (2007),
&lt;a href="http://www.nejm.org/doi/full/10.1056/NEJMsa066082"&gt;&amp;ldquo;The Spread of Obesity in a Large Social Network over 32 Years,&amp;rdquo;&lt;/a&gt;
&lt;em&gt;New Engl. J. Med.&lt;/em&gt;, &lt;strong&gt;357&lt;/strong&gt;, 370-379&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;img src="http://feeds.feedburner.com/~r/ptrckprry/~4/VMWZY4cr_Sw" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://ptrckprry.com/blog/2010/11/08/network-science-reading-list/</feedburner:origLink></entry>
  
  <entry>
    <title>Blog Move</title>
    <link href="http://feedproxy.google.com/~r/ptrckprry/~3/5mDua6nyv1o/" />
    <updated>2009-10-29T00:00:00-07:00</updated>
    <id>tag:ptrckprry.com,2009-10-29:/blog/2009/10/29/blog-move/</id>
    <content type="html">&lt;p&gt;Here’s a quick heads for anyone who is still following this blog. You may have noticed that posting has become stagnant lately. The main reason for this is have been busy finishing &lt;a href="http://arxiv.org/abs/0909.3052"&gt;my PhD dissertation&lt;/a&gt;, and didn’t work at all on side projects. I’ve also been deterred by WordPress (my blog software) being a bit unwieldy. The first problem has been solved; I am a doctor now. The second problem will be solved, when I move my site over to &lt;a href="http://wiki.github.com/mojombo/jekyll"&gt;Jekyll&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The downside of the transition to Jekyll is that the RSS feed might break. Also, I might be lazy and break some of the links to the blog posts. I apologize to all 20 of you subscribers.&lt;/p&gt;

&lt;p&gt;The upside of all of this is that I hope to be updating more frequently in the future.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/ptrckprry/~4/5mDua6nyv1o" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://ptrckprry.com/blog/2009/10/29/blog-move/</feedburner:origLink></entry>
  
  <entry>
    <title>Poetic Writing</title>
    <link href="http://feedproxy.google.com/~r/ptrckprry/~3/IXRebFWJtT0/" />
    <updated>2009-07-22T00:00:00-07:00</updated>
    <id>tag:ptrckprry.com,2009-07-22:/blog/2009/07/22/poetic-writing/</id>
    <content type="html">&lt;p&gt;Technical writing in 1974 was a lot more poetic than it is now.  From F. Downton&amp;rsquo;s discussion of Stone&amp;rsquo;s &amp;ldquo;Cross-Validatory Choice and Assessment of Statistical Predictions&amp;rdquo;:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;A current nine-day wonder in the press concerns the exploits of a Mr. Uri
Geller who appears to be able to bend metal objects without touching them;
Professor Stone seems to be attempting to bend statistics without touching
them. My attitude to both of these phenomena is one of open-minded
scepticism; I do not believe in either of these prestigious activities, on
the other hand they both deserve serious scientific examination.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Also enjoyable is Stone&amp;rsquo;s extended analogy:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;[I]t is reasonable to enquire how one arrives at a prescription in any
particular problem.  A tentative answer is that, like a doctor with his
patient, the statistician with his client must write his prescription only
after careful consideration of the reasonable choices&amp;hellip; Just as the
doctor should be prepared for side-effects, so the statistician should
monitor and check the execution of the prescription for any unexpected
complications&amp;hellip; A prescription is neither true nor false; it is better
to say that, in a broad sense, it either succeeds or fails.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;You just don&amp;rsquo;t see this kind of writing very often in current Statistics
writing.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/ptrckprry/~4/IXRebFWJtT0" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://ptrckprry.com/blog/2009/07/22/poetic-writing/</feedburner:origLink></entry>
  
  <entry>
    <title>New Haskell BLAS bindings!</title>
    <link href="http://feedproxy.google.com/~r/ptrckprry/~3/QEalmorlFRE/" />
    <updated>2009-01-09T00:00:00-08:00</updated>
    <id>tag:ptrckprry.com,2009-01-09:/blog/2009/01/09/new-haskell-blas-bindings/</id>
    <content type="html">&lt;p&gt;I just uploaded &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/blas"&gt;version 0.7 of the Haskell BLAS
bindings&lt;/a&gt;! This
is a major milestone&amp;mdash; it is finally the library with all of the features that I
want.&lt;/p&gt;

&lt;p&gt;Here&amp;rsquo;s what the library is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It provides basic data types necessary for doing linear algebra. There are
dense vectors and matrices, banded matrices, triangular and Hermitian dense
matrices, and triangular and Hermitian banded matrices.&lt;/li&gt;
&lt;li&gt;It provides mutable types and operations that mutate them in either the ST
or the IO monad.&lt;/li&gt;
&lt;li&gt;It maintains a clean distinction in the monadic functions between arguments
that get mutated and arguments that get read.  This allows passing immutable
objects without calls to &amp;ldquo;unsafeThaw&amp;rdquo; everywhere.&lt;/li&gt;
&lt;li&gt;It gives a convenient interface to the Fortran BLAS functions.  The emphasis
here is on convenient.  When using the Fortran functions directly, &amp;ldquo;dgemm&amp;rdquo;,
the function that multiplies a dense matrix by another dense matrix, takes
no fewer than 13 arguments.  In the GNU Scientific Library, the binding
for the function takes 7 arguments.  The Haskell version takes 5.&lt;/li&gt;
&lt;li&gt;It uses BLAS calls internally to perform elementwise vector addition,
subtraction, multiplication, and division, so these operations should be
very efficient.&lt;/li&gt;
&lt;li&gt;It gives nearly complete access to all of the functionality in the Fortran
libraries.  The only missing functions for the dense matrix types are the
are the matrix updating functions (ger, syr, etc.).  Support for packed
storage of triangular matrices is absent.  Either of these would be a good
project for anyone interested.  Neither would be too difficult.&lt;/li&gt;
&lt;li&gt;Other than the matrix updated and the packed storage (and one other small
thing, which I will talk about below), anything you can do in Fortran can
be done in Haskell.  Since most of the computation time in a numerical
routine is in the floating point operations, this means that in principle
you can write code in Haskell that will be just as fast the equivalent
Fortran, at least for moderately-sized inputs.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Here&amp;rsquo;s what the library is not:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It will never provide support for multiplication by the transpose of a
complex matrix without making a copy.  This is the missing feature hinted
at above.  Even though Fortran BLAS supports this, it is fundamentally
impossible for the Haskell bindings.  It is debatable how important this
is, since multiplying by the conjugate of the transpose &lt;em&gt;is&lt;/em&gt; supported.&lt;/li&gt;
&lt;li&gt;It is not a general array library.  The only supported element types are
&lt;code&gt;Double&lt;/code&gt; and &lt;code&gt;Complex Double&lt;/code&gt;.  This will likely never change.  If you want
something general, use &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/array"&gt;array&lt;/a&gt;, &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/carray"&gt;carray&lt;/a&gt;, &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/uvector"&gt;uvector&lt;/a&gt;, or one of the
thousand other Haskell array libraries.&lt;/li&gt;
&lt;li&gt;It is not a full-featured linear algebra library.  You cannot compute an
eigenvalue.  You cannot perform a matrix decomposition.  You cannot solve a
linear system.  This is a library for &lt;em&gt;writing&lt;/em&gt; a full-featured linear
algebra library.  It is no good alone for doing anything substantial.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Interested in hacking? Please do. There is a list of project ideas
&lt;a href="http://github.com/patperry/blas/tree/master/TODO"&gt;in the TODO FILE&lt;/a&gt;.  If
any of these sounds worthwhile and you would like to work on it, let me
know and I will give you some guidance.&lt;/p&gt;

&lt;p&gt;Now that that&amp;rsquo;s out of the way, I can (finally) start LAPACK bindings.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/ptrckprry/~4/QEalmorlFRE" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://ptrckprry.com/blog/2009/01/09/new-haskell-blas-bindings/</feedburner:origLink></entry>
  
  <entry>
    <title>Monte Carlo Poker Odds</title>
    <link href="http://feedproxy.google.com/~r/ptrckprry/~3/L9qvujFY7Uc/" />
    <updated>2008-12-31T00:00:00-08:00</updated>
    <id>tag:ptrckprry.com,2008-12-31:/blog/2008/12/31/monte-carlo-poker-odds/</id>
    <content type="html">&lt;p&gt;There&amp;rsquo;s a new version of
&lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/monte-carlo"&gt;monte-carlo&lt;/a&gt;,
the Monte Carlo monad and transformer I wrote for Haskell. The highlights of the
release are a &lt;code&gt;MonadMC&lt;/code&gt; typeclass and functions for sampling from general
discrete distributions. This post gives a demonstration of the library by
showing how to estimate poker odds via Monte Carlo simulation.&lt;/p&gt;

&lt;p&gt;The goal of the program will be to estimate the distribution of poker
hands from dealing five cards out of a well-shuffled deck of fifty-two
cards.  For reference, &lt;a href="http://en.wikipedia.org/wiki/Poker_probability"&gt;Wikipedia&lt;/a&gt; gives the probabilities
of the poker hands.  I&amp;rsquo;m not going to bother distinguishing between a
royal flush and a straight flush.&lt;/p&gt;

&lt;p&gt;In the program below, we will need to import the following headers&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;import Control.Monad
import Control.Monad.MC
import Data.List
import Data.Map( Map )
import qualified Data.Map as Map
import System.Environment
import Text.Printf
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The second of these is part of the &lt;code&gt;monte-carlo&lt;/code&gt; package.&lt;/p&gt;

&lt;h2&gt;Poker Functions&lt;/h2&gt;

&lt;p&gt;In Haskell, we need to define types for cards and functions for
classifying hands.  First, we define a card:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;data Suit = Club | Diamond  | Heart | Spade deriving (Eq, Show)
data Card = Card { number :: Int 
                 , suit   :: Suit
                 }
          deriving (Eq, Show)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Here are the numerical values for the face cards,&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;ace, jack, queen, king :: Int
ace   = 1
jack  = 11
queen = 12
king  = 13
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and here is how we get a complete deck of cards&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;deck :: [Card]
deck = [ Card i s | i &amp;lt;- [ 1..13 ],
                    s &amp;lt;- [ Club, Diamond, Heart, Spade ] ]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Next, we enumerate the different hands, and define a function that takes a list
of five cards and tells us what hand it is&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;data Hand = HighCard  | Pair | TwoPair | ThreeOfAKind | Straight
          | Flush | FullHouse | FourOfAKind | StraightFlush 
          deriving (Eq, Show, Ord)

hand :: [Card] -&amp;gt; Hand
hand cs = 
  case matches of 
    [1,1,1,1,1] -&amp;gt; case undefined of
                     _ | isStraight &amp;amp;&amp;amp; isFlush -&amp;gt; StraightFlush
                     _ | isFlush               -&amp;gt; Flush
                     _ | isStraight            -&amp;gt; Straight
                     _ | otherwise             -&amp;gt; HighCard
    [1,1,1,2]                                  -&amp;gt; Pair
    [1,2,2]                                    -&amp;gt; TwoPair
    [1,1,3]                                    -&amp;gt; ThreeOfAKind
    [2,3]                                      -&amp;gt; FullHouse
    [1,4]                                      -&amp;gt; FourOfAKind
  where
    (x:xs) = (sort . map number) cs
    (s:ss) = map suit cs

    isStraight | x == ace &amp;amp;&amp;amp; xs == [ 10..king ] = True
               | otherwise                      = xs == [ x+1..x+4 ]

    isFlush = all (== s) ss

    matches = (sort . map length . group) (x:xs)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The only tricky part is the special handling of the ace in testing for a
straight.&lt;/p&gt;

&lt;h2&gt;Monte Carlo Functions&lt;/h2&gt;

&lt;p&gt;To choose a random five-card hand, we use the &lt;code&gt;sampleSubset&lt;/code&gt; function from
&lt;code&gt;Control.Monad.MC&lt;/code&gt;, which has type&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;sampleSubset :: (MonadMC m) =&amp;gt; Int -&amp;gt; Int -&amp;gt; [a] -&amp;gt; m [a]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We give as parameters the subset size, the collection size, and a list of the
collection elements.  So, to get a five-card hand from a deck of fifty-two
cards, we define a function &lt;code&gt;deal&lt;/code&gt; as&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;deal :: (MonadMC m) =&amp;gt; m [Card]
deal = sampleSubset 5 52 deck
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The signature of the function looks a little strange, because it is polymorphic
in the monad type.  We could have written the signature as &lt;code&gt;deal :: MC [Card]&lt;/code&gt;.
However, by using the more general signature, we can use the function with
either the &lt;code&gt;MC&lt;/code&gt; monad or with &lt;code&gt;MCT&lt;/code&gt;, the monad transormer version.&lt;/p&gt;

&lt;p&gt;The bulk of the work in the simulation gets performed by the &lt;code&gt;repeatMCWith&lt;/code&gt;
function, which has signature&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;repeatMCWith :: (MonadMC m)
             =&amp;gt; (a -&amp;gt; b -&amp;gt; a) -- ^ accumulator
             -&amp;gt; a             -- ^ initial value
             -&amp;gt; Int           -- ^ number of repetitions
             -&amp;gt; m b           -- ^ generator
             -&amp;gt; m a
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This function is an analogue of &lt;code&gt;foldl&lt;/code&gt;.  It repeats a Monte Carlo action a
specified number of times and accumulates the results.  To tally up the
counts of all of the hands, we define&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;type HandCounts = Map Hand Int

emptyCounts :: HandCounts
emptyCounts = Map.empty

updateCounts :: HandCounts -&amp;gt; [Card] -&amp;gt; HandCounts
updateCounts counts cs = Map.insertWith' (+) (hand cs) 1 counts
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then, we use these functions in combination with &lt;code&gt;deal&lt;/code&gt; and &lt;code&gt;reapeatMCWith&lt;/code&gt;
to estimate the probabilities of all of the hands.  Here is the &lt;code&gt;main&lt;/code&gt; function
we use&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;main = do
  [reps] &amp;lt;- map read `fmap` getArgs
  main' reps

main' reps =
  let seed   = 0
      counts = repeatMCWith updateCounts emptyCounts reps deal
               `evalMC` mt19937 seed in do
  printf "\n"
  printf "    Hand       Count    Probability     99%% Interval   \n"
  printf "-------------------------------------------------------\n"
  forM_ ((reverse . Map.toAscList) counts) $ \(h,c) -&amp;gt;
      let n     = fromIntegral reps :: Double
          p     = fromIntegral c / n 
          se    = sqrt (p * (1 - p) / n)
          delta = 2.575829 * se
          (l,u) = (p-delta, p+delta) in
      printf "%-13s %7d    %.6f   (%.6f,%.6f)\n" (show h) c p l u
  printf "\n"
&lt;/code&gt;&lt;/pre&gt;

&lt;h2&gt;Results &amp;amp; Discussion&lt;/h2&gt;

&lt;p&gt;Here are the results from running the simulation with one million repititions:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;    Hand       Count    Probability     99% Interval   
-------------------------------------------------------
StraightFlush      12    0.000012   (0.000003,0.000021)
FourOfAKind       224    0.000224   (0.000185,0.000263)
FullHouse        1452    0.001452   (0.001354,0.001550)
Flush            1908    0.001908   (0.001796,0.002020)
Straight         3980    0.003980   (0.003818,0.004142)
ThreeOfAKind    21341    0.021341   (0.020969,0.021713)
TwoPair         47480    0.047480   (0.046932,0.048028)
Pair           423785    0.423785   (0.422512,0.425058)
HighCard       499818    0.499818   (0.498530,0.501106)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;On my machine it takes about six seconds for the simulation to run.  We can
see that all of the intervals contain the true answer.  However, the relative
accuracty of the uncommon hands is not very good.  In a later post, I&amp;rsquo;ll show
how to use &lt;a href="http://en.wikipedia.org/wiki/Importance_sampling"&gt;Importance Sampling&lt;/a&gt; to get better estimates of the probabilities
for the rare hands.&lt;/p&gt;

&lt;p&gt;Thank you to Aditya Majahan for suggesting the inclusion of &lt;code&gt;reapeatMC&lt;/code&gt; in
the library.  Please send any more usage reports or feature requests my way.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/ptrckprry/~4/L9qvujFY7Uc" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://ptrckprry.com/blog/2008/12/31/monte-carlo-poker-odds/</feedburner:origLink></entry>
  
  <entry>
    <title>ANN: BLAS Bindings for Haskell, version 0.6</title>
    <link href="http://feedproxy.google.com/~r/ptrckprry/~3/-Jt27ttO2Ok/" />
    <updated>2008-10-31T00:00:00-07:00</updated>
    <id>tag:ptrckprry.com,2008-10-31:/blog/2008/10/31/ann-blas-bindings-for-haskell-version-06/</id>
    <content type="html">&lt;p&gt;There&amp;rsquo;s a &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/blas"&gt;new version of the BLAS bindings out&lt;/a&gt;. I put a lot of work
into it and did a pretty massive overhaul of the code. The highlight of the new
release is that now you can do operations in the &lt;code&gt;ST&lt;/code&gt; monad. Also, I fixed a lot
of organizational issues (no more orphan instances!), and cleaned up the
interface a bit. There are a few performance improvements, too (I shaved half
second off &lt;a href="/blog/2008/07/24/addressing-haskell-blas-performance-issues/"&gt;that old
benchmark&lt;/a&gt;). The
downside is that I completely broke backwards compatibility, but since as far as
I can tell I only have two users, I&amp;rsquo;m not too worried about that.&lt;/p&gt;

&lt;p&gt;People have been clamoring for a tutorial, but unfortunately I still don&amp;rsquo;t have
time. My Orals are in ten days, and this stuff is not really core to my
research.  Maybe in a few months I&amp;rsquo;ll do something.&lt;/p&gt;

&lt;p&gt;In the mean time, I did manage to come up with some sample code. Here&amp;rsquo;s a
Fortan90 routine for recursively computing an &lt;a href="http://en.wikipedia.org/wiki/LU_decomposition"&gt;LU decomposition&lt;/a&gt; with row
pivoting, taken from &lt;a href="http://www.netlib.org/netlib/utk/people/JackDongarra/PAPERS/beautiful-code.pdf"&gt;Jack Dongarra and Piotr Luszczek&amp;rsquo;s chapter (PDF)&lt;/a&gt; in &lt;em&gt;Beautiful Code&lt;/em&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;recursive subroutine rdgetrf(m, n, a, lda, ipiv, info) 
 implicit none 

 integer, intent(in) :: m, n, lda 
 double precision, intent(inout) :: a(lda,*) 
 integer, intent(out) :: ipiv(*) 
 integer, intent(out) :: info 

 integer :: mn, nleft, nright, i 
 double precision :: tmp 

 double precision :: pone, negone, zero 
 parameter (pone=1.0d0) 
 parameter (negone=-1.0d0) 
 parameter (zero=0.0d0) 

 intrinsic min
 integer idamax 
 external dgemm, dtrsm, dlaswp, idamax, dscal 

 mn = min(m, n) 

 if (mn .gt. 1) then 
    nleft = mn / 2 
    nright = n - nleft 

    call rdgetrf(m, nleft, a, lda, ipiv, info) 

    if (info .ne. 0) return 
    call dlaswp(nright, a(1, nleft+1), lda, 1, nleft, ipiv, 1) 

    call dtrsm('L', 'L', 'N', 'U', nleft, nright, pone, a, lda, 
$        a(1, nleft+1), lda) 

    call dgemm('N', 'N', m-nleft, nright, nleft, negone, 
$        a(nleft+1,1) , lda, a(1, nleft+1), lda, pone, 
$        a(nleft+1, nleft+1), lda) 

    call rdgetrf(m - nleft, nright, a(nleft+1, nleft+1), lda, 
$        ipiv(nleft+1), info)
    if (info .ne. 0) then 
       info = info + nleft 
       return 
    end if 

    do i = nleft+1, m 
       ipiv(i) = ipiv(i) + nleft 
    end do 

    call dlaswp(nleft, a, lda, nleft+1, mn, ipiv, 1) 

 else if (mn .eq. 1) then 
    i = idamax(m, a, 1) 
    ipiv(1) = i 
    tmp = a(i, 1) 

    if (tmp .ne. zero .and. tmp .ne. -zero) then 
       call dscal(m, pone/tmp, a, 1) 

       a(i,1) = a(1,1) 
       a(1,1) = tmp 
    else 
       info = 1 
    end if 

 end if 

 return 
 end 
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Here&amp;rsquo;s the same program in Haskell, using version 0.6 of the blas bindings:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;module LU ( luFactorize ) where

import BLAS.Elem( BLAS3 )
import Control.Monad
import Control.Monad.ST
import Data.Matrix.Dense
import Data.Matrix.Dense.ST
import Data.Matrix.Tri
import Data.Vector.Dense.ST

luFactorize :: (BLAS3 e) =&amp;gt; STMatrix s (m,n) e -&amp;gt; ST s (Either Int [Int])
luFactorize a
    | mn &amp;gt; 1 =
        let nleft = mn `div` 2
            (a_1, a_2) = splitColsAt nleft a
            (a11, a21) = splitRowsAt nleft a_1
            (a12, a22) = splitRowsAt nleft a_2
        in luFactorize a_1 &amp;gt;&amp;gt;=
               either (return . Left) (\pivots -&amp;gt; do
                   zipWithM_ (swapRows a_2) [0..] pivots
                   doSolveMat_ (lowerU a11) a12
                   doSApplyAddMat (-1) a21 a12 1 a22
                   luFactorize a22 &amp;gt;&amp;gt;=
                       either (return . Left . (nleft+)) (\pivots' -&amp;gt; do
                           zipWithM_ (swapRows a21) [0..] pivots'
                           return (Right $ pivots ++ map (nleft+) pivots')
                       )
               )
    | mn == 1 = 
        let x = colView a 0
        in getWhichMaxAbs x &amp;gt;&amp;gt;= \(i,e) -&amp;gt;
            if (e /= 0) 
                then do
                    scaleBy (1/e) x
                    readElem x 0 &amp;gt;&amp;gt;= writeElem x i
                    writeElem x 0 e
                    return $ Right [i]
                else
                    return $ Left 0
    | otherwise =
        return (Right [])
  where
    (m,n) = shape a
    mn    = min m n
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The Haskell version returns &lt;code&gt;Left&lt;/code&gt; with a column index in the event of failure,
and &lt;code&gt;Right&lt;/code&gt; with a list of the pivot swaps in the case of success. It makes
exactly the same BLAS calls as the Fortran90 version. The performance of the two
versions should be close, especially for large inputs. (Anyone want to verify
this?)&lt;/p&gt;

&lt;p&gt;Some day it will be possible to do serious scientific computing in Haskell without much effort.  This is one small step towards that goal.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/ptrckprry/~4/-Jt27ttO2Ok" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://ptrckprry.com/blog/2008/10/31/ann-blas-bindings-for-haskell-version-06/</feedburner:origLink></entry>
  
  <entry>
    <title>A Monte Carlo Monad for Haskell</title>
    <link href="http://feedproxy.google.com/~r/ptrckprry/~3/gWJfxdi5Yz0/" />
    <updated>2008-08-26T00:00:00-07:00</updated>
    <id>tag:ptrckprry.com,2008-08-26:/blog/2008/08/26/a-monte-carlo-monad-for-haskell/</id>
    <content type="html">&lt;p&gt;I&amp;rsquo;ve just uploaded two new packages to Hackage. The first, &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/gsl-random"&gt;gsl-random&lt;/a&gt;, is a
set of bindings to the random number generators and random number distributions
that come as part of the &lt;a href="http://www.gnu.org/software/gsl/"&gt;GNU Scientific Library&lt;/a&gt;. The next package,
&lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/monte-carlo"&gt;monte-carlo&lt;/a&gt;, is a monad and monad transformer for performing computations
that require a random number generator, and is based on &lt;code&gt;gsl-random&lt;/code&gt;. This post
will give you a taste of what the &lt;code&gt;MC&lt;/code&gt; Monte Carlo monad can do.&lt;/p&gt;

&lt;h2&gt;Introduction&lt;/h2&gt;

&lt;p&gt;For those unfamiliar with Monte Carlo, the basic idea is to use randomness to
compute a nonrandom quantity. You generate a sequence of random variables &lt;code&gt;X&lt;/code&gt;
with mean equal to some quantity you care about, and then form a confidence
interval for the quantity based on the mean and standard deviation of the
simulated values.&lt;/p&gt;

&lt;p&gt;Here&amp;rsquo;s an example that will make things a little more concrete. Suppose you want
to compute &lt;code&gt;pi&lt;/code&gt;. Say you have the ability to generate random points in the unit
box &lt;code&gt;[-1,1]^2&lt;/code&gt;. The box has an area of &lt;code&gt;2^2 = 4&lt;/code&gt;. The unit disc, which is
contained completely in the box, has an area of &lt;code&gt;pi&lt;/code&gt;. Therefore, the probability
of &lt;code&gt;X&lt;/code&gt; landing inside the unit circle is &lt;code&gt;pi/4&lt;/code&gt;. So, if we generate a whole
bunch of &lt;code&gt;X&lt;/code&gt;s, we should expect about &lt;code&gt;pi/4&lt;/code&gt; of them to land inside the unit
circle. This means we can get an estimate of &lt;code&gt;pi&lt;/code&gt; by generating a bunch of
random points, counting how many land inside the circle, and then multiplying by
&lt;code&gt;4&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Of course, the more points we take, the more accurate our answer will be. We can
even get an approximate confidence interval for the true value by computing the
standard deviation of the values.&lt;/p&gt;

&lt;h2&gt;Using the &lt;code&gt;MC&lt;/code&gt; Monad&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;monte-carlo&lt;/code&gt; package provides a monad, called &lt;code&gt;MC&lt;/code&gt;, for performing Monte
Carlo computations. The package exports a function for generating values
uniformly over an interval:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;uniform :: Double -&amp;gt; Double -&amp;gt; MC Double
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This function takes the lower and upper bounds, and produces a value in that
range.  We can use this function to generate a value in the unit box:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;unitBox :: MC (Double,Double)
unitBox = liftM2 (,) (uniform (-1) 1) 
                     (uniform (-1) 1)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We will need a function to test if a point lies inside the unit circle:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;inUnitCircle :: (Double,Double) -&amp;gt; Bool
inUnitCircle (x,y) = x*x + y*y &amp;lt;= 1
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Here is a function to generate &lt;code&gt;n&lt;/code&gt; points and count how many fall inside the
circle, and then to compute an estimate and standard error for &lt;code&gt;pi&lt;/code&gt;
based on n samples:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;computePi :: Int -&amp;gt; MC (Double,Double)
computePi n = do
    xs &amp;lt;- replicateM n unitBox
    let m  = length $ filter inUnitCircle xs
        p  = toDouble m / toDouble n
        se = sqrt (p * (1 - p) / toDouble n)
    return (4*p, 4*se)

  where
    toDouble = realToFrac . toInteger
&lt;/code&gt;&lt;/pre&gt;

&lt;h2&gt;Running the Simulation&lt;/h2&gt;

&lt;p&gt;To get a value &lt;em&gt;out&lt;/em&gt; of the &lt;code&gt;MC&lt;/code&gt; monad, we must provide it with a random number
generator. To get a Mersenne Twister generator, we use the &lt;code&gt;mt19937&lt;/code&gt; function.
Then, evaluate the result with &lt;code&gt;evalMC&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here&amp;rsquo;s an example:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;main = let
    n       = 10000000
    seed    = 0
    (mu,se) = evalMC (computePi n) $ mt19937 seed
    delta   = 2.575*se
    (l,u)   = (mu-delta, mu+delta)
    in do
        printf "Estimate:                   %g\n" mu
        printf "99%% Confidence Interval:    (%g,%g)\n" l u
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It takes my 2GHz Macbook a little under five seconds to run the simulation with
ten million samples. Here are the results I get:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Estimate:                   3.1414584
99% Confidence Interval:    (3.1401211162675353,3.1427956837324644)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Pretty cool, eh?&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;You might be wondering why I didn&amp;rsquo;t just use &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/MonadRandom"&gt;MonadRandom&lt;/a&gt;? There are two
reasons: first, I didn&amp;rsquo;t want to write routines to generate random variables
from different distributions (Normal, Exponential, Poisson, etc.). The GSL
provides these for me. Second, internally the &lt;code&gt;MC&lt;/code&gt; monad is &lt;em&gt;not&lt;/em&gt; using a pure
random number generator. It only keeps one copy of the generator state, and
modifies it every time it samples a value.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/ptrckprry/~4/gWJfxdi5Yz0" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://ptrckprry.com/blog/2008/08/26/a-monte-carlo-monad-for-haskell/</feedburner:origLink></entry>
  
  <entry>
    <title>ANN: BLAS Bindings for Haskell, version 0.5</title>
    <link href="http://feedproxy.google.com/~r/ptrckprry/~3/26yIaDDdXuU/" />
    <updated>2008-08-10T00:00:00-07:00</updated>
    <id>tag:ptrckprry.com,2008-08-10:/blog/2008/08/10/ann-blas-bindings-for-haskell-version-05/</id>
    <content type="html">&lt;p&gt;I&amp;rsquo;ve put together a new release of the Haskell BLAS bindings, now
&lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/blas"&gt;available on hackage&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Here are the new features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add &lt;code&gt;Banded&lt;/code&gt; matrix data type, as well as &lt;code&gt;Tri Banded&lt;/code&gt; and &lt;code&gt;Herm Banded&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Add support for trapezoidal dense matrices (&lt;code&gt;Tri Matrix (m,n) e&lt;/code&gt;, where
&lt;code&gt;m&lt;/code&gt; is not the same as &lt;code&gt;n&lt;/code&gt;).  Note that trapezoidal banded matrices are
&lt;em&gt;NOT&lt;/em&gt; supported.&lt;/li&gt;
&lt;li&gt;Add &lt;code&gt;Diag&lt;/code&gt; matrix data type for diagonal matrices.&lt;/li&gt;
&lt;li&gt;Add &lt;code&gt;Perm&lt;/code&gt; matrix data type, for permutation matrices.&lt;/li&gt;
&lt;li&gt;Enhance the &lt;code&gt;RMatrix&lt;/code&gt; and &lt;code&gt;RSolve&lt;/code&gt; type classes with an API that allows
specifying where to store the result of a computation.&lt;/li&gt;
&lt;li&gt;Enhance the &lt;code&gt;IMatrix&lt;/code&gt;, &lt;code&gt;RMatrix&lt;/code&gt;, &lt;code&gt;ISolve&lt;/code&gt;, and &lt;code&gt;RSolve&lt;/code&gt; type classes to add
&amp;ldquo;scale and multiply&amp;rdquo; operations.&lt;/li&gt;
&lt;li&gt;Remove the scale parameter for &lt;code&gt;Tri&lt;/code&gt; and &lt;code&gt;Herm&lt;/code&gt; matrix data types.&lt;/li&gt;
&lt;li&gt;Flatten the data types for &lt;code&gt;DVector&lt;/code&gt; and &lt;code&gt;DMatrix&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Some inlining and unpacking performance improvements.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;As far as what to expect in version 0.6, I plan to add support for operations
in the ST monad.  This is going to require a pretty big code reorganization and
will cause quite a few API breakages, but I think it&amp;rsquo;s worth the pain.  The
next release will also come with a tutorial and examples.  After the big
code reorganization, I&amp;rsquo;ll get started on LAPACK bindings.&lt;/p&gt;

&lt;p&gt;Please let me know if you are using the library.  I&amp;rsquo;m really interested in
what people like and don&amp;rsquo;t like.  If you think that some functionality is
missing, let me know.  If you think the API is awkward in certain places,
let me know that, too.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/ptrckprry/~4/26yIaDDdXuU" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://ptrckprry.com/blog/2008/08/10/ann-blas-bindings-for-haskell-version-05/</feedburner:origLink></entry>
  
  <entry>
    <title>Addressing Haskell BLAS Performance Issues</title>
    <link href="http://feedproxy.google.com/~r/ptrckprry/~3/O6HqwFr_2gU/" />
    <updated>2008-07-24T00:00:00-07:00</updated>
    <id>tag:ptrckprry.com,2008-07-24:/blog/2008/07/24/addressing-haskell-blas-performance-issues/</id>
    <content type="html">&lt;p&gt;Last month Anatoly Yakovenko started &lt;a href="http://www.haskell.org/pipermail/haskell-cafe/2008-June/044401.html"&gt;a thread on haskell-cafe&lt;/a&gt; about
the Haskell blas bindings being much slower than using raw C.  I was in Denver
at the time on a drive across the United States, so I didn&amp;rsquo;t participate in much
of the conversation.  The conclusion seemed to be that the bindings were about
thirty times slower than C.  Ouch.&lt;/p&gt;

&lt;p&gt;Here is a C program that computes ten million dot product between two vectors of
doubles:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;#include &amp;lt;cblas.h&amp;gt;
#include &amp;lt;stdlib.h&amp;gt;

int main() 
{
   int size  = 10;
   int times = 10*1000*1000;
   int i = 0;

   double *x = malloc( size*sizeof( double ) );
   double *y = malloc( size*sizeof( double ) );

   for( i = 0; i &amp;lt; times; ++i ) 
   {
      cblas_ddot( size, x, 1, y, 1 );
   }

   free( x );
   free( y );

   return 0;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;There is no overhead for initializing the vector&amp;mdash; just for allocating and
freeing it.  The equivalent Haskell program, using mutable vectors, is:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;module Main where

import Control.Monad
import Data.Vector.Dense.IO

main = do
   let size  = 10
   let times = 10*1000*1000

   x &amp;lt;- newVector_ size :: IO (IOVector n Double)
   y &amp;lt;- newVector_ size 
   replicateM_ times $ x `getDot` y
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The Haskell program also checks that the lengths of x and y match, but the
overhead from this is only about a tenth of a second.  There was some grumbling
on haskell-cafe about the compiler annotations allowing the removal of the for
loop, but this doesn&amp;rsquo;t seem to be happening for me when I compile with -O2.&lt;/p&gt;

&lt;p&gt;The runtime from the C version is about 0.380 seconds, and from the Haskell
version it is about 4.345 seconds.  The discrepancy is 11.5 times worse
instead of 30, but still bad.  I claimed that the overhead does not grow with
the size of the vector, and indeed this seems to be the case.  When I increase
the size to 1024, the C version runs in about 15.883 seconds and the Haskell
version runs in about 19.900 seconds.&lt;/p&gt;

&lt;p&gt;Depending on the size of vectors you&amp;rsquo;re dealing with the Haskell performance is
either terrible or acceptable.  Still, I wondered if I could do better.&lt;/p&gt;

&lt;p&gt;The root of the inefficiency is the vector data type I used. &lt;a href="/blog/2008/06/12/blas-data-types"&gt;In my last post&lt;/a&gt;
I argued that it&amp;rsquo;s useful to make conjugating a vector be an O(1) operation. One
way to do this is to store a boolean flag &amp;ldquo;isConj&amp;rdquo; that indicates whether or not
the vector is conjugated.  If so, the values stored in memory are the complex
conjugates of the values in the vector.  Another way to do this is to define the
vector data type as&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;data DVector t n e =
       DV { fptr   :: !(ForeignPtr e)
          , offset :: !Int
          , len    :: !Int
          , stride :: !Int
          }
     | C !(DVector t n e)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In this representation, a vector of the form &amp;ldquo;DV f o l s&amp;rdquo; is a raw vector, and
&amp;ldquo;C x&amp;rdquo; is the conjugate of the vector x.  In the first version of the library, I
went with the second representation.  Originally, it was for aesthetic reasons,
but now I&amp;rsquo;m not so sure of the relative pretty-ness of the two approaches.  One
thing Wren Ng Thorton pointed out to me is that with the two representations are
not equivalent, since in the second you could have C(C(C(&amp;hellip;C(DV(&amp;hellip;))&amp;hellip;))) as
a legitamate value.&lt;/p&gt;

&lt;p&gt;When you take performance considerations into account, a boolean flag is a
clear winner over an algebraic data type.  When I switched to the first
representation and re-ran the benchmarks, I got 1.264 seconds for vectors of
size 10 and 16.839 seconds for vectors of size 1024.  The comparison I&amp;rsquo;ve done
between the two data representation isn&amp;rsquo;t perfect, because in the boolean flag
code I also incorporated some unboxing and inlining changes suggested by Don
Stewart.  Still, the biggest performance gains came from the data types.&lt;/p&gt;

&lt;p&gt;When you remove the length checking (by using &lt;code&gt;unsafeGetDot&lt;/code&gt; instead of
&lt;code&gt;getDot&lt;/code&gt;), the times for the update Haskell benchmarks are 1.095 seconds
and 16.682 seconds.  So, for ten million dot products, there is an overhead of
about 0.6 seconds for using Haskell instead of C, regardless of the vector
size.  This is pretty good.&lt;/p&gt;

&lt;p&gt;The next release of the bindings will incorporate these changes.  Thanks go to
everyone on haskell-cafe for their help, especially Anatoly for pointing out the
problem.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/ptrckprry/~4/O6HqwFr_2gU" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://ptrckprry.com/blog/2008/07/24/addressing-haskell-blas-performance-issues/</feedburner:origLink></entry>
  
  <entry>
    <title>BLAS Data Types</title>
    <link href="http://feedproxy.google.com/~r/ptrckprry/~3/VNfVLnORYIs/" />
    <updated>2008-06-12T00:00:00-07:00</updated>
    <id>tag:ptrckprry.com,2008-06-12:/blog/2008/06/12/blas-data-types/</id>
    <content type="html">&lt;p&gt;I&amp;rsquo;m going to shed a little light on what I meant when &lt;a href="/blog/2008/06/06/ann-blas-bindings-for-haskell-version-04/"&gt;in my last post&lt;/a&gt; I
said that &amp;ldquo;BLAS and LAPACK (mostly) support&amp;rdquo; an O(1) &lt;code&gt;herm&lt;/code&gt; operation for
matrices.&lt;/p&gt;

&lt;p&gt;First I need to tell you about the BLAS data types.  There are two fundamental
data types in BLAS: dense vectors and dense matrices.&lt;/p&gt;

&lt;h2&gt;Vectors&lt;/h2&gt;

&lt;p&gt;A &lt;em&gt;dense vector&lt;/em&gt; is represented as an array of non-contiguous values, with a
fixed stride between values.  In C, you need three things to represent a vector:
the length of the vector, a pointer to the first element in the vector, and an
integer stride.  The stride is required to be at least 1, and the length has to
be non-negative.  Length is usually represented by a variable called &lt;code&gt;n&lt;/code&gt;, the
pointer is usually called &lt;code&gt;x&lt;/code&gt; or &lt;code&gt;y&lt;/code&gt;, and the stride (or &amp;ldquo;increment&amp;rdquo;) is
named &lt;code&gt;incx&lt;/code&gt; for &lt;code&gt;x&lt;/code&gt; or &lt;code&gt;incy&lt;/code&gt; for &lt;code&gt;y&lt;/code&gt;.  So, you could have&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;double *data = { 1.6, 1.7, -3.1, -0.2, 2.6, 1.1 };
int n = 3;
double *x = data;
int incx = 2;
double *y = data + 1;
int incy = 1;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then, x would be a vector with elements (1.6, -3.1, 2.6), and y would be a
vector with elements (1.7, -3.1, -0.2).  Being able to have a non-unit stride
for vectors turns out to be incredibly useful.&lt;/p&gt;

&lt;h2&gt;Matrices&lt;/h2&gt;

&lt;p&gt;Matrices in BLAS are slightly more complicated.  There are two variants:
&lt;em&gt;row-major&lt;/em&gt;, and &lt;em&gt;column-major&lt;/em&gt;.  For CBLAS both are supported equally well,
but most Fortran code (including LAPACK) assumes column-major.  Unfortunately,
a lot of C code stores matrices in row-major order.  The inconsistency turns
out not to be a very big deal for real-valued matrices, but it can cause
trouble for complex-valued ones.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;m going to stick with the convention that when I say &amp;ldquo;matrix&amp;rdquo; I mean
&amp;ldquo;column-major matrix&amp;rdquo;.  The important thing to remember is that elements in
the same column are stored contiguously, but elements in the same row are not.
(For row-major, the reverse is true.)&lt;/p&gt;

&lt;p&gt;A &lt;em&gt;dense matrix&lt;/em&gt; is represented by four numbers: the number of rows, the
number of columns, a pointer to the upper-left element, and the
&lt;em&gt;leading dimension&lt;/em&gt;.  The numbers of rows and columns are usually given by
&lt;code&gt;m&lt;/code&gt; and &lt;code&gt;n&lt;/code&gt;.  The pointer to the element is usually named &lt;code&gt;a&lt;/code&gt;, &lt;code&gt;b&lt;/code&gt;, or &lt;code&gt;c&lt;/code&gt;.
The leading dimension is named for the pointer it is associated with, and
would be &lt;code&gt;lda&lt;/code&gt; to go with &lt;code&gt;a&lt;/code&gt; or &lt;code&gt;ldb&lt;/code&gt; to go with &lt;code&gt;b&lt;/code&gt;.  What&amp;rsquo;s &lt;code&gt;lda&lt;/code&gt;?  This
is the stride between consecutive elements of the same row.  It must be
greater than or equal to &lt;code&gt;min(1,m)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;An example will clarify things a bit.  Let&amp;rsquo;s say we want to represent the
matrix &lt;code&gt;a = [ 1 4; 2 5; 3 6 ]&lt;/code&gt;.  In MATLAB notation, this is a 3-by-2 matrix;
the first column is (1, 2, 3), and the second column is (4, 5, 6).  In C, we
would have&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;double *data = { 1, 2, 3, 4, 5, 6 };
int m = 3, n = 2;
double *a = data
int lda = 3;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Since there is no gap in memory between the last element of the first column
and the first element of the second column, &lt;code&gt;lda&lt;/code&gt; is equal to &lt;code&gt;m&lt;/code&gt;.  Now, what
if we want to represent the submatrix &lt;code&gt;b = a(1:2,1:2)&lt;/code&gt;?  This is easy:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;int mb = 2, nb = 2;
double *b = a;
int ldb = 3;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We can also similarly represent &lt;code&gt;c = a(2:3,1:2)&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;int mc = 2, nc = 2;
double *c = a + 1;
int ldc = 3;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In general we can represent any submatrix whose row and column indices are
contiguous.&lt;/p&gt;

&lt;p&gt;What else can we do with a matrix?  Well, because of the stride parameter, we
can easily represent a row of the matrix &lt;code&gt;a&lt;/code&gt; (it has stride &lt;code&gt;lda&lt;/code&gt;), or a
column (it has stride &lt;code&gt;1&lt;/code&gt;).  We can also represent a diagonal of &lt;code&gt;a&lt;/code&gt; using
stride &lt;code&gt;lda+1&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;Data Types&lt;/h2&gt;

&lt;p&gt;Now, what did I mean when I said that BLAS &amp;ldquo;supported&amp;rdquo; an O(1) herm operation?
First of all, notice that I have been a little vague when I&amp;rsquo;ve given the
definitions for the vector and matrix data types.  Probably most of you were
expecting me to introduce a &lt;code&gt;struct&lt;/code&gt; at some point.  Sadly, BLAS does not
define any new formal types.  BLAS only defines functions.  The &amp;ldquo;types&amp;rdquo; I
talked about above are really just conventions that all of the functions
adhere to.  So, here&amp;rsquo;s the function signature for &lt;code&gt;daxpy&lt;/code&gt;, the function that
performs the operation &lt;code&gt;y := alpha * x + y&lt;/code&gt;, where &lt;code&gt;alpha&lt;/code&gt; is a scalar and
&lt;code&gt;x&lt;/code&gt; and &lt;code&gt;y&lt;/code&gt; are vectors:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;void cblas_daxpy (int n, double alpha, 
                  const double *x, int incx, 
                  double *y, int incy);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Only primitive types appear in the argument list.  The only way to tell that
the function operates on vectors is by looking at the names of the arguments
and reading the documentation.  This is annoying.  Let&amp;rsquo;s fix that by defining
our own vector type:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;typedef struct 
{
    double *data;
    int size;
    int stride;
    int is_conj;
} vector_t;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;(Ignore &lt;code&gt;is_conj&lt;/code&gt; for now; the rest should be self-explanatory).  Now, we can
simplify &lt;code&gt;daxpy&lt;/code&gt; a bit:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;void my_daxpy (double alpha, const vector_t *x, 
               vector_t *y);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can imagine that this will clean up a lot of our code.  Let&amp;rsquo;s see if we can
use the same trick for matrices using &lt;code&gt;dgemv&lt;/code&gt;, the function to multiply a matrix
by a vector, as an example.  Specifically, the function performs the operation
&lt;code&gt;y := alpha * op(A) * x + beta * y&lt;/code&gt; where &lt;code&gt;op&lt;/code&gt; is either &amp;ldquo;transpose&amp;rdquo;, &amp;ldquo;herm&amp;rdquo;, or &amp;ldquo;identity&amp;rdquo; (&amp;ldquo;herm&amp;rdquo; means conjugate transpose).&lt;/p&gt;

&lt;p&gt;Here is the signature:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;void cblas_dgemv (enum CBLAS_ORDER order,
                  enum CBLAS_TRANSPOSE transa, 
                  int m, int n,
                  double alpha, const double *a, int lda,
                  const double *x, int incx,
                  double beta, double *y, int incy);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;There are twelve parameters.  Barf.  Can we clean this up?  Absolutely.  Here&amp;rsquo;s
the vector type we are going to use:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;typedef struct
{
    double *data;
    int size1;
    int size2;
    int lda;
    int is_herm;
} matrix_t;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I have introduced a boolean field &lt;code&gt;is_herm&lt;/code&gt; to indicate whether the matrix has
been transposed and conjugated.  This eliminates &lt;code&gt;transa&lt;/code&gt; from the call
signature:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;void my_dgemv (double alpha, 
               const matrix_t *a, const vector_t *x,
               double beta, 
               const vector_t *y);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We have eliminated seven parameters from the call, and we have only sacrificed
a little bit of functionality.  We have gained the ability to do run-time
dimension checking for the arguments.&lt;/p&gt;

&lt;p&gt;The feature we have lost is the ability to use &amp;ldquo;transpose&amp;rdquo;.  We can only do
&amp;ldquo;identity&amp;rdquo; and &amp;ldquo;herm&amp;rdquo;.  Why did I use a boolean for &lt;code&gt;is_herm&lt;/code&gt; rather than the
more general &lt;code&gt;CBLAS_TRANSPOSE&lt;/code&gt; type?  Because now we can have a function&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;void make_herm (matrix_t *a);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;that takes the &lt;code&gt;herm&lt;/code&gt; of a matrix as on O(1) operation.  It would be nice to
have &lt;code&gt;make_trans&lt;/code&gt; be an O(1) operation, too.  Sadly, because BLAS does not support &amp;ldquo;conjugate&amp;rdquo; as the &lt;code&gt;op&lt;/code&gt; type in a multiplication, we can either have
&lt;code&gt;trans&lt;/code&gt; be O(1) or we can have &lt;code&gt;herm&lt;/code&gt; be O(1), but we cannot have both.  I think
giving up &lt;code&gt;trans&lt;/code&gt; is worth simplifying the interface.&lt;/p&gt;

&lt;p&gt;(Astute readers will notice that for &lt;code&gt;gemv&lt;/code&gt;, it is possible to get &lt;code&gt;conj(a)&lt;/code&gt; by using &amp;ldquo;row-major&amp;rdquo; as the order and &amp;ldquo;herm&amp;rdquo; as the transpose type.  This trick does not extend to &lt;code&gt;dgemm&lt;/code&gt;, the function that multiplies two matrices.)&lt;/p&gt;

&lt;p&gt;Now I have to come back to why &lt;code&gt;vector_t&lt;/code&gt; has an &lt;code&gt;is_conj&lt;/code&gt; field.  This is
necessary if we want getting a row or a column of a matrix to be an O(1)
operation for &amp;ldquo;herm-ed&amp;rdquo; matrices.&lt;/p&gt;

&lt;h2&gt;Extending BLAS&lt;/h2&gt;

&lt;p&gt;BLAS &lt;em&gt;almost&lt;/em&gt; supports the simplifications in the API we&amp;rsquo;ve made by introducing
&lt;code&gt;vector_t&lt;/code&gt; and &lt;code&gt;matrix_t&lt;/code&gt;.  We need the following functionality ourselves to
make the simplifications work.  Here are the functions we need to add:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;y := alpha * conj(x) + beta * y&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;y := alpha * op(A) * conj(x) + beta * y&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;conj(y) := alpha * op(A) * x + beta * conj(y)&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The second function is missing when &lt;code&gt;y&lt;/code&gt; has non-unit stride, and the third is
missing when &lt;code&gt;x&lt;/code&gt; has non-unit stride.  This is because we can always cast a
vector as a matrix when it is conjugated.  If the vector is not conjugated, we
can only perform the cast if the stride is one. (Columns are stored contiguously
for normal matrices, but not for herm-ed matrices.)  Once the vectors have been
casted to matrices, we can call &lt;code&gt;gemm&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;Summary&lt;/h2&gt;

&lt;p&gt;The BLAS API is painfully verbose.  By adding our own data types and giving up
the ability to take the transpose of the matrix, we can get a far simpler
interface with nearly all of the power.  The approach I describe here is the one
I use in the &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/blas"&gt;blas bindings for Haskell&lt;/a&gt;.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/ptrckprry/~4/VNfVLnORYIs" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://ptrckprry.com/blog/2008/06/12/blas-data-types/</feedburner:origLink></entry>
  
  <entry>
    <title>ANN: BLAS bindings for Haskell, version 0.4</title>
    <link href="http://feedproxy.google.com/~r/ptrckprry/~3/lpnLaB_CXf8/" />
    <updated>2008-06-06T00:00:00-07:00</updated>
    <id>tag:ptrckprry.com,2008-06-06:/blog/2008/06/06/ann-blas-bindings-for-haskell-version-04/</id>
    <content type="html">&lt;p&gt;I&amp;rsquo;ve written a set of bindings for the &lt;a href="http://www.netlib.org/blas/"&gt;BLAS&lt;/a&gt; linear algebra library, and I finally uploaded them to &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/blas"&gt;Hackage&lt;/a&gt; last night.  That was kind of the impetus for starting this blog: so there would be a place for me to make a formal announcement.&lt;/p&gt;

&lt;p&gt;Well, before I got a chance to make that announcement, I received the following e-mail:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;From: Alberto Ruiz
To: Patrick Perry
CC: haskell-cafe@haskell.org
Subject: Patrick Perry's BLAS package

Hello all,
I have just noticed that yesterday this fantastic package has been uploaded
to hackage.  We finally have a high quality library for numeric linear
algebra. This is very good news for the Haskell community.

Patrick, many thanks for your excellent work. Do you have similar plans
for LAPACK?
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I&amp;rsquo;m really happy that people seem to be interested in the library.  Alberto, in particular, is the primary author of &lt;a href="http://alberrto.googlepages.com/gslhaskell"&gt;hmatrix&lt;/a&gt;, another haskell linear algebra library (which I stole a few ideas from), so if he endorses it, that means a lot to me.&lt;/p&gt;

&lt;p&gt;So, Yet Another Linear Algebra Library?  I&amp;rsquo;ve already mentioned hmatrix.  There&amp;rsquo;s also another one called &lt;a href="http://www.cs.utah.edu/~hal/HBlas/index.html"&gt;HBlas&lt;/a&gt;.  Why would anyone want a third?  Here are my reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Support for both immutable and mutable types.&lt;/strong&gt;
Haskell tries to make you use immutable types as much as possible, and
indeed there is a very good reason for this, but sometimes you have a 100MB
matrix, and it just isn&amp;rsquo;t very practical to make a copy of it every time
you modify it.  &lt;em&gt;hmatrix&lt;/em&gt; only supports immutable types, and HBlas only
supports mutable ones.  I wanted both.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Access control via phantom types.&lt;/strong&gt;
When you have immutable and mutable types, it&amp;rsquo;s very annoying to have
separate functions for each type.  Do I want to have to call &amp;ldquo;numCols&amp;rdquo; for
immutable matrices and &amp;ldquo;getNumCols&amp;rdquo; for mutable ones, even though both
functions are pure, and both do exactly the same thing?  No.  If I want to
add an immutable matrix to a mutable one, to I want to first call
&lt;code&gt;unsafeThaw&lt;/code&gt; on the immutable one to cast it to be mutable?  No.  With the
phantom type trick, you can get around this insanity.  Jane Street Capital
has &lt;a href="http://ocaml.janestcapital.com/?q=node/11"&gt;a very good description&lt;/a&gt; of how this works.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Phantom types for matrix and vector shapes.&lt;/strong&gt;
This is a trick I learned from &lt;a href="http://www.haskell.org/haskellwiki/Darcs"&gt;darcs&lt;/a&gt;.  It means that the compiler can catch many dimension-mismatch mistakes.  So, for instance, a function like the following will not type-check.  (&lt;code&gt;&amp;lt;*&amp;gt;&lt;/code&gt; is the function to multiply a matrix by a vector.  Everything is ok if you replace &lt;code&gt;row&lt;/code&gt; by &lt;code&gt;col&lt;/code&gt;.)  This feature has caught a few bugs in my code.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt; foo :: (BLAS3 e) =&amp;gt; Matrix (m,n) e -&amp;gt; Matrix (n,k) e -&amp;gt; Int -&amp;gt; Vector m e
 foo a b i = let x = row b i in a &amp;lt;*&amp;gt; x
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Taking the conjugate transpose (&lt;code&gt;herm&lt;/code&gt;) of a matrix is an O(1)
operation.&lt;/strong&gt;
This is similar to &lt;em&gt;hmatrix&lt;/em&gt;, where taking the transpose is O(1).  As BLAS
and LAPACK (mostly) support this, it makes no sense to copy a matrix just
to work with the conjugate transpose.  Why conjugate transpose instead of
just transpose?  Because the former is a far more common operation.  This
is why the &lt;code&gt;'&lt;/code&gt; operator in MATLAB is conjugate transpose.  The drawback for
this feature is that BLAS and LAPACK do not support it everywhere.  In
particular, QR decomposition with pivoting is going to be a huge pain in
the ass to support for herm-ed matrices.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Support for triangular and hermitian views of matrices.&lt;/strong&gt;
This is a feature of BLAS that no one seems to support (not even MATLAB).
In addition to the &lt;code&gt;Matrix&lt;/code&gt; type, there are &lt;code&gt;Tri Matrix&lt;/code&gt; and &lt;code&gt;Herm Matrix&lt;/code&gt;
types that only refer to the upper- or lower-triangular part of the matrix.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Hopefully the features above are compelling enough to make people want to use the library.  These bindings have been a lot of work.  For me to come up with the feature list above, I&amp;rsquo;ve already gone through a few iterations of dramatic re-writes (hence the version number).  Of course, I always welcome suggestions for how to make it better.&lt;/p&gt;

&lt;p&gt;What&amp;rsquo;s next?  In the immediate future, I plan to add banded matrices.  I&amp;rsquo;ve already written a good chunk of code for this, but it isn&amp;rsquo;t very well tested, so I decided to leave it out of the release.  I&amp;rsquo;m also going to add permutation matrices.  I don&amp;rsquo;t have plans to add support for packed triangular matrices, but if someone else wanted to do that, I would be happy to include it.  The same goes for symmetric complex matrices.&lt;/p&gt;

&lt;p&gt;LAPACK support is on the horizon, but that may take awhile.  Also, I probably won&amp;rsquo;t do more than SVD, QR, and Cholesky, since those are all I need.  Expect a preliminary announcement by the end of the summer.&lt;/p&gt;

&lt;p&gt;This work would not have been possible without looking at the other excellent linear algebra libraries out there.  In particular the &lt;a href="http://www.gnu.org/software/gsl/"&gt;GNU Scientific Library&lt;/a&gt; was the basis for much of the design.  I also drew inspiration from hmatrix and the haskell array libraries.&lt;/p&gt;

&lt;p&gt;Thanks also to the folks at &lt;code&gt;#haskell&lt;/code&gt;.  You guys have been a lot of help.&lt;/p&gt;

&lt;p&gt;Please let me know if you have any success in using the library, and if you have any suggestions for how to make it better.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/ptrckprry/~4/lpnLaB_CXf8" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://ptrckprry.com/blog/2008/06/06/ann-blas-bindings-for-haskell-version-04/</feedburner:origLink></entry>
  
  <entry>
    <title>First Post</title>
    <link href="http://feedproxy.google.com/~r/ptrckprry/~3/N1o-kLf71cw/" />
    <updated>2008-06-05T00:00:00-07:00</updated>
    <id>tag:ptrckprry.com,2008-06-05:/blog/2008/06/05/first-post/</id>
    <content type="html">&lt;p&gt;Look out blog-o-sphere, here I come.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/ptrckprry/~4/N1o-kLf71cw" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://ptrckprry.com/blog/2008/06/05/first-post/</feedburner:origLink></entry>
  
 
</feed>
