Patrick O. Perry

Network Science Reading List

2010-11-08T00:00:00+00:00

Network science is a huge field spanning many disciplines; for newcomers, it is to know where to start. What follows is an incomplete list of network science papers I found to be interesting, organized by topic.

Exponential Random Graph Models

ERGMs are the most widely-used network models in the social sciences. They model relational data through statistics like the numbers of triangles and k-star subgraphs. Unfortunately, they are difficult to fit and interpret.

Holland, P. W., and Leinhardt, S. (1981), “An Exponential Family of Probability Distributions for Directed Graphs,” J. Am. Stat. Assoc., 76, 33-50
Anderson, C. J., Wasserman S., and Crouch, B. (1999), “A p* Primer: Logit Models for Social Networks,” Soc. Networks, 21, 37-66
Snijders, T. A. B. (2002), “Markov Chain Monte Carlo Estimation of Exponential Random Graph Models,” J. Soc. Struct., 3, 1-40
Handcock, M. S. (2003), “Assessing Degeneracy in Statistical Models of Social Networks,” Working paper no. 39, Center for Statistics and the Social Sciences, University of Washington-Seattle

Latent Space Models

Latent space models are an alternative to ERGMs which get around dyadic dependence by positing existence of latent covariates. Since their introduction in 2002, they have been extended to include clustering and degree heterogeneity. Beware that these models impose a triangle inequality on social space, which may not be appropriate.

Hoff, P. D., Raftery, A. E., and Handcock, M. S. (2002), “Latent Space Approaches to Social Network Analysis,” J. Am. Stat. Assoc., 97, 1090–1098
Handcock, M. S., Raftery, A. E., and Tantrum, J. M. (2007), “Model-Based Clustering for Social Networks,” J. R. Statist. Soc. A, 170, 301-354
Krivitsky, P. N., Handcock, M. S., Raftery, A. E., and Hoff, P. D. (2009), “Representing Degree Distributions, Clustering, and Homophily in Social Networks with Latent Cluster Random Effects Models,” Soc. Networks, 31, 204-213

Block Models

Block models are another class of network models involving latent variables. While work in the 80s assumed the block structure to be known, the current approach is to assume each node belongs to an unknown class, and the node’s behavior is determined by its class membership. Bickell and Chen have shown it is possible to recover the unknown class labels if the network is big enough.

Holland, P. W., Laskey, K. B., and Leinhardt, S. (1983), “Stochastic Blockmodels: First Steps,” Soc. Networks, 5, 109–137
Airoldi, E. M., Blei, D. M., Feinberg, S. E., and Xing, E. P. (2008), “Mixed Membership Stochastic Blockmodels,” J. Mach. Learn. Res., 9, 1981-2014
Bickell, P. and Chen, A. (2009), “A Nonparametric View of Network Models and Newman-Girvan and Other Modularities,” P. Natl. Acad. Sci., 106, 21068–21073

Agent-Based Models

Agent-based models are similar in spirit to latent space models (network dynamics arise from pairwise behavior) while still keeping some of the attractive features of ERGMs (explicit transitivity or hub/spoke behavior).

Jackson, M. O. and Wolinsky, A. (1996), “A Strategic Model of Social and Economic Networks,” J. Econ. Theory., 71, 44-74
Snijders, T. A. B., Van de Bunt, G. V., and Steglich, C. E. G. (2010), “Introduction to Stochastic Actor-Based Models for Network Dynamics,” Soc. Networks, 32, 44–60

Community Detection

Community detection in networks is like clustering in traditional data analysis. For some reason, this has received a lot of attention, especially in the physics community. This seems like a fad, but it’s worth knowing about.

Newman, M. E. J. (2006), “Modularity and Community Structure in Networks,” P. Natl. Acad. Sci., 103, 8577-8582
Leskovec, J., Lang, K. J., Dasgupta A., and Mahoney, M. W. (2009), “Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters,” Internet Mathematics, 6, 29-123

Sampling

Sampling and missing data issues are extremely important, but they largely get ignored. Mostly, this is because they give rise to really hard problems. Often theoretical results are negative–in particular, many have attacked respondent-driven sampling–but without constructive alternatives, it will be hard to advance the field.

Heckathorn, D. D. (1997), “Respondent-Driven Sampling: A New Approach to the Study of Hidden Populations,” Social Problems, 44, 174-199
Achlioptas, D., Clauset, A., Kempe, D., and Moore, C. (2009), “On the Bias of Traceroute Sampling: Or, Power-Law Degree Distributions in Regular Graphs,” J. ACM, 56, 1-28
Handcock, M. S. and Gile, K. J. (2010), “Modeling Social Networks from Sampled Data,” Ann. Appl. Stat., 4, 5-25

Applications

The dirty secret of network science is that the hype is disproportionate to the scientific impact. Below are two of the more important application-driven results. The Christakis and Fowler (2007) paper in particular generated significant attention, both positive and negative.

Morris, M. (1997), “Concurrent Partnerships and the Spread of HIV,” AIDS, 11, 641-648
Christakis, N. A. and Fowler, J. H. (2007), “The Spread of Obesity in a Large Social Network over 32 Years,” New Engl. J. Med., 357, 370-379

Blog Move

2009-10-29T00:00:00+00:00

Here’s a quick heads for anyone who is still following this blog. You may have noticed that posting has become stagnant lately. The main reason for this is have been busy finishing my PhD dissertation, and didn’t work at all on side projects. I’ve also been deterred by WordPress (my blog software) being a bit unwieldy. The first problem has been solved; I am a doctor now. The second problem will be solved, when I move my site over to Jekyll.

The downside of the transition to Jekyll is that the RSS feed might break. Also, I might be lazy and break some of the links to the blog posts. I apologize to all 20 of you subscribers.

The upside of all of this is that I hope to be updating more frequently in the future.

Poetic Writing

2009-07-22T00:00:00+00:00

Technical writing in 1974 was a lot more poetic than it is now. From F. Downton’s discussion of Stone’s “Cross-Validatory Choice and Assessment of Statistical Predictions”:

A current nine-day wonder in the press concerns the exploits of a Mr. Uri Geller who appears to be able to bend metal objects without touching them; Professor Stone seems to be attempting to bend statistics without touching them. My attitude to both of these phenomena is one of open-minded scepticism; I do not believe in either of these prestigious activities, on the other hand they both deserve serious scientific examination.

Also enjoyable is Stone’s extended analogy:

[I]t is reasonable to enquire how one arrives at a prescription in any particular problem. A tentative answer is that, like a doctor with his patient, the statistician with his client must write his prescription only after careful consideration of the reasonable choices… Just as the doctor should be prepared for side-effects, so the statistician should monitor and check the execution of the prescription for any unexpected complications… A prescription is neither true nor false; it is better to say that, in a broad sense, it either succeeds or fails.

You just don’t see this kind of writing very often in current Statistics writing.

New Haskell BLAS bindings!

2009-01-09T00:00:00+00:00

I just uploaded version 0.7 of the Haskell BLAS bindings! This is a major milestone– it is finally the library with all of the features that I want.

Here’s what the library is:

It provides basic data types necessary for doing linear algebra. There are dense vectors and matrices, banded matrices, triangular and Hermitian dense matrices, and triangular and Hermitian banded matrices.
It provides mutable types and operations that mutate them in either the ST or the IO monad.
It maintains a clean distinction in the monadic functions between arguments that get mutated and arguments that get read. This allows passing immutable objects without calls to “unsafeThaw” everywhere.
It gives a convenient interface to the Fortran BLAS functions. The emphasis here is on convenient. When using the Fortran functions directly, “dgemm”, the function that multiplies a dense matrix by another dense matrix, takes no fewer than 13 arguments. In the GNU Scientific Library, the binding for the function takes 7 arguments. The Haskell version takes 5.
It uses BLAS calls internally to perform elementwise vector addition, subtraction, multiplication, and division, so these operations should be very efficient.
It gives nearly complete access to all of the functionality in the Fortran libraries. The only missing functions for the dense matrix types are the are the matrix updating functions (ger, syr, etc.). Support for packed storage of triangular matrices is absent. Either of these would be a good project for anyone interested. Neither would be too difficult.
Other than the matrix updated and the packed storage (and one other small thing, which I will talk about below), anything you can do in Fortran can be done in Haskell. Since most of the computation time in a numerical routine is in the floating point operations, this means that in principle you can write code in Haskell that will be just as fast the equivalent Fortran, at least for moderately-sized inputs.

Here’s what the library is not:

It will never provide support for multiplication by the transpose of a complex matrix without making a copy. This is the missing feature hinted at above. Even though Fortran BLAS supports this, it is fundamentally impossible for the Haskell bindings. It is debatable how important this is, since multiplying by the conjugate of the transpose is supported.
It is not a general array library. The only supported element types are Double and Complex Double. This will likely never change. If you want something general, use array, carray, uvector, or one of the thousand other Haskell array libraries.
It is not a full-featured linear algebra library. You cannot compute an eigenvalue. You cannot perform a matrix decomposition. You cannot solve a linear system. This is a library for writing a full-featured linear algebra library. It is no good alone for doing anything substantial.

Interested in hacking? Please do. There is a list of project ideas in the TODO FILE. If any of these sounds worthwhile and you would like to work on it, let me know and I will give you some guidance.

Now that that’s out of the way, I can (finally) start LAPACK bindings.

Monte Carlo Poker Odds

2008-12-31T00:00:00+00:00

There’s a new version of monte-carlo, the Monte Carlo monad and transformer I wrote for Haskell. The highlights of the release are a MonadMC typeclass and functions for sampling from general discrete distributions. This post gives a demonstration of the library by showing how to estimate poker odds via Monte Carlo simulation.

The goal of the program will be to estimate the distribution of poker hands from dealing five cards out of a well-shuffled deck of fifty-two cards. For reference, Wikipedia gives the probabilities of the poker hands. I’m not going to bother distinguishing between a royal flush and a straight flush.

In the program below, we will need to import the following headers

import Control.Monad
import Control.Monad.MC
import Data.List
import Data.Map( Map )
import qualified Data.Map as Map
import System.Environment
import Text.Printf

The second of these is part of the monte-carlo package.

Poker Functions

In Haskell, we need to define types for cards and functions for classifying hands. First, we define a card:

data Suit = Club | Diamond  | Heart | Spade deriving (Eq, Show)
data Card = Card { number :: Int 
                 , suit   :: Suit
                 }
          deriving (Eq, Show)

Here are the numerical values for the face cards,

ace, jack, queen, king :: Int
ace   = 1
jack  = 11
queen = 12
king  = 13

and here is how we get a complete deck of cards

deck :: [Card]
deck = [ Card i s | i <- [ 1..13 ],
                    s <- [ Club, Diamond, Heart, Spade ] ]

Next, we enumerate the different hands, and define a function that takes a list of five cards and tells us what hand it is

data Hand = HighCard  | Pair | TwoPair | ThreeOfAKind | Straight
          | Flush | FullHouse | FourOfAKind | StraightFlush 
          deriving (Eq, Show, Ord)

hand :: [Card] -> Hand
hand cs = 
  case matches of 
    [1,1,1,1,1] -> case undefined of
                     _ | isStraight && isFlush -> StraightFlush
                     _ | isFlush               -> Flush
                     _ | isStraight            -> Straight
                     _ | otherwise             -> HighCard
    [1,1,1,2]                                  -> Pair
    [1,2,2]                                    -> TwoPair
    [1,1,3]                                    -> ThreeOfAKind
    [2,3]                                      -> FullHouse
    [1,4]                                      -> FourOfAKind
  where
    (x:xs) = (sort . map number) cs
    (s:ss) = map suit cs
    
    isStraight | x == ace && xs == [ 10..king ] = True
               | otherwise                      = xs == [ x+1..x+4 ]

    isFlush = all (== s) ss

    matches = (sort . map length . group) (x:xs)

The only tricky part is the special handling of the ace in testing for a straight.

Monte Carlo Functions

To choose a random five-card hand, we use the sampleSubset function from Control.Monad.MC, which has type

sampleSubset :: (MonadMC m) => Int -> Int -> [a] -> m [a]

We give as parameters the subset size, the collection size, and a list of the collection elements. So, to get a five-card hand from a deck of fifty-two cards, we define a function deal as

deal :: (MonadMC m) => m [Card]
deal = sampleSubset 5 52 deck

The signature of the function looks a little strange, because it is polymorphic in the monad type. We could have written the signature as deal :: MC [Card]. However, by using the more general signature, we can use the function with either the MC monad or with MCT, the monad transormer version.

The bulk of the work in the simulation gets performed by the repeatMCWith function, which has signature

repeatMCWith :: (MonadMC m)
             => (a -> b -> a) -- ^ accumulator
             -> a             -- ^ initial value
             -> Int           -- ^ number of repetitions
             -> m b           -- ^ generator
             -> m a

This function is an analogue of foldl. It repeats a Monte Carlo action a specified number of times and accumulates the results. To tally up the counts of all of the hands, we define

type HandCounts = Map Hand Int

emptyCounts :: HandCounts
emptyCounts = Map.empty

updateCounts :: HandCounts -> [Card] -> HandCounts
updateCounts counts cs = Map.insertWith' (+) (hand cs) 1 counts

Then, we use these functions in combination with deal and reapeatMCWith to estimate the probabilities of all of the hands. Here is the main function we use

main = do
  [reps] <- map read `fmap` getArgs
  main' reps

main' reps =
  let seed   = 0
      counts = repeatMCWith updateCounts emptyCounts reps deal
               `evalMC` mt19937 seed in do
  printf "\n"
  printf "    Hand       Count    Probability     99%% Interval   \n"
  printf "-------------------------------------------------------\n"
  forM_ ((reverse . Map.toAscList) counts) $ \(h,c) ->
      let n     = fromIntegral reps :: Double
          p     = fromIntegral c / n 
          se    = sqrt (p * (1 - p) / n)
          delta = 2.575829 * se
          (l,u) = (p-delta, p+delta) in
      printf "%-13s %7d    %.6f   (%.6f,%.6f)\n" (show h) c p l u
  printf "\n"

Results & Discussion

Here are the results from running the simulation with one million repititions:

    Hand       Count    Probability     99% Interval   
-------------------------------------------------------
StraightFlush      12    0.000012   (0.000003,0.000021)
FourOfAKind       224    0.000224   (0.000185,0.000263)
FullHouse        1452    0.001452   (0.001354,0.001550)
Flush            1908    0.001908   (0.001796,0.002020)
Straight         3980    0.003980   (0.003818,0.004142)
ThreeOfAKind    21341    0.021341   (0.020969,0.021713)
TwoPair         47480    0.047480   (0.046932,0.048028)
Pair           423785    0.423785   (0.422512,0.425058)
HighCard       499818    0.499818   (0.498530,0.501106)

On my machine it takes about six seconds for the simulation to run. We can see that all of the intervals contain the true answer. However, the relative accuracty of the uncommon hands is not very good. In a later post, I’ll show how to use Importance Sampling to get better estimates of the probabilities for the rare hands.

Thank you to Aditya Majahan for suggesting the inclusion of reapeatMC in the library. Please send any more usage reports or feature requests my way.

ANN: BLAS Bindings for Haskell, version 0.6

2008-10-31T00:00:00+00:00

There’s a new version of the BLAS bindings out. I put a lot of work into it and did a pretty massive overhaul of the code. The highlight of the new release is that now you can do operations in the ST monad. Also, I fixed a lot of organizational issues (no more orphan instances!), and cleaned up the interface a bit. There are a few performance improvements, too (I shaved half second off that old benchmark). The downside is that I completely broke backwards compatibility, but since as far as I can tell I only have two users, I’m not too worried about that.

People have been clamoring for a tutorial, but unfortunately I still don’t have time. My Orals are in ten days, and this stuff is not really core to my research. Maybe in a few months I’ll do something.

In the mean time, I did manage to come up with some sample code. Here’s a Fortan90 routine for recursively computing an LU decomposition with row pivoting, taken from Jack Dongarra and Piotr Luszczek’s chapter (PDF) in Beautiful Code:

recursive subroutine rdgetrf(m, n, a, lda, ipiv, info) 
 implicit none 
 
 integer, intent(in) :: m, n, lda 
 double precision, intent(inout) :: a(lda,*) 
 integer, intent(out) :: ipiv(*) 
 integer, intent(out) :: info 
 
 integer :: mn, nleft, nright, i 
 double precision :: tmp 
 
 double precision :: pone, negone, zero 
 parameter (pone=1.0d0) 
 parameter (negone=-1.0d0) 
 parameter (zero=0.0d0) 
 
 intrinsic min
 integer idamax 
 external dgemm, dtrsm, dlaswp, idamax, dscal 
 
 mn = min(m, n) 
  
 if (mn .gt. 1) then 
    nleft = mn / 2 
    nright = n - nleft 
   
    call rdgetrf(m, nleft, a, lda, ipiv, info) 
   
    if (info .ne. 0) return 
    call dlaswp(nright, a(1, nleft+1), lda, 1, nleft, ipiv, 1) 
   
    call dtrsm('L', 'L', 'N', 'U', nleft, nright, pone, a, lda, 
$        a(1, nleft+1), lda) 

    call dgemm('N', 'N', m-nleft, nright, nleft, negone, 
$        a(nleft+1,1) , lda, a(1, nleft+1), lda, pone, 
$        a(nleft+1, nleft+1), lda) 
    
    call rdgetrf(m - nleft, nright, a(nleft+1, nleft+1), lda, 
$        ipiv(nleft+1), info)
    if (info .ne. 0) then 
       info = info + nleft 
       return 
    end if 

    do i = nleft+1, m 
       ipiv(i) = ipiv(i) + nleft 
    end do 
   
    call dlaswp(nleft, a, lda, nleft+1, mn, ipiv, 1) 

 else if (mn .eq. 1) then 
    i = idamax(m, a, 1) 
    ipiv(1) = i 
    tmp = a(i, 1) 

    if (tmp .ne. zero .and. tmp .ne. -zero) then 
       call dscal(m, pone/tmp, a, 1) 
   
       a(i,1) = a(1,1) 
       a(1,1) = tmp 
    else 
       info = 1 
    end if 
   
 end if 

 return 
 end 

Here’s the same program in Haskell, using version 0.6 of the blas bindings:

module LU ( luFactorize ) where

import BLAS.Elem( BLAS3 )
import Control.Monad
import Control.Monad.ST
import Data.Matrix.Dense
import Data.Matrix.Dense.ST
import Data.Matrix.Tri
import Data.Vector.Dense.ST

luFactorize :: (BLAS3 e) => STMatrix s (m,n) e -> ST s (Either Int [Int])
luFactorize a
    | mn > 1 =
        let nleft = mn `div` 2
            (a_1, a_2) = splitColsAt nleft a
            (a11, a21) = splitRowsAt nleft a_1
            (a12, a22) = splitRowsAt nleft a_2
        in luFactorize a_1 >>=
               either (return . Left) (\pivots -> do
                   zipWithM_ (swapRows a_2) [0..] pivots
                   doSolveMat_ (lowerU a11) a12
                   doSApplyAddMat (-1) a21 a12 1 a22
                   luFactorize a22 >>=
                       either (return . Left . (nleft+)) (\pivots' -> do
                           zipWithM_ (swapRows a21) [0..] pivots'
                           return (Right $ pivots ++ map (nleft+) pivots')
                       )
               )
    | mn == 1 = 
        let x = colView a 0
        in getWhichMaxAbs x >>= \(i,e) ->
            if (e /= 0) 
                then do
                    scaleBy (1/e) x
                    readElem x 0 >>= writeElem x i
                    writeElem x 0 e
                    return $ Right [i]
                else
                    return $ Left 0
    | otherwise =
        return (Right [])
  where
    (m,n) = shape a
    mn    = min m n

The Haskell version returns Left with a column index in the event of failure, and Right with a list of the pivot swaps in the case of success. It makes exactly the same BLAS calls as the Fortran90 version. The performance of the two versions should be close, especially for large inputs. (Anyone want to verify this?)

Some day it will be possible to do serious scientific computing in Haskell without much effort. This is one small step towards that goal.

A Monte Carlo Monad for Haskell

2008-08-26T00:00:00+00:00

I’ve just uploaded two new packages to Hackage. The first, gsl-random, is a set of bindings to the random number generators and random number distributions that come as part of the GNU Scientific Library. The next package, monte-carlo, is a monad and monad transformer for performing computations that require a random number generator, and is based on gsl-random. This post will give you a taste of what the MC Monte Carlo monad can do.

Introduction

For those unfamiliar with Monte Carlo, the basic idea is to use randomness to compute a nonrandom quantity. You generate a sequence of random variables X with mean equal to some quantity you care about, and then form a confidence interval for the quantity based on the mean and standard deviation of the simulated values.

Here’s an example that will make things a little more concrete. Suppose you want to compute pi. Say you have the ability to generate random points in the unit box [-1,1]^2. The box has an area of 2^2 = 4. The unit disc, which is contained completely in the box, has an area of pi. Therefore, the probability of X landing inside the unit circle is pi/4. So, if we generate a whole bunch of Xs, we should expect about pi/4 of them to land inside the unit circle. This means we can get an estimate of pi by generating a bunch of random points, counting how many land inside the circle, and then multiplying by 4.

Of course, the more points we take, the more accurate our answer will be. We can even get an approximate confidence interval for the true value by computing the standard deviation of the values.

Using the `MC` Monad

The monte-carlo package provides a monad, called MC, for performing Monte Carlo computations. The package exports a function for generating values uniformly over an interval:

uniform :: Double -> Double -> MC Double

This function takes the lower and upper bounds, and produces a value in that range. We can use this function to generate a value in the unit box:

unitBox :: MC (Double,Double)
unitBox = liftM2 (,) (uniform (-1) 1) 
                     (uniform (-1) 1)

We will need a function to test if a point lies inside the unit circle:

inUnitCircle :: (Double,Double) -> Bool
inUnitCircle (x,y) = x*x + y*y <= 1

Here is a function to generate n points and count how many fall inside the circle, and then to compute an estimate and standard error for pi based on n samples:

computePi :: Int -> MC (Double,Double)
computePi n = do
    xs <- replicateM n unitBox
    let m  = length $ filter inUnitCircle xs
        p  = toDouble m / toDouble n
        se = sqrt (p * (1 - p) / toDouble n)
    return (4*p, 4*se)

  where
    toDouble = realToFrac . toInteger

Running the Simulation

To get a value out of the MC monad, we must provide it with a random number generator. To get a Mersenne Twister generator, we use the mt19937 function. Then, evaluate the result with evalMC.

Here’s an example:

main = let
    n       = 10000000
    seed    = 0
    (mu,se) = evalMC (computePi n) $ mt19937 seed
    delta   = 2.575*se
    (l,u)   = (mu-delta, mu+delta)
    in do
        printf "Estimate:                   %g\n" mu
        printf "99%% Confidence Interval:    (%g,%g)\n" l u

It takes my 2GHz Macbook a little under five seconds to run the simulation with ten million samples. Here are the results I get:

Estimate:                   3.1414584
99% Confidence Interval:    (3.1401211162675353,3.1427956837324644)

Pretty cool, eh?

You might be wondering why I didn’t just use MonadRandom? There are two reasons: first, I didn’t want to write routines to generate random variables from different distributions (Normal, Exponential, Poisson, etc.). The GSL provides these for me. Second, internally the MC monad is not using a pure random number generator. It only keeps one copy of the generator state, and modifies it every time it samples a value.

ANN: BLAS Bindings for Haskell, version 0.5

2008-08-10T00:00:00+00:00

I’ve put together a new release of the Haskell BLAS bindings, now available on hackage.

Here are the new features:

Add Banded matrix data type, as well as Tri Banded and Herm Banded.
Add support for trapezoidal dense matrices (Tri Matrix (m,n) e, where m is not the same as n). Note that trapezoidal banded matrices are NOT supported.
Add Diag matrix data type for diagonal matrices.
Add Perm matrix data type, for permutation matrices.
Enhance the RMatrix and RSolve type classes with an API that allows specifying where to store the result of a computation.
Enhance the IMatrix, RMatrix, ISolve, and RSolve type classes to add “scale and multiply” operations.
Remove the scale parameter for Tri and Herm matrix data types.
Flatten the data types for DVector and DMatrix.
Some inlining and unpacking performance improvements.

As far as what to expect in version 0.6, I plan to add support for operations in the ST monad. This is going to require a pretty big code reorganization and will cause quite a few API breakages, but I think it’s worth the pain. The next release will also come with a tutorial and examples. After the big code reorganization, I’ll get started on LAPACK bindings.

Please let me know if you are using the library. I’m really interested in what people like and don’t like. If you think that some functionality is missing, let me know. If you think the API is awkward in certain places, let me know that, too.

Addressing Haskell BLAS Performance Issues

2008-07-24T00:00:00+00:00

Last month Anatoly Yakovenko started a thread on haskell-cafe about the Haskell blas bindings being much slower than using raw C. I was in Denver at the time on a drive across the United States, so I didn’t participate in much of the conversation. The conclusion seemed to be that the bindings were about thirty times slower than C. Ouch.

Here is a C program that computes ten million dot product between two vectors of doubles:

#include <cblas.h>
#include <stdlib.h>

int main() 
{
   int size  = 10;
   int times = 10*1000*1000;
   int i = 0;

   double *x = malloc( size*sizeof( double ) );
   double *y = malloc( size*sizeof( double ) );

   for( i = 0; i < times; ++i ) 
   {
      cblas_ddot( size, x, 1, y, 1 );
   }

   free( x );
   free( y );
   
   return 0;
}

There is no overhead for initializing the vector– just for allocating and freeing it. The equivalent Haskell program, using mutable vectors, is:

module Main where

import Control.Monad
import Data.Vector.Dense.IO

main = do
   let size  = 10
   let times = 10*1000*1000
   
   x <- newVector_ size :: IO (IOVector n Double)
   y <- newVector_ size 
   replicateM_ times $ x `getDot` y

The Haskell program also checks that the lengths of x and y match, but the overhead from this is only about a tenth of a second. There was some grumbling on haskell-cafe about the compiler annotations allowing the removal of the for loop, but this doesn’t seem to be happening for me when I compile with -O2.

The runtime from the C version is about 0.380 seconds, and from the Haskell version it is about 4.345 seconds. The discrepancy is 11.5 times worse instead of 30, but still bad. I claimed that the overhead does not grow with the size of the vector, and indeed this seems to be the case. When I increase the size to 1024, the C version runs in about 15.883 seconds and the Haskell version runs in about 19.900 seconds.

Depending on the size of vectors you’re dealing with the Haskell performance is either terrible or acceptable. Still, I wondered if I could do better.

The root of the inefficiency is the vector data type I used. In my last post I argued that it’s useful to make conjugating a vector be an O(1) operation. One way to do this is to store a boolean flag “isConj” that indicates whether or not the vector is conjugated. If so, the values stored in memory are the complex conjugates of the values in the vector. Another way to do this is to define the vector data type as

data DVector t n e =
       DV { fptr   :: !(ForeignPtr e)
          , offset :: !Int
          , len    :: !Int
          , stride :: !Int
          }
     | C !(DVector t n e)

In this representation, a vector of the form “DV f o l s” is a raw vector, and “C x” is the conjugate of the vector x. In the first version of the library, I went with the second representation. Originally, it was for aesthetic reasons, but now I’m not so sure of the relative pretty-ness of the two approaches. One thing Wren Ng Thorton pointed out to me is that with the two representations are not equivalent, since in the second you could have C(C(C(…C(DV(…))…))) as a legitamate value.

When you take performance considerations into account, a boolean flag is a clear winner over an algebraic data type. When I switched to the first representation and re-ran the benchmarks, I got 1.264 seconds for vectors of size 10 and 16.839 seconds for vectors of size 1024. The comparison I’ve done between the two data representation isn’t perfect, because in the boolean flag code I also incorporated some unboxing and inlining changes suggested by Don Stewart. Still, the biggest performance gains came from the data types.

When you remove the length checking (by using unsafeGetDot instead of getDot), the times for the update Haskell benchmarks are 1.095 seconds and 16.682 seconds. So, for ten million dot products, there is an overhead of about 0.6 seconds for using Haskell instead of C, regardless of the vector size. This is pretty good.

The next release of the bindings will incorporate these changes. Thanks go to everyone on haskell-cafe for their help, especially Anatoly for pointing out the problem.

BLAS Data Types

2008-06-12T00:00:00+00:00

I’m going to shed a little light on what I meant when in my last post I said that “BLAS and LAPACK (mostly) support” an O(1) herm operation for matrices.

First I need to tell you about the BLAS data types. There are two fundamental data types in BLAS: dense vectors and dense matrices.

Vectors

A dense vector is represented as an array of non-contiguous values, with a fixed stride between values. In C, you need three things to represent a vector: the length of the vector, a pointer to the first element in the vector, and an integer stride. The stride is required to be at least 1, and the length has to be non-negative. Length is usually represented by a variable called n, the pointer is usually called x or y, and the stride (or “increment”) is named incx for x or incy for y. So, you could have

double *data = { 1.6, 1.7, -3.1, -0.2, 2.6, 1.1 };
int n = 3;
double *x = data;
int incx = 2;
double *y = data + 1;
int incy = 1;

Then, x would be a vector with elements (1.6, -3.1, 2.6), and y would be a vector with elements (1.7, -3.1, -0.2). Being able to have a non-unit stride for vectors turns out to be incredibly useful.

Matrices

Matrices in BLAS are slightly more complicated. There are two variants: row-major, and column-major. For CBLAS both are supported equally well, but most Fortran code (including LAPACK) assumes column-major. Unfortunately, a lot of C code stores matrices in row-major order. The inconsistency turns out not to be a very big deal for real-valued matrices, but it can cause trouble for complex-valued ones.

I’m going to stick with the convention that when I say “matrix” I mean “column-major matrix”. The important thing to remember is that elements in the same column are stored contiguously, but elements in the same row are not. (For row-major, the reverse is true.)

A dense matrix is represented by four numbers: the number of rows, the number of columns, a pointer to the upper-left element, and the leading dimension. The numbers of rows and columns are usually given by m and n. The pointer to the element is usually named a, b, or c. The leading dimension is named for the pointer it is associated with, and would be lda to go with a or ldb to go with b. What’s lda? This is the stride between consecutive elements of the same row. It must be greater than or equal to min(1,m).

An example will clarify things a bit. Let’s say we want to represent the matrix a = [ 1 4; 2 5; 3 6 ]. In MATLAB notation, this is a 3-by-2 matrix; the first column is (1, 2, 3), and the second column is (4, 5, 6). In C, we would have

double *data = { 1, 2, 3, 4, 5, 6 };
int m = 3, n = 2;
double *a = data
int lda = 3;

Since there is no gap in memory between the last element of the first column and the first element of the second column, lda is equal to m. Now, what if we want to represent the submatrix b = a(1:2,1:2)? This is easy:

int mb = 2, nb = 2;
double *b = a;
int ldb = 3;

We can also similarly represent c = a(2:3,1:2):

int mc = 2, nc = 2;
double *c = a + 1;
int ldc = 3;

In general we can represent any submatrix whose row and column indices are contiguous.

What else can we do with a matrix? Well, because of the stride parameter, we can easily represent a row of the matrix a (it has stride lda), or a column (it has stride 1). We can also represent a diagonal of a using stride lda+1.

Data Types

Now, what did I mean when I said that BLAS “supported” an O(1) herm operation? First of all, notice that I have been a little vague when I’ve given the definitions for the vector and matrix data types. Probably most of you were expecting me to introduce a struct at some point. Sadly, BLAS does not define any new formal types. BLAS only defines functions. The “types” I talked about above are really just conventions that all of the functions adhere to. So, here’s the function signature for daxpy, the function that performs the operation y := alpha * x + y, where alpha is a scalar and x and y are vectors:

void cblas_daxpy (int n, double alpha, 
                  const double *x, int incx, 
                  double *y, int incy);

Only primitive types appear in the argument list. The only way to tell that the function operates on vectors is by looking at the names of the arguments and reading the documentation. This is annoying. Let’s fix that by defining our own vector type:

typedef struct 
{
    double *data;
    int size;
    int stride;
    int is_conj;
} vector_t;

(Ignore is_conj for now; the rest should be self-explanatory). Now, we can simplify daxpy a bit:

void my_daxpy (double alpha, const vector_t *x, 
               vector_t *y);

You can imagine that this will clean up a lot of our code. Let’s see if we can use the same trick for matrices using dgemv, the function to multiply a matrix by a vector, as an example. Specifically, the function performs the operation y := alpha * op(A) * x + beta * y where op is either “transpose”, “herm”, or “identity” (“herm” means conjugate transpose).

Here is the signature:

void cblas_dgemv (enum CBLAS_ORDER order,
                  enum CBLAS_TRANSPOSE transa, 
                  int m, int n,
                  double alpha, const double *a, int lda,
                  const double *x, int incx,
                  double beta, double *y, int incy);

There are twelve parameters. Barf. Can we clean this up? Absolutely. Here’s the vector type we are going to use:

typedef struct
{
    double *data;
    int size1;
    int size2;
    int lda;
    int is_herm;
} matrix_t;

I have introduced a boolean field is_herm to indicate whether the matrix has been transposed and conjugated. This eliminates transa from the call signature:

void my_dgemv (double alpha, 
               const matrix_t *a, const vector_t *x,
               double beta, 
               const vector_t *y);

We have eliminated seven parameters from the call, and we have only sacrificed a little bit of functionality. We have gained the ability to do run-time dimension checking for the arguments.

The feature we have lost is the ability to use “transpose”. We can only do “identity” and “herm”. Why did I use a boolean for is_herm rather than the more general CBLAS_TRANSPOSE type? Because now we can have a function

void make_herm (matrix_t *a);

that takes the herm of a matrix as on O(1) operation. It would be nice to have make_trans be an O(1) operation, too. Sadly, because BLAS does not support “conjugate” as the op type in a multiplication, we can either have trans be O(1) or we can have herm be O(1), but we cannot have both. I think giving up trans is worth simplifying the interface.

(Astute readers will notice that for gemv, it is possible to get conj(a) by using “row-major” as the order and “herm” as the transpose type. This trick does not extend to dgemm, the function that multiplies two matrices.)

Now I have to come back to why vector_t has an is_conj field. This is necessary if we want getting a row or a column of a matrix to be an O(1) operation for “herm-ed” matrices.

Extending BLAS

BLAS almost supports the simplifications in the API we’ve made by introducing vector_t and matrix_t. We need the following functionality ourselves to make the simplifications work. Here are the functions we need to add:

y := alpha * conj(x) + beta * y
y := alpha * op(A) * conj(x) + beta * y
conj(y) := alpha * op(A) * x + beta * conj(y)

The second function is missing when y has non-unit stride, and the third is missing when x has non-unit stride. This is because we can always cast a vector as a matrix when it is conjugated. If the vector is not conjugated, we can only perform the cast if the stride is one. (Columns are stored contiguously for normal matrices, but not for herm-ed matrices.) Once the vectors have been casted to matrices, we can call gemm.

Summary

The BLAS API is painfully verbose. By adding our own data types and giving up the ability to take the transpose of the matrix, we can get a far simpler interface with nearly all of the power. The approach I describe here is the one I use in the blas bindings for Haskell.

ANN: BLAS bindings for Haskell, version 0.4

2008-06-06T00:00:00+00:00

I’ve written a set of bindings for the BLAS linear algebra library, and I finally uploaded them to Hackage last night. That was kind of the impetus for starting this blog: so there would be a place for me to make a formal announcement.

Well, before I got a chance to make that announcement, I received the following e-mail:

From: Alberto Ruiz
To: Patrick Perry
CC: haskell-cafe@haskell.org
Subject: Patrick Perry's BLAS package

Hello all,
I have just noticed that yesterday this fantastic package has been uploaded
to hackage.  We finally have a high quality library for numeric linear
algebra. This is very good news for the Haskell community.

Patrick, many thanks for your excellent work. Do you have similar plans
for LAPACK?

I’m really happy that people seem to be interested in the library. Alberto, in particular, is the primary author of hmatrix, another haskell linear algebra library (which I stole a few ideas from), so if he endorses it, that means a lot to me.

So, Yet Another Linear Algebra Library? I’ve already mentioned hmatrix. There’s also another one called HBlas. Why would anyone want a third? Here are my reasons:

Support for both immutable and mutable types. Haskell tries to make you use immutable types as much as possible, and indeed there is a very good reason for this, but sometimes you have a 100MB matrix, and it just isn’t very practical to make a copy of it every time you modify it. hmatrix only supports immutable types, and HBlas only supports mutable ones. I wanted both.
Access control via phantom types. When you have immutable and mutable types, it’s very annoying to have separate functions for each type. Do I want to have to call “numCols” for immutable matrices and “getNumCols” for mutable ones, even though both functions are pure, and both do exactly the same thing? No. If I want to add an immutable matrix to a mutable one, to I want to first call unsafeThaw on the immutable one to cast it to be mutable? No. With the phantom type trick, you can get around this insanity. Jane Street Capital has a very good description of how this works.
Phantom types for matrix and vector shapes. This is a trick I learned from darcs. It means that the compiler can catch many dimension-mismatch mistakes. So, for instance, a function like the following will not type-check. (<*> is the function to multiply a matrix by a vector. Everything is ok if you replace row by col.) This feature has caught a few bugs in my code.
```
 foo :: (BLAS3 e) => Matrix (m,n) e -> Matrix (n,k) e -> Int -> Vector m e
 foo a b i = let x = row b i in a <*> x
```
Taking the conjugate transpose (herm) of a matrix is an O(1) operation. This is similar to hmatrix, where taking the transpose is O(1). As BLAS and LAPACK (mostly) support this, it makes no sense to copy a matrix just to work with the conjugate transpose. Why conjugate transpose instead of just transpose? Because the former is a far more common operation. This is why the ' operator in MATLAB is conjugate transpose. The drawback for this feature is that BLAS and LAPACK do not support it everywhere. In particular, QR decomposition with pivoting is going to be a huge pain in the ass to support for herm-ed matrices.
Support for triangular and hermitian views of matrices. This is a feature of BLAS that no one seems to support (not even MATLAB). In addition to the Matrix type, there are Tri Matrix and Herm Matrix types that only refer to the upper- or lower-triangular part of the matrix.

Hopefully the features above are compelling enough to make people want to use the library. These bindings have been a lot of work. For me to come up with the feature list above, I’ve already gone through a few iterations of dramatic re-writes (hence the version number). Of course, I always welcome suggestions for how to make it better.

What’s next? In the immediate future, I plan to add banded matrices. I’ve already written a good chunk of code for this, but it isn’t very well tested, so I decided to leave it out of the release. I’m also going to add permutation matrices. I don’t have plans to add support for packed triangular matrices, but if someone else wanted to do that, I would be happy to include it. The same goes for symmetric complex matrices.

LAPACK support is on the horizon, but that may take awhile. Also, I probably won’t do more than SVD, QR, and Cholesky, since those are all I need. Expect a preliminary announcement by the end of the summer.

This work would not have been possible without looking at the other excellent linear algebra libraries out there. In particular the GNU Scientific Library was the basis for much of the design. I also drew inspiration from hmatrix and the haskell array libraries.

Thanks also to the folks at #haskell. You guys have been a lot of help.

Please let me know if you have any success in using the library, and if you have any suggestions for how to make it better.

First Post

2008-06-05T00:00:00+00:00

Look out blog-o-sphere, here I come.