Wednesday, March 30, 2011

The New Parallel Monad

So there's a new parallel monad recently out from some very major players in the Haskell scene.

Man, is it slick!  Using the factors code from the last post, it was so easy to write a program that parallelized the factorizations of different numbers.  My code took two command line arguments: the first number to factor, and the number of consecutive numbers to factor. The base version of the code was
getFactorStr :: Integer -> String
getFactorStr x = concat $ intersperse "," $
                     map show $ factors x

getFactorStrRange :: Integer -> Integer -> [String]
getFactorStrRange x y = map getFactorStr [x..y]

main = do
   args <- getArgs let start = read $ args !! 0
   let end = (start + (read $ args !! 1))
   mapM putStrLn $ getFactorStrRange start end
The parallel version isn't a lot more complicated:
getFactorStr' :: Integer -> Par String
getFactorStr' x = do
  calc <- spawn $ return $ concat $ intersperse "," $ 
             map show $ factors x
  v <- get calc
  return v 

getFactorStrRange' :: Integer -> Integer -> Par [String]
getFactorStrRange' x y = parMapM getFactorStr' [x..y]

main = do
  args <- getArgs let start = read $ args !! 0
  let end = (start + (read $ args !! 1))
  mapM putStrLn $ runPar $ getFactorStrRange' start end
Now I'm sure it took me longer than it would have taken a more experienced Haskell programmer, but whenever I got into trouble I just had to look at the types of things to find my way out. Like the baseball coach says, you've just got to keep your eye on the type.

And how did the timing work out?  On my quad-core Athlon II machine, the parallel version ran about 3.2x as fast as the sequential version (15.1 seconds vs 48.6 seconds).  This was a crude timing test -- I made no effort to control for what other processes on the system were doing. 

Playing Around with QuickCheck

So I saw the new parallel monad and I wanted to give it a spin.  A simple package that factors numbers seemed to be just the thing. It parallelizes nicely. More about that in the companion post -- this post is about my first steps with QuickCheck  I wanted to test this factor code:
factors :: Integer -> [Integer]
factors x
  | x < 2 = [1] | otherwise = sort $ factors' x 2 [1,x]

factors' :: Integer -> Integer -> [Integer] -> [Integer]
factors' x y zs
  | fromInteger y > (sqrt $ fromInteger x) = zs
  | x `mod` y == 0 && y*y /= x =
                 factors' x (y+1) (y : x `div` y : zs)
  | x `mod` y == 0 && y*y == x =
                 factors' x (y+1) (y : zs)
  | otherwise = factors' x (y+1) zs
The first lesson I learned along the way is that the result of a QuickCheck test is a Property, not a Bool; just a heads-up for other newbies. BTW, one of my favorite things about Real World Haskell is that it gives more than a passing thought to stuff like Cabal and QuickCheck. 

The test machinery was rather more involved than the code itself.  First I had to put together a list of prime numbers, then I could write code that would test factorization over primes and over products of two primes. 
sieve :: Integral a => [a] -> [a]
sieve (p:xs) = p : sieve [x | x <- xs, 0 /= x `mod` p] primes :: Integral a => [a]
primes = sieve [2..]

primesInteger = primes :: [Integer]
getPrime x = primesInteger !! (x `mod` 100)

buildAssocList :: Integral a => [a] -> [a] -> [a]
buildAssocList xs ys = sort $ nub [a*b|a <- xs,b <- ys]

propAssoc x y = x > 0 ==> y > 0 ==>
  factors (x*y) == buildAssocList (factors x)
                                  (factors y)

propPrime x = x > 0 ==>
  let prime = getPrime x in factors prime == [1,prime]

propSimple x y = x > 0 ==> y > 0 ==>
  let primeX = getPrime x
      primeY = getPrime y
  in (factors $ primeX * primeY) ==
          (sort $ nub [1,primeX,primeY,primeX*primeY])
This is all straightforward.  Getting the final piece of test code took rather more time, but it refactored down to
testDepth = 8          -- actually N-1

check :: Integer -> [Integer] -> Integer -> [Integer] -> Bool
check 0 _ _ _ = True
check y xs n ns =
  let p = getPrime (1 + abs(fromIntegral $ head xs))
      ps = factors p
      p_list = build_assoc_list ns ps
  in factors (n*p) == p_list &&
     check (y-1) (tail xs) (n*p) (p_list)

prop_complex z zs = z > 0 &&
     (fromIntegral $ length zs) >=
         (fromIntegral (z `mod` testDepth))   ==>
  check (z `mod` testDepth) zs 1 [1]
This works, though I've got to say that the prop_complex check on the length of zs is not exactly elegant.   I'd love to hear comments on how to make it more idiomatic Haskell (and that applies to all the code).

Friday, March 18, 2011

Real World Haskell on Functors

I've just hit the "Introducing Functors" section.  Though my initial love affair with the book has worn off just a little bit, I'm still liking it a lot.

The following excerpt on Functors is a really nice explanation:

class Functor f where
   fmap :: (a -> b) -> f a -> f b

We can think of fmap as a kind of lifting function, as we introduced in "Avoiding Boilerplate with Lifting" on page 223. It takes a function over ordinary values a -> b, and lifts it to become a function over containers f a -> f b, where f is the container type.

I think that viewing f as a container type is a great introduction to Functors.  I find it more accessible than the other explanation I've seen, which I talk about in this post.  Not that the other way isn't useful in it's own right, but I think I would have picked things up faster if I had seen this explanation first.

Tuesday, March 1, 2011

Two days into Real World Haskell . . .

and I'm loving it already!

Topics that others explain incompletely, this book covers well the first time.  Example: the indentation rule.  LYAH, much as I love it, lets you know that indentation is important and what common practice is, but leaves you without a firm grasp on all the "what ifs."  Real World Haskell lays it all out there the first time but without taking a lot of space to do it.

Having already worked my way completely through LYAH and "Write Yourself a Scheme", and written a few scripts in Haskell, I was a little tempted to skip or skim through the first few chapters of RWH.  I'm glad I didn't; there's lots of great insights in that material. 

So I've just reached the first chapter where they're really digging in to some "real-world" code: a simple JSON library.  My biggest gripe so far has been that exercises have been a little scarce, and they come at the end of the end of a section, textbook-style.  I prefer exercises integrated into the text:  they strike while the iron is hot.  See this lambda calculus introduction for an example.