Memoization in Haskell
When I was first learning Haskell, there were some imperative design patterns that I had trouble converting to functional style Haskell. Memoization is the first thing that comes to mind. I use memoization so much that it was pretty much trivial do in an imperative language like Java. However, I could not figure out how to do it in Haskell. In this blog post, I will discuss how you can memoize your functions in Haskell to boost performance in case you're ever in a situation where you want to use it. It turns out that memoization in Haskell is just as easy, if not easier, than memoization in an imperative language.
Before I talk about memoization in Haskell, I will discuss the traditional approach to memoizing a recursive function in an imperative language. It essentially boils down to these steps:
$$g(n, k) =
\left\{
\begin{array}{ll}
1 \quad &if \quad k = 0\\
g(n, k-1) \quad &if \quad n = k\\
g(n-1, k) + g(n, k-1) \quad &otherwise
\end{array}
\right. $$ Consider an implementation of this function in Java taken as a direct translation of the mathematical formulation:
We can still memoize function in Haskell, we just have to approach the problem differently. Here is a non-memoized implementation in Haskell:
Before I talk about memoization in Haskell, I will discuss the traditional approach to memoizing a recursive function in an imperative language. It essentially boils down to these steps:
- Query lookup table to see if result is computed; If so, return it
- Compute result recursively
- Store result in lookup table
- Return result
$$g(n, k) =
\left\{
\begin{array}{ll}
1 \quad &if \quad k = 0\\
g(n, k-1) \quad &if \quad n = k\\
g(n-1, k) + g(n, k-1) \quad &otherwise
\end{array}
\right. $$ Consider an implementation of this function in Java taken as a direct translation of the mathematical formulation:
public static long g(int n, int k) {
if(k == n+1) return 0;
if(k == 0) return 1;
return g(n-1, k) + g(k, n-1);
}
We could easily memoize this function using the steps described above:
public static Long[][] = new Long[100][100];
public static long g(int n, int k) {
if(k == n+1) return 0;
if(k == 0) return 1;
if(lookup[n][k] != null) return lookup[n][k];
lookup[n][k] = g(n-1, k) + g(k, n-1);
return lookup[n][k];
}
In Haskell, we have to approach the problem a little differently. There a few things that makes it difficult to directly translate from Java to Haskell:- There is no global state in Haskell, so it would be invalid to have a lookup table that has global visibility as it does in Java.
- In imperative languages, functions are defined as a sequence of steps, but in functional languages a function is more like a sequence of transformations on some input. The way we wrote memoization above is extremely imperative and doesn't translate well to Haskell.
We can still memoize function in Haskell, we just have to approach the problem differently. Here is a non-memoized implementation in Haskell:
g :: Int -> Int -> Integer
g _ 0 = 1
g n k
| n == k = g n (k-1)
| otherwise = g (n-1) k + g n (k-1)
We can memoize this function as follows:
import Data.Array
g :: Int -> Int -> Integer
g n k = table ! (n,k) where
bnds = ((0,0), (n,k))
table = listArray bnds . map g' $ range bnds
g' (_,0) = 1
g' (n,k)
| n == k = table ! (n,k-1)
| otherwise = table ! (n,k-1) + table ! (n-1,k)
The basic idea is that we define a lookup table to store all the values of the function. Thanks to Lazy Evaluation, it won't actually evaluate any elements of the lookup table until they are explicitly asked for with an indexing operation. The definition of g' then uses the values in the lookup table directly in it's definition. When table is first defined, the program allocates thunks for every element of the array. A thunk is basically a computation - either evaluated or unevaluated. If a computation is never used, it is never evaluated. And after a computation has been evaluated once, all successive references to the thunk are done without re-evaluating the computation.
This discussion wouldn't be complete without saying something about the Control.Monad.Memo
module on Hackage. This module contains a set of higher order functions that allow you to tag the parts of your code that you want to memoize. It's very easy to use and very powerful as well. It's nice because you can memoize just about any function in the same way, regardless of their inputs. The library is available on hackage. Here is a memoized implementation using the memoization module:
import Control.Monad.Memo
g :: Int -> Int -> Integer
g n k = startEvalMemo $ g' (n,k) wheree
g' (_,0) = return 1
g' (n,k)
| n == k = memo g' (n,k-1)
| otherwise = do a <- a="" b="" code="" g="" k-1="" k="" memo="" n-1="" n="" return="">->
If you are new to Haskell and curious how to do use memoization, hopefully this blog post cleared up some of your questions. For a more extensive overview of memoization in Haskell, you might want to visit the Haskell wiki page.
Comments
Post a Comment