Non-Numeric Matrices 2

In my last blog post, I showed how matrices and matrix multiplication in particular can make sense for more than just numbers. In fact, I used a boolean matrix construct to find the connected components of a graph. In this blog post I will extend that idea to two more types of matrices:
  • "Distance" matrices to find the shortest distances between any two nodes
  • Path matrices to find the (shortest) list of actions that lead between any two states
The first type of matrix that I will talk about is what I call a 'Distance' matrix. This is not to be confused with how distance matrices are usually defined in mathematics and computer science (pairwise distance between points). Instead, my definition is more closely related to an adjacency matrix, where connected nodes have non-zero entries (given by the weight, or distance between the nodes), and unconnected nodes have a distance of 'X' to denote that there is no edge between the nodes. Using this newly defined construct, we can use matrix exponentiation to find all pairs of shortest paths. I will use Haskell to explore this idea.
data Distance = NA | Distance Int deriving(Eq)

instance Show Distance where
  show NA = "X"
  show (Distance x) = show x
In order to get matrix exponentiation to work for our new data type, we need to provide at least 3 definitions: fromInteger, (+), and (*). We only need IromInteger to be defined for '0' and '1' so the underlying matrix framework can construct a valid identity matrix. Futher, we are going to overwrite what it means to add and multiply to fit our new problem domain (all pairs shortest path). The elements of our Distance matrix are the distances between nodes in our graph (or 'X' if there is no edge between them). If we square our distance matrix, we want the resulting matrix to contain elements with the shortest distance between any two nodes (in two steps). If we can go from node 'A' to node 'B' in \( x \) distance and we can go from node 'B' to node 'C' in \( y \) distance, then clearly we can get from 'A' to 'C' in \( x + y \) distance through 'B'. Further, if we have a bunch of possible distances from 'A' to 'C' through arbitrary intermediate nodes, the actual distance is the minimum of that list. Thus, we want to define (*) to be addition, and (+) to be minimum. We also need some sensible notion for an identity matrix. What does it mean to be '0'? What does it mean to be '1'? Well, in math \( 0*t = 0 \) and \( 0 + t = t \). The element that satisfies this for us is 'X'; if there is no path between 'A' and 'B', then no matter what the distance between 'B' and 'C' is, there will never be a path between 'A' and 'C' through 'B'. Further, we will always favor the existence of a path over the non-existence of one, so 'X' + t = t no matter what t is. '1' has to obey a similar property: \( 1 * t = t \). What element obeys that law for us? Well, since (*) is actually addition for us, '1' is actually '0'. Here is the Haskell Num instance for this data type.
instance Num Distance where
  fromInteger 0 = NA
  fromInteger 1 = Distance 0
  NA + d = d
  d + NA = d
  Distance x + Distance y = Distance (min x y)
  NA * _ = NA
  _ * NA = NA
  Distance x * Distance y = Distance (x + y)
  abs = undefined
  signum = undefined
  negate = undefined
Now, we can easily create a distance matrix, iteratively multiply until convergence (or exponentiate it), and get a matrix containing the shortest distance between any two vertices.
Ryan> let matrix = ...
Ryan> matrix
(  0  X  X  X  1  X 93 97 )
(  X  0  7  X 79 14  X 27 )
(  X  X  0  X 36  X  X 43 )
(  X 91 39  0 98  X 95 35 )
( 71 13  4 98  0 32  X  X )
( 30  X 19  X 66  0  X 99 )
(  X 96 12 35 71  X  0  X )
(  X  X 69  X  X  X  X  0 )                            
Now we can simply exponentiate this matrix, and haskell will automatically know to use our custom defined (+) and (*) operators for the matrix multiplication.
Ryan> matrix*matrix
(   0  14   5  99   1  33  93  97 )
(  44   0   7 177  43  14   X  27 )
( 107  49   0 134  36  68   X  43 )
( 169  91  39   0  75 105  95  35 )
(  62  13   4  98   0  27 164  40 )
(  30  79  19 164  31   0 123  62 )
( 142  84  12  35  48 103   0  55 )
(   X   X  69   X 105   X   X   0 )

Ryan> matrix^8
(   0  14   5  99   1  28  93  41 )
(  44   0   7 141  43  14 137  27 )
(  93  49   0 134  36  63 186  43 )
( 132  88  39   0  75 102  95  35 )
(  57  13   4  98   0  27 150  40 )
(  30  44  19 129  31   0 123  62 )
( 105  61  12  35  48  75   0  55 )
( 162 118  69 203 105 132 255   0 )                           
So matrix \( ^8 \) yields the matrix containing the minimum distance between a any two vertices. Matrix multiplication is an \( O(n^3) \) algorithm, and matrix exponentiation requires a logarithmic number of matrix multiplication (also in terms of \( n \)), so the time complexity of this algorithm is \( O(n^3 log (n) \), which is slightly worse than the \( O(n^3) \) Floyd-Warshall algorithm.

Path Matrices

The last type of matrix I will talk about are what I call 'Path' matrices. This is the most complex of the three variations I've found, but we can reuse many of the same ideas from Boolean matrices and Distance matrices. The motivation for Path matrices is to be able to calculate a (shortest) path (a list of actions or transitions) between any two vertices in a graph. For simplicity, I will denote an action by the state that you end up in. Thus, the action '3' is an action that goes to state '3' regardless of the starting state. This can be changed depending on the context of the problem but this is the most obvious choice for this example. The elements of the matrix will contain the sequence of actions required to go from from one vertex to another. If no such sequence exists, then that element will contain some special value, which I will call 'None'. This leads to the following Haskell data definition:
type Action = Int
data Path = None | Path [Action] deriving (Show, Eq)
The elements of the matrix will be singleton lists (or 'None') where the elements of the list indicates the sequence of actions required to go from one vertex to another. This might seem confusing, so here is an example to clear things up:
(  Path [] Path [2] Path [3]     None     None )
(     None  Path [] Path [3] Path [4]     None )
(     None Path [2]  Path [] Path [4] Path [5] )
( Path [1]     None Path [3]  Path [] Path [5] )
(     None     None     None Path [4]  Path [] )
We want to define multiplication and addition in such a way that when exponentiated, the resulting matrix yields the shortest list of actions required to move between any two states. If \( [a_1, a_2 .. a_n] \) is a list of actions that result in moving from 'A' to 'B' and \( [b_1, b_2 .. b_m] \) is a list of actions that result in moving from 'B' to 'C', then \( [a_1, a_2 .. a_n, b_1, b_2 .. b_m] \) is a list of actions that would go from 'A' to 'C'. Thus, multiplication should be defined as list concatenation. Further, given two potential paths from 'A' to 'C' we will always favor the shorter one, so we can define addition of two lists to be the one shorter in length. The last thing we need to specify is what our '0' and '1' elements are so we can build an identity matrix. In this case, '0' = 'None' since None * x = None and None + x = x. Further, '1' = 'Path []' since Path [] * x = x. This leads to the following Num instance in Haskell:
instance Num Path where
  fromInteger 0 = None
  fromInteger 1 = Path []
  None + p = p
  p + None = p
  Path xs + Path ys = if length xs <= length ys then Path xs else Path ys
  None * _ = None
  _ * None = None
  Path xs * Path ys = Path $ xs ++ ys
  abs = undefined
  signum = undefined
  negate = undefined
Now we are ready to find the shortest list of actions between any two vertices.
Ryan> matrix
(  Path [] Path [2] Path [3]     None     None )
(     None  Path [] Path [3] Path [4]     None )
(     None Path [2]  Path [] Path [4] Path [5] )
( Path [1]     None Path [3]  Path [] Path [5] )
(     None     None     None Path [4]  Path [] )

Ryan> matrix^2
(    Path []   Path [2]   Path [3] Path [3,4] Path [3,5] )
( Path [4,1]    Path []   Path [3]   Path [4] Path [4,5] )
( Path [4,1]   Path [2]    Path []   Path [4]   Path [5] )
(   Path [1] Path [3,2]   Path [3]    Path []   Path [5] )
( Path [4,1]       None Path [4,3]   Path [4]    Path [] )

Ryan> matrix^5  
(      Path []     Path [2]     Path [3]   Path [3,4]   Path [3,5] )
(   Path [4,1]      Path []     Path [3]     Path [4]   Path [4,5] )
(   Path [4,1]     Path [2]      Path []     Path [4]     Path [5] )
(     Path [1]   Path [3,2]     Path [3]      Path []     Path [5] )
(   Path [4,1] Path [4,3,2]   Path [4,3]     Path [4]      Path [] )
Hopefully the matrices above make sense. 'matrix' is the original matrix. 'matrix \( ^2 \)' is the matrix where the elements are lists of actions that can be completed in \( 2 \) or less steps. 'matrix \( ^5 \)' is the final matrix that is guaranteed to have every path filled in if it exists. For example, to go from vertex \( 1 \) to vertex \( 5 \) (top right corner of matrix), we will follow the path: \( 1 \rightarrow 3 \rightarrow 5 \). Similarly, to go from vertex \( 5 \) to \( 1 \), we will take the path: \( 5 \rightarrow 4 \rightarrow 1 \).

Concluding Remarks

These algorithms aren't very useful as there are more efficient ways to solve all of these problems. That being said, I still think it is pretty neat that matrix multiplication can be applied in some many different and unexpected ways. I also think it's awesome how easily it was to implement these ideas in Haskell. Haskell is the only programming language I've used that has the ecosystem that can support this level of abstraction. Doing something like this would be unimaginable in a language like java unless I were to manually write out the matrix multiplication code for all 3 types of matrices. This sort of thing might be possible to do in python as well, but I think it would definitely be difficult to beat the Haskell solution. If you would like to tinker with the code or take a closer look, I've made it available here.

Comments

Post a Comment

Popular posts from this blog

Multi-Core Programming with Java

Beat the Streak: Day Three

Efficiently Remove Duplicate Rows from a 2D Numpy Array