An Efficient Communication Scheme

In this blog post, I'll be talking about an efficient communication scheme that can be leveraged in parallel programs for broadcast or reduce operations.  The idea is very simple, but I thought it was pretty cool when I learned it, so I thought I would pass along the information to you.  Let's abstract away the parallel programming aspect and analyze the following question:
You have a message that you want to share with a large group of people, but you can only share the message with one person at a time. How should you distribute this message?

Setting up the Problem

To clear up some possible ambiguities, I want to emphasize that it takes one time step to share a message regardless of who you share it with, and anybody can share the message with anybody else (granted they've received it first).  I will identify each person in the group by a unique integer in ${0,1,...,n}$, and I will assign myself the id $0$. This is similar to how we might structure a distributed program running on $n$ nodes.  The most naive solution to this problem is to just share the message with all $n−1$ people one by one. There is much better way to do it, however.


A Better Scheme

A better scheme is to take advantage of the opportunity to parallelize the work of sharing the message. On time step one, I can share the message with one other person. On time step two, we can both share the message with two new people. Then the four of us can share the message with four new people. In general, we can double the number of people with the message on each time step, which implies that we can distribute this message to everybody in logarithmic time using this scheme.

I've shown how it's theoretically possible to achieve logarithmic time, but I left out some important details. Namely, how do you know who you should share the message with at any given time? I'd like to synthesize a function $f(i,t)$, where $f(i,t)$ tells you who person ii should send the message to at time $t$. For example, at time $0$, person $0$ can send the message to person $1$. At time $1$, person $0$ can send the message to person $2$ and person $1$ can send the message to person $3$. In general, $f(i,t)=2t+i$ where $0 \leq i < 2t $. Here is an animation of this process in tree form:


Another way to think about this communication pattern is as a k-dimensional hypercube where the vertices of the hypercube represent the people in the group. The vertices of a 4-D hypercube are given by the set of points ${(0,0,0,0),(0,0,0,1),(0,0,1,0),...,(1,1,1,0),(1,1,1,1)}$. These points in 4D space map very naturally to 4 digit binary numbers, or numbers in the range $ 0 \leq n < 2^4$. At each time step, we add a dimension to the hypercube until everybody has received the message. Here is the animation for this interpretation going up to the 3-D cube:


Conclusion

This methodology is used in parallel programming to broadcast a message to the nodes in a cluster. It performs much better than the naive solution I originally came up with, and serves as a great example of how you should start thinking about extracting parallelism from your problems. Since computer engineers are approaching the limit of single processor speed, parallel programming is becoming increasingly important, and being able to identify opportunities for parallelism within your code is a great skill to have.

Comments

Post a Comment

Popular posts from this blog

Multi-Core Programming with Java

Beat the Streak: Day Three

Efficiently Remove Duplicate Rows from a 2D Numpy Array