Here I have captured a thought process I had while reading about algorithms for hard graph problems. The thoughts are inspired by MapReduce, distributed merge sort and the more colorful newspapers of the world.
Summary of thoughts
Given an instance of an problem (think Max Clique, Traveling Salesman or another hard graph problem)…
Compute an instance that is “easier” but has the same optimal solution. This is done by a “reducer algorithm”.
Reducer algorithms may run in parallel.
Reducer algorithms may be different.
Reducer algorithms can “gossip” with each other during execution. Gossip helps an algorithm by yielding new information about the problem being solved.
Gossip is either a suboptimal solution or a reduced problem instance. This information can be used as a lower bound, or in other ways.
“Merger algorithms” can combine problem instances from different reducer algorithms into one.
A full example of reducing and merging: Maximum Clique Problem.
Here is an instance of the Maximum Clique Problem, in this case a non-planar graph. By the way, planar graphs are boring because they can only contain cliques of size 4 or smaller.
Let’s see what could happen when running two different reducers (reducer 1 and reducer 2) on this problem instance, and then merging the returned instances.
Reducer 1 works by randomly finding a clique in the graph, and repeatedly deleting nodes that have degree less than the size of the clique. The clique found is emitted as a gossip message (reducer 2 will use this as a lower bound).
Here is the result of running reducer 1:
Let’s look at reducer 2. While running reducer 2 could receive a gossip message from reducer 1, that a clique of size 4 has been found. Reducer 2 could use this as a lower bound. Reducer 2 targets nodes of degree around the lower bound. It works (slowly) be examining the targeted node to find out if it is part of a clique. If not it is deleted from the graph.
This could be the result of running reducer 2 (and accepting gossip from reducer 1):
In this madeup example reducer 1 managed to remove more nodes than reducer 2, but the point is that they removed different nodes.
Running the merger (computes the intersection) on the two reduces instances yields this:
Yay, an even smaller instance. But while we have the reducers up and running, we not restart reducer 1 with this instance as input! Let’s see what we get.
This look pretty good. This graph contains only 23 nodes, which is approximately half of the original graph, and that by discovering a relatively small clique of size 4 (compared to the big one of size 7).
Conclusion and a small disclaimer
Most people who deal with such problems call this sort of thing preprocessing. I call it a “reducer network”, mainly because it sounds cooler, but also because I think there might be a novel idea here. Namely running a host of algorithms in a distributed environment to perform the preprocessing while emitting and accepting gossip. Of course this is very similar to the ideas behind Google MapReduce and similar services, and might be exactly the same thing. I just felt the need to document my though process, and this post was created 🙂
This blog post is based on ideas and thoughts I had while reading “The Algorithm Design Manual” by Skiena (great book). The thougts are just that, thoughts.