## How to read computer science papers

Situation: You have a large pile of computer science papers in front of you. You want to read them all. What to do?

My suggestion is that you read the two guides below. They are really short and helpful. I’m one year into my CS PhD, and I still find reading a large pile of papers to be quite hard. Especially if the papers are exploring problems within a field that I’m not super familiar with.

Here’s a game I like to play. Select two wikipedia pages at random, and find a route from one to the other. I stated a theorem once that:

you can get from any page on wikipedia to the page on pollination in 7 steps or less. (it was actually another page, but let’s say it was pollination)

I devised a method for doing this using Google search. Let’s call the random page s, and the page you want to reach t, e.g. pollination. A given page on wikipedia has a set of incoming links (other pages linked to the page), and a set of outgoing links (other pages linked to by the page). Let’s call these two sets in[p] and out[p]. These two sets contain direct decendants and ancestors of p respectively.

## If only more people wrote like Lamport

I’m half way through the Part-time parliament on Paxos article by Leslie Lamport. It is an article that describes a three-phase consensus protocol by telling a story of a (fictional) parliement on the greek island of Paxos, complete with fat priests, busy businessmen, and a vivid description of the parliament hall. The story illustrates how the members of the parliament, who would walk in and out at any time, could pass decrees and agree on what had been agreed upon, decree-wise.

## Good indian computer science videos on youtube

While browsing the web for for good videos to help me land a cool job at high profile tech firm, I came across this series from an Indian university.

#### Lecture – 16 Disk Based Data Structures

You should be able to easily find the other videos in the series through this one. Generally the subjects that are covered relate to data structures and algorithms:

• Trees (Red-Black, B, AVL)
• Hashing
• Heaps
• Sorting

The videos are very practical and relate the data structures to scenarios where they would be used, like for bank transactions etc.

## Summary of thoughts

Given an instance of an problem (think Max Clique, Traveling Salesman or another hard graph problem)…

Thought 1:

Compute an  instance that is “easier” but has the same optimal solution. This is done by a “reducer algorithm”.

Thought 2:

Reducer algorithms may run in parallel.

Thought 3:

Reducer algorithms may be different.

Thought 4:

Reducer algorithms can “gossip” with each other during execution. Gossip helps an algorithm by yielding new information about the problem being solved.

Thought 5:

Gossip is either a suboptimal solution or a reduced problem instance. This information can be used as a lower bound, or in other ways.

Thought 6:

“Merger algorithms” can combine problem instances from different reducer algorithms into one.

A full example of reducing and merging: Maximum Clique Problem.

Here is an instance of the Maximum Clique Problem, in this case a non-planar graph. By the way, planar graphs are boring because they can only contain cliques of size 4 or smaller.

Let’s see what could happen when running two different reducers (reducer 1 and reducer 2) on this problem instance, and then merging the returned instances.

Reducer 1 works by randomly finding a clique in the graph, and repeatedly deleting nodes that have degree less than the size of the clique. The clique found is emitted as a gossip message (reducer 2 will use this as a lower bound).

Here is the result of running reducer 1:

Let’s look at reducer 2. While running reducer 2 could receive a gossip message from reducer 1, that a clique of size 4 has been found. Reducer 2 could use this as a lower bound. Reducer 2 targets nodes of degree around the lower bound. It works (slowly) be examining the targeted node to find out if it is part of a clique. If not it is deleted from the graph.

This could be the result of running reducer 2 (and accepting gossip from reducer 1):

In this madeup example reducer 1 managed to remove more nodes than reducer 2, but the point is that they removed different nodes.

Running the merger (computes the intersection) on the two reduces instances yields this:

Yay, an even smaller instance. But while we have the reducers up and running, we not restart reducer 1 with this instance as input! Let’s see what we get.

This look pretty good. This graph contains only 23 nodes, which is approximately half of the original graph, and that by discovering a relatively small clique of size 4 (compared to the big one of size 7).

Conclusion and a small disclaimer

Most people who deal with such problems call this sort of thing preprocessing. I call it a “reducer network”, mainly because it sounds cooler, but also because I think there might be a novel idea here. Namely running a host of algorithms in a distributed environment to perform the preprocessing while emitting and accepting gossip. Of course this is very similar to the ideas behind Google MapReduce and similar services, and might be exactly the same thing. I just felt the need to document my though process, and this post was created 🙂

This blog post is based on ideas and thoughts I had while reading “The Algorithm Design Manual” by Skiena (great book). The thougts are just that, thoughts.