Coding Project
Author: Matt Kenney
To find the longest path in a directed, cyclic graph, $G = (V, E)$ (as asked in Coding Assignment 4 part e), I implemented an algorithm which generates a number of permutations of DAG reductions, $G{\pi}$, of the graph $G$, and finds the longest path in each of these DAGs. As described in these MIT lecture notes on using Random DAGs to compute the NP-hard longest path problem we've been assigned, there is a $\frac{1}{k!}$ chance that a path of length $k$ in $G$ is also present in a given random topological reduction of $G$, $G{\pi}$. If we convert $G$ to a DAG using $k!$ different permutations of the topological ordering $\pi$, then we have a strong chance that we will identify whether or not a k-length path exists in $G$. Therefore, below, we attempt up to $n!$ ($n = |V|$) DAG reductions of G and find the longest path in each of these reductions, $G{\pi}$. We stop when we reach the time limit (10 mins), since, in reality, computing the longest path in $n!$ DAG reductions of $G$ is not feasible in the case of the FaceBook graph, which has $n > 4000$. To compute random DAG reductions, we simply call $\textbf{DFS}$ with different random orderings, $\sigma$, of the vertices $v \in V$, to obtain different topological orderings, $\pi$, in this way. To obtain topological reductions, as described in the lecture notes linked above, we convert $G$ into a DAG, $G{\pi}$, by running $\textbf{DFS}(G, \sigma)$, where $\sigma$ is a random ordering of the vertices in $G$. to optain a topological ordering, $\pi$. We then remove all edges from $G$ which do not satisfy the topological order. Lets denote the index of vertex $i$ in the ordering of $\pi$ as $\pi(i)$ (so if $\sigma = a, b, c$, then $\pi(a) = 1$, $\pi(b) = 2$, and $\pi(c) = 3$). Specically, then, to create the random DAG reduction $G_{\pi}$, we retain only the edges $(i, j) \in G$ such that $\pi(i) < \pi(j)$.
To compute the longest path in each our our random DAG permutations, I used the lowest cost path in a DAG algorithm, but interpreted edge weights as $c(e) = -1 ~~ \forall e \in E$ so that in fact the longest path in the DAG is returned. Since we have obtained the topological ordering $\pi$ already for each of these DAGs, we simply search for the longest path starting at $\pi0$, or the first known source of the DAG. In order to implement this algorithm, I used a GPU accelerated approach using the cuGraph
library, which implements the longest path in a DAG problem in a time complexity that is a function of $\log n$ by leveraging parallelism. This is in contrast to the $O(|V| + |E|)$ time complexity of the standard serial algorithm. See more information on the time complexity of Parallel Single-Source Shortest Path here. This parallelism adaptation makes finding the longest path in the Facebook graph much more feasible. See the implementation below.
Despite the speedup obtained in the parallel shortest path algorithm, I do recognize that this algorithm remains bounded by DFS speed. If I get the time, I'd like to either 1) Implement the algorithm for Monte-Carlo type randomized graph "coloring", as described Here, which runs in around $4^n$ time, or 2) work on implementing a parallel approach to finding a topological ordering, so that I can remove the bottleneck posed by that step. In any case, though, getting to use GPU libraries (and install them, which is much trickier than expected) and working to implement this algorithm were both valuable learning experiences.
DFS
Below is the DFS iterative implementation I wrote for my coding group. It is just a 'helper' function in this larger project
Get Random DAG algorithm
As described in the introduction, this algorithm computes random DAG reductions of the passed graph, G
Longest Path algorithm from cuGraph
Here, we use the algorithm implemented in cuGraph to find the shortest path in a DAG. More precisely, though, this algorithm finds the lowest cost path in a DAG, where the cost of a path is the sum of the cost of the edges in the path i.e. $c(p) = \sum{e \in p} c(e)$. If we set all edges in some DAG, $G{\pi}=(V, E)$, such that $c(e) = -1 ~~ \forall e \in E $, then the lowest cost path becomes the longest path, and the cost of this path is exactly the length of the longest path in $G_{\pi}$.