A simple algorithm to solve small instances of the Dominating Set problem on GPUs

The Dominating Set Problem is an NP-complete problem. The problem has been studied extensively and efforts have been conducted to explore the performance gains that can be obtained by solving it on parallel computers.

Graphical Processing Units (GPUs) have been increasing in speed, capacity and number of processing units at very impressive rates and it seems reasonable to assume that they will continue to do so in the near future. GPUs can be programmed for General Purpose (GP GPU) computing.

This site describes an algorithm to solve small instances of the Dominating Set Problem on GPUs.

A dominating set for a graph G = (V, E) is a subset D of V such that every vertex not in D is joined to at least one member of D by one edge. One is interested in finding the dominating set with the minimum number of elements.

In the description of this algorithm it will be assumed that the graph will be represented using an adjacency matrix.

Given a graph with a set of n vertices, one way to find the Dominating Set with the minimum number of elements is to examine each and every one of the 2n subsets and to choose, among the dominating sets, the one (or ones) with the minimum number of elements. See the "Exact Algorithms" section in the Wikipedia entry about the Dominating Set .

When programming a GPU, an assumption that is frequently made is that one is operating on an array of elements. Usually this array contains a large number of elements. When writing the code that will be executed in each of the processors on the GPU, one uses a function that allows the code to find the ``virtual'' identity of the processor and this identity is then used as an index on the large array upon which the computation is taking place. This approach can be used even if one actually does not allocate an array. One identifies the size of the grid used by the algorithm and one thread will be started for every element of that grid. Each thread will have a unique id.

Given a graph with n vertices, one initial approach is to state that one will execute 2n different threads. Clearly, this imposes a limitation on the size of the graphs for which this algorithm will be applicable.

The code that will be executed on each thread will find its identity, processorId, and will treat its identity as an encoding of a subset of the n vertices: The ones in the binary encoding of processorId will correspond to the nodes in a particular subset. One uses then the adjacency matrix to verify if this particular subset is a dominating set. If it is, an atomicMin operation is performed between the number of elements in this subset and a global variable that contains the smallest size of a dominating set that has been found so far.

If this particular subset is not a dominating set, the algorithm does not attempt to update the global variable. Once all the threads have finished executing, the size of the smallest dominating set is in the global variable.

Consider the following example. The graph is illustrated in Figure 1. The graph has three nodes: 0,1 and 2. Nodes 0 and 1 are connected. Node 2 is not. In this case n, the number of nodes, in the graph is 3. If one calculates all the possible subsets, there are 2 3 = 8 possible distinct subsets. The following table shows the eight subsets and indicates which of those subsets are dominating sets and their corresponding sizes.

 Index Binary Encoding Elements Dominating Set Size of the Dominating Set 0 000 {} No N.A. 1 001 {0} No N.A. 2 010 {1} No N.A. 3 011 {0,1} No N.A. 4 100 {2} No N.A. 5 101 {0,2} Yes 2 6 110 {1,2} Yes 2 7 111 {0,1,2} Yes 3

From the table, one observes that there are three subsets that are dominating sets, and two of them have a size of 2. Every thread in the code will evaluate a different row in the table. It will find if this particular subset is a dominating set. If it is, it will perform an atomic operation against a global variable that contains the size of the smallest dominating set found so far. At the end of the execution of all the threads, that global variable will contain the size of the smallest dominating set.

The files that implement this algorithm are available here, along with several test files. The file that reads the adjacency matrix reads from the standard input. Use input redirection if you want to use the test files.

• calcDominatingSet.cu The code in CUDA
• cutil.h A header file for the CUDA code.