In this post, I like to share a simple idea on how to solve to optimality some hard instances of the Graph Coloring problem. This simple idea yields a “new time” record for a couple of hard instances.
To date, the best exact approach to solve Graph Coloring is based on BranchandPrice [1, 2, 3]. The branchandprice method is completely different from the Constraint Programming approach I discussed in a previous post. A key component of BranchandPrice is the column generation phase, which is intuitively quite simple, but mathematically rather involved for a short blog post.
Here, I want to show you that a modern Mixed Integer Programming (MIP) solver, such as Gurobi or CPLEX, can solve a few hard instances of graph coloring with the following “null implementation effort”:
 Enumerate all possible columns
 Build a .mps instance with those columns
 Use a MIP solver to solve the .mps instance
Indeed, in this post we try to answer to the following question:
Is there any hope to solve any hard graph coloring instances with this naive approach?
Formulation
Given an undirected graph and a set of colors , the minimum (vertex) graph coloring problem consists of assigning a color to each vertex, while every pair of adjacent vertices gets a different color. The objective is to minimize the number of colors used in a solution.
The branchandprice approach to graph coloring is based on a set covering formulation. Let be the collection of all the maximal stable sets of , and let be the maximal stable sets that contain the vertex . Let be a 01 variable equal to 1 if all the vertices in the maximal stable set get assigned the same color. Hence, the set covering model is:
Indeed, we “cover” every vertex of with the minimal number of maximal stable sets. The issue with this model is the total number of maximal stable sets in , which is exponential in the number of vertices of G.
Column Generation is a “mathematically elegant” method to bypass this issue: it lets you to solve the set covering model by generating a very small subset of the elements in . This happens by repeatedly solving an auxiliary problem, called the pricing subproblem. For graph coloring, the pricing subproblem consists of a Maximum Weighted Stable Set problem. If you are interested in Column Generation, I recommend you to look at the first chapter of the Column Generation book, which contains a nice tutorial on the topic, and I would strongly recommend reading the nice survey “Selected Topics in Column Generation”, [4].
How many maximal stable sets are in a hard graph coloring instance?
If this number were not so high, we could enumerate all the stable sets in and attempt to directly solve the set covering model without resorting to column generation. However, “high” is a subjective measure, so let me do some computations on my laptop and give you some precise numbers.
Hard instances
Among the DIMACS instances of Graph Coloring, there are a few instances proposed by David Johnson, which are still unsolved (in the sense that we have not a computational proof of optimality of the best known upper bounds).
The table below shows the dimensions of these instances. The name of instances are DSJC{n}.{d}, where {n} is the number of vertices and {d} gives the density of the graph (e.g., DSJC125.9 has 125 vertices and 0.9 of density).
Graph  Nodes  Edges  Max stable sets  Enumeration Time 

DSJC125.9  125  6,961  524  0.00 
DSJC250.9  250  27,897  2,580  0.01 
DSJC500.9  500  112,437  14,560  0.12 
DSJC1000.9  1,000  449,449  100,389  2.20 
DSJC125.5  125  3,891  43,268  0.53 
DSJC250.5  250  15,668  1,470,363  43.16 
DSJC500.5  500  62,624  ?  out of memory 
DSJC1000.5  1,000  249,826  ?  out of memory 
DSJC125.1  125  736  ?  out of memory 
DSJC250.1  250  3,218  ?  out of memory 
DSJC500.1  500  12,458  ?  out of memory 
DSJC1000.1  1,000  49,629  ?  out of memory 
As you can see the number of maximal stable sets (i.e. the cardinality of )
of several instances is not so high, above all for very dense graphs, where the number of stables set is less than the number of edges. However, for sparse graphs, the number of maximal stable sets is too large for the memory available in my laptop.
Now, let me restate the main question of this post:
Can we enumerate all the maximal stable sets of and use a MIP solver such as Gurobi or CPLEX to solve any Johnson’s instance of Graph Coloring?
Results
I have written a small script which uses Cliquer to enumerate all the maximal stable sets of a graph, and then I generate an .mps instance for each of the DSJC instance where I was able to store all maximal stable sets. The .mps file are on my public GitHub repository for this post.
The table below shows some numbers for the sparse instances obtained using Gurobi (v6.0.0) with a timeout of 10 minutes on my laptop. If you compare these numbers with the results published in the literature, you can see that they are not bad at all.
Believe me, these number are not bad at all, and establish a new TIME RECORD.
For example, the instance DSJC250.9 was solved to optimality only recently in 11094 seconds by [3], while the column enumeration approach solves the same instance on a similar hardware in only 23 seconds (!), and, honestly, our work in [2] did not solve this instance to optimality at all.
Graph  Best known  Enum. Time  Run time  LB  UB  Time [2]  LB[2]  UB [2] 

DSJC125.9  44  0.00  0.44  44  44  44  44  44 
DSJC250.9  72  0.01  23  72  72  timeout  71  72 
DSJC500.9  128  0.12  timeout  123  128  timeout  123  136 
DSJC1000.9  222  2.20  timeout  215  229  timeout  215  245 
DSJC125.5  17  0.53  70.6  17  17  19033  17  17 
DSJC250.5  28  43.16  timeout  26  33  timeout  26  31 
Can we ever solve to optimality DSJC500.9 and DSJC1000.9 via Column Enumeration?
I would say:
“Yes, we can!”
… but likely we need to be smarter while branching on the decision variables, since the default branching strategy of a generic MIP solver does not exploit the structure of the problem. If I had the time to work again on Graph Coloring, I would likely use the same branching scheme used in [2], where we combined a Zykov’s branching rule with a randomized iterative deepening depthfirst search (randomised because at each restart we were using a different initial pool of columns). Another interesting option would be to tighten the set covering formulation with valid inequalities, by starting with those studied in [5].
In conclusion, I believe that enumerating all columns can be a simple but good starting point to attempt to solve to optimality at least the instances DSJC500.9 and DSJC1000.9.
Do you have some spare time and are you willing to take up the challenge?
References

A Mehrotra, MA Trick. A column generation approach for graph coloring. INFORMS Journal on Computing. Fall 1996 vol. 8(4), pp.344354. [pdf]

S. Gualandi and F. Malucelli. Exact Solution of Graph Coloring Problems via Constraint Programming and Column Generation. INFORMS Journal on Computing. Winter 2012 vol. 24(1), pp.81100. [pdf] [preprint]

S. Held, W. Cook, E.C. Sewell. Maximumweight stable sets and safe lower bounds for graph coloring. Mathematical Programming Computation. December 2012, Volume 4, Issue 4, pp 363381. [pdf]

M. Lubbecke and J. Desrosiers. Selected topics in column generation. Operations Research. 2005, Volume 53, Issue 6, pp 10071023. [pdf]

Set covering and packing formulations of graph coloring: algorithms and first polyhedral results. Discrete Optimization. 2009, Volume 6, Issue 2, pp 135147. [pdf]