As a benchmark, I grabbed a large text file from P. Norvig’s website, which is 6’488’666 byte long.
The final answer? Yes, mispredicted branches have a huge impact in Python too.
The hidden answer? Python dictionaries ever stop to surprise me: they are REALLY efficient.
NOTE: The followig code snippets were executed in a Python 3.5 notebook, on a windows machine, running Windows 10 and Anaconda Python 3.5 64 bits. You can find my notebook on my Blog GitHub repo. Don’t ask me why, but this blog entry is better visualized directly on GitHub.
UPDATE: Well, most of the time I would use my first implementation based on the filter
builtin function, and I would try for alternative implementations only after a profiler has shown
that removing blanks is a true bottleneck of my whole program. As written in the title, this post is meant as a basic exercise in Python.
In Python, I prefer to write as much code in functional style as possible, relying on the 3 basic functions:
Therefore, after few preliminaries, here is my first code snippet:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

6488671 function calls in 1.956 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 1.955 1.955 <ipythoninput3eeb7d3495697>:1(RemoveBlanksFilter)
6488666 0.870 0.000 0.870 0.000 <ipythoninput3eeb7d3495697>:2(<lambda>)
1 0.000 0.000 1.956 1.956 <string>:1(<module>)
1 0.000 0.000 1.956 1.956 {builtin method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 1.085 1.085 1.955 1.955 {method 'join' of 'str' objects}
Wow, I didn’t realize that I would have call the lambda function for every single byte of my input file. This is clearly too much overhead.
Let me drop my functional style, and write a plain old forloop:
1 2 3 4 5 6 7 8 

Is test passed: True
1


5452148 function calls in 1.566 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 1.210 1.210 1.553 1.553 <ipythoninput65e45e3056bc2>:1(RemoveBlanks)
1 0.012 0.012 1.566 1.566 <string>:1(<module>)
1 0.000 0.000 1.566 1.566 {builtin method builtins.exec}
5452143 0.310 0.000 0.310 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.033 0.033 0.033 0.033 {method 'join' of 'str' objects}
Mmm… we just shift the problem to the list append function calls. Maybe we can do better by working in place.
Well, almost in place: Python string are immutable; therefore, we first copy the string into a list, and then we work in place over the copied list.
1 2 3 4 5 6 7 8 9 10 

Is test passed: True
1


5 function calls in 1.158 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 1.113 1.113 1.145 1.145 <ipythoninput999d36ae6359e>:1(RemoveBlanksInPlace)
1 0.013 0.013 1.158 1.158 <string>:1(<module>)
1 0.000 0.000 1.158 1.158 {builtin method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.032 0.032 0.032 0.032 {method 'join' of 'str' objects}
Ok, working in place does have an impact. Let me go on the true point: avoiding mispredicted branches.
As in the original blog post:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 

Is test passed: True
1


6489183 function calls in 1.474 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 1.235 1.235 1.460 1.460 <ipythoninput121bd75a3de21d>:1(RemoveBlanksNoBranch)
1 0.014 0.014 1.474 1.474 <string>:1(<module>)
256 0.000 0.000 0.000 0.000 {builtin method builtins.chr}
1 0.000 0.000 1.474 1.474 {builtin method builtins.exec}
6488666 0.192 0.000 0.192 0.000 {builtin method builtins.ord}
256 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.033 0.033 0.033 0.033 {method 'join' of 'str' objects}
Ouch!!! These are getting even worse! Why? Well, ‘ord’ is a function, so we are getting back the overhead of function calls. Can we do better by using a dictionary instead of an array?
Let me use a dictionary in order to avoid the ‘ord’ function calls.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 

Is test passed: True
1


261 function calls in 0.771 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.724 0.724 0.758 0.758 <ipythoninput1546ad4c3f0b26>:1(RemoveBlanksNoBranchDict)
1 0.013 0.013 0.771 0.771 <string>:1(<module>)
256 0.000 0.000 0.000 0.000 {builtin method builtins.chr}
1 0.000 0.000 0.771 0.771 {builtin method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.034 0.034 0.034 0.034 {method 'join' of 'str' objects}
Oooh, yes! Now we can see that without mispredicted branches we can really speed up our algorithm.
Is this the best pythonic solution? No, surely not, but still it is an interesting remark to keep in mind when coding.
Likely, the simplest pythonic solution is just to use the ‘replace’ string function as follows:
1 2 3 4 5 6 7 

Is test passed: True
1


7 function calls in 0.065 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 0.064 0.064 <ipythoninput1858fd6655cfba>:1(RemoveBlanksBuiltin)
1 0.001 0.001 0.065 0.065 <string>:1(<module>)
1 0.000 0.000 0.065 0.065 {builtin method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
3 0.063 0.021 0.063 0.021 {method 'replace' of 'str' objects}
Here we are, the best solution is indeed to use a builtin function, whenever it is possible, even if this was not the real aim of this exercise.
Please, let me know if you have some comments or a different solution in Python.
]]>In this post, I like to share a simple idea on how to solve to optimality some hard instances of the Graph Coloring problem. This simple idea yields a “new time” record for a couple of hard instances.
To date, the best exact approach to solve Graph Coloring is based on BranchandPrice [1, 2, 3]. The branchandprice method is completely different from the Constraint Programming approach I discussed in a previous post. A key component of BranchandPrice is the column generation phase, which is intuitively quite simple, but mathematically rather involved for a short blog post.
Here, I want to show you that a modern Mixed Integer Programming (MIP) solver, such as Gurobi or CPLEX, can solve a few hard instances of graph coloring with the following “null implementation effort”:
Indeed, in this post we try to answer to the following question:
Is there any hope to solve any hard graph coloring instances with this naive approach?
Given an undirected graph and a set of colors , the minimum (vertex) graph coloring problem consists of assigning a color to each vertex, while every pair of adjacent vertices gets a different color. The objective is to minimize the number of colors used in a solution.
The branchandprice approach to graph coloring is based on a set covering formulation. Let be the collection of all the maximal stable sets of , and let be the maximal stable sets that contain the vertex . Let be a 01 variable equal to 1 if all the vertices in the maximal stable set get assigned the same color. Hence, the set covering model is:
Indeed, we “cover” every vertex of with the minimal number of maximal stable sets. The issue with this model is the total number of maximal stable sets in , which is exponential in the number of vertices of G.
Column Generation is a “mathematically elegant” method to bypass this issue: it lets you to solve the set covering model by generating a very small subset of the elements in . This happens by repeatedly solving an auxiliary problem, called the pricing subproblem. For graph coloring, the pricing subproblem consists of a Maximum Weighted Stable Set problem. If you are interested in Column Generation, I recommend you to look at the first chapter of the Column Generation book, which contains a nice tutorial on the topic, and I would strongly recommend reading the nice survey “Selected Topics in Column Generation”, [4].
How many maximal stable sets are in a hard graph coloring instance?
If this number were not so high, we could enumerate all the stable sets in and attempt to directly solve the set covering model without resorting to column generation. However, “high” is a subjective measure, so let me do some computations on my laptop and give you some precise numbers.
Among the DIMACS instances of Graph Coloring, there are a few instances proposed by David Johnson, which are still unsolved (in the sense that we have not a computational proof of optimality of the best known upper bounds).
The table below shows the dimensions of these instances. The name of instances are DSJC{n}.{d}, where {n} is the number of vertices and {d} gives the density of the graph (e.g., DSJC125.9 has 125 vertices and 0.9 of density).
Graph  Nodes  Edges  Max stable sets  Enumeration Time 

DSJC125.9  125  6,961  524  0.00 
DSJC250.9  250  27,897  2,580  0.01 
DSJC500.9  500  112,437  14,560  0.12 
DSJC1000.9  1,000  449,449  100,389  2.20 
DSJC125.5  125  3,891  43,268  0.53 
DSJC250.5  250  15,668  1,470,363  43.16 
DSJC500.5  500  62,624  ?  out of memory 
DSJC1000.5  1,000  249,826  ?  out of memory 
DSJC125.1  125  736  ?  out of memory 
DSJC250.1  250  3,218  ?  out of memory 
DSJC500.1  500  12,458  ?  out of memory 
DSJC1000.1  1,000  49,629  ?  out of memory 
As you can see the number of maximal stable sets (i.e. the cardinality of )
of several instances is not so high, above all for very dense graphs, where the number of stables set is less than the number of edges. However, for sparse graphs, the number of maximal stable sets is too large for the memory available in my laptop.
Now, let me restate the main question of this post:
Can we enumerate all the maximal stable sets of and use a MIP solver such as Gurobi or CPLEX to solve any Johnson’s instance of Graph Coloring?
I have written a small script which uses Cliquer to enumerate all the maximal stable sets of a graph, and then I generate an .mps instance for each of the DSJC instance where I was able to store all maximal stable sets. The .mps file are on my public GitHub repository for this post.
The table below shows some numbers for the sparse instances obtained using Gurobi (v6.0.0) with a timeout of 10 minutes on my laptop. If you compare these numbers with the results published in the literature, you can see that they are not bad at all.
Believe me, these number are not bad at all, and establish a new TIME RECORD.
For example, the instance DSJC250.9 was solved to optimality only recently in 11094 seconds by [3], while the column enumeration approach solves the same instance on a similar hardware in only 23 seconds (!), and, honestly, our work in [2] did not solve this instance to optimality at all.
Graph  Best known  Enum. Time  Run time  LB  UB  Time [2]  LB[2]  UB [2] 

DSJC125.9  44  0.00  0.44  44  44  44  44  44 
DSJC250.9  72  0.01  23  72  72  timeout  71  72 
DSJC500.9  128  0.12  timeout  123  128  timeout  123  136 
DSJC1000.9  222  2.20  timeout  215  229  timeout  215  245 
DSJC125.5  17  0.53  70.6  17  17  19033  17  17 
DSJC250.5  28  43.16  timeout  26  33  timeout  26  31 
Can we ever solve to optimality DSJC500.9 and DSJC1000.9 via Column Enumeration?
I would say:
“Yes, we can!”
… but likely we need to be smarter while branching on the decision variables, since the default branching strategy of a generic MIP solver does not exploit the structure of the problem. If I had the time to work again on Graph Coloring, I would likely use the same branching scheme used in [2], where we combined a Zykov’s branching rule with a randomized iterative deepening depthfirst search (randomised because at each restart we were using a different initial pool of columns). Another interesting option would be to tighten the set covering formulation with valid inequalities, by starting with those studied in [5].
In conclusion, I believe that enumerating all columns can be a simple but good starting point to attempt to solve to optimality at least the instances DSJC500.9 and DSJC1000.9.
Do you have some spare time and are you willing to take up the challenge?
A Mehrotra, MA Trick. A column generation approach for graph coloring. INFORMS Journal on Computing. Fall 1996 vol. 8(4), pp.344354. [pdf]
S. Gualandi and F. Malucelli. Exact Solution of Graph Coloring Problems via Constraint Programming and Column Generation. INFORMS Journal on Computing. Winter 2012 vol. 24(1), pp.81100. [pdf] [preprint]
S. Held, W. Cook, E.C. Sewell. Maximumweight stable sets and safe lower bounds for graph coloring. Mathematical Programming Computation. December 2012, Volume 4, Issue 4, pp 363381. [pdf]
M. Lubbecke and J. Desrosiers. Selected topics in column generation. Operations Research. 2005, Volume 53, Issue 6, pp 10071023. [pdf]
Set covering and packing formulations of graph coloring: algorithms and first polyhedral results. Discrete Optimization. 2009, Volume 6, Issue 2, pp 135147. [pdf]
By sheer serendipity, this morning I came across three paragraphs clearly stating the importance of Big Data from a scientific standpoint, that I like to crosspost here (the following paragraphs appear in the introduction of [1]):
In all applied fields, it is now commonplace to attack problems through data analysis, particularly through the use of statistical and machine learning algorithms on what are often large datasets. In industry, this trend has been referred to as ‘Big Data’, and it has had a significant impact in areas as varied as artificial intelligence, internet applications, computational biology, medicine, finance, marketing, journalism, network analysis, and logistics.
Though these problems arise in diverse application domains, they share some key characteristics. First, the datasets are often extremely large, consisting of hundreds of millions or billions of training examples; second, the data is often very highdimensional, because it is now possible to measure and store very detailed information about each example; and third, because of the large scale of many applications, the data is often stored or even collected in a distributed manner. As a result, it has become of central importance to develop algorithms that are both rich enough to capture the complexity of modern data, and scalable enough to process huge datasets in a parallelized or fully decentralized fashion. Indeed, some researchers have suggested that even highly complex and structured problems may succumb most easily to relatively simple models trained on vast datasets.
Many such problems can be posed in the framework of Convex Optimization.
Given the significant work on decomposition methods and decentralized algorithms in the optimization community, it is natural to look to parallel optimization algorithms as a mechanism for solving largescale statistical tasks. This approach also has the benefit that one algorithm could be flexible enough to solve many problems.
Even if I am not an expert of Convex Optimization [2], I do have my own mathematical optimization bias. Likely, you may have a different opinion (that I am always happy to hear), but, honestly, the above paragraphs are the best content that I have read so far about Big Data.
[1] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends in Machine Learning. Vol. 3, No. 1 (2010) 1–122. [pdf]
[2] If you like to have a speedy overview of Convex Optimization, you may read a J.F. Puget’s blog post.
]]>After a nice chat with Bo Jensen, CEO, founder, and coowner (really, he is a Rocket Scientist!) at Sulum Optimization, I realised that I know barely anything.
By definition, we have that:
“Presolving is a way to transform the given problem instance into an equivalent instance that is (hopefully) easier to solve.” (see, chap. 10 in Tobias Achterberg’s Thesis)
All I know is that every MIP solver has a Presolve parameter, which can take different values. For instance, Gurobi has three possible values for that parameter (you can find more details on the Gurobi online manual):
However, I can’t tell you the real impact of that parameter on the overall solution process of a MIP instance. Thus, here we go: let me write a new post that addresses this basic question!
To measure the impact of preprocessing we need four ingredients:
Changing one of the ingredients could give you different results, but, hopefully, the big picture will not change too much.
As a solver, I have selected the current release of Gurobi (i.e., version 5.6.2). For the data set, likely the most critical ingredient, I have used the MIPLIB2003, basically because I had already all the 60 instances on my server. For running the test I have used an old cluster from the Math Department of University of Pavia.
The measure of impact I have decided to use (after considering other alternatives) is quite conservative: the fraction of closed instances as a function of runtime.
During the last weekend, I have collected a bunch of logs for the 60 instances of the MIPLIB2003, and, then, using RStudio, I have draw the following cumulative plot:
The picture is as simple as clear:
Preprocessing does always payoff and permits to solve around 10% more of the instances within the same time limit!
In this post, I will not discuss additional technical details, but I just want to add two observations:
Likely, the aggressive presolve setting has been decided by Gurobi using a different, much larger, and customeroriented dataset.
Indeed, preprocessing is a very important feature of a modern MIP solver as Gurobi. Investing few seconds before starting the branchandbound MIP search can save a significant amount of runtime. However, a more aggressive preprocessing strategy does not seem to payoff, in average, on the MIPLIB2003.
Unfortunately, preprocessing is somehow disregarded from the research community. There are few recent papers dealing with preprocessing (“ehi! if you do have one, please, let me know about it, ok?”). Most of papers are from the 90s and about Linear Programming, i.e., without integer variables, which mess up everything.
Here a list of basic questions I have in mind:
If you want to share your idea, experience, or opinion, with respect to these questions, you could comment below or send me an email.
Now, to conclude, my bonus question:
Do you have any new smart idea for improving preprocessing?
Well, if you had, I guess you would at least write a paper about, but, do not go for a patent, please!
]]>Egon talks intersection cuts at #aussois. Still the man. pic.twitter.com/7KMcNyJYV0
— Jeff Linderoth (@JeffLinderoth) January 8, 2014
The Captain gave an inspiring talk by questioning the recursive paradigm of cutting planes algorithms. With a very basic example, Balas has shown how a non basic vertex (solution) can produce a much deeper cut than a cut generated by an optimal basis. Around this intuition, Balas has presented a very nice generalization of Intersection Cuts… a new paper enters my “PAPERSTOBEREAD” folder.
To stay on the subject of cutting planes, the talk by Marco Molinaro in the first day of the workshop was really nice. He raises the fundamental question on how important are sparse cuts versus dense cuts. The importance of sparse cuts comes from linear algebra: when solving the simplex it is better to have small determinants in the coefficient matrix of the Linear Programming relaxation in order to avoid numerical issues; sparse cuts implicitly help in keeping small the determinants (intuitively, you have more zeros in the matrix). Dense cuts play the opposite role, but they can be really important to improve the bound of the LP relaxation. In his talk, Molinaro has shown and proofed, for three particular cases, when sparse cuts are enough, and when they are not. Another paper goes on the “PAPERSTOBEREAD” folder.
In the same day of Molinaro, it was really inspiring the talk by Sebastian Pokutta, who really gave a completely new (for me) perspective on Extended Formulations by using Information Theory. Sebastian is the author of a blog, and I hope he will post about his talk.
Andrea Lodi has discussed about an Optimization problem that arises in Supervised Learning. For this problem, the COINOR solver Couenne, developed by Pietro Belotti, significantly outperforms CPLEX. The issues seem to come from on a number of basic bigM (indicator) constraints. To make a long story short, if you have to solve a hard problem, it does pay off to try different solvers, since there is not a “winall” solver.
Do you have an original new idea for developing solvers? Do not be intimidated by CPLEX or Gurobi and go for it!
The presentation by Marco Senatore was brilliant and his work looks very interesting. I have particularly enjoyed the application in Public Transport that he has mentioned at the end of his talk.
I recommend to have a look at the presentation of Stephan Held about the Reachaware Steiner Tree Problem. He has an interesting Steiner treelike problem with a very important application in chip design. The presentation has impressive pictures of what optimal solutions look like in chip design.
At the end of talk, Stephan announced the 11th DIMACS challenge on Steiner Tree Problems.
Eduardo Uchoa gave another impressive presentation on recent progresses on the classical Capacitated Vehicle Routing Problem (CVRP). He has a very sophisticated branchandpriceandcut algorithm, which comes with a very efficient implementation of every possible idea developed for CVRP, plus new ideas on solving efficiently the pricing sub problems (my understanding, but I might be wrong, is that they have a very efficient dominance rule for solving a shortest path sub problem). +1 item in the “PAPERSTOBEREAD” folder.
The last day of the workshop, I have enjoyed the two talks by Simge Kucukyavuz and Jim Luedtke on Stochastic Integer Programming: for me is a completely new topic, but the two presentations were really inspiring.
To conclude, Domenico Salvagnin has shown how far it is possible to go by carefully using MIP technologies such as cutting planes, symmetry handling, and problem decomposition. Unfortunately, it does happen too often that when someone (typically a non OR expert) has a difficult application problem, he writes down a more or less complicated Integer Programming model, tries a solver, sees it takes too much time, and gives up with exact methods. Domenico, by solving the largest unsolved instance for the 3dimensional assignment problem, has shown that
there are potentially no limits for MIP solvers!
In this post, I have only mentioned a few talks, which somehow overlap with my research interests. However, every talk was really interesting. Fortunately, Francois Margot has strongly encouraged all of the speakers to upload their slides and/or papers, so you can find (almost) all of them on the program web page of the workshop. Visit the website and have a nice reading!
To conclude, let me steal another nice picture from twitter:
— Matteo Fischetti (@MFischetti) January 10, 2014
]]>Public Transport is not really a buzzword, but still on Google you can get almost the same number as with “Big Data”: 26,400,000 results.
Because many of us use Public Transport every day, but most of us still use their own car to go to work, to bring child at school, and to go shopping. This has a negative impact on the quality of life of everyone and is clearly inefficient since it does cost more:
(Well, for time, it is not always true, but it happens more often than commonly perceived).
Thus, an important challenge is to improve the quality of Public Transport while keeping its cost competitive. The ultimate goal should be to increase the number of people that trust and use Public Transport.
How is it possible to achieve this goal?
Modern transport operators have installed so called Automatic Vehicle Monitoring (AVM) systems that use several technologies to monitor the fleet of vehicles that operates the service (e.g., metro coaches, buses, metro trains, trains, …).
The stream of data produced by an AVM might be considered as Big Data because of its volume and velocity (see Big Data For Dummies, by J.F. Puget). Each vehicle produces at regular intervals (measured in seconds) data concerning its position and status. This information is stored in remote data centers. The data for a single day might not be considered as “Big”, however once you start to analyze the historical data, the volume increases significantly. For instance, a public transport operator could easily have around 2000 thousands vehicles that operate 24 hours a day, producing data potentially every second.
At the moment, this stream of data misses the third dimension of Big Data that is variety. However, new projects that aim at integrating this information with the stream of data coming from social networks are quickly reaching maturity. One of such project is SuperHub, a FP7 project that has recently won the best exhibit award in Cluster 2 “Smart and sustainable cities for 2020+”, at the ICT2013 Conference in Vilnius.
I don’t know whether transport operators are really Big Data producers or they are merely Small Data producers, but data collected using AVMs are nowadays mainly used to report and monitor the daily activities.
In my own opinion, the data produced by transport operators, integrated with input coming from social networks, should be used to improve the quality of the public transport, for instance, by trying to better tackle Disruption Management issues.
So, I am curious:
]]>Do you know any project that uses AVM data, combined with Social Network inputs (e.g., from Twitter), to elaborate Disruption Management strategies for Public Transport? If yes, do they use Mathematical Optimization at all?
I love reading about everything and I am glad that part of my work consists in reading.
Unfortunately, for researchers, reading is not always that easy, as clearly explained in The Researcher’s Bible:
Reading is difficult: The difficulty seems to depend on the stage of academic development. Initially it is hard to know what to read (many documents are unpublished), later reading becomes seductive and is used as an excuse to avoid research. Finally one lacks the time and patience to keep up with reading (and fears to find evidence that one’s own work is second rate or that one is slipping behind)
For my stage of academic development, reading is extremely seductive, and the situation became even worse after reading the answers to the following question raised by Michael Trick on ORexchange:
If you are looking for excuses to avoid research, go through those answers and select any paper you like, you will have outstanding and authoritative excuses!
]]>This post is about solving the classical Graph Coloring problem by using a simple solver, named here GeCol, that is built on top of the Constraint Programming (CP) solver Gecode. The approach of GeCol is based on the CP model described in [1]. Here, we want to explore some of the new features of the last version of Gecode (version 4.0.0), namely:
We are going to present computational results using these features to solve the instances of the Graph Coloring DIMACS Challenge. However, this post is not going to describe in great details what these features are: please, for this purpose, refer to the Modeling and Programming with Gecode book.
As usual, all the sources used to write this post are publicly available on my GitHub repository.
Given an undirected graph and a set of colors , the minimum (vertex) graph coloring problem consists of assigning a color to each vertex, while every pair of adjacent vertices gets a different color. The objective is to minimize the number of colors.
To model this problem with CP, we can use for each vertex an integer variable with domain equals to : if , then color is assigned to vertex .
Using (inclusionwise) maximal cliques, it is possible to post constraints on subsets of adjacent vertices: every subset of vertices belonging to the same clique must get a different color. In CP, we can use the wellknown alldifferent
constraint for posting these constraints.
In practice, to build our CP model, first, we find a collection of maximal cliques , such that for every edge there exists at least a clique that contains both vertices and . Second, we post the following constraints:
where denotes the subset of variables corresponding to the vertices that belong to the clique .
In order to minimize the number of colors, we use a simple iterative procedure. Every time we found a coloring with colors, we restart the search by restricting the cardinality of to . If no feasible coloring exists with colors, we have proved optimality for the last feasible coloring found, i.e. .
In addition, we apply a few basic preprocessing steps that are described in [1]. The maximal cliques are computed using Cliquer v1.21 [5].
The Graph Coloring problem is an optimization problem that has several equivalent optimum solutions: for instance, given an optimal assignment of colors to vertices, any permutation of the colors, gives a solution with the same optimum value.
While this property is implicitly considered in Column Generation approaches to Graph Coloring (e.g., see [3], [1], and [4]), the CP model we have just presented, suffers from symmetries issues: the values of the domains of the integer variables are symmetric.
The Lightweight Dynamic Symmetry Breaking is a strategy for dealing with this issue [2]. In Gecode, you can define a set of values that are symmetric as follows:
Symmetries syms;
syms << ValueSymmetry(IntArgs::create(k,1));
and then when posting the branching strategy you just write (just note that use of object syms
):
branch(*this, x, INT_VAR_SIZE_MIN(), INT_VAL_MIN(), syms);
With three lines of code, you have solved (some of) the symmetry issues.
How efficient is Lightweight Dynamic Symmetry Breaking for Graph Coloring?
We try to answer to this question with the plot below that shows the results for two versions of GeCol:
Both versions select for branching the variable with the smallest domain size. The plot reports the empirical cumulative distribution as function of run time (in logscale). The tests were run with a timeout of 300 seconds on a quite old server. Note that at the timeout, the version with LDBS has solved around 55% of the instances, while the version without LDBS has solved only around 48% of the instances.
The second new feature of Gecode that we explore here is the Accumulated Failure Count and the Activitybased branching strategies.
While solving any CP model, the strategy used to select the next variable to branch over is very important. The Accumulated Failure Count strategy stores the cumulative number of failures for each variable (for details see Section 8.5 in MPG). The Activitybased search does something similar, but instead of counting failures, measures the activity of each variable. In a sense, these two strategies try to learn from failures and activities as they occur during the search process.
These two branching strategies are more effective when combined with Restart Based Search: the solver performs the search with increasing cutoff values on the number of failures. Gecode offers several optional strategies to improve the cutoff. In our tests, we have used a geometric cutoff sequence (Section 9.4 in MPG).
How effective are the Accumulated Failure Count and the Activitybased strategies for Graph Coloring when combined with Restart Based Search?
The second plot below shows a comparison of 3 versions of GeCol, with 3 different branching strategies:
The last strategy is tremendously efficient: it dominates the other two strategies, and it is able to solve more of the 60% of the considered instances within the timeout of 300 seconds.
However, it is possible to do still slightly better. Likely, at the begging of the search phase, several variables have the same value of AFC. Therefore, it is possible to improve the branching strategy by breaking ties: we can divide the ACT or the AFC value of a variable by the its domain size. The next plot shows the results with these other branching strategies:
The new features of Gecode are very interesting and offer plenty of options. The LDBS is very general, and it could be easily applied to several other combinatorial optimization problems. Also the new branching strategies gives important enhancements, above all when combined with restart based search.
”…with great power there must also come – great responsibility!” (Uncle Ben, The Amazing SpiderMan, n.660, Marvel Comics)
As a drawback, it is becoming harder and harder to find the best parameter configuration for solvers as Gecode (but this is true also for other type of solvers, e.g. Gurobi and Cplex).
Can you find or suggest a better parameter configuration for GeCol?
S. Gualandi and F. Malucelli. Exact Solution of Graph Coloring Problems via Constraint Programming and Column Generation. INFORMS Journal on Computing. Winter 2012 vol. 24(1), pp.81100. [pdf] [preprint]
C. Mears, M.G. de la Banda, B. Demoen, M. Wallace. Lightweight dynamic symmetry breaking. In Eighth International Workshop on Symmetry in Constraint Satisfaction Problems, SymCon’08, 2008. [pdf]
A Mehrotra, MA Trick. A column generation approach for graph coloring. INFORMS Journal on Computing. Fall 1996 vol. 8(4), pp.344354. [pdf]
S. Held, W. Cook, E.C. Sewell. Maximumweight stable sets and safe lower bounds for graph coloring. Mathematical Programming Computation. December 2012, Volume 4, Issue 4, pp 363381. [pdf]
Patric R.J. Ostergard. A fast algorithm for the maximum clique problem. Discrete Applied Mathematics, vol. 120(13), pp. 197–207, 2002 [pdf]
Recently, I have discovered a nice tiny library (1 file!) that supports Backtrack Programming in standard C. The library is called CBack and is developed by Keld Helsgaun, who is known in the Operations Research and Computer Science communities for his efficient implementation of the LinKernighan heuristics for the Travelling Salesman Problem.
CBack offers basically two functions that are described in [1] as follows:
Choice(N)
: “is used when a choice is to be made among a number of alternatives, where N is a positive integer denoting the number of alternatives”.Backtrack()
: “causes the program to backtrack, that is to say, return to the most recent call of Choice, which has not yet returned all its values”.With these two functions is pretty simple to develop exact enumeration algorithms. The CBack library comes with several examples, such as algorithms for the Nqueens problem and the 15puzzle. Below, I will show you how to use CBack to implement a simple algorithm that finds a Maximum Clique in an undirected graph.
As usual, the source code used to write this post is publicly available on my GitHub repository.
The CBack documentation shows as first example the following code snippet:
1 2 3 4 5 

The output produced by the snippet is:
1 2 3 4 5 6 

If you are familiar with backtrack programming (e.g., Prolog), you should not be surprised by the output, and you can jump to the next section. Otherwise, the Figure below sketches the program execution.
When the program executes the Choice(N=3)
statement, that is the first call to the first choice (line 2), value 1 is assigned to variable i
. Behind the scene, the Choice function stores the current execution state of the program in its own stack,
and records the next possible choices (i.e. the other possible program branches),
that are values 2
and 3
. Next, the second Choice(N=2)
assigns value 1 to j
(line 3),
and again the state of the program is stored for later use. Then, the printf
outputs i = 1 , j = 1
(line 4 and first line of output). Now, it is time to backtrack (line 5).
What is happening here?
Look again at the figure above: When the Backtrack()
function is invoked, the algorithm backtracks and continues the execution
from the most recent Choice stored in its stack, i.e. it assigns to variable j
value 2, and printf
outputs i = 1, j = 2
. Later, the Backtrack()
is invoked again, and this time the algorithm backtracks until the previous possible choice that corresponds to the assignment of value 2 to variable i
, and it executes i = 2
. Once the second choice for variable i
is performed, there are again two possible choices for variable j
, since the program has backtracked to a point that precedes that statement. Thus, the program executes j = 1
, and printf
outputs i = 2, j = 1
. At this point, the program backtracks again, and consider the next possible choice, j = 2
. This is repeated until all possible choices for Choice(3)
and Choice(2)
are exhausted, yielding the 6 possible combinations of i
and j
that the problem gave as output.
Indeed, during the execution, the program has implicitly visited in a depthfirst manner the search tree of the previous figure. CBack supports also different search strategy, such as best first, but I will not cover that topic here.
In order to store and restore the program execution state (well, more precisely the calling environment), Choice(N)
and Backtrack
use two threatening C standard functions, setjmp
and longjmp
.
For the details of their use in CBack, see [1].
The reason why I like this library, apart from remembering me the time I was programming with Mozart, is that it permits to implement quickly exact algorithms based on enumeration. While enumeration is usually disregarded as inefficient (“ehi, it is just brute force!”), it is still one of the best method to solve small instances of almost any combinatorial optimization problem. In addition, many sophisticated exact algorithms use plain enumeration as a subroutine, when during the search process the size of the problem becomes small enough.
Consider now the Maximum Clique Problem: Given an undirected graph , the problem is to find the largest complete subgraph of . More formally, you look for the largest subset of the vertex set such that for any pair of nodes in there exists an arc .
The wellknown branchandbound algorithm of Carraghan and Pardalos [2] is based on enumeration. The implementation of Applegate and Johnson, called dfmax.c, is a very efficient implementation of that algorithm. Next, I show a basic implementation of the same algorithm that uses CBack for backtracking.
The Carraghan and Pardalos algorithm uses three sets: the current clique , the largest clique found so far , and the set of candidate vertices . The pseudo code of the algorithm is as follows (as described in [3]):
1 2 3 4 5 6 7 8 9 10 11 12 13 

As you can see, the backtracking is here described in terms of a recursive function. However, using CBack, we can implement the same algorithm without using recursion.
We use an array S
of integers, one for each vertex of .
If S[v]=0
, then vertex belongs to the candidate set ; if S[v]=1
, then vertex is in ; if S[v]=2
, then vertex cannot be neither in nor in . The variable s
stores the size of current clique.
Let me show you directly the C code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 

Well, I like this code pretty much, despite being a “plain old” C program. The algorithm and code can be improved in several ways (ordering the vertices, improving the pruning, using upper bounds from heuristic vertex coloring, using induced degree as in [2]), but still, the main loop and the backtrack machinery is all there, in a few lines of code!
Maybe you wonder about the efficiency of this code, but at the moment I have not a precise answer. For sure, the ordering of the vertices is crucial, and can make a huge difference on solving the maxclique DIMACS instances. I have used CBack to implement my own version of the Ostengard’s maxclique algorithm [4], but my implementation is somehow slower. I suspect that the difference is due to data structure used to store the graph (Ostengard’s implementation relies on bitsets), but not in the way the backtracking is achieved. Although, to answer to such question could be a subject of another post.
In conclusion, if you need to implement an exact enumerative algorithm, CBack could be an option to consider.
Keld Helsgaun. CBack: A Simple Tool for Backtrack Programming in C. Software: Practice and Experience, vol. 25(8), pp. 905934, 2006. [doi]
Carraghan and Pardalos. An exact algorithm for the maximum clique problem. Operations Research Letters, vol. 9(6), pp. 375382, 1990, [pdf]
Torsten Fahle. Simple and Fast: Improving a BranchandBound Algorithm. In Proc ESA 2002, LNCS 2461, pp. 485498. [doi]
Patric R.J. Ostergard. A fast algorithm for the maximum clique problem. Discrete Applied Mathematics, vol. 120(13), pp. 197–207, 2002 [pdf]
On the blackboard, to solve small Integer Linear Programs with 2 variables and less or equal constraints is easy, since they can be plotted in the plane and the linear relaxation can be solved geometrically. You can draw the lattice of integer points, and once you have found a new cutting plane, you show that it cuts off the optimum solution of the LP relaxation.
This post presents a naive (textbook) implementation of Fractional Gomory Cuts that uses the basic solution computed by CPLEX, the commercial Linear Programming solver used in our lab sessions. In practice, this post is an online supplement to one of my last exercise session.
In order to solve the “blackboard” examples with CPLEX, it is necessary to use a couple of functions that a few years ago were undocumented. GUROBI has very similar functions, but they are currently undocumented. (Edited May 16th, 2013: From version 5.5, Gurobi has documented its advanced simplex routines)
As usual, all the sources used to write this post are publicly available on my GitHub repository.
Given a Integer Linear Program in the form:
it is possible to rewrite the problem in standard form by adding slack variables:
where is the identity matrix and is a vector of slack variables, one for each constraint in . Let us denote by the linear relaxation of obtained by relaxing the integrality constraint.
The optimum solution vector of , if it exists and it is finite, it is used to derive a basis (for a formal definition of basis, see [1] or [3]). Indeed, the basis partitions the columns of matrix into two submatrices and , where is given by the columns corresponding to the basic variables, and by columns corresponding to variables out of the base (they are equal to zero in the optimal solution vector).
Remember that, by definition, is nonsingular and therefore is invertible. Using the matrices and , it is easy to derive the following inequalities (for details, see any OR textbook, e.g., [1]):
where the operator is applied component wise to the matrix elements. In practice, for each fractional basic variable, it is possible to generate a valid Gomory cut.
The key step to generate Gomory cuts is to get an optimal basis or, even better, the inverse of the basis matrix multiplied by and by . Once we have that matrix, in order to generate a Gomory cut from a fractional basic variable, we just use the last equation in the previous derivation, applying it to each row of the system of inequalities
Given the optimal basis, the optimal basic vector is , since the non basic variables are equal to zero. Let be the index of a fractional basic variable, and let be the index of the constraint corresponding to variable in the equations , then the Gomory cut for variable is:
The CPLEX callable library (written in C) has the following advanced functions:
Using the first two functions, Gomory cuts from an optimal base can be generated as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 

The code reads row by row (index i) the inverse basis matrix multiplied by (line 7),
which is temporally stored in vector z
,
and then the code stores the corresponding Gomory cut in the compact matrix given by vectors rmatbeg
, rmatind
, and rmatval
(lines 815).
The array b_bar
contains the vector (line 21). In lines 2627, all the cuts are added at once to the current LP data structure.
On GitHub you find a small program that I wrote to generate Gomory cuts for problems written as . The repository have an example of execution of my program.
The code is simple only because it is designed for small IPs in the form . Otherwise, the code must consider the effects of preprocessing, different sense of the constraints, and additional constraints introduced because of range constraints.
If you are interested in a real implementation of MixedInteger Gomory cuts, that are a generalization of Fractional Gomory cuts to mixed integer linear programs, please look at the SCIP source code.
The introduction of Mixed Integer Gomory cuts in CPLEX was The major breakthrough of CPLEX 6.5 and produced the versiontoversion speedup given by the blue bars in the chart below (source: Bixby’s slides available on the web):
Gomory cuts are still subject of research, since they pose a number of implementation challenges. These cuts suffer from severe numerical issues, mainly because the computation of the inverse matrix requires the division by its determinant.
“In 1959, […] We started to experience the unpredictability of the computational results rather steadily” (Gomory, see [4]).”
A recent paper by Cornuejols, Margot, and Nannicini deals with some of these issues [2].
If you like to learn more about how the basis are computed in the CPLEX LP solver, there is very nice paper by Bixby [3]. The paper explains different approaches to get the first basic feasible solution and gives some hints of the CPLEX implementation of that time, i.e., 1992. Though the paper does not deal with Gomory cuts directly, it is a very pleasant reading.
To conclude, for those of you interested in Optimization Stories there is a nice chapter by G. Cornuejols about the Ongoing Story of Gomory Cuts [4].
C.H. Papadimitriou, K. Steiglitz. Combinatorial Optimization: Algorithms and Complexity. 1998. [book]
G. Cornuejols, F. Margot and G. Nannicini. On the safety of Gomory cut generators. Submitted in 2012. Mathematical Programming Computation, under review. [preprint]
R.E. Bixby. Implementing the Simplex Method: The Initial Basis. Journal on Computing vol. 4(3), pages 267–284, 1992. [abstract]
G. Cornuejols. The Ongoing Story of Gomory Cuts. Documenta Mathematica  Optimization Stories. Pages 221226, 2012. [preprint]
Trenord officially said that the software that planned the crew schedule is faulty. The software was bought last year from Goal Systems, a Spanish company. Rumors say that Trenord paid the Goal System around 1,500,000 Euro. Likely, the system is not faulty, but it “only” had bad input data.
Before the Goal System, Trenord was using a different software, produced by Management Artificial Intelligence Operations Research srl (MAIOR) that is used by several public transportation companies in Italy, included ATM that operates the subway and buses in Milan. In addition, MAIOR collaborates with the Politecnico di Milano and the University of Pisa to improve continuously its software. Honestly, I am biased, since I collaborate with MAIOR. However, Trenord dismissed the software of MAOIR without any specific complaint, since the management had decided to buy the Goal System software.
Newspapers do not ask the following question:
Why to change a piece of software, if the previous one was working correctly?
In Italy, soccer players have a motto: “squadra che vince non si cambia”. Maybe at Trenord nobody plays soccer.
Likely, next week will be better for the 700,000 commuters, since OR experts from MAIOR are traveling to Milan to help Trenord to improve the situation.
The MIP instances I propose come from my formulation of the Machine Reassignment Problem proposed for the Roadef Challenge sponsored by Google last year. As I wrote in a previous post, the Challenge had huge instances and a micro time limit of 300 seconds. I said micro because I have in mind exact methods: there is little you can do in 300 seconds when you have a problem with potentially as many as binary variables. If you want to use math programming and start with the solution of a linear programming relaxation of the problem, you have to be careful: it might happen that you cannot even solve the LP relaxation at the root node within 300 seconds.
That is why most of the participants tackled the Challenge mainly with heuristic algorithms. The only general purpose solver that qualified for the challenge is Local Solver, which has a nice abstraction (“somehow” similar to AMPL) to wellknown local search algorithms and move operators. The Local Solver script used in the qualification phase is available here.
However, in my own opinion, it is interesting to try to solve at least the instances of the qualification phase with Integer Linear Programming (ILP) solvers such as Gurobi and CPLEX. Can these branchandcut commercial solvers be competitive on such problems?
Consider you are given a set of processes , a set of machines , and an initial mapping of each process to a single machine (i.e., if process is initially assigned to machine ). Each process consumes several resources, e.g., CPU, memory, and bandwidth. In the challenge, some processes were defined to be transient: they consume resources both on the machine where they are initially located, and in the machine they are going to be after the reassignment. The problem asks to find a new assignment of processes to machines that minimizes a rather involved cost function.
A basic ILP model will have a 01 variable equals to 1 if you (re)assign process to machine . The number of processes and the number of machines give a first clue on the size of the problem. The constraints on the resource capacities yield a multidimensional knapsack subproblem for each machine. The Machine Reassignment Problem has other constraints (kind of logical 01 constraints), but I do not want to bore you here with a full problem description. If you like to see my model, please read the AMPL model file.
In order to convince you that the proposed instances are challenging, I report some computational results.
The table below reports for each instance the best result obtained by the participants of the challenge (second column). The remaining four columns give the upper bound (UB), the lower bound (LB), the number of branchandbound nodes, and the computation time in seconds obtained with Gurobi 5.0.1, a timeout of 300 seconds, and the default parameter setting on a rather old desktop (single core, 2Gb of RAM).
Instance  Best Known UB  Upper Bound  Lower Bound  Nodes  Time 

a11  44,306,501  44,306,501  44,306,501  0  0.05 
a12  777,532,896  780,511,277  777,530,829  537   
a13  583,005,717  583,005,720  583,005,715  15  48.76 
a14  252,728,589  320,104,617  242,404,632  24   
a15  727,578,309  727,578,316  727,578,296  221  2.43 
a21  198  54,350,836  110  0   
a22  816,523,983  1,876,768,120  559,888,659  0   
a23  1,306,868,761  2,272,487,840  1,007,955,933  0   
a24  1,681,353,943  3,223,516,130  1,680,231,407  0   
a25  336,170,182  787,355,300  307,041,984  0   
Instances a11, a13, a15 are solved to optimality within 300 seconds
and hence they are not further considered.
The remaining seven instances are the challenging instances mentioned at the begging of this post. The instances a2x are embarrassing: they have an UB that is far away from both the best known UB and the computed LB. Specifically, look at the instance a21: the best result of the challenge has value 198, Gurobi (using my model) finds a solution with cost 54,350,836: you may agree that this is “slightly” more than 198. At the same time the LB is only 110.
Note that for all the a2x instances the number of branchandbound nodes is zero. After 300 seconds the solver is still at the root node trying to generate cutting planes and/or running their primal heuristics. Using CPLEX 12.5 we got pretty similar results.
This is why I think these instances are challenging for branchandcut solvers.
Commercial solvers have usually a metaparameter that controls the search focus by setting other parameters (how they are precisely set is undocumented: do you know more about?). The two basic options of this parameter are (1) to focus on looking for feasible solution or (2) to focus on proving optimality. The name of this parameter is MipEmphasis in CPLEX and MipFocus in Gurobi. Since the LPs are quite time consuming and after 300 seconds the solver is still at the root node, we can wonder whether generating cuts is of any help on these instances.
If we set the MipFocus to feasibility and we explicitly disable all cut generators, would we get better results?
Look at the table below: the values of the upper bounds of instances a12, a14, and a23 are slightly better than before: this is a good news. However, for instance a21 the upper bound is worse, and for the other three instances there is no difference. Moreover, the LBs are always weaker: as expected, there is no free lunch!
Instance  Upper Bound  Lower Bound  Gap  Nodes 

a12  779,876,897  777,530,808  0.30%  324 
a14  317,802,133  242,398,325  23.72%  48 
a21  65,866,574  66  99.99%  81 
a22  1,876,768,120  505,443,999  73.06%  0 
a23  1,428,873,892  1,007,955,933  29.45%  0 
a24  3,223,516,130  1,680,230,915  47.87%  0 
a25  787,355,300  307,040,989  61.00%  0 
If we want to keep a timeout of 300 seconds, there is little we can do, unless we develop an adhoc decomposition approach.
Can we improve those results with a branchandcut solver using a longer timeout?
Most of the papers that uses branchandcut to solve hard problems have a timeout of at least one hour, and they start by running an heuristic for around 5 minutes. Therefore, we can think of using the best results obtained by the participants of the challenge as starting solution.
So, let us make a step backward: we enable all cut generators and we set all parameters at the default value. In addition we set the time limit to one hour. The table below gives the new results. With this setting we are able to “prove” nearoptimality of instance a12, and we reduce significantly the gap of instance a24. However, the solver never improves the primal solutions: this means that we have not improved the results obtained in the qualification phase of the challenge. Note also that the number of nodes explored is still rather small despite the longer timeout.
Instance  Upper Bound  Lower Bound  Gap  Nodes 

a12  777,532,896  777,530,807  ~0.001%  0 
a14  252,728,589  242,404,642  4.09%  427 
a21  198  120  39.39%  2113 
a22  816,523,983  572,213,976  29.92%  18 
a23  1,306,868,761  1,068,028,987  18.27%  69 
a24  1,681,353,943  1,680,231,594  0.06%  133 
a25  336,170,182  307,042,542  8.66%  187 
What if we disable all cuts and set the MipFocus to feasibility again?
Instance  Upper Bound  Lower Bound  Gap  Nodes 

a12  777,532,896  777,530,807  ~0.001%  0 
a14  252,728,589  242,398,708  4.09%  1359 
a21  196  70  64.28%  818 
a22  816,523,983  505,467,074  38.09%  81 
a23  1,303,662,728  1,008,286,290  22.66%  56 
a24  1,681,353,943  1,680,230,918  0.07%  108 
a25  336,158,091  307,040,989  8.67%  135 
With this parameter setting, we improve the UB for 3 instances: a21, a23, and a25.
However, the lower bounds are again much weaker. Look at instance a21: the lower bound is
now 70 while before it was 120. If you look at instance a23 you can see that even if
we got a better primal solution, the gap is weaker, since the lower bound is worse.
With the focus on feasibility you get better results, but you might miss the ability to prove optimality. With the focus on optimality you get better lower bounds, but you might not improve the primal bounds.
1) How to balance feasibility with optimality?
To use branchandcut solver and to disable cut generators is counterintuitive, but if you do you, you get better primal bounds.
2) Why should I use a branchandcut solver then?
Do you have any idea out there?
While writing this post, we got 3 solutions that are better than those obtained by the participants of the qualification phase: a21, a23, and a25 (the three links give the certificates of the solutions). We are almost there in proving optimality of a23, and we get better lower bounds than those published in [1].
Deepak Mehta, Barry O’Sullivan, Helmut Simonis. Comparing Solution Methods for the Machine Reassignment Problem. In Proc of CP 2012, Québec City, Canada, October 812, 2012.
Thanks to Stefano Coniglio and to Marco Chiarandini for their passionate discussions about the posts in this blog.
]]>During the conference, the weather outside was pretty cold, but at the conference site the discussions were warm and the presentations were intriguing.
In this post, I share an informal report of the conference as “Je me souviens”.
The invited talks were excellent and my favorite one was given by Miguel F. Anjos on Optimization Challenges in Smart Grid Operations. Miguel is not exactly a CP programmer, he is more on discrete non linear optimization, but his talk was a perfect mixed of applications, modeling, and solution techniques. Please, read and enjoy his slides.
I like to mention just one of his observations. Nowadays, electric cars are becoming more and more present. What would happen when each of us will have an electric car? Likely, during the night, while sleeping, we will connect our car to the grid to recharge the car batteries. This will lead to high variability in night peaks of energy demand.
How to manage these peaks?
Well, what Miguel has reported as a possible challenging option is to think of the collection of cars connected to the grid as a kind of huge battery. This sort of collective battery could be used to better handle the peaks of energy demands. Each car would play the game with a double role: if there is not an energy demand peak, you can recharge the car battery; otherwise, the car battery could be used as a power source and it could supply energy to the grid. This is an oversimplification, but as you can image there would be great challenges and opportunities for any constraint optimizer in terms of modeling and solution techniques.
I am curious to read more about, do you?
This year CP had the thicker conference proceedings, ever. Traditionally, the papers are presented in two parallel sessions. Two is not that much when you think that this year at ISMP there were 40 parallel sessions… but still, you always regret that you could not attend the talk in the other session. Argh!
Here I like to mention just two works. However, the program chair is trying to make all the slides available. Have a look at the program and at the slides: there are many good papers.
In the application track, Deepak Mehta gave a nice talk about a joint work with Barry O’Sullivan and Helmut Simonis on Comparing Solution Methods for the Machine Reassignment Problem, a problem that Google has to solve every day in its data centers and that was the subject of the Google/Roadef Challenge 2012. The true challenge is given by the HUGE size of the instances and the very short timeout (300 seconds). The work presented by Deepak is really interesting and they got excellent results using CPbased Large Neighborhood Search: they classified second at the challenge.
Related to the Machine Reassignment Problem there was a second interesting talk entitled Weibullbased Benchmarks for Bin Packing, by Ignacio Castineiras, Milan De Cauwer and Barry O’Sullivan. They have designed a parametric instance generator for bin packing problems based on the Weibull distribution. Having a parametric generator is crucial to perform exhaustive computational results and to identify those instances that are challenging for a particular solution technique. For instance, they have considered a CPapproach to bin packing problems and they have identified those Weibull shape values that yield challenging instances for such an approach. A nice feature is that their generator is able to create instances similar to those of the Google challenge… I hope they will release their generator soon!
Differently from other conferences (as for instance IPCO), CP gives PhD students the opportunity to present their ongoing work within a Doctoral Program. The sponsors cover part of the costs for attending the conference. During the conference each student has a mentor who is supposed to help him. This year there were around 24 students and only very few of them had a paper accepted at the main conference. This means that without the Doctoral Program, most of these students would not had the opportunity to attend the conference.
Geoffrey Chu awarded the 2012 ACP Doctoral Research Award for his thesis Improving Combinatorial Optimization. To give you an idea about the amount of his contributions, consider that after his thesis presentation, someone in the audience asked:
“And you got only one PhD for all this work?”
Chapeau! Among other things, Chu has implemented Chuffed one of the most efficient CP solver that uses lazy clause generation and that ranked very well at the last MiniZinc Challenge, even if it was not one of the official competitors.
For the record, the winner of the MiniZinc challenge of this year is (again) the Gecode team. Congratulations!
Next year CP will be held in Sweden, at Uppsala University on 1620 September 2013. Will you be there? I hope so…
In the meantime, if you were at the conference, which was your favorite talk and/or paper?
]]>Here we go, my first blog entry, ever. Let’s start with two short quizzes.
1. The well known Dijkstra’s algorithm is:
[a] A greedy algorithm
[b] A dynamic programming algorithm
[c] A primaldual algorithm
[d] It was discovered by Dantzig
2. Which is the best C++ implementation of Dijkstra’s algorithm among the following?
[a] The Boost Graph Library (BGL)
[b] The COINOR Lemon Graph Library
[c] The Google OrTools
[d] Hei dude! We can do better!!!
What is your answer for the first question? … well, the answers are all correct! And for the second question? To know the correct answer, sorry, you have to read this post to the end…
If you are curious to learn more about the classification of the Dijkstra’s algorithm proposed in the first three answers, please consider reading [1] and [2]. Honestly, I did not know that the algorithm was independently discovered by Dantzig [3] as a special case of Linear Programming. However, Dantzig is credited for the first version of the bidirectional Dijkstra’s algorithm (should we called it Dantzig’s algorithm?), which is nowadays the best performing algorithm on general graphs. The bidirectional Dijkstra’s algorithm is used as benchmark to measure the speedup of modern specialized shortest path algorithms for road networks [4,5], those algorithms that are implemented, for instance, in our GPS navigation systems, in yours smartphones (I don’t have one, argh!), in Google Maps Directions, and Microsoft Bing Maps.
Why a first blog entry on Dijkstra’s algorithm? That’s simple.
I did while programming in C++, and I want to share with you my experience.
The algorithm is quite simple. First partition the nodes of the input graph G=(N,A) in three sets: the sets of (1) scanned, (2) reachable, and (3) unvisited nodes. Every node has a distance label and a predecessor vertex . Initially, set the label of the source node , while set for all other nodes. Moreover, the node s is placed in the set of reachable nodes, while all the other nodes are unvisited.
The algorithm proceedes as follows: select a reachable node i with minimum distance label, and move it in the set of scanned nodes, it will be never selected again. For each arc (i,j) in the forward star of node i check if node j has distance label ; if it is the case, update the label and the predecessor vertex . In addition, if the node was unvisited, move it in the set of reachable nodes. If the selected node i is the destination node t, stop the algorithm. Otherwise, continue by selecting the next node i with minimum distance label.
The algorithm stops either when it scans the destination node t or the set of reachable nodes is empty. For the nice properties of the algorithm consult any textbook in computer science or operations research.
At this point it should be clear why Dijkstra’s algorithm is greedy: it always select a reachable node with minimum distance label. It is a dynamic programming algorithm because it maintains the recursive relation for all . If you are familiar with Linear Programming, you should recognize that the distance labels play the role of dual variable of a flow based formulation of the shortest path problem, and the Dijkstra’s algorithm costructs a primal solution (i.e. a path) that satisfies the dual constraints .
The algorithm uses two data structures: the input graph G and the set of reachable nodes Q. The graph G can be stored with an adjacency list, but be sure that the arcs are stored in contiguous memory, in order to reduce the chance of cache misses when scanning the forward stars. In my implementation, I have used a std::vector to store the forward star of each node.
The second data structure, the most important, is the priority queue Q. The queue has to support three operations: push, update, and extractmin. The type of priority queue used determines the worstcase complexity of the Dijkstra’s algorithm. Theoretically, the best strongly polynomial worstcase complexity is achieved via a Fibonacci heap. On road networks, the Multi Bucket heap yields a weakly polynomial worstcase complexity that is more efficient in practice [4,5]. Unfortunately, the Fibonacci Heap is a rather complex data structure, and lazy implementations end up in using a simpler Binomial Heap.
The good news is that the Boost Library from version 1.49 has a Heap library. This library contains several type of heaps that share a common interface: daryheap, binomialheap, fibonacciheap, pairingheap, and skewheap. The worstcase complexity of the basic operations are summarized in a nice table. Contrary to textbooks, these heaps are ordered in non increasing order (they are maxheap instead of minheap), that means that the top of the heap is always the element with highest priority. For implementing Dijkstra, where all arc lengths are non negative, this is not a problem: we can store the elements with the distance changed in sign (sorry for the rough explanation, but if you are really intrested it is better to read directly the source code).
The big advantage of boost::heap is that it allows to program Dijkstra once, and to compile it with different heaps via templates. If you wonder why the Boost Graph Library does not use boost::heap, well, the reason is that BGL was implemented a few years ago, while boost::heap appeared this year.
Here is the point that maybe interests you the most: can we do better than wellreputed C++ graph libraries?
I have tried three graph libraries: Boost Graph Library (BGL) v1.51, COINOR Lemon v1.2.3, and Google OrTools cheked out from svn on Sep 7th, 2012. They all have a Dijkstra implementation, even if I don’t know the implementation details. As a plus, the three libraries have python wrappers (but I have not test it). The BGL is a header only library. Lemon came after BGL. BGL, Lemon, and my implementation use (different) Fibonacci Heaps, while I have not clear what type of priority queue is used by OrTools.
Disclaimer: Google OrTools is much more than a graph library: among others, it has a Constraint Programming solver with very nice features for Large Neighborhood Search; however, we are interested here only in its Dijkstra implementation. Constraint Programming will be the subject of another future post.
A few tests on instances taken from the last DIMACS challenge on Shortest Path problems show the pros and cons of each implementation. Three instances are generated using the rand graph generator, while 10 instances are road networks. The test are done on my late 2008 MacBookPro using the apple gcc4.2 compiler. All the source code, scripts, and even this post text, are available on github.
The first test compares the four implementations on 3 graphs with different density d that is the ratio . The graphs are:
For each graph, 50 queries between different pairs of source and destination nodes are performed. The table below reports the average of query times (total time divided by query numbers). The entries in bold highlight the shortest time per row.
Graph  MyGraph  BGL  Lemon  OrTools 

Rand 1  0.0052  0.0059  0.0074  1.2722 
Rand 2  0.0134  0.0535  0.0706  1.6128 
Rand 3  0.0705  0.5276  0.7247  4.2535 
In these tests, it looks like my implementation is the winner… wow!
Although, the true winner is the boost::heap library, since the nasty implementation details
are delegated to that library.
… but come on! These are artificial graphs: who is really interested in shortest paths on random graphs?
The second test uses road networks that are very sparse graphs. We report only average computation time in seconds over 50 different pair of sourcedestination nodes. We decided to leave out OrTools since it is not very performing on very sparse graphs.
This table below shows the average query time for the standard implementations that use Fibonacci Heaps.
Area  nodes  arcs  MyGraph  BGL  Lemon 

Western USA  6,262,104  15,248,146  2.7215  2.7804  3.8181 
Eastern USA  3,598,623  8,778,114  1.9425  1.4255  2.7147 
Great Lakes  2,758,119  6,885,658  0.1808  0.8946  0.2602 
California and Nevada  1,890,815  4,657,742  0.5078  0.5808  0.7083 
Northeast USA  1,524,453  3,897,636  0.6061  0.5662  0.8335 
Northwest USA  1,207,945  2,840,208  0.3652  0.3506  0.5152 
Florida  1,070,376  2,712,798  0.1141  0.2753  0.1574 
Colorado  435,666  1,057,066  0.1423  0.1117  0.1965 
San Francisco Bay  321,270  800,172  0.1721  0.0836  0.2399 
New York City  264,346  733,846  0.0121  0.0677  0.0176 
From this table, BGL and my implementation are equally good, while Lemon comes after.
What would happen if we use a diffent type of heap?
This second table shows the average query time for the Lemon graph library with a specialized Binary Heap implementation, and my own implementation with generic 2Heap and 3Heap (binary and ternary heaps) and with a Skew Heap. Note that in order to use a different heap I just modify a single line of code.
Area  nodes  arcs  2Heap  3Heap  Skew Heap  Lemon 2Heap 

Western USA  6,262,104  15,248,146  1.977  1.934  2.104  1.359 
Eastern USA  3,598,623  8,778,114  1.406  1.372  1.492  0.938 
Great Lakes  2,758,119  6,885,658  0.132  0.130  0.135  0.109 
California and Nevada  1,890,815  4,657,742  0.361  0.353  0.372  0.241 
Northeast USA  1,524,453  3,897,636  0.433  0.421  0.457  0.287 
Northwest USA  1,207,945  2,840,208  0.257  0.252  0.256  0.166 
Florida  1,070,376  2,712,798  0.083  0.081  0.080  0.059 
Colorado  435,666  1,057,066  0.100  0.098  0.100  0.064 
San Francisco Bay  321,270  800,172  0.121  0.117  0.122  0.075 
New York City  264,346  733,846  0.009  0.009  0.009  0.007 
Mmmm… I am no longer the winner: COINOR Lemon is!
This is likely due to the specialized binary heap implementation of the Lemon library. Instead, the boost::heap library has a daryheap, that for d=2 is a generic binary heap.
Dijkstra’s algorithm is so beatiful because it has the elegance of simplicity.
Using an existing efficient heap data structure, it is easy to implement an “efficient” version of the algorithm.
However, if you have spare time, or you need to solve shortest path problems on a specific type of graphs (e.g., road networks), you might give a try with existing graph libraries, before investing developing time in your own implementation. In addition, be sure to read [4] and the references therein contained.
All the code I have used to write this post is available on github. If you have any comment or criticism, do not hesitate to comment below.
Pohl, I. Bidirectional and heuristic search in path problems. Department of Computer Science, Stanford University, 1969. [pdf]
Sniedovich, M. Dijkstra’s algorithm revisited: the dynamic programming connexion. Control and cybernetics vol. 35(3), pages 599620, 2006. [pdf]
Dantzig, G.B. Linear Programming and Extensions. Princeton University Press, Princeton, NJ, 1962.
Delling, D. and Sanders, P. and Schultes, D. and Wagner, D. Engineering route planning algorithms. Algorithmics of large and complex networks Lecture Notes in Computer Science, Volume 5515, pages 117139, 2009. [doi]
Goldberg, A.V. and Harrelson, C. Computing the shortest path: Astar search meets graph theory. Proc. of the sixteenth annual ACMSIAM symposium on Discrete algorithms, 156165, 2005. [pdf]