0% found this document useful (0 votes)
49 views18 pages

BB Notes

Branch and bound algorithms are methods for global optimization of nonconvex problems. They maintain provable upper and lower bounds on the optimal objective value. The algorithm works by recursively partitioning the problem space and pruning regions that cannot contain the global optimum. It terminates when the remaining upper and lower bounds are within epsilon of each other, proving the solution is epsilon-suboptimal. While slow in the worst case, in practice branch and bound often converges much faster by pruning large regions of the search space.

Uploaded by

myturtle game01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views18 pages

BB Notes

Branch and bound algorithms are methods for global optimization of nonconvex problems. They maintain provable upper and lower bounds on the optimal objective value. The algorithm works by recursively partitioning the problem space and pruning regions that cannot contain the global optimum. It terminates when the remaining upper and lower bounds are within epsilon of each other, proving the solution is epsilon-suboptimal. While slow in the worst case, in practice branch and bound often converges much faster by pruning large regions of the search space.

Uploaded by

myturtle game01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Branch and Bound Methods

Stephen Boyd and Jacob Mattingley


Notes for EE364b, Stanford University, Winter 2006-07
April 1, 2018

Branch and bound algorithms are methods for global optimization in nonconvex prob-
lems [?, ?]. They are nonheuristic, in the sense that they maintain a provable upper and
lower bound on the (globally) optimal objective value; they terminate with a certificate prov-
ing that the suboptimal point found is ǫ-suboptimal. Branch and bound algorithms can be
(and often are) slow, however. In the worst case they require effort that grows exponentially
with problem size, but in some cases we are lucky, and the methods converge with much
less effort. In these notes we describe two typical and simple examples of branch and bound
methods, and show some typical results, for a minimum cardinality problem.

1 Unconstrained nonconvex minimization


The material in this section is taken from [?]. The branch and bound algorithm we describe
here finds the global minimum of a function f : Rm → R over an m-dimensional rectangle
Qinit , to within some prescribed accuracy ǫ. We let f ⋆ denote the optimal value, i.e., f ⋆ =
inf x∈Qinit f (x). For a rectangle Q ⊆ Qinit we define

Φmin (Q) = inf f (x),


x∈Q

so f ⋆ = Φmin (Qinit ).
The algorithm will use two functions Φlb (Q) and Φub (Q) defined for any rectangle Q ⊆
Qinit . These functions must satisfy the following conditions. First, they are lower and upper
bounds on Φmin (Q), respectively: for any Q ⊆ Qinit ,

Φlb (Q) ≤ Φmin (Q) ≤ Φub (Q).

The second condition is (roughly) that the bounds become tight as the rectangle shrinks
to a point. More precisely, as the maximum half-length of the sides of Q, denoted by size(Q),
goes to zero, the difference between upper and lower bounds uniformly converges to zero,
i.e.,
∀ǫ > 0 ∃δ > 0 ∀Q ⊆ Qinit , size(Q) ≤ δ =⇒ Φub (Q) − Φlb (Q) ≤ ǫ.

1
Figure 1: Branch and bound example in R2 , after 3 iterations. The partition of
the original rectangle is shown at left; the associated binary tree is shown at right.

We mention a third condition, which is not needed to prove convergence of the branch
and bound algorithm, but is needed in practice: The functions Φub (Q) and Φlb (Q) should
be cheap to compute.
We now describe the algorithm. We start by computing
L1 = Φlb (Qinit ), U1 = Φub (Qinit ),
which are lower and upper bounds on f ⋆ , respectively. If U1 − L1 ≤ ǫ, the algorithm
terminates. Otherwise we partition Qinit into two rectangles Qinit = Q1 ∪ Q2 , and compute
Φlb (Qi ) and Φub (Qi ), i = 1, 2. Then we have new lower and upper bounds on f ⋆ ,
L2 = min{Φlb (Q1 ), Φlb (Q2 )} ≤ Φmin (Qinit ) ≤ U2 = min{Φub (Q1 ), Φub (Q2 )}.
If U2 − L2 ≤ ǫ, the algorithm terminates. Otherwise, we partition one of Q1 and Q2 into
two rectangles, to obtain a new partition of Qinit into three rectangles, and compute Φlb and
Φub for these new rectangles. We update the lower bound L3 as the minimum over the lower
bounds over the partition of Qinit , and similarly for the upper bound U3 .
At each iteration we split one rectangle into two, so after k iterations (counting the
evaluation of Φlb (Qinit ) and Φub (Qinit ) as the first one), we have a partition of Qinit into k
rectangles, Qinit = ∪ki=1 Qi . This partition can be represented by a partially filled binary
tree, with the root node corresponding to the original rectangle, and the children of a node
corresponding to the subrectangles obtained by partitioning the parent rectangle.
This is illustrated in figure 1. The original rectangle corresponds to the root of the
tree. We first partition Qinit vertically. At the next iteration, we horizontally partition the
lefthand child; at the third iteration, we vertically partition the bottom child. After three
iterations, we have a partition of the original rectangle into 4 rectangles, which correspond
to the leaves in the tree.
The associated lower and upper bounds on f ⋆ are
Lk = min Φlb (Qi ), Uk = min Φub (Qi ).
i=1,...,k i=1,...,k

2
We can assume that the lower and upper bounds of the pair of rectangles obtained by
splitting are no worse than the lower and upper bounds of the rectangle they were formed
from. This implies that Lk is nondecreasing, and Uk is nonincreasing.
To fully specify the algorithm, we need to give the rule for choosing which rectangle to
split at each step, and we have to specify which edge along which the rectangle is to be split.
One standard method for choosing the rectangle in the current partition to be split is to
choose one with the smallest lower bound, i.e., a rectangle that satisfies Φlb (Q) = Lk . Once
we choose the rectangle to split, we split it along any of its longest edges.
The algorithm is summarized below.

k = 0;
L0 = {Qinit };
L0 = Φlb (Qinit );
U0 = Φub (Qinit );
while Uk − Lk > ǫ, {
pick Q ∈ Lk for which Φlb (Q) = Lk ;
split Q along one of its longest edges into QI and QII ;
form Lk+1 from Lk by removing Qk and adding QI and QII ;
Lk+1 := minQ∈Lk+1 Φlb (Q);
Uk+1 := minQ∈Lk+1 Φub (Q);
k := k + 1;
}

The requirement that we split the chosen rectangle along a longest edge controls the
condition number of the rectangles in the partition, which in turn allows us to prove conver-
gence.
As the algorithm proceeds, we can eliminate some rectangles from consideration; they
can be pruned, since Φmin (Qinit ) cannot be achieved in them. This is done as follows. At
each iteration, we eliminate from the list Lk any rectangles that satisfy Φlb (Q) > Uk , since
every point in such a rectangle is worse than the current upper bound on f ⋆ .
Pruning does not affect the algorithm, since any rectangle pruned would never be selected
later for further splitting. But pruning can reduce storage requirements. With pruning, the
union of the list of current rectangles can be regarded as the set of possible minimizers of
f . The number of active rectangles, and their total volume, give some measure of algorithm
progress.
The term pruning comes from the following. The algorithm can be viewed as developing a
partially filled binary tree of rectangles representing the current partition Lk , with the nodes
corresponding to rectangles and the children of a given node representing the two halves
obtained by splitting it. The leaves of the tree give the current partition. By removing a
rectangle from consideration, we prune the tree.

3
1.1 Convergence analysis
We first show that after a large enough number of iterations, the partition Lk must contain
a rectangle of small volume. We then show that this rectangle has small size, and this in
turn implies that Uk − Lk is small.
The number of rectangles in the partition Lk is k, and the total volume of these rectangles
is vol(Qinit ), so
vol(Qinit )
min vol(Q) ≤ . (1)
Q∈Lk k
Thus, after a large number of iterations, at least one rectangle in the partition has small
volume. Next, we show that small volume implies small Q size for a rectangle in any partition.
We define the condition number of a rectangle Q = i [li , ui ] as
maxi (ui − li )
cond(Q) = .
mini (ui − li )
Our splitting rule, which requires that we split rectangles along a longest edge, results in
an upper bound on the condition number of rectangles in our partition: for any rectangle
Q ∈ Lk ,
cond(Q) ≤ max{cond(Qinit ), 2}. (2)
To prove this it is enough to show that when a rectangle Q is split into rectangles Q1 and
Q2 , we have
cond(Q1 ) ≤ max{cond(Q), 2}, cond(Q2 ) ≤ max{cond(Q), 2}.
Let νmax be the maximum edge length of Q, and let νmin be the minimum edge length of Q,
so cond(Q) = νmax /νmin . When Q is split into Q1 and Q2 , the maximum edge length of the
children is no more than νmax . The minimum edge length of the children, by our splitting
rule, is no smaller than the minimum of νmax /2 and νmin . It follows that the condition
number of the children is no more than the maximum of 2 and cond(Q).
We note that there are other splitting rules that also result in a uniform bound on the
condition number of the rectangles in any partition generated. One such rule is to cycle
through the index on which we split the rectangle. If Q was formed by splitting its parent
along the ith coordinate, then when we split Q, we split it along the (i + 1) modulo m
coordinate.
We can bound the size of a rectangle Q in terms of its volume and condition number,
since
Y
vol(Q) = (ui − li )
i
m−1
≥ max(ui − li ) min(ui − li )
i i
(2 size(Q))m
=
cond(Q)m−1
m
2 size(Q)
≥ .
cond(Q)

4
Thus,
1
size(Q) ≤ cond(Q)vol(Q)1/m . (3)
2
Combining equations (1), (2) and (3) we get
1/m
1 vol(Qinit )
min size(Q) ≤ max{cond(Qinit ), 2} . (4)
Q∈Lk 2 k
Thus the minimum size of the rectangles in the partition Lk converges to zero.
By assumption 2, there exists a δ such that any rectangle of size δ or smaller satisfies
Φub (Q) − Φlb (Q) ≤ ǫ. Take k large enough that at least one rectangle in the partition
has size not exceeding δ. Then at least one rectangle, say, Q, in partition Lk satisfies
Φub (Q)−Φlb (Q) ≤ ǫ. It follows than when this rectangle was added to the list, the algorithm
should have terminated.

2 Mixed Boolean-convex problems


In this section we consider another simple example of branch and bound, applied to a com-
binatorial problem. We consider the following problem:

minimize f0 (x, z)
subject to fi (x, z) ≤ 0, i = 1, . . . , m (5)
zj ∈ {0, 1}, j = 1, . . . , n,

where x and z are the optimization variables. The variables z1 , . . . , zn are, naturally, called
Boolean variables. We assume the functions fi , i = 0, . . . , n, are convex. This problem is
called a mixed Boolean convex problem. We denote the optimal value of this problem as p⋆ .
One way to solve this problem is by exhaustive search. That is, we solve 2n convex
optimization problems, one for each possible value of the (Boolean) vector z, and then choose
the smallest of these optimal values. This involves solving a number of convex problems that
is exponential in the size of the variable z. For n more than 30 or so, this is clearly not
possible.
We will use a branch and bound method to solve this problem. In the worst case, we end
up solving the 2n convex problems, i.e., carrying an exhaustive search. But with luck, this
does not occur.
The convex relaxation
minimize f0 (x, z)
subject to fi (x, z) ≤ 0, i = 1, . . . , m (6)
0 ≤ zj ≤ 1, j = 1, . . . , n,

with (continuous) variables x and z, is convex, and therefore easily solved. Its optimal value,
which we denote L1 , is a lower bound on the optimal value of (5). This lower bound can be
+∞ (in which case the original problem is surely infeasible) or −∞.

5
We can also get an upper bound on p⋆ using the relaxation. The simplest method is to
round each of the relaxed Boolean variables zi to 0 or 1. A more sophisticated approach is
to first round each of the relaxed Boolean variables zi to 0 or 1, and then, with these values
of zi fixed, solve the resulting convex problem in the variable x.
The upper bound obtained from either of these methods can be +∞, if this method fails
to produce a feasible point. (The upper bound can also be −∞, in which case the original
problem is surely unbounded below.) We’ll denote this upper bound by U1 . Of course, if we
have U1 − L1 ≤ ǫ, the required tolerance, we can quit.
Now we are going to branch. Pick any index k, and form two problems: The first problem
is
minimize f0 (x, z)
subject to fi (x, z) ≤ 0, i = 1, . . . , m
zj ∈ {0, 1}, j = 1, . . . , n
zk = 0
and the second problem is

minimize f0 (x, z)
subject to fi (x, z) ≤ 0, i = 1, . . . , m
zj ∈ {0, 1}, j = 1, . . . , n
zk = 1.

In other words, we fix the value of zk to 0 in the first problem, and 1 in the second. We
call these subproblems of the original, since they can be thought of as the same problem,
with one variable eliminated or fixed. Each of these subproblems is also a mixed Boolean
convex problem, with n − 1 Boolean variables. The optimal value of the original problem is
the smaller of the optimal values of the two subproblems.
We now solve the two convex relaxations of these subproblems to obtain a lower and upper
bound on the optimal value of each subproblem. We’ll denote these as L̃, Ũ (for zk = 0)
and L̄, Ū (for zk = 1). Each of these two lower bounds must be at least as large as L1 , i.e.,
min{L̃, L̄} ≥ L1 . We can also assume, without loss of generality, that min{Ũ , Ū } ≤ U1 .
From these two sets of bounds, we obtain the following bounds on p⋆ :

L2 = min{L̃, L̄} ≤ p⋆ ≤ U2 = min{Ũ , Ū }.

By the inequalities above, we have U2 − L2 ≤ U1 − L1 .


At the next step, we can choose to split either of the subproblems. We need to split on an
index (not equal to k, the index used to split the original problem). We solve the relaxations
for the split subproblems (which have n − 2 Boolean variables), and obtain lower and upper
bounds for each.
At this point we have formed a partial binary tree of subproblems. The root is the
original problem; the first split yields two child subproblems, one with zk = 0 and the other
with zk = 1. The second iteration yields another two children of one of the original children.
We continue is this way. At each iteration, we choose a leaf node (which corresponds to a
subproblem, with some of the Boolean variables fixed to particular values), and split it, by

6
fixing a variable that is not fixed in the parent problem. An edge in the tree corresponds to
the value 0 or 1 of a particular variable zi . At the root node of the tree, none of the values of
the zi are specified. A node at depth k in the tree corresponds to a subproblem in which k of
the Boolean variables have fixed values. For each node, we have an upper and lower bound
on the optimal value of the subproblem. After k iterations, we have a tree with exactly k
leaves.
The minimum of the lower bounds, over all the leaf nodes, gives a lower bound on the
optimal value p⋆ ; similarly, the minimum of the upper bounds, over all the leaf nodes, gives
an upper bound on the optimal value p⋆ . We refer to these lower and upper bounds as Lk
and Uk , respectively. The global lower bound Lk is nondecreasing, and the global upper
bound Uk is nonincreasing, so we always have Uk+1 − Lk+1 ≤ Uk − Lk . We can terminate
the algorithm when Uk − Lk ≤ ǫ.
Proving convergence of the algorithm is trivial: it must terminate, with Uk = Lk , in
fewer than 2n iterations. To see this, note that if a leaf has depth n, it means that all the
Boolean variables are fixed in the subproblem, so by solving the convex relaxation we get
the exact solution. In other words, for any leaf of depth n, we have U = L. The worst thing
that can happen is that we develop a complete binary tree, to depth n, which requires 2n
iterations, at which point every subproblem lower and upper bound is equal, and therefore
the algorithm terminates. This is nothing more than exhaustive search.
We can carry out pruning, to save memory (but not time), as the algorithm progresses. At
iteration k we have the upper bound Uk on p⋆ . Any node in the tree that has a lower bound
larger than U can be pruned, i.e., removed from the tree, since all points corresponding to
that node are worse than the current best point found.
The choice of which leaf to split at a given stage in the algorithm is somewhat arbitrary.
The primary strategy is to split the leaf corresponding to the smallest lower bound. This
means we always get an improvement in the global lower bound. The drawback of this
scheme is that in some situations (such as the cardinality example in the next section) we
may achieve a tight lower bound before getting any optimal feasible points.
The same can be said for the choice of index to use to split a leaf. It can be any index
corresponding to a Boolean variable that has not yet been fixed. One heuristic is to split
along a variable k for which |zk⋆ − 1/2| is smallest, where zk⋆ is the optimal point found in
the relaxation. This corresponds to splitting a variable that is most ‘undecided’ (between
0 and 1) in the relaxation. In some problems, the opposite heuristic works well: we split a
variable that is most ‘decided’. To do this, we choose k for which |zk⋆ − 1/2| is maximum.
In most cases, many of the variables zk⋆ will be 0 or 1; among these, we can choose the one
with the largest associated Lagrange multiplier. (The idea here is that this is the relaxed
variable that is 0 and 1, and has the highest pressure to stay there.)

7
[−0.143, ∞)

z1 = 0 z1 = 1

[∞, ∞] [0.2, ∞]
z2 = 0 z2 = 1
[∞, ∞) [1, 1]

Figure 2: Simple three variable example.

2.1 Small example


We illustrate the algorithm with a small example, the Boolean LP

minimize 2z1 + z2 − 2z3


subject to 0.7z1 + 0.5z2 + z3 ≥ 1.8
zi ∈ {0, 1}, i = 1, 2, 3,

with variables z1 , z2 , and z3 . This problem is small enough to solve by exhaustive enumer-
ation. Out of 8 possible 3-dimensional Boolean vectors, just one is feasible. This optimal
point is z = (1, 1, 1).
Figure 2 shows the binary tree corresponding to the branch and bound algorithm, which
terminates in 2 iterations, after 5 LP relaxations are solved. Each node shows the lower and
upper bound, and each edge shows the variable that is fixed, and its value. At the first node
we solve the relaxed problem and obtain the lower bound L1 = −0.143. Rounding the zi⋆
found in the relaxation does not, however, yield a feasible solution, so the upper bound is
U1 = ∞.
Setting z1 = 0 we find that the relaxed problem is infeasible, so we have L = ∞ (and
therefore U = ∞). At this point we have determined that, if there is any solution, it must
satisfy z1 = 1. Now we explore the z1 = 1 branch. Solving the associated relaxed LP
we obtain the new lower bound 0.2, an improvement over the previous global lower bound
−0.143. Once again, when we round zi⋆ , we get an infeasible point, so the upper bound at
this node is still ∞. At this point, two iterations into the algorithm, having solved a total
of 3 LPs, we find that L2 = 0.2 and U2 = ∞. (This means that so far, we are not even sure
the problem is feasible.)
Now we split on the variable z2 . Setting z2 = 1 we get a relaxed optimal value of 1. In
the relaxed problem, we have z3⋆ = 1, so (even without rounding) we obtain a feasible point
with objective 1. Thus, the lower and upper bounds at this node are both 1. When we set
z2 = 0, the relaxed problem is infeasible. For this node, then, we have L = U = ∞.

8
After this third iteration we have
L3 = min{∞, ∞, 1} = 1, U3 = min{∞, ∞, 1} = 1,
which means we have solved the problem globally.
Of course for a simple problem like this we could have simply checked the 8 possible
Boolean vectors for feasibility, and evaluated the objective function of those that are feasible.
This example was meant only to illustrate the branch and bound method.

3 Minimum cardinality example


We consider the problem of finding the sparsest solution of a set of linear inequalities,
minimize card(x)
(7)
subject to Ax b.
This problem is equivalent to the mixed Boolean-LP
minimize 1T z
subject to Li zi ≤ xi ≤ Ui zi , i = 1, . . . , n
(8)
Ax b
zi ∈ {0, 1}, i = 1, . . . , n
with variables x and z, and where L and U are, respectively, lower and upper bounds on x.
We can find the lower bounds Li by solving the n LPs
minimize xi
subject to Ax b,
for i = 1, . . . , n. (The optimal values of these LPs give L1 , . . . , Ln .) We can find the upper
bounds U1 , . . . , Un by solving the n LPs
maximize xi
subject to Ax b
for i = 1, . . . , n.
If any Li is positive, then any feasible x satisfies xi > 0, and it follows that zi = 1. In this
case, we can remove the variable zi (since we already know its value). In a similar way, if
any Ui is negative, then we must have zi = 1, and we can simple remove zi . So we’ll assume
that Li ≤ 0 and Ui ≥ 0.
We’ll solve the mixed Boolean-convex problem (8) using the general method described
in §2. The relaxation is
minimize 1T z
subject to Li zi ≤ xi ≤ Ui zi , i = 1, . . . , n
(9)
Ax b
0 ≤ zi ≤ 1, i = 1, . . . , n

9
which is equivalent to
Pn
minimize i=1 ((1/Ui )(xi )+ + (−1/Li )(xi )− )
subject to Ax b

(assuming Li < 0 and Ui > 0, for i = 1, . . . , n). The objective here is an asymmetric weighted
ℓ1 -norm. Minimizing the ℓ1 -norm over a convex set is a well known heuristic for finding a
sparse solution. We can improve on the lower bound found by the relaxation. Since card(x)
is integer valued, if we know card(x) ≥ l, we know that card(x) ≥ ⌈l⌉, the smallest integer
greater than or equal to l. (For example, if we know that card(x) ≥ 23.4, then in fact we
know that card(x) ≥ 24.)
We find an upper bound on the mixed Boolean-LP (8) by simply recording the cardinality
of the x obtained in the relaxed problem (9).
At each iteration we split a node with the lowest lower bound. This will be the optimal
strategy if we find an optimal point before proving the lower bound. Less clear is the choice of
variable to split; empirical results suggest that splitting along the ‘most ambiguous’ variable
in the relaxation, i.e., zj , where j = argmini (|zi − 1/2|), is an efficient strategy.

3.1 Numerical results


A small example
We first give the results of the branch and bound algorithm for a randomly generated problem
of the form (7), with x ∈ R30 and 100 constraints. It takes 124 iterations, and the solution
of a total of 309 LPs (including the 60 LPs required to compute the lower and upper bounds
on each variable) to find that the minimum cardinality is 19.
Figure 3 shows the tree generated by the algorithm at four different iterations. At top
left is the tree after three iterations, with lower bound 12.6 and upper bound 21. At top
right is the tree after five iterations, with lower bound 13.1 and upper bound 20. Note that
there is a pruned node, shown with a dotted line. It had a lower bound above 20, the global
upper bound at that point, and so could be eliminated from the tree. At bottom left, after
10 iterations, the lower bound is 13.9 and upper bound 19. Finally, the plot at bottom right
shows the tree at the end of the algorithm, with lower bound 18.04 and upper bound 19.
Figure 4 shows the evolution of the global lower and upper bounds. We can see, for
example, that a point with globally minimum cardinality is obtained after only 8 iterations;
however, it takes around 100 more iterations to guarantee that this point has minimum
cardinality.
Figure 5 shows the portion of non-eliminated or non-pruned sparsity patterns, versus
iteration number. Each pruned node, at level i, corresponds to 2n+1−i possible values of the
Boolean variable z; each active (non-pruned) node at level i corresponds to 2n+1−i possible
values of the Boolean variable. Near termination, the fraction of non-pruned Boolean values
is around 1%. This means we have eliminated around 99% of the possible Boolean values.
However, the total number of Boolean values is 230 ≈ 109 , so 1% still represents around 107
possible values for z. This means that, right before the algorithm terminates, there are still

10
Figure 3: Tree with 30 variable problem, after 3 iterations (top left), 5 iterations
(top right), 10 iterations (bottom left), and 124 iterations (bottom right).

20

18
cardinality

16

14

upper bound
lower bound
ceil(lower bound)
12
0 20 40 60 80 100 120

iteration

Figure 4: Bounds with 30 variable cardinality example.

11
100

portion of non-pruned Boolean values

10-1

10-2
0 20 40 60 80 100 120

iteration

Figure 5: Portion of solutions remaining possible (not eliminated), 30 variables.

60

50
number of leaves on tree

40

30

20

10

0
0 20 40 60 80 100 120

iteration

Figure 6: Number of leaves on tree, 30 variables.

12
107 possible values of z ‘in play’.
Figure 6 shows the total number of active (non-pruned) leaves in the tree, which is the
size of the partition, versus iteration number. After k iterations, we can have at most k
leaves (which occurs only if no pruming occurs). So we can see that in this example, we have
typically pruned around half of the leaves.
For this example, direct enumeration of all sparsity patterns is impractical (or at least,
very slow compared to branch and bound), since it would require the solution of 230 ≈ 109
LPs. We must count ourselves lucky to have solved the problem with (provable) global
optimality by solving only a few hundred LPs.

13
40

35

cardinality
30

25

20
upper bound
lower bound
ceil(lower bound)
15
0 500 1000 1500 2000 2500 3000 3500 4000

iteration

Figure 7: Bounds with 50 variable example.

Larger example
We show the same plots for a larger example with 50 variables. It takes 3665 iterations to
find the optimal cardinality of 32. This took approximately 20 minutes on a Linux system
with an Intel Core 2 Duo 1.67 GHz and 2 GB of RAM. Solving the problem by brute
force would, in comparison, involve, as a minimum, eliminating all sparsity patterns with
cardinality 31. In this example, we know that six of the variables cannot possibly be zero,
44

so we need to eliminate 31 > 1010 sparsity patterns by solving an LP for each. This is
obviously effectively impossible.
This is the same example used in the lecture on ℓ1 -norm methods for cardinality problems.
The simple ℓ1 -norm heuristic yields a point with cardinality 44 (by solving one LP); using
the iterated ℓ1 -norm heuristic, we obtain a point with cardinality 36 (by solving 4 LPs).
The branch and bound method does not produce a better point than this until around 480
iterations (which requires the solution of 1061 LPs), when a point with cardinality 35 is
found. It takes around 1200 iterations before an optimal point is found, and another 2500
iterations after that to prove that it is optimal.

14
100

portion of non-pruned Boolean values


10-1

10-2

10-3

10-4

10-5
0 500 1000 1500 2000 2500 3000 3500 4000

iteration

Figure 8: Portion of solutions remaining possible (not eliminated), 50 variables.

1600

1400
number of leaves on tree

1200

1000

800

600

400

200

0
0 500 1000 1500 2000 2500 3000 3500 4000

iteration

Figure 9: Number of leaves on tree, 50 variables.

15
190

180

170

cardinality 160

150

140

130

120
upper bound
110 lower bound
ceil(lower bound)
100
0 1000 2000 3000 4000 5000

iteration

Figure 10: Bounds with 200 variable cardinality example.

An even larger example


In the interest of honesty, we show an example where branch and bound fails (in a practical
sense). The problem to be solved here is a randomly generated problem instance of the
form (7) with 200 variables. Figures 10, 11, and 12 show partial results from running this
algorithm for 50 hours on a Linux system with an Intel Pentium M 1.7 GHz and 1 GB of
RAM. After solving around 10000 LPs, we do not know what the optimal cardinality is; we
only know that it lies between 135 and 179.
You could, however, report terrific progress to your boss, by pointing out that in 50 hours,
you have successfully reduced the total number of possible sparsity patterns (i.e., values of
the Boolean variable z) by a factor of 1012 .
On the other hand, 2200 ≈ 1.6 · 1060 , so that still leaves a rather substantial number,
1.6 · 1048 , of sparsity patterns in play. If your boss points this out, you can plead the
NP-hard defense.

16
100

portion of non-pruned Boolean values


10-1

10-2

10-3

10-4

10-5

10-6

10-7

10-8

10-9

10-10

10-11

10-12
0 1000 2000 3000 4000 5000

iteration

Figure 11: Portion of solutions remaining possible (not eliminated), 200 variables.

1000

800
number of leaves on tree

600

400

200

0
0 1000 2000 3000 4000 5000

number of iterations

Figure 12: Number of leaves on tree, 200 variables.

17
Acknowledgments
Arpita Ghosh and Alessandro Magnani helped with an earlier version of this set of notes.

18

You might also like