0% found this document useful (0 votes)
20 views39 pages

BB Slides

This document discusses branch and bound methods for solving nonconvex optimization problems. It describes how branch and bound algorithms work by maintaining provable lower and upper bounds on the optimal objective value. It also discusses how branch and bound can be applied to mixed Boolean-convex optimization problems.

Uploaded by

il
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views39 pages

BB Slides

This document discusses branch and bound methods for solving nonconvex optimization problems. It describes how branch and bound algorithms work by maintaining provable lower and upper bounds on the optimal objective value. It also discusses how branch and bound can be applied to mixed Boolean-convex optimization problems.

Uploaded by

il
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Branch and Bound Methods

• basic ideas and attributes


• unconstrained nonconvex optimization
• mixed convex-Boolean optimization

EE364b, Stanford University


Methods for nonconvex optimization problems

• convex optimization methods are (roughly) always global, always fast

• for general nonconvex problems, we have to give up one

• local optimization methods are fast, but need not find global solution
(and even when they do, cannot certify it)

• global optimization methods find global solution (and certify it), but
are not always fast (indeed, are often slow)

EE364b, Stanford University 1


Branch and bound algorithms

• methods for global optimization for nonconvex problems

• nonheuristic
– maintain provable lower and upper bounds on global objective value
– terminate with certificate proving ǫ-suboptimality

• often slow; exponential worst case performance

• but (with luck) can (sometimes) work well

EE364b, Stanford University 2


Basic idea

• rely on two subroutines that (efficiently) compute a lower and an upper


bound on the optimal value over a given region
– upper bound can be found by choosing any point in the region, or by
a local optimization method
– lower bound can be found from convex relaxation, duality, Lipschitz
or other bounds, . . .

• basic idea:
– partition feasible set into convex sets, and find lower/upper bounds
for each
– form global lower and upper bounds; quit if close enough
– else, refine partition and repeat

EE364b, Stanford University 3


Unconstrained nonconvex minimization

goal: find global minimum of function f : Rm → R, over an


m-dimensional rectangle Qinit, to within some prescribed accuracy ǫ

• for any rectangle Q ⊆ Qinit, we define Φmin(Q) = inf x∈Q f (x)

• global optimal value is f ⋆ = Φmin(Qinit)

EE364b, Stanford University 4


Lower and upper bound functions

• we’ll use lower and upper bound functions Φlb and Φub, that satisfy, for
any rectangle Q ⊆ Qinit,

Φlb(Q) ≤ Φmin(Q) ≤ Φub(Q)

• bounds must become tight as rectangles shrink:

∀ǫ > 0 ∃δ > 0 ∀Q ⊆ Qinit, size(Q) ≤ δ =⇒ Φub(Q) − Φlb(Q) ≤ ǫ

where size(Q) is diameter (length of longest edge of Q)

• to be practical, Φub(Q) and Φlb(Q) should be cheap to compute

EE364b, Stanford University 5


Branch and bound algorithm

1. compute lower and upper bounds on f ⋆


• set L1 = Φlb(Qinit) and U1 = Φub(Qinit)
• terminate if U1 − L1 ≤ ǫ
2. partition (split) Qinit into two rectangles Qinit = Q1 ∪ Q2
3. compute Φlb(Qi) and Φub(Qi), i = 1, 2
4. update lower and upper bounds on f ⋆
• update lower bound: L2 = min{Φlb(Q1), Φlb(Q2)}
• update upper bound: U2 = min{Φub(Q1), Φub(Q2)}
• terminate if U2 − L2 ≤ ǫ
5. refine partition, by splitting Q1 or Q2, and repeat steps 3 and 4

EE364b, Stanford University 6


• can assume w.l.o.g. Ui is nonincreasing, Li is nondecreasing

• at each step we have a partially developed binary tree; children


correspond to the subrectangles formed by splitting the parent rectangle

• leaves give the current partition of Qinit

• need rules for choosing, at each step


– which rectangle to split
– which edge (variable) to split
– where to split (what value of variable)

• some good rules: split rectangle with smallest lower bound, along
longest edge, in half

EE364b, Stanford University 7


Example

partitioned rectangle in R2, and associated binary tree, after 3 iterations

EE364b, Stanford University 8


Pruning

• can eliminate or prune any rectangle Q in tree with Φlb(Q) > Uk


– every point in rectangle is worse than current upper bound
– hence cannot be optimal

• does not affect algorithm, but does reduce storage requirements

• can track progress of algorithm via


– total pruned (or unpruned) volume
– number of pruned (or unpruned) leaves in partition

EE364b, Stanford University 9


Convergence analysis
• number of rectangles in partition Lk is k (without pruning)

• total volume of these rectangles is vol(Qinit), so

vol(Qinit)
min vol(Q) ≤
Q∈Lk k

• so for k large, at least one rectangle has small volume

• need to show that small volume implies small size

• this will imply that one rectangle has U − L small

• hence Uk − Lk is small

EE364b, Stanford University 10


Bounding condition number

• condition number of rectangle Q = [l1, u1] × · · · × [ln, un] is

maxi(ui − li)
cond(Q) =
mini(ui − li)

• if we split rectangle along longest edge, we have

cond(Q) ≤ max{cond(Qinit), 2}

for any rectangle in partition

• other rules (e.g., cycling over variables) also guarantee bound on


cond(Q)

EE364b, Stanford University 11


Small volume implies small size

Y m−1
vol(Q) = (ui − li) ≥ max(ui − li) min(ui − li)
i i
i
m m
(2 size(Q)) 2 size(Q)
= m−1

cond(Q) cond(Q)

and so size(Q) ≤ (1/2)vol(Q)1/mcond(Q)

therefore if cond(Q) is bounded and vol(Q) is small, size(Q) is small

EE364b, Stanford University 12


Mixed Boolean-convex problem

minimize f0(x, z)
subject to fi(x, z) ≤ 0, i = 1, . . . , m
zj ∈ {0, 1}, j = 1, . . . , n

• x ∈ Rp is called continuous variable

• z ∈ {0, 1}n is called Boolean variable

• f0, . . . , fn are convex in x and z

• optimal value denoted p⋆

• for each fixed z ∈ {0, 1}n, reduced problem (with variable x) is convex

EE364b, Stanford University 13


Solution methods

• brute force: solve convex problem for each of the 2n possible values of
z ∈ {0, 1}n
– possible for n ≤ 15 or so, but not n ≥ 20

• branch and bound


– in worst case, we end up solving all 2n convex problems
– hope that branch and bound will actually work much better

EE364b, Stanford University 14


Lower bound via convex relaxation
convex relaxation

minimize f0(x, z)
subject to fi(x, z) ≤ 0, i = 1, . . . , m
0 ≤ zj ≤ 1, j = 1, . . . , n

• convex with (continuous) variables x and z, so easily solved

• optimal value (denoted L1) is lower bound on p⋆, optimal value of


original problem

• L1 can be +∞ (which implies original problem infeasible)

EE364b, Stanford University 15


Upper bounds

• can find an upper bound (denoted U1) on p⋆ several ways


• simplest method: round each relaxed Boolean variable zi⋆ to 0 or 1
• more sophisticated method: round each Boolean variable, then solve the
resulting convex problem in x
• randomized method:
– generate random zi ∈ {0, 1}, with Prob(zi = 1) = zi⋆
– (optionally, solve for x again)
– take best result out of some number of samples
• upper bound can be +∞ (method failed to produce a feasible point)
• if U1 − L1 ≤ ǫ we can quit

EE364b, Stanford University 16


Branching
• pick any index k, and form two subproblems

• first problem:

minimize f0(x, z)
subject to fi(x, z) ≤ 0, i = 1, . . . , m
zj ∈ {0, 1}, j = 1, . . . , n
zk = 0

• second problem:

minimize f0(x, z)
subject to fi(x, z) ≤ 0, i = 1, . . . , m
zj ∈ {0, 1}, j = 1, . . . , n
zk = 1

EE364b, Stanford University 17


• each of these is a Boolean-convex problem, with n − 1 Boolean variables

• optimal value of original problem is the smaller of the optimal values of


the two subproblems

• can solve convex relaxations of subproblems to obtain lower and upper


bounds on optimal values

EE364b, Stanford University 18


New bounds from subproblems

• let L̃, Ũ be lower, upper bounds for zk = 0

• let L̄, Ū be lower, upper bounds for zk = 1

• min{L̃, L̄} ≥ L1

• can assume w.l.o.g. that min{Ũ , Ū } ≤ U1

• thus, we have new bounds on p⋆:

L2 = min{L̃, L̄} ≤ p⋆ ≤ U2 = min{Ũ , Ū }

EE364b, Stanford University 19


Branch and bound algorithm

• continue to form binary tree by splitting, relaxing, calculating bounds on


subproblems

• convergence proof is trivial: cannot go more than 2n steps before U = L

• can prune nodes with L excceding current Uk

• common strategy is to pick a node with smallest L

• can pick variable to split several ways


– ‘least ambivalent’: choose k for which z ⋆ = 0 or 1, with largest
Lagrange multiplier
– ‘most ambivalent’: choose k for which |zk⋆ − 1/2| is minimum

EE364b, Stanford University 20


Small example

nodes show lower and upper bounds for three-variable Boolean LP

[−0.143, ∞)

z1 = 0 z1 = 1

[∞, ∞] [0.2, ∞]
z2 = 0 z2 = 1
[∞, ∞] [1, 1]

EE364b, Stanford University 21


Minimum cardinality example
find sparsest x satisfying linear inequalities

minimize card(x)
subject to Ax b

equivalent to mixed Boolean-LP:

minimize 1T z
subject to Lizi ≤ xi ≤ Uizi, i = 1, . . . , n
Ax b
zi ∈ {0, 1}, i = 1, . . . , n

with variables x and z and lower and upper bounds on x, L and U

EE364b, Stanford University 22


Bounding x
• Li is optimal value of LP

minimize xi
subject to Ax b

• Ui is optimal value of LP

maximize xi
subject to Ax b

• solve 2n LPs to get all bounds

• if Li > 0 or Ui < 0, we can just set zi = 1

EE364b, Stanford University 23


Relaxation problem

• relaxed problem is

minimize 1T z
subject to Lizi ≤ xi ≤ Uizi, i = 1, . . . , n
Ax b
0 ≤ zi ≤ 1, i = 1, . . . , n

• (assuming Li < 0, Ui > 0) equivalent to


Pn
minimize i=1 ((1/Ui )(xi )+ + (−1/Li )(xi )− )
subject to Ax b

• objective is asymmetric weighted ℓ1-norm

EE364b, Stanford University 24


A few more details

• relaxed problem is well known heuristic for finding a sparse solution, so


we take card(x⋆) as our upper bound

• for lower bound, we can replace L from LP with ⌈L⌉, since card(x) is
integer valued

• at each iteration, split node with lowest lower bound

• split most ambivalent variable

EE364b, Stanford University 25


Small example

• random problem with 30 variables, 100 constraints

• 230 ≈ 109

• takes 8 iterations to find a point with globally minimum cardinality (19)

• but, takes 124 iterations to prove minimum cardinality is 19

• requires 309 LP solves (including 60 to calculate lower and upper


bounds on each variable)

EE364b, Stanford University 26


Algorithm progress
tree after 3 iterations (top left), 5 iterations (top right), 10 iterations
(bottom left), and 124 iterations (bottom right)

EE364b, Stanford University 27


Global lower and upper bounds

20

18
cardinality

16

14

upper
lower bound
bound
ceil(lower bound)
12
0 20 40 60 80 100 120

iteration

EE364b, Stanford University 28


Portion of non-pruned sparsity patterns

100
portion of non-pruned Boolean values

10-1

10-2
0 20 40 60 80 100 120

iteration

EE364b, Stanford University 29


Number of active leaves in tree

60

50
number of leaves on tree

40

30

20

10

0
0 20 40 60 80 100 120

iteration

EE364b, Stanford University 30


Larger example

• random problem with 50 variables, 100 constraints

• 250 ≈ 1015

• took 3665 iterations (1300 to find an optimal point)

• minimum cardinality 31

• same example as used in ℓ1-norm methods lecture


– basic ℓ1-norm relaxation (1 LP) gives x with card(x) = 44
– iterated weighted ℓ1-norm heuristic (4 LPs) gives x with
card(x) = 36

EE364b, Stanford University 31


Global lower and upper bounds

40

35
cardinality

30

25

20
upper bound
lower bound
ceil(lower bound)
15
0 500 1000 1500 2000 2500 3000 3500 4000

iteration

EE364b, Stanford University 32


Portion of non-pruned sparsity patterns

100
portion of non-pruned Boolean values

10-1

10-2

10-3

10-4

10-5
0 500 1000 1500 2000 2500 3000 3500 4000

iteration

EE364b, Stanford University 33


Number of active leaves in tree

1600

1400
number of leaves on tree

1200

1000

800

600

400

200

0
0 500 1000 1500 2000 2500 3000 3500 4000

iteration

EE364b, Stanford University 34


Even larger example

• random problem with 200 variables, 400 constraints

• 2200 ≈ 1.6 · 1060

• we quit after 10000 iterations (50 hours on a single processor machine


with 1 GB of RAM)

• only know that optimal cardinality is between 135 and 179

• but have reduced number of possible sparsity patterns by factor of 1012

EE364b, Stanford University 35


Global lower and upper bounds

190

180

170

160
cardinality

150

140

130

120
upper bound
110 lower bound
ceil(lower bound)
100
0 1000 2000 3000 4000 5000

iteration

EE364b, Stanford University 36


Portion of non-pruned sparsity patterns

100
portion of non-pruned Boolean values
10-1

10-2

10-3

10-4

10-5

10-6

10-7

10-8

10-9

10-10

10-11

10-12
0 1000 2000 3000 4000 5000

iteration

EE364b, Stanford University 37


Number of active leaves in tree

1000
number of leaves on tree

800

600

400

200

0
0 1000 2000 3000 4000 5000

iteration

EE364b, Stanford University 38

You might also like