0% found this document useful (0 votes)
34 views

On The Shoulders of Giants A Brief

Uploaded by

nisakav739
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

On The Shoulders of Giants A Brief

Uploaded by

nisakav739
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Discussiones Mathematicae

Differential Inclusions, Control and Optimization 32 (2012) 5–44


doi:10.7151/dmdico.1140

”ON THE SHOULDERS OF GIANTS”


A BRIEF EXCURSION INTO THE HISTORY OF
MATHEMATICAL PROGRAMMING1

Rainer Tichatschke

Department of Mathematics
University of Trier, 54286 Trier, Germany

Abstract

Similar to many mathematical fields also the topic of mathematical pro-


gramming has its origin in applied problems. But, in contrast to other
branches of mathematics, we don’t have to dig too deeply into the past cen-
turies to find their roots. The historical tree of mathematical programming,
starting from its conceptual roots to its present shape, is remarkably short,
and to quote Isaak Newton, we can say:

”We are standing on the shoulders of giants”.

The goal of this paper is to describe briefly the historical growth of


mathematical programming from its beginnings to the seventies of the last
century and to review its basic ideas for a broad audience. During this
process we will demonstrate that optimization is a natural way of thinking
which follows some extremal principles.
Keywords: history, mathematical programming.
2010 Mathematics Subject Classification: 01A99, 90C25.

1. The Giants

Let us start with Leonhard Euler.

1
Part of a lecture held at the University of Trier on the occasion of the Year of Mathematics
2008 in Germany.
6 R. Tichatschke

' $
Leonhard Euler (1707–1783)

1727: Euler has been appointed professor (by


Daniel Bernoulli) at the University of Saint
Petersburg, Russia.

Member of St. Petersburg’s Academy of Science


and since 1741 member of the Prussian Academy
of Science, Berlin.

1744: Methodus inveniendi lineas curvas maximi minimive proprietate gaudentes, sive so-
lutio problematis isoperimetrici latissimo sensu accepti.
(Method to find curves, possessing some property in the most or smallest degree or the
resolution of the Isoperimetric problem considered in the broadest sense.)
In this work he has established the variational analysis in a systematic way.
& %

He is the most productive mathematician of all times (his oeuvre consists of 72


volumes) and as one of the first he captured the importance of the optimization.
He wrote [33]: ”Whatever human paradigm is manifest, it usually reflects the
behavior of maximization or minimization. Hence, there are no doubts at all that
natural phenomena can be explained by means of the methods of maximization
or minimization.”
It is not surprising why optimization appears as a natural thought pattern.
Thousands of years human beings have sought solutions for problems which re-
quire a minimal effort and/or a maximal revenue. This approach has contributed
to the growth of all branches of mathematics. Moreover, the thought of optimiz-
ing something has entered nowadays many disciplines of science.
Back to Euler. He has delivered important contributions on the field of
optimization in both theory and methods. His characterization of optimal so-
lutions, i.e., the description of necessary optimality conditions, has founded the
variational analysis. This topic treats problems, where one or more unknown
functions are sought such that some definite integral, depending on the chosen
function, attains its largest or smallest value.
Z t1
(1) L(y(t), y 0 (t), t)dt → min!, y(t0 ) = a, y(t1 ) = b.
t0

A famous example is the Brachistochrone problem:



Problem: Find the path (curve) of a mass point, which moves in shortest time under the
influence of the gravity from point A = (0, 0) to point B = (a, b):
Z as
1 + y 02 (x)
J (y) := dx → min!, y(0) = 0, y(a) = b.
0 2gy(x)

”On the Shoulders of Giants” A brief excursion into the ... 7

This problem had been formulated already in 1696 by Johann Bernoulli, and
it is known that he always quarreled with his brother Jacob Bernoulli, who
found the correct solution to this problem, but was unable to prove it. In 1744
Euler answered this question by proving the following theorem.

Theorem. Suppose y = y(t), t0 ≤ t ≤ t1 , is a C 2 -solution of the minimization


problem (1), then the (Euler)-equation holds:

d
Ly0 − Ly = 0.
dt

In the case of the Brachistochrone this equation has the particular form (because
L does not depend on time t):

d 0
(y Ly0 − L) = 0.
dt

Solving this differential equation one gets the sought solution as arc of a cycloid.

Solution: Cycloid
x(t) = c1 +c(t−sin t); y(t) = c(1−cos t).
0 ≤ t ≤ t∗ . 11
00
0011
00 11
00 t
11 11
00 11
00
The constants c, c1 and t∗ are determined 00
11 00
11 00 00
11 11
11
00
00
11
by the boundary conditions.

The cycloid describes the behavior of a tautochrone, meaning that a mass point
(x(t), y(t)) sliding down a tautochrone-shaped frictionless wire will take the same
amount of time to reach the bottom no matter how high or low the release point
is. In fact, since a tautochrone is also a brachistochrone, the mass point will take
the shortest possible time to reach the bottom out of all possible shapes of the
wire.
Euler is also one of the first who used methods of discrete approximation
for solving variational problems. With this method he has solved, for instance,
the well-known Isoperimetric problem:
' $
00000000000000000000
11111111111111111111
11111111111111111111
00000000000000000000
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
Problem: Under all closed curves K of length 00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
L, enclosing the area F , find the one which 00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
F
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
maximizes the area F . 00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
K
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111

Solution: K – circle with circumference L.


& %
8 R. Tichatschke

In today’s language of optimization this problem can be considered as a maxi-


mization problem subject to a constraint, because the length L is understood as
a restriction.
More than 200 years later C. Carathéodory (1873–1950) has described
Euler’s variational analysis as ”one of the most beautiful mathematical works,
which has been ever written” [13].

' $
Joseph Louis Lagrange (1736–1813)

1755: Professor for mathematics at the Royal


Artillery School in Turin.

1757: He is one of the founders of the Academy


of Science in Turin.

1766: Director of the Prussian Academy of


Science in Berlin and successor of Euler.

Accomplisher of the building of Newton’s mechanics, worked also in selestical mechanics,


Algebra and number theory.

1762: Multivariable Variational Analysis,


1788: Méchanique analytique.
& %

In 1762 Lagrange simplified Euler’s deduction of the necessary optimality con-


ditions and was able to generalize these conditions (so called Euler-Lagrange-
equation) for multivariate functions [70, 71]. His starting point has been the
equations of motions in the mechanics. Dealing with the movement of mass
points on curves or areas, one has to add to Newton’s equation so-called forces of
pressure to keep the points at the curve or area. This apparatus is rather clumsy.
Following the ingenious idea of Lagrange it became much more elegant – by
inserting a suitable system of coordinates – to eliminate all the constraints com-
pletely. Newton’s equation of mechanics (second law: a = F/m, i.e., acceleration
a of a body is parallel and directly proportional to the net force F and inversely
proportional to the mass m) cannot be translated to more sophisticated physical
theories like electrodynamics, universal relativity theory, theory of elementary
particles etc. But the Lagrange approach can be generalized to all field theories
in physics. The corresponding variational description is Hamilton’s principle of
stationarity, named after William Rowan Hamilton (1805–1865). It proves
to be an extremal principle and describes a generalization of different physical
observations. In 1746 Pierre Louis Maupertuis was the first who discussed a
universal valid principle of nature behaving extremal or optimal. For instance, a
rolling ball is locally always following the steepest descent; the difference of the
”On the Shoulders of Giants” A brief excursion into the ... 9

temperature in a body is creating a thermal stream in the direction of the lowest


temperature or a ray of light shining through different media is always taking the
path with the shortest time. (Fermat’s principle)
Euler and Lagrange contributed essentially to the mathematical formula-
tion of these thoughts. Carl Gustav Jakob Jacobi (1804–1851) wrote in this
respect: ”While Lagrange was going to generalize Euler’s method of variational
analysis, he observed how one can describe in one line the basic equation for all
problems of analytical mechanics.”[49].
' $
Lagrange principle:

min {f (x) : g(x) = 0} L(x, λ) := f (x) + λg(x) → min(x,λ)∈IRn+1

∇f(x∗ ) = ∇g(x∗ )

x∗

By crossing of the level lines f (x) = const the value of the objective function f is
changing; it becomes (locally) extremal if curve K touches at x∗ tangentially such a
level line, i.e., the tangents of both curves coincide at x∗ , hence their normal vectors
∇f and ∇g are co-linear at x∗ :
Euler-Lagrange formalism: x∗ is solution ⇒ ∃ λ∗ such that

Lx (x∗ , λ∗ ) = 0 ⇔ ∇f (x∗ ) + λ∗ ∇g(x∗ ) = 0


Lλ (x∗ , λ∗ ) = 0 ⇔ g(x∗ ) = 0
& %

The description of many physical problems has been simplified by Lagrange’s


formalism. Today it is a classical tool in optimization and finds its application
wherever extrema subject to equality constraints have to be calculated.
Probably, this is the right place to mention that the Euler-Lagrange-equations
are necessary conditions for a curve or a point to be optimal. However, in using
these conditions, historically many errors were made which gave rise to mistakes
for decades.
It is as in Perron’s paradoxon:

Let N be the largest positive integer. Then for N 6= 1 it holds N 2 > N , contradicting that
N is the largest integer.
Conclusion: N = 1 is the largest integer.

10 R. Tichatschke

Implications as above are devastating, nonetheless they were made often. For
instance, in elementary algebra in old Greece, where problems were solved begin-
ning with the phrase: ”Let x be the sought quantity”.
In variational analysis the Euler equation belongs to the so-called necessary
conditions. It has been obtained by the same pattern of argumentation as in
Perron’s paradoxon. The basic assumption that there exists a solution is used for
calculating a solution whose existence is only postulated. However, in the class of
problems, where this basic assumption holds true, there is no wrongdoing. But,
from where do we know that a concrete problem belongs exactly to this class?
The so-called necessary condition does not answer this question. Therefore, a
”solution”, obtained by these necessary Euler conditions, is still not a solution,
but only a candidate for being a solution.
It is surprising that such an elementary point of logic went unnoticed for
a long time. The first who criticized the Euler-Lagrange method was Karl
Weierstrass (1815–1897) almost one century later. Even Georg Friedrich
Bernhard Riemann (1826–1866) made the same unjust assumption in his fa-
mous Dirichlet principle (cf. [39]).
While at that time the resolution of several types of equations was a central
topic in mathematics, one was mainly interested in finding unique solutions. Solv-
ing of inequalities arose a marginal interest only. Especially solving of inequalities
by algorithmic methods wasn’t playing almost any role.
Fourier [38] was one of the first who described a systematic elimination
method for solving linear inequalities, similar – but in its realization much more
complicated – to Gauss elimination, which was already known by the
Chinese people 300 years earlier, of course without Carl Friedrich Gauss’s
(1777–1855) knowing.
' $
Jean Baptiste Joseph de Fourier (1768–1830)

1797: Professor for analysis and mechanics at the


École Polytechnique Paris, successor of Lagrange.

1832: Théorie analytique de la chaleur.


(Analytic theory of the heat).

First systematic foundation of (Fourier) series and (Fourier) integrals for solving differential
equations.
A memorial plaque of him can be found at the Eiffel tower in Paris.
& %

He was a very practical-minded man. In 1802 Napoleon appointed him prefect


of the department Isère in the south of France. In this position he had to drain
the marshes near Lyon. In 1815 Napoleon (after his return from island Elba)
”On the Shoulders of Giants” A brief excursion into the ... 11

installed him as prefect of the department Rhône. He was working lifelong as


secretary of the French Academy of Science.
Among the few who worked with inequality systems was Farkas, born near
Klausenburg (nowadays Cluj-Napoca, Romania). He investigated linear inequal-
ities in mechanics and studied theorems of the alternative [34].
Probably 40 years later these results proved to be very helpful in the geom-
etry of polyhedra and in the duality theory of linear programming.

' $

Julius Farkas (1847–1930)

1887: Professor in Kolozsvár (Romania)

1902: Grundsatz der einfachen Ungleichungen,


J. f. Reine und Angew. Math. 124, 1–27.

Theorem: Given A ∈ IRm×n , b ∈ IRm .

{x ∈ IRn : Ax ≤ b, x ≥ 0} 6= ∅ ⇔ {u ∈ IRm : u ≥ 0, AT u ≥ 0, uT b < 0} = ∅,


i.e., of these two linear inequality systems always exactly one is solvable.
& %

In connection with linear inequality systems also Minkowski has to be named,


who used linear inequalities for his remarkable geometry of numbers and devel-
oped together with Hermann Weyl (1885–1955) the structural assembling of
polyhedra [81].
' $
Herman Minkowski (1864–1909)

1892: Assistance professor at the University Bonn,


1894: Professor at the University Königsberg and
since 1896 at the Polytechnikum Zürich, where
Albert Einstein was one of his students.
• Geometry of numbers.
• Geometrization of the special relativity
theory.
• Theory of convex bodies.
Theorem: Let P be a polyhedral set, LP its lineality space and P 0 = P ∩ L⊥ P . Denote
S = {x1 , . . . , xq } the extremal points and T = {y 1 , . . . , y r } the extremal rays of P 0 . Then

P = LP + conv(S) + cone(T ).
& %
12 R. Tichatschke

To the roots of the theory of optimization belong also the works of Chebyshev,
better known from his contributions to approximation theory.
' $
Pafnuti Lvovich Chebyshev (1821–1894)

1850: Associate professor at St. Petersburg’s


University.

1860: Professor, there.

Basic contributions to probability theory, theory


of numbers and approximation theory.

Chebyshev-problem: X
min max |a(t) − xi fi (t)|.
x t∈T
& i
%

In the simplest version of such a continuous approximation problem one is looking


for the uniform approximation of a given continuous curve a(t) by a system of
linearly independent functions fi (t). In today’s terminology one would say we
are dealing with a non-smooth convex minimization problem, ore more exactly
with a semi-infinite problem. Hence, Chebyshev can be regarded as one of the
first who considered this kind of optimization problems. For some special cases
he found analytic solutions, known as Chebyshev polynomials.
Similar to Euler he also understood the significance of extremal problems.
He wrote [115]: ”In all practical human activities we find the same problem:
How to allocate our ressources such that as most as possible profit can be at-
tained?”
In Russia two students of Chebyshev, namely Markov and Lyapunov,
carried on with the investigations of extremal problems.
Markov is mainly known for theory of stochastic processes.
In 1913 he studied sequences of letters in novels to detect the necessity of inde-
pendence of the law of large numbers. According to that law, the average of the
results obtained from a large number of trials should be close to the expected
value, and would tend to become closer as more trials are performed.
The so-called stochastic Markov process became a general statistical tool,
from which future developments can be determined by current knowledge. But
Markov studied also so-called moment problems for optimizing the moments
of a distribution function or stochastic variables [1, 67]. This kind of problems
can be formulated as constrained optimization problems with integral functions,
where, in distinction to a variational problem, no derivatives appear.
”On the Shoulders of Giants” A brief excursion into the ... 13

' $
Andrey Andreyevich Markov (1856–1922)

1886: Assistance professor at St. Petersburg’s


University, member of the Russian Academy of
Science.

Famous for his works on number theory and prob-


ability theory (Markov chains, Markov processes
etc).

Problem of moments:
Z b
min tn f (t)dt,
a
s.t. 0 ≤ f (t) ≤ t, ∀ t ∈ [a, b],
Z b
ti f (t)dt = ci , i = 1, . . . , n − 1.
a
& %

At the first glance Lyapunov’s investigations are not connected with optimiza-
tion, because he studied stability theory for differential equations [99].
' $

Aleksandr Mikhailovich Lyapunov


(1857–1918)

1895: Associate professor at the University of


Kharkov, founder of stability theory for differen-
tial equations.

Theorem: Solution x(t) of the equation ẋ = f (x) is stable if there exists a function V (x)
such that
h∇V (x), f (x)i < 0.
& %

We can take an inverse point of view and interpret the result as follows: The
differential equation in Lyapunov’s theorem is a time-continuous method for min-
imizing the (Lyapunov-) function V (x). Today the Lypunov method is a system-
atical tool for investigating convergence and stability of numerical methods in
optimization.

2. The Pioneers in Linear Optimization

There exist two isolated roots of linear optimization, which can be traced back
to Gaspard Monge [82] and Charles-Jean de La Vallée Poussin [95].
14 R. Tichatschke

' $
Gaspard Monge (1746–1818)

1765: Professor for mathematics and 1771 for


physics in Mézières.
1780: Professor for hydrodynamics in Paris.
1794: Founder of the École Polytechnique in
Paris.

1782: Continuous mass transport under minimal costs, Application de l’analyse à la


géometrie, Paris.
& %

In 1780 Monge became member of the French Academy of Science. In the


days when 1789 the French Revolution began, he was a supporter of it and at
the moment of proclamation of the French Republic in 1792 he was appointed
Minister of navy. In this position he was jointly responsible for the death sentence
of King Ludwig XVI.
Among several physical discoveries, for instance theory of mirage, he ren-
dered outstanding services to the creation of the descriptive geometry, to which
also belongs his work on continuous mass transport. His idea is seen as an early
contribution to the linear transport problem, a particular case of the linear pro-
gramming problem.

The second root is attributed to Vallée Poussin.


' $

Charles-Jean de La Valée Poussin (1866–1962)

1892: Professor for mathematics an der Université


Louvain.
1911: Sure la méthode de l’approximation minimum,
Anales de la Societé Scientifique de Bruxelles, No 35,
pp. 1–16.

1920: First president of the International Mathematical Union.


& %

In the years 1892–1894 he attended lectures of Camille Jordan, Henri


Poincaré, Émile Picard in Paris and of Amandus Schwarz, Ferdinand
Frobenius in Berlin. With his paper, published in the Anales of Brussels Society
of Science, he is rated as one of the founders of linear optimization.
By the way, concerning the contributions of Monge and Vallée Poussin,
in 1991 Dantzig wrote disparagingly [75] (page 19): ”Their works had as much
”On the Shoulders of Giants” A brief excursion into the ... 15

influence on the development of Linear Programming in the forties, as one would


find in an Egyptian pyramid an electronic computer built in 3000 BC”.
In the forties of the last century, as a matter of fact, optimization – as we un-
derstand this topic today – was developed seriously and again practical problems
influenced the directions of its outcome. Doubtless, time was ripe for establishing
such rapid development.
In the community of the optimizers are to name three forceful pioneers: L.V.
Kantorovich, T.C. Koopmans and G.B. Dantzig.
' $

& %

In 1939, for the first time, Kantorovich solved a problem of linear optimization.
Shortly afterwards, F.L. Hitchcock published a paper about a transportation
problem. However at that time the importance of these papers was not recognized
entirely.
1926–1930 Kantorovich studied mathematics at the University of Lenin-
grad. At the age of 18 he obtained a doctorate in mathematics. However, the
doctor degree was awarded to him only in 1935, at that time the academic titles
had been re-introduced in the Soviet society [73]. In the forties a rapid develop-
ment of the functional analysis was set up. Here we have to mention the names
of Hilbert, Banach, Steinhaus and Mazur, but also Kantorovich.
Before he attained his majority of twenty-one years, he had published fifteen
papers in major mathematical journals and became a full professor at Leningrad
University. He was a mathematician in the classical mold whose contributions
were mainly centered on functional analysis, descriptive and constructive function
theory and set theory as well as on computational and approximate methods and
mathematical programming. So he made significant contributions to the building
of bridges between functional analysis, optimization and numerical methods.
16 R. Tichatschke

' $
Leonid Vitalevich Kantorovich (1912–1986)

1934: Professor at the University of Leningrad.

• Linear Optimization (1939).


• Optimality conditions for extremal prob-
lems in topological vector spaces (1940).
• Functional analytic foundation of descent
methods,
Convergence of Newton’s method for func-
tional equations (1939–1948).
1939: Mathematical Methods for Production Organization and Planning, Leningrad, 66
pages.
1940: On an efficient method for solving some classes of extremum problems, DAN SSSR 28.
1959: Functional Analysis, Moscow, Nauka.
& %

At the end of the thirties he was concerned with the mathematical modeling of the
production in some timber company and developed a method, which later on was
recognized as equivalent to the dual simplex method. In 1939 he published a small
paperback (only 66 pages) [52], with the exact title (in English translation): ”A
mathematical method of the production planning and organization and the best
use of economic operating funds”. Neither the notions Linejnaja Optimizacija
(Linear Optimization) nor simplex method were ever mentioned in this booklet.
In contrast to the publicity of Dantzig’s results in the western countries,
Kantorovich’s booklet received only a small echo within mathematicians and
economists in the East. The western world, caused by the iron curtain, didn’t
have any knowledge of that publication and in the Soviet Union there were prob-
ably two reasons for ignoring it. First, there was no real need for mathematical
methods in a totalitarian system. Although the central planning of the national
economy stood theoretically in the foreground of all social processes, the sys-
tem was founded essentially on administration. Second, it should be mentioned
that this booklet was not written in the usual mathematical language, therefore
mathematicians had no reason to read it.
What is really known is his book on Economical Calculation of the Best
Utilization of Resources [56], published in 1960 (with an appendix by G.S. Ru-
binstein), but at that time in the West the essential developments were almost
finished. In this monograph one can find two appendices about the mathematical
theory of Linear Programming and their numerical methods. Curiously, in doing
justice to the Marxist terminology, therein the dual variables are denoted by ob-
jectively substantiated estimates but not as prices, because in the Soviet thinking
prices had not to be imposed by the market but by the Politburo.
”On the Shoulders of Giants” A brief excursion into the ... 17

As already mentioned, Kantorovich contributed significantly to functional


analysis [54, 55]. His functional-analytic methods in optimization are well-known
and contain ideas and techniques, which have been in the progress of development
thirty years before the preparation of the theory of convex analysis.
As already mentioned, in 1975 he was honored, together with Koopmans,
with the Nobel price for economics. Quotation of the Nobel committee: ”For
contributions to the theory of the optimal allocation of operating capital”.
Koopmans was an US-American economist and physicist with roots in the
Netherlands, who tackled the problems of resource allocations.
' $
Tjalling Charles Koopmans (1910–1985)

1948: Professor at the Yale University.

1968: Professor at the Stanford University.

1942: Exchange Ratios between Cargoes on Various


Routes (Non-Refrigerated Dry Cargoes),
Memorandum for the Combined Shipping Adjust-
ment Board, Washington, D.C.

1951: Activity Analysis of Production and Allocation, Wiley, New York.


1971: On the Description and Comparison of Economic Systems, (with J. Michael Montias)
in: Comparison of Economic Systems, Univ. of California Press, 1971, pp. 27–78.
& %

Koopmans was highly annoyed that Dantzig could not participate in that price.
In the mid-forties Dantzig became aware of the fact that in many prac-
tical modeling problems the economic restrictions could be described by linear
inequalities. Moreover, replacing the ”rule of thumb” by a goal function, for the
first time he formulated deliberately a problem, consisting explicitly of a (linear)
objective function and (linear) restrictions in form of equalities and/or inequal-
ities. In particular, hereby he established a clear separation between the goal
of the optimization, the set of feasible solutions and, by suggesting the simplex
method, the method of solving such problems.
' $
3

2.5

2
x(2)

1.5

0.5

& %
0 0.5 1 1.5 2 2.5 3
x(1)
18 R. Tichatschke

Dantzig studied mathematics at the universities of Maryland and Michigan, be-


cause his parents could not afford a study at a more distinguished university. In
1936 he got his B.A. in mathematics and physics and switched to the University
of Michigan in order to earn a doctorate. In 1937 he finished advanced studies,
receiving a M.A. in mathematics.
' $
Georg B. Dantzig (1914–2005)

1960: Professor at the University of California,


Berkeley.
1966: Professor at the Stanford University.
1966: Linear Programming and Extensions,
Springer-Verlag, Berlin.

1966: Linear Inequalities and Related Systems, Princeton Univ. Press, Princeton, NJ.
1969: Lectures in Differential Equations, Van Nostrand, Reinhold Co., New York.
& %

After working two years as a statistician at the RAND Corporation in Washing-


ton, in 1939 he started his PhD-study at the University of California, Berkeley,
which he interrupted when the USA was joining the Second World War. He
entered the Air Force and became (1941–1946) leader of the Combat Analysis
Branch at the headquarter of the US-Air Force. In 1946 he continued with his
PhD-studies and obtained a doctorate under supervision of Jerzy Neyman.
Thereafter he was working as a mathematical consultant at the Ministry of
Defense.
The breakthrough in Linear Programming was made in 1947 with Dantzig’s
paper: Programming in a Linear Structure. One can read by Dantzig [75] page
29, that during the summer of 1948 Koopmans suggested him to make use of a
shorter title, namely Linear Programming. The notion simplex method goes back
to a discussion between Dantzig and Motzkin, the latter held the opinion that
simplex method describes most excellently the geometry of changing from one
vertex of the polyhedra of the feasible solutions to another.
In 1949, exactly two years after the first publication of the simplex algo-
rithm, Koopmans organized the first Conference on Mathematical Programming
in Chicago, which was later counted as number ”zero-conference” in a sequence
of Mathematical Programming Conferences, taking place up today. Besides, well-
known economists like Arrow, Samuelson, Hurwicz and Dorfman, also
mathematicians like Albert Tucker, Harold Kuhn, David Gale, Johann
von Neumann, Theodore S. Motzkin and others attended this event.
”On the Shoulders of Giants” A brief excursion into the ... 19

In 1960 Dantzig became professor at the University of California at Berkeley


and in 1966 he switched to a chair for Operations Research and Computer Science
at Stanford-University. In 1973 he was one of the founders and the first presi-
dent of the Mathematical Programming Society (MPS). Also by MPS the Dantzig
price has been created and awarded to colleagues for outstanding contributions
in mathematical programming. In 1991 the first edition of the SIAM Journal on
Optimization was dedicated to George B. Dantzig.

Back to the Nobel price awarded to Koopmans and Kantorovich. In


1975 the Royal Swedish Academy of Science granted the price for economy to
equal parts to Kantorovich and to Koopmans for their contributions to op-
timal resource allocation. The prize money at that year amounted to 240.000
US-dollars. Immediately after this ceremony Koopmans traveled to IIASA
(The International Institute for Applied Systems Analysis) in Laxenburg, Aus-
tria. One can read in Michel Balinski [75], page 12, at that time director
of the IIASA, that on the occasion of a ceremonial meeting Koopmans sub-
mitted 40.000 $ of the prize money as a present to the IIASA. Therefore, he
indeed accepted only one third of the whole amount of the prize money for
himself.

By the way, for a long time it was unclear whether it would be permitted
that Kantorovich could accept the Nobel price of economy, because some years
before when Boris Pasternak (known for his novel ”Doctor Shiwago”) had the
honor to get the Nobel price for literature, the Soviet authorities forced him to
reject. Also the Nobel price award to the physicist Andrej Sacharov, one of
the leading Soviet dissidents at that time, had been seen as an unfriendly act. But
because one was unable to change Sacharov’s mind to accept this recognition,
one refused his journey to Stockholm and expelled him as a member from the
Soviet Academy of Sciences. It is known that Kantorovich, together with some
physicists, voted against this expulsion.

It is worth mentioning that in the framework of ”Linear Programming” and


”Operations Research” in its widest sense five more scientists have been awarded
with the Nobel price: in 1976 Wasilij Leontiev (Input-Output-Analysis), in
1990 Harry Markowitz (Development of the theory of portfolio selection), in
1994 Reinhard Selten and John Nash (Analysis of the equilibrium in non-
cooperative game theory) and in 2007 Leonid Hurwicz (Development of the
basics of economic design). Hurwicz turned at the time of his awarding 90 years
and has been the oldest price winner up to now.

An historical overview about the development of Operations Research can be


found in [40].
20 R. Tichatschke

' $

Wasilij Leontiev Harry Markowitz Reinhard Selten


(1976) (1990) (1994)

John F. Nash Leonid Hurwicz


(1994) (2007)
& %
Now, let us sidestep to game theory and his founder Johann von Neumann.
He achieved outstanding results on several mathematical fields.
' $
Johann von Neumann (1903–1957)

1926–1929: Associate professor at the


Humboldt University, Berlin.
1933–1957: Professor at the Institute for
Advanced Studies, Princeton.
1943: Collaborator at the Manhattan
project in Los Alamos.

• Quantum mechanics.
• Theory of linear operators in Hilbert spaces.
• Theory of shock waves.
• Computer architecture.

1928: Zur Theorie der Gesellschaftsspiele, Math. Ann. 100, 295–320.


1944: The Theory of Games and Economic Behavior, Springer.
& %
Already in 1928 a paper of the mathematician Émile Borel (1871–1956) on
min-max properties inspired him to ideas, which led to one of the most original
design later on, the game theory. In the same year he proved the min-max theo-
”On the Shoulders of Giants” A brief excursion into the ... 21

rem [85] on the existence of optimal strategies in a zero-sum game. Together with
the economist Oskar Morgenstern he wrote in 1944 the famous book ”The
Theory of Games and Economic Behavior” [86], dealing also with n-person games
(n > 2), which are important generalizations in economy. These contributions
made him the founder of game theory, which he applied less to classical salon
games, rather than to situations of conflict and decision with incomplete knowl-
edge of the intensions of the opposing players.
When on December 12, 1941, in Pearl Harbor the Japanese Air Force scuttled
most of the American Pacific Fleet, it was the time of birth of the application
of game theory for military purposes. Later, from the analysis of the debacle
it became clear that the recommendations, given by experts and founded on
game theoretical considerations, were dismissed by the Pentagon. This gave game
theory an extraordinary impetus and up to now the mathematical research on
game theory is subject of secrecy in a great extent.

3. The Beginnings of Nonlinear Optimization

In the fifties, apart from Linear Programming, several new research directions in
the area of extremal problems have been developed, which are summarized today
under the keyword Mathematical Programming.
Concerning optimization problems subject to equality constraints we men-
tioned already that the optimality conditions are going back to Euler and La-
grange. Nonlinear inequality constraints have been considered first in 1914 by
Bolza [11] and in 1939 by Karush [59]. Unfortunately, these results have been
forgotten for a long time.
' $
O. Bolza:
1914: Über Variationsprobleme mit Ungleichungen als Nebenbedingungen, Mathem.
Abhandlungen 1–18.
W. Karush:
1939: Minima of functions of several variables with inequalities as side conditions,
MSc Thesis, Univ. of Chicago.
F. John:
1948: Extremum problems with inequalities as subsidiary conditions,
Studies and Essays, Presented to R. Courant on his 60th Birthday, Jan. 1948,
Interscience, New York, 187–204.
M. Slater:
1950: Lagrange Multipliers Revisited, Cowles Commisssion Discussion Paper, No 403.
& %

In 1948 Fritz John [50] considered problems with inequality constraints, too.
He did not assume any constraint qualifications, up to the fact that all functions
should be continuously differentiable.
22 R. Tichatschke

The term constraint qualification can be traced back to Kuhn and Tucker [69]
and says (and this makes the treatment of problems under nonlinear inequalities
so difficult) that a suitable local approximation of the feasible domain is guar-
anteed (mathematically speaking: linearization cone and tangential cone have to
coincide).
A discussion here about the historical development of the constraint quali-
fication would go to far. But we refer to an earlier paper of Morton Slater
[103]. He found a useful sufficient condition for the existence of a saddle point,
without assuming that the saddle function is differentiable.

In developing the theory of nonlinear optimization Albert W. Tucker played


an outstanding role.
' $

Albert W. Tucker (1905–1995)

1933: Professor at the Princeton University.


• Topology.
• Mathematical programming (duality
theory).
• Game theory (prisoner’s dilemma).

Afternoon tea club:


J. Alexander, A. Church, A. Einstein, L. Eisenhart, S. Lefschetz, J.v. Neumann,
O. Veblen, H. Weyl, E. Wigner, A. Turing, u.a.
& %

Tucker graduated in 1932 and since 1933 he was a fellow of the Mathematical
Department at the Princeton University. Actually in the fifties and sixties he
became a successful chairman of the department. He is known for his work in
duality theory for linear and nonlinear optimization, but also in game theory. He
introduced the well-known prisoner’s dilemma which is a bi-matrix game with
non-constant profit sum.
His famous students were Michel Balinski, David Gale, John Nash
(Nobel price 1994), Lloyd Shapley (Nobel price 2012), Alan Goldman and
John Isbell. Every year the Mathematical Programming Society grants the
Tucker price for outstanding student achievements.
In the thirties the department in Princeton was famous for its tea afternoon
sessions, bringing together scientists and giving reason for inspiring discussions.
Members of this club were Albert Einstein, Johann v. Neumann, Hermann
Weyl and at the beginning also Alan Turing, a student of the logician Alonzo
Church. During the Second Wold War Turing developed the ideas of Polish
”On the Shoulders of Giants” A brief excursion into the ... 23

mathematicians and was able to crack a highly complicated German radio code
ENIGMA2 and later on, working at the University of Manchester, he invented
the Turing machine.
Starting in 1948 until about 1972 at Princeton, under the leadership of
Tucker, a project was sponsored by the Naval Research Office, drawing up op-
timality conditions for different classes of non-linear optimization problems and
formulating the duality theory for convex problems. In this project among others
Harold Kuhn, a student of Ralph Fox, David Gale and Lloyd Shapley
were involved.
Due to Albert Tucker and Harold Kuhn Lagrange’s multiplier rule has
been generalized to problems with inequality constraints [68].
' $
Harold W. Kuhn and Albert W. Tucker:
1951: Nonlinear programming, Proceedings of the Second Berkeley Symposium on
Mathem. Statistics and Probability,
Univ. of California Press, Berkeley, 481–492.

f (x) → min, x ∈ IRn



(P )
gi (x) ≤ 0, (i = 1, . . . , m)

(S) ∃ x̃ with gi (x̃) < 0 ∀ i = 1, . . . , m

Theorem: (necessary and sufficient optimality conditions)


In (P ) let f, gi , (i = 1, . . . , m) be convex functions and Slater’s condition (S) be satisfied.
Then:
x∗ is a global minimizer of (P ) ⇔ ∃ λ∗i ≥ 0 (i = 1, . . . , m), such that
λ∗i gi (x∗ ) = 0 (i = 1, . . . , m),
L(x, λ∗ ) ≥ L(x∗ , λ∗ ) ∀ x ∈ IRn ,

where λ∗ ∈ IR+m
– (Lagrange multiplier to x∗ );
g(x) = [g1 (x), . . . , gm (x)]T ;
L(x, λ) = f (x) + hλ, g(x)i – (Lagrange function of (P)).

1957: Linear and nonlinear programming, Oper. Res. 5, 244–257.


& %

What about Linear Programming at that time? Particular attention was payed
to its commercial applications, although no efficient computers were at hand.
One of the first documented applications of the simplex method was a diet prob-
lem by G.J. Stiegler [105], with the goal of modeling a possibly economical
food composition for the US-Army, guaranteeing certain minimum and maximum
quantities of vitamins and other ingredients. The solution of this linear program
2
Marian Rejewski together with two fellow Poznan’ University mathematics graduates, Hen-
ryk Zygalski and Jerzy Różycki, solved 1932 the logical structure of one of the first military
versions of Enigma.
24 R. Tichatschke

with 9 inequalities and 77 variables kept busy 9 persons, and required computa-
tional work of approximately 120 man days.
' $
Historic overview on LP-computations by means of simplex algorithms on
computers: (cf. [88])

Year Constraints Remarks


1951 10
1954 30
1957 120
1960 600
1963 2 500
1966 10 000 structured
1970 30 000 structured
& %

Nowadays structured problems with Millions of constraints are solved, for instance
in aviation industries.
In this context one can read the following curiosity in Lilly Lancaster
[72]: It is well known that the simplex algorithm carries out the pivoting in
dependence of the reduced costs. However, the prices for spices, as a rule, are
higher as those for other commodities. Therefore, the spice-variables appeared
mostly as nonbasic variables, hence they got the values zero. The result was that
the optimized food was terrible tasteless. In this paper it is described how the
LP-model was changed stepwise by altering the constraints in order to get tasty
food.
In 1952 Charnes, Cooper and Mellon [14] successfully used the simplex
method in the oil industry for optimal cracking of crude oil into petrol and high
quality oil.
The first publication on solving linearly constrained systems iteratively can be
traced back to Hestenes and Stiefel [46]. In 1952 they suggested a conjugate
gradient method to determine a feasible point of a system of linear equations and
inequalities.
In the fifties in the US the development of network flow problems has been
started. The contribution of Ford and Fulkerson [37] consisted in connect-
ing flow problems with graph theory. Until now combinatorial optimization is
benefiting from this approach.
1959–60 Dantzig and Wolfe [23] were working on decomposition princi-
ples, i.e., the decomposition of structured large scale LP’s into master and (sev-
eral) subproblems. Nowadays these ideas allow parallelization of computational
work on the level of the subproblems and enable a fast resolution of such problems
with hundreds of thousands of variables and constraints.
The dual variant of this decomposition method was used in 1962 by Benders
[5] for solving mixed integer problems.
”On the Shoulders of Giants” A brief excursion into the ... 25

' $
M.R. Hestenes; E. Stiefel:
1952: Methods of conjugate gradients for solving linear systems,
J. Res. Natl. Bur. Stand. 49, 409–436.
R. Gomory:
1958: Outline of an algorithm for integer solutions to linear programs,
Bull. Amer. Math. Soc. 64, 275–278.
G.B. Dantzig; Ph. Wolfe:
1960: Decomposition principle for linear programs, Oper. Res. 8, 101–111.
J.F. Benders:
1962: Partitioning procedures for solving mixed-variables programming problems,
Numer. Math. 4, 238–252.

L.R. Ford; D.R. Fulkerson:


1962: Flows in Networks, Princeton University Press, Princeton, N.J., 194 p.
& %

The study of integer programming problems has been started in 1958 by Ralph
Gomory [44]. Unlike the earlier work on the traveling salesman problem by
Fulkerson, Johnson and Dantzig on the usage of cutting planes for cutting
off non-optimal tours in the traveling salesman problem, Gomory showed how
to generate ”cutting planes” systematically. These are extra conditions which,
when added to an existing system of inequalities, guarantee that the optimal so-
lution consists of integers. Today such techniques, combining cutting planes with
branch-and-bound-methods, belong to the most efficient algorithms for solving
applied problems in integer programming.
In the Soviet Union first results on matrix games have been published by
Ventzel [111] and Vorobyev [112] and a very popular Russian textbook on
Linear Programming has been written by Yudin and Golstein [113].
' $
S.I. Zukhovitzkij:
1956: On the approximation of real functions in the sense of Chebyshev (in Russian),
Uspekhi Matem. Nauk, 11(2), 125–159.
E.S. Ventzel; N.N. Vorobyev:
1959: Elements of Game Theory (in Russian), Moscow, Fizmatgiz.
D.B. Yudin; E.G. Golstein:
1961: Problems and Methods of Linear Programming (in Russian), Moscow, Sov. Radio.

E.Ya. Remez:
1969: Foundation of Numerical Methods for Chebyshev Approximation (in Russian), Kiev,
Naukova Dumka.
& %
26 R. Tichatschke

Approximately at the same time, papers of two Ukrainian mathematicians,


Zukhovitskij [116] and Remez [96], became known. They suggested numer-
ical methods for solving best-approximation problems in the sense of Chebyshev.
These are simplex-like algorithms for solving the underlying linear semi-infinite
optimization problem.
The initial works about this topic are essentially older than mentioned here.
For instance, Remez’s algorithm for the numerical solution of Chebyshev’s ap-
proximation problem was presented by Remez in 1935 on the occasion of
a meeting of the Mathematical Department of the Ukrainian Academy of
Sciences.
Important results in the field of control theory, i.e., optimization under con-
straints described by differential equations, were initiated with the beginnings of
space travel. 1957 was the year when the Soviets were shooting the first rocket,
the Sputnik, in the outer space.
One of the basic problems in optimal control consists in the transfer of the
state x(t) of some system, described by differential equations, from a given start
into some target domain T . Hereby the controlling is carried out by some control
function u(t) which belongs to a certain class of functions and minimizes a given
objective functional.
A typical problem of that kind can be found in space travel: Transfer of a
controllable object to some planet in shortest time or with smallest costs.
' $
Terminal control problem:
Z T
min f 0 (x(t), u(t))dt;
0
dx
= f (x, u), x(0) = x0 , x(T ) ∈ T (T );
dt
u(t) ∈ Uad = {u(·) : measurable , u(t) ∈ Ψ for t0 ≤ t ≤ t1 }.

(x – state vector, u – control vector, f = (f 1 , . . . , f n )).


& %

Hence, again the question about necessary optimality conditions arises, which
have to be satisfied by an optimal control function, i.e., we are dealing with a
strong analogue to variational analysis. In the latter theory the Euler-Lagrange-
equations were necessary showing that certain functions prove to be candidates
for optimal functions; here in control theory the analogous necessary optimal-
ity conditions for a control problem are described by Pontryagin’s maximum
principle.
”On the Shoulders of Giants” A brief excursion into the ... 27

Pontryagin lost his eyesight at the age of 14 years. Thanks to his mother,
who read mathematical books to him, he became a mathematician despite of his
blindness.

' $

Lev Semenovich Pontryagin (1908–1988)

1934: Steklov Institute, Moscow.

1935: Head of the Department of Topology and Func-


tional Analysis at the Steklov Institute.

• Geometric aspects of the topology.


• Duality theory of the homology.
• Co-cycle theory in topology (Pontryagin classes).
• Maximum principle.
1956: L.S. Pontryagin, V.G. Boltyanskij, R.V. Gamkrelidze:
Mathematical Theory of Optimal Processes, Moscow, Nauka.
& %

We have to thank him for a series of basic results, first of all in topology. His
excursus into applied mathematics by investigating control problems has to be
valued merely as a side effect of his research program.
In 1954 the maximum principle was formulated as a thesis by Pontryagin
and proved in 1956 in a joint monograph with Boltyanskij and Gamkrelidze
[91]. Still it proves to be fundamental for modern control theory.
Gamkrelidze, one of the coauthors of the mentioned monograph, indicated
later on Pontryagin’s proof of the maximum principle as ”in some sense sensa-
tional”. Exceptional on this proof is the usage of topological arguments, namely
the cut theory, which goes back to the American S. Lefschetz.
Boltyanskij [9], in his extensive widening of the maximum principle to other
classes of control problems, has shown that topological methods, in particular
homotopy results, are very useful in control theory. Meanwhile there exist proof
techniques belonging to convex analysis and establishing the maximum principle,
too. One of the first authors in this area was Pshenichnyj [93].
The transfer of the maximum principle into a discrete setting made some
difficulties. Among the numerous papers, dealing at that time with this questions,
there are more than a few which are incorrect (see references in [10]).
28 R. Tichatschke

' $
Maximum principle of the terminal control problem:

Consider dual state variables (Lagrange multipliers) λ0 (t), . . . , λn (t), the Hamiltonian func-
tion
n
X
H(λ, x, u) = λi (t)f i (x(t), u(t))
i=0

and the adjoint system (for some feasible process (x(t), u(t)), 0 ≤ t ≤ T )
n
dλi ∂H X ∂f i (x, u)
=− j =− λi , j = 1, . . . , n,
dt ∂x i=0
∂xj

λ0 = −1, λ1 (T ) = λ2 (T ) = · · · = λn (T ) = 0.
Theorem: If the process (x∗ (t), u∗ (t)) is optimal, then there exist absolutely continuous
functions λ∗i (t), solving the adjoint system almost all on [0, T ] and at each time τ ∈ [0, T ]
the following maximum condition is satisfied:

H(λ∗ (τ ), x∗ (τ ), u∗ (τ )) = max H(λ∗ (τ ), x∗ (τ ), u).


u∈Uad
& %
Another important principle, which found its application mainly in the theory of
discrete optimal processes, is the principle of dynamic programming and can be
traced back to Bellman [6].
' $
Richard Bellman (1920–1984)
1947: Collaborator in the area of theoretical physics
in Los Alamos.
1952: Switch to the Rand Corporation, development
of dynamical programming.
1965: Professor for mathematics, electro-technique
and medicine at the University of Southern Califor-
nia.
• Bellmann’s optimality principle.
• Algorithm of Bellman and Ford (shortest path in graphs).
• Bio-informatics.
1957: Dynamic Programming, Princeton Univ. Press.
& %
In physics this principle was known for a long time, but under another name:
Legendre transformation. There, the transition from a global (at all times si-
multaneously) to a time-dependent (dynamical) way of looking at things corre-
sponds to the transition of the Lagrange-functional into the Hamilton-functional
by means of the Legendre transformation.
In control theory and similar areas this approach can be used, for instance,
to derive an equation (Hamilton-Jacobi-Bellman equation) where its solution
amounts in the optimal objective value of the optimal process.
”On the Shoulders of Giants” A brief excursion into the ... 29

Hereby the argumentation is more or less as follows: If a problem is time-


dependent, one can consider the optimal value of the objective functional at
a certain time. Then one is asking, which equation has to be fulfilled at the opti-
mal solution such that the objective functional is staying optimal also at a later
date. This consideration leads to the Hamilton-Jacobi-Bellman equation. That
way one can divide the problem into time-steps instead of solving it at the whole.
' $
Bellmann’s optimality principle for a discrete process:

The value of the objective functional at the k-th level is optimal if for each at the k-th level
chosen xk the objective value of the (k − 1)-th level is optimal.
Denote
ξ state vector, descibing the state of the k-level process,
Λk (ξ) optimal value of the k-level process, in dependence of the state k,
ξ, xk variables (or vectors of variables) which have to be determined at k-th level.

Assumption: After choosing xk and ξ let the vector of state-variables, corresponding to


the (k − 1)-th level, be given by some transformation T (ξ, xk ).

Λk (ξ) = max{fk (ξ, xk ) + Λk−1 [T (ξ, xk )]}, k = 1, . . . , n.


xk

System of recurrent formulas, in which Λk (ξ) can be determined for (k = 1, . . . , n) if


Λk−1 (η) is known for the problem at the (k − 1)-th level.
& %

About five years later an intensive study of control problems has started, be-
ginning with the papers of Roxin [100], Neustadt [87], Balakrishnan [3],
Hestenes [47], Halkin [45], Butkovskij [12], Berkovitz [7] and others.
' $
E. Roxin:
1962: The existence of optimal controls, Michigan Math. J. 9, 109–119.
L.W. Neustadt:
1963: The existence of the optimal control in the absence of convexity conditions,
J. Math. Anal. Appl. 7, 110–117.
A.V. Balakrishnan:
1965: Optimal control problem in Banach spaces,
J. Soc. Ind. Appl. Math., Ser. A, Control 3, 152–180.
M.R. Hestenes:
1966: Calculus of Variations and Optimal Control Theory, Wiley, New York.
H. Halkin:
1966: A maximum principle of Pontryagin’s type for nonlinear differential equations,
SIAM J. Control 4, 90–112.
A.G. Butkovskij:
1969: Distributed Control Systems, Isd. Nauka, Moscow.
L.D. Berkovitz:
1969: An existence theorem for optimal control, JOTA 4, 77–86.
& %
30 R. Tichatschke

The beginnings of stochastic optimization can be found in the literature from


1955 on. At that time the application of observed coincidences (in some parts) of
data in LP’s has been discussed, for instance, by Dantzig [21] and G. Tintner
[110].
One was investigating several problems: Compensation problems (recourse),
distributed problems (among others also distribution of the optimal value under
given common distribution of LP-data) or problems with probability restrictions
(chance-constraints).
Under special structural assumptions (with respect to the data and their
given distribution) first applications were considered by Van de Panne and
Popp [89] and Tintner [110]. Also there can be found first solution tech-
niques for stochastic problems by means of quadratic optimization, for instance,
by Beale [4]. Dantzig and Madansky [22] described techniques which use
two-stage stochastic programs and in Charnes and Cooper [15] one can find
programs with constraints, which have to be satisfied with certain probabilities.
The state of the art in stochastic programming till the mid-seventies has been
described in the monographs by Kall [51] and Ermoliev [32].
' $
G.B. Dantzig:
1955: Linear programming under uncertainty, Management Sci., 1: 197–206.
G. Tintner:
1955: Stochastic linear programming with applications to agricultural economics, in: 2nd
Symp. Linear Programming, vol.2: 197–228.
A. Charnes and W.W. Cooper:
1959: Chance-constrained programming, Management Sci., 5:73–79.
E.M.L. Beale:
1961: The use of quadratic programming in stochastic linear programming, RAND Report
P-2404, The RAND Corporation.
G.B. Dantzig and A. Madansky:
1961: On the solution of two-stage linear programs under uncertainty, in: Proc. 4th Berke-
ley Symp. Math. Stat. Prob., Berkeley, pp. 165–176.
C. van de Panne and W. Poop:
1963: Minimum-cost cattle feed under probabilistic problem constraint, Management Sci.,
9:405–430.
P. Kall:
1976: Stochastic Linear Programming, Springer-Verlag, Berlin.
Y.M. Ermoliev:
1976: Stochastic Programming Methods, Nauka, Moscow.
& %

As we can see there was a time of great activities, but the results in essence were
still isolated and could not be understood as a part of an uniquely united branch.
The situation changed dramatically in the sixties and seventies. Time was ripe
to create a complete picture of Mathematical Programming, which immediately
led to a kaleidoscope of new contributions.
”On the Shoulders of Giants” A brief excursion into the ... 31

4. The 60s and 70s

The main directions of the investigation in these years were: General theory of
nonlinear optimization, numerical methods for nonlinear optimization problems,
non-smooth optimization, global optimization, discrete optimization, optimiza-
tion on graphs, stochastic optimization, dynamic optimization, and variational
inequalities.
The understanding of the common nature of different optimization problems
was the first breakthrough in this period. Although in different papers there
existed different approaches for analyzing specific nonlinear problems, still these
did not lead to a common technique for obtaining optimality criteria. Moreover,
the mentioned papers were dealing exclusively with the finite dimensional case,
with the exception of the paper of Bolza [11].
The transition to infinite-dimensional settings was forced essentially by the
papers of Dubovitzkij and Milyutin [31] and Pshenicnyij [94]. At the latest
at that time it became clear that the functional analytical foundation of duality
theory in mathematical programming in general spaces can be deduced from the
geometric form of the Hahn-Banach theorem.
' $

Abram Ya. Dubovitzkij, Aleksej A.


Milyutin:
1963: Extremum problems under constraints,
Doklady AN SSSR, 149(4), 759–762.

Theorem: Let C1 , . . . , Cm be convex cones in a Hilbert space H, Ci (i = 1, . . . , k ≤ m) be


polyhedral and C = C1 ∩ · · · ∩ Cm . If

∩ki=1 Ci ∩ (∩m
i=k+1 int Ci ) 6= ∅,

then it holds for the dual cone of C:

C ∗ = C1∗ + · · · + Cm

.
& %
Dubovitzkij-Milyutin’s formalism delivered the breakthrough: The characteriza-
tion of the dual cone of the intersection of finitely many cones is an efficient tool
for a unified approach to necessary optimality conditions. Till today this formal-
ism is commonly used for treating different classes of optimization problems as
well as in finite-dimensional and in infinite-dimensional spaces. In particular, it
delivered new criteria for some difficult problems, for instance, control problems
with phase constraints [8].
In the process of working out the theory of convex analysis, see for instance
the monographs of Rockafellar [97] (for finite-dimensional spaces) and Ioffe
32 R. Tichatschke

and Tichomirov [48] (for Banach spaces), these investigations were pushed on
and became deeper.
' $

R. Tyrrel Rockafellar:
1970: Convex Analysis, Princeton Univ.
Press, Princeton.

Alexander D. Ioffe, Vladimir M.


Tichomirov:
1974: Theory of Extremum Problems,
Nauka, Moscow.

& %

Parallel to the development of a general theory in nonlinear optimization it be-


came clear that numerical methods can be handled in an united framework,
too.
Publications of Fletcher and Reeves [36] concerning conjugate gradient
methods or papers of Levitin and Polyak [77] describe several gradient- and
Newton-like methods for unconstraint optimization problems and their expansion
to constraint problems. Fiacco and McCormick [35] published first results
for penalty-methods. In the seventies and eighties the books and papers of Den-
nis and Schnabel [29], Goldfarb [42], Powell [92], Davidon [24], only to
mention a few, are based on these numerical developments.
General theorems about convergence and rates of convergence for numerical
algorithms in finite-dimensional and infinite-dimensional spaces have been proved
and a great number of applications of nonlinear (especially global optimization
problems), control problems, semi-infinite problems etc. have been considered.
These analytic-numeric developments prolonged successfully over many years and
numerous monographs appeared.
Smooth optimization problems, involving differentiable functions, allow to
apply descent methods with the help of gradient- and Newton-approximations.
The Ukrainian mathematician Naum Shor [102] was the first who transferred
this approach to non-smooth problems. In his PhD-thesis he suggested a subgra-
dient method for non-differentiable functions and used it for numerically solving
of a program, which is dual to a transport-like problem. Later this approach,
named bundle-methods, was developed further by Schramm and Zowe [101],
Lemarechal[74] and Kiwiel [62].
”On the Shoulders of Giants” A brief excursion into the ... 33

' $

Robert Fletcher, C.M. Reeves:


1964: Function minimization by conjugate
gradient, Computer Journal, 7, 149–154.

Evgenij S. Levitin, Boris T. Polyak:


1966: Minimization methods under con-
straints, Zhurn. Vychisl. Matem. i Matem.
Fiz. 6, 787–823.

Polyak
& %

In the seventies non-smooth analysis (this notion is due to F. Clarke [16]) be-
came a well-developed branch of analysis. Now some parts of this theory are
more or less complete. Subdifferential calculus for a class of convex functions and
' $
Naum Z. Shor:
1962: Application of the subgradient method for the solution of network transport problems,
Notes of Sc. Seminar, Ukrainian Acad. of Science, Kiew.

Vladimir F. Demyanov, Alexander M. Rubinov:


1968: Approximate Methods for Solving Extremum Problems, Leningrad, State University

& %
34 R. Tichatschke

minimax-theory belong to these parts and the latter was decisively developed by
Clarke, Demyanov and Gianessi [17] and Demyanov and Rubinov (see,
for instance, [25]–[28]).
For a long time it was unknown whether linear programs belong to the class
of problems which are difficult to solve (in non-polynomial time) or to the class
of more easily solvable problems (in polynomial time).
' $
Victor Klee (1925–2007)

1957: Professor at University of Washington, Seattle.

1995: Honorary doctor at the University Trier.


• Convex sets.
• Functional analysis (Kadec-Klee-Theorem).
• Analysis of algorithms, Optimization, Combi-
natorics.
& %

In 1970 Klee, who is also well-known in Functional Analysis, constructed some


examples together with Georg Minty [63] showing that the classical simplex
algorithm needs in the worst case an exponential number of steps. Because the
number of vertices will grow exponentially if the dimension of a certain distorted
standard cube increases exponentially, in the worst case all vertices of the cube
must be visited in order to go to the optimal vertex.
' $
Construction of the Klee-Minty cube for n = 3. Now there exists an unique path along the
edges of the cube, which hits all vertices and decreases monotonously with respect to the
objective function hc, xi with cT = (0, 0, −1).

& %

In 1979 Leonid Khachiyan [60] published the ellipsoid method, a method for
determining a feasible point of a polytope.
”On the Shoulders of Giants” A brief excursion into the ... 35

' $
Leonid Khachiyan:

1979: A polynomial algorithm in linear programming, Dokl. Akad. Nauk SSSR 244,
1093–1096.

& %

This method was initially proposed in 1976 and 1977 by Yudin and Nemirovskij
[114] and independently of those by Naum Shor for solving convex optimization
problems.
In 1979 Khachiyan modified this method and in doing so he developed the
first polynomial algorithm for solving linear programs. It is a matter of fact
that, by means of the optimality conditions, a linear program can be transformed
into a system of linear equations and inequalities, hence one is dealing with the
finding of a feasible point of a polyhedral set. However, for practical purposes
this algorithm was not suitable.
The basic idea of the algorithm is the following: Construct some ellipsoid
containing all vertices of the polyhedron. Afterwards check whether the middle
point of the ellipsoid belongs the polyhedron. If so, one has found a point of
the polyhedron and the algorithm stops. Otherwise, construct the half-ellipsoid,
in which the polyhedron should be included and put some smaller, new ellipsoid
around the polyhedron. After a number of steps, depending polynomially on the
code-length of the linear program, one has found a feasible point of the polyhedron
or the polyhedron is empty.
In the mid-eighties, precisely in 1984, Narendra Karmarkar [58] and
others started developing interior point methods for solving linear programs.
In this connection we should mention the fate of a paper by Dikin [30].
He was a student of Kantorovich and in his PhD thesis, at the advice of his
supervisor, he suggested some procedure for solving linear programs numerically,
although he failed to prove convergence estimates. This work did not find an
interest and was forgotten until the late eighties. At that time it became clear
that Karmakar’s algorithm was very similar to Dikin’s method.
36 R. Tichatschke

' $
Narendra Karmarkar:
1984: A new polynomial-time algorithm for linear programming, Combinatorica 4, 373–395.

Simplex-method interior point method


& %

Interior point methods approach an optimal vertex right through the interior of
a polyhedron, whereas the simplex method is running along the edges and ver-
tices of the polyhedron. The significance of the interior-point approach consisted
mainly in the fact that it was the first polynomial algorithm for solving linear
programs having the potential to be useful also in practice. But the essential
breakthroughs, making the interior point methods competitive to the simplex
algorithm, took place in the nineties.
Advantages of these methods consist, in contrast to simplex methods, in
their easy adaption for solving quadratic or certain nonlinear programs, so-called
semi-definite programs, and their application to large-scale, sparse problems. One
disadvantage is that, by adding a constraint or variable into the linear program,
a so-called ”warm-start” cannot carried out as efficiently as in simplex methods.
Now, let us once more come back to variational analysis. Also variational
problems, in particular variational inequalities, have their origin in the calculus of
variations associated with the minimization of functionals in infinite-dimensional
spaces. The systematic study of this subject began in the early 1960s with the
seminal work of the Italian mathematician Guido Stampacchia and his col-
laborators, who used variational inequalities (VI’s) as analytic tool for studying
free boundary problems defined by nonlinear partial differential operators aris-
ing from unilateral problems in elasticity and plasticity theory and in mechanics.
Some of the earliest books and papers in variational inequalities are Lions and
Stampacchia [78], Mancino and Stampacchia [79] and Stampacchia [104].
In particular, the first theorem of existence and uniqueness of the solutions of a
VI was proved in [104]. The books by Baiocchi and Capelo [2] and Kinder-
lehrer and Stampacchia [61] provide a thorough introduction to the appli-
cation of VI’s in infinite-dimensional function spaces. The book by Glowinski,
Lions and Trémolière [41] is among the earliest references to give a detailed
numerical treatment of such VI’s. Nowadays there is a huge literature on the
subject of infinite-dimensional VI’s and related problems.
”On the Shoulders of Giants” A brief excursion into the ... 37

The development of mathematical programming and control theory has proceeded


almost contemporarily with the systematical investigation of ill-posed problems
and their numerical treatment. It was clear from the very beginning that the main
classes of extremal problems include ill-posed problems. Among the variational
inequalities, having important applications in different fields of physics, there are
ill-posed problems, too. Nevertheless, up to now, in the development of numeri-
cal methods for finite- and infinite-dimensional extremal problems, ill-posedness
has not been a major point of consideration. As a rule, conditions ensuring
convergence of a method include assumptions on the problem which warrant its
well-posedness in a certain sense. Moreover, in many papers exact input data and
exact intermediate calculations are assumed. Therefore, the usage of standard
optimization and discretization methods often proves to be unsuccessful for the
treatment of ill-posed problems.
In the first methods dealing with ill-posed linear programming and optimal
control problems, suggested by Tikhonov [107, 108, 109], the problem under
consideration was regularized by means of a sequence of well-posed problems
involving a regularized objective and preserving the constraints from the original
problem.
Essential progress in the development of solution methods for ill-posed varia-
tional inequalities was initiated by a paper of Mosco [84]. Based on Tikhonov’s
principle, he investigated a stable scheme for the sequential approximation of vari-
ational inequalities, where the regularization is performed simultaneously with an
approximation of the objective functional and the feasible set. Later on analogous
approaches became known as iterative regularization.
A method using the stabilizing properties of the proximal-mapping (see
Moreau [83]) was introduced by Martinet [80] forthe unconstrained minimiza-
tion of a convex functional in a Hilbert space. Rockafellar [98] created the
theoretical foundation to the further advances in iterative proximal point reg-
ularization for ill-posed VI’s with monotone operators and convex optimization
problems. These results attracted the attention of numerical analysts, and the
number of papers in this field was increasing rapidly during the last decades
(cf. [106]).

5. Conclusion

Meanwhile optimization has spread out to a powerful stream fed by ideas and
works of thousands of mathematicians. New directions of the development were
opened up like robust programming, variational inequalities and complementar-
ity problems, mathematical programming with equilibrium constraints, PDE con-
strained optimization, shape optimization and much more.
38 R. Tichatschke

Likewise the area of applications in mathematical programming has widened


continuously, new optimization models in medicine and neurology, drug design,
biomedicine, to name only a few, appeared besides classical application areas in
economy and natural sciences. At the 21-th International Symposium on Mathe-
matical Programming 2012 in Berlin one could count 40 parallel sessions with ap-
proximately 1700 talks and about 2000 participants. Remember, at the congress
in Chicago 1949 there were at most two dozens of talks. In the face of these fig-
ures it becomes clear that it is almost impossible to name all new developments
and one has to restrict oneself in displaying the available results.
This paper is an attempt to describe the early beginnings and some selected
mathematical ideas in optimization. Hereby mainly the progress in the American
and Russian schools is pointed out. In my opinion these schools have distinctly
influenced the development of mathematical programming. However, I am aware
that the selected material and topics reflect mostly my personal view. It is com-
pletely obvious that other important trends in mathematical optimization have
been neglected and substantial contributions of other nations are unmentioned.

Acknowledgement
I would like to express my gratitude to V.F. Demyanov who provided me with
some photos of Russian mathematicians from his collection Faces and Places.
Other photos I have taken from the books [75], [40] and from the web-sites of
Wikipedia.

References

[1] N.I. Akhiezer, The Classical Problem of Moments, Fizmatgiz, Moscow, 1961. (in
Russian)
[2] C. Baiocchi and A. Capelo, Variational and Quasi-Variational Inequalities: Appli-
cations to Free Boundary Problems, J. Wiley & Sons, Chichester, 1984.
[3] A.V. Balakrishnan, Optimal control problems in Banach spaces, J. Soc. Ind. Appl.
Math., Ser. A, Control 3 (1965) 152–180.
[4] E.M.L. Beale, The use of quadratic programming in stochastic linear programming,
RAND Report P-2404, RAND Corporation, 1961.
[5] J.F. Benders, Partitioning procedures for solving mixed-variables programming
problems, Numer. Math. 4 (1962) 238–252. doi:10.1007/BF01386316
[6] R. Bellman, Dynamic Programming, Princeton Univ. Press, Princeton, 1957.
[7] L.D. Berkovitz, An existence theorem for optimal control, JOTA 4 (1969) 77–86.
doi:10.1007/BF00927413
[8] V.G. Boltyanskij, Method of tents in the theory of extremum problems, Uspekhi
Matem. Nauk 30 (1975) 3–55. (in Russian)
”On the Shoulders of Giants” A brief excursion into the ... 39

[9] V.G Boltyanskij, Mathematische Methoden der optimalen Steuerung, Akad. Ver-
lagsgesellschaft Geest & Portig, Leipzig, 1971.
[10] V.G. Boltyanskij, Optimale Steuerung diskreter Systeme, Akad. Verlagsgesellschaft
Geest & Portig, Leipzig, 1976.
[11] O. Bolza, Über Variationsprobleme mit Ungleichungen als Nebenbedingungen,
Math. Abhandlungen 1 (1914) 1–18.
[12] A.G. Butkovskij, Distributed Control Systems, Isd. Nauka, Moscow, 1969. (in
Russian)
[13] C. Carathéodory, Calculus of Variations and Partial Differential Equations of the
First Order, New York, Chelsea Publishing Co., 1982.
[14] A. Charnes, W.W. Cooper and B. Mellon, A model for optimizing production by
reference to cost surrogates, Econometrica 23 (1955) 307–323. doi:10.2307/1910387
[15] A. Charnes and W.W. Cooper, Chance-constrained programming, Management Sci.
5 (1959) 73–79.
[16] F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, 1983.
[17] F.H. Clarke, V.F. Demynov and F. Gianessi, Nonsmooth Optimization and Related
Topics, Plenum Press, New York, 1989.
[18] G.B. Dantzig, Linear Programming and Extensions, Princeton, 1963.
[19] G.B. Dantzig, Linear Inequalities and Related Systems, Princeton Univ. Press,
Princeton, 1966.
[20] G.B. Dantzig, Lectures in Differential Equations, Van Nostrand, Reinhold Co.,
New York, 1969.
[21] G.B. Dantzig, Linear programming under uncertainty, Management Sci. 1 (1955)
197–206. doi:10.1287/mnsc.1.3-4.197
[22] G.B. Dantzig and A. Madanski, On the solution of two-stage linear programs under
uncertainty, in: Proc. 4th Berkeley Symp. Math Stat. Prob., Berkeley, 1961.
[23] G.B. Dantzig and Ph. Wolfe, Decomposition principles for linear programs, Oper.
Res. 8 (1960) 101–111. doi:10.1287/opre.8.1.101
[24] W.C. Davidon, Variable Metric Methods for Minimization, A.E.C. Res. and De-
velop. Report ANL-5990, Argonne National Laboratory, Illinois, 1959.
[25] V.F. Demyanov and A.M. Rubinov, Approximate Methods for Solving Extremum
Problems, Leningrad State Univ., Leningrad, 1968. (in Russian)
[26] V.F. Demyanov and V.N. Malozemov, Introduction to Minimax, Nauka, Moscow,
1972. (in Russian)
[27] V.F. Demyanov and L.V. Vasiliev, Nondifferentiable Optimization, Nauka,
Moscow, 1981. (in Russian)
[28] V.F. Demyanov and A.M. Rubinov, Fundations of Nonsmooth Analysis and Qua-
sidifferential Calculus, Nauka, Moscow, 1990. (in Russian)
40 R. Tichatschke

[29] J. Dennis and R. Schnabel, Numerical Methods for Unconstrained Optimization


and Nonlinear Equations, Prentice-Hall, Englewood Cliffs, 1983.
[30] I. Dikin, An iterative solution of problems of linear and quadratic programming,
Doklady AN SSSR 174 (1967) 747–748. (in Russian)
[31] A.Ya. Dubovitskij and A.A. Milyutin, Extremum problems under constraints, Dok-
lady AN SSSR 149 (1963) 759–762. (in Russian)
[32] Y.M. Ermoliev, Stochastic Programming Methods, Nauka, Moscow, 1976. (in
Russian)
[33] L. Euler, Methodus inveniendi lineas curvas maximi minimive proprietate gau-
dentes, sive solutio problematis isoperimetrici latissimo sensu accepti, Opera Om-
nia, Series 1, Volume 24, 1744.
[34] J. Farkas, Theorie der einfachen Ungleichungen, Journal für die Reine und Ange-
wandte Mathematik 124 (1902) 1–27.
[35] A.V. Fiacco and G.P. McCormick, Nonlinear Programming: Sequential Uncon-
strained Minimization Techniques, Wiley, 1968.
[36] R. Fletcher and C.M. Reeves, Function minimization by conjugate gradient, Com-
puter J. 7 (1964) 149–154. doi:10.1093/comjnl/7.2.149
[37] L.R. Ford and D.R. Fulkerson, Flows in Networks, Princeton Univ. Press, Prince-
ton, 1962.
[38] J.B. Fourier, Analyse des Équations Déterminées, Navier, Paris, 1831.
[39] L. Garding, The Dirichlet problem, Math. Intelligencer 2 (1979) 42–52.
doi:10.1007/BF03024386
[40] S.I. Gass and A.A. Assad, An annotated timeline of operations research: an infor-
mal history, Int. Ser. in OR & Management Sci. 75 (2005).
[41] R. Glowinski, J.L. Lions and R. Trémolières, Numerical Analysis of Variational
Inequalities, North-Holland, Amsterdam, 1981.
[42] D. Goldfarb, Extension of Davidon’s variable metric method to maximization under
linear inequality and equality constraints, SIAM J. Appl. Math. 17 (1969) 739–764.
doi:10.1137/0117067
[43] A.J. Goldman and A.W. Tucker, Theory of linear programming, in: Kuhn
and Tucker (eds.), Linear Inequalities and Related Systems, Princeton, 1956,
pp. 53–97.
[44] R. Gomory, Outline of an algorithm for integer solutions to linear programs, Bull.
Amer. Math. Soc. 64 (1958) 275–278. doi:10.1090/S0002-9904-1958-10224-4
[45] H. Halkin, A maximum principle of Pontryagin, SIAM J. Control 4 (1966) 90–112.
doi:10.1137/0304009
[46] M.R. Hestenes and E. Stiefel, Methods of conjugate gradients for solving linear
systems, J. Res. Natl. Bur. Stand. 49 (1952) 409–436. doi:10.6028/jres.049.044
”On the Shoulders of Giants” A brief excursion into the ... 41

[47] M.R. Hestenes, Calculus of Variational and Optimal Control Theory, Wiley, New
York, 1966.
[48] A.D. Ioffe and V.M. Tikhomirov, Theory of Extremum Problems, Nauka, Moscow,
1974. (in Russian)
[49] C.G.J. Jacobi, Vorlesungen über analytische Mechanik : Berlin 1847/48, (Nach
einer Mitschr. von Wilhelm Scheibner), Hrsg. von Helmut Pulte, Deutsche
Mathematiker-Vereinigung, Braunschweig, Vieweg, 1996.
[50] F. John, Extremum problems with inequalities as subsidiary conditions, in: Studies
and Essays, Presented to R. Courant on his 60th Birthday, Jan. 1948, Interscience,
New York, 187–204.
[51] P. Kall, Stochastic Linear Programming, Springer Verlag, Berlin, 1976.
doi:10.1007/978-3-642-66252-2
[52] L.V. Kantorovich, Mathematical Methods for Production Organization and Plan-
ning, Leningrad, 1939. (in Russian)
[53] L.V. Kantorovich, On an efficient method for solving some classes of extremum
problems, Doklady AN SSSR 28 (1940) 212–215. (in Russian)
[54] L.V. Kantorovich, Fuctional analysis and applied mathematics, Uspekhi Matem.
Nauk 6 (1948) 89–185. (in Russian)
[55] L.V. Kantorovich, Functional Analysis, Nauka, Moscow, 1959. (in Russian)
[56] L.V. Kantorovich, Economical Calculation of the Best Utilization of Ressources,
Moscow, Academy of Sciences, 1960. (in Russian)
[57] L.V. Kantorovich, The Best Use of Economic Resources, Harvard U. Press, Cam-
bridge, 1965.
[58] N. Karmarkar, A new polynomial time algorithm for linear programming, Combi-
natorica 4 (1984) 373–395. doi:10.1007/BF02579150
[59] W. Karush, Minima of functions of several variables with inequalities as side con-
ditions, MSc Thesis, Univ. of Chicago, 1939.
[60] L. Khachiyan, A polynomial algorithm in linear programming, Doklady AN SSSR
244 (1979) 1093–1096. (in Russian)
[61] D. Kinderlehrer and G. Stampacchia, An Introduction to Variational Inequalities
and their Applications, Acad. Press, New York-London, 1980.
[62] K. Kiwiel, Proximal level bundle methods for convex nondifferentiable optimization,
saddle point problems and variational inequalities, Math. Programming 69 (B)
(1995) 89–109.
[63] V. Klee and G. Minty, How good is the simplex algorithm? in: Inequalities III, O.
Shisha, ed., Academic Press, New York, 1972, 159–175.
[64] T.C. Koopmans, Exchange Ratios between Cargoes on Various Routes (Non-
Refrigerated Dry Cargoes), Memorandum for the Combined Shipping Adjustment
Board, Washington, D.C., 1942.
42 R. Tichatschke

[65] T.C. Koopmans, Activity Analysis of Production and Allocation, Wiley, New York,
1951.
[66] T.C. Koopmans and M. Montias, On the Description and Comparison of Economic
Systems, in: Comparison of Economic Systems, Univ. of California Press, 1971,
27–78.
[67] M.G. Krein and A.A. Nudelman, Markov’s Problem of Moments and Extremum
Problems, Nauka, Moscow, 1973. (in Russian)
[68] H.W. Kuhn and A.W. Tucker, Nonlinear Programming, Proc. of the Second Berke-
ley Symp. in Math., Statistics and Probability, Univ. of California Press, Berkeley,
1951, 481–492.
[69] H.W. Kuhn and A.W. Tucker, Linear and nonlinear programming, Per. Res. 5
(1957) 244–257.
[70] J.L. Lagrange, Théorie des Fonctions Analytiques, Journal de l’École Polytech-
nique, 1797.
[71] J.L. Lagrange, Mécanique Analytique, Journal de l’École Polytechnique, 1788.
[72] L.M. Lancaster, The history of the application of mathematical programming to
menu planning, Europian Journal of Operation Research 57 (1992) 339–347.
doi:10.1016/0377-2217(92)90345-A
[73] L.J. Leifman, Functional Analysis, Optimization and Mathematical Economics: A
Collection of Papers Dedicated to the Memory of L.V. Kantorovich, Oxford Univ.
Press, 1990.
[74] C. Lemarechal, Lagrangian decomposition and nonsmooth optimization: Bundle
algorithm, prox iterations, augmented Lagrangian, in: Nonsmooth Optimization:
Methods and Applications (Erice, 1991) Gordon and Breach, Montreux, 1992,
201–206.
[75] J.K. Lenstra, A.H.G. Rinnooy Kan and A. Schrijver, History of Mathematikal
Programming, CWI North-Holland, 1991.
[76] K. Levenberg, A method for the solution of certain nonlinear problems in least
squares, Quarterly of Applied Mathem. 2 (1944) 164–168.
[77] E.S. Levitin and B.T. Polyak, Minimization methods under constraints, Zhurn.
Vychisl. Matem. i Matem. Fiz. 6 (1966) 787–823. (in Russian)
[78] J.L. Lions and G. Stampacchia, Variational inequalities, Comm. Pure Appl. Math.
20 (1967) 493–519. doi:10.1002/cpa.3160200302
[79] O.G. Mancino and G. Stampacchia, Convex programming and variational inequal-
ities, JOTA 9 (1972) 3–23. doi:10.1007/BF00932801
[80] B. Martinet, Régularisation d’inéquations variationelles par approximations suc-
cessives, RIRO 4 (1970) 154–159.
[81] H. Minkowski, Allgemeine Lehrsätze über konvexe Polyeder. Nachrichten von der
königlichen Gesellschaft der Wissenschaften zu Göttingen, Math.-Physik. Klasse
(1897) 198–219.
”On the Shoulders of Giants” A brief excursion into the ... 43

[82] G. Monge, Mémoire sur la théorie des déblais et de remblais. Histoire de l’ Académie
Royale des Sciences de Paris, avec les Mémoires de Mathématique et de Physique
pour la même année (1781) 666–704.
[83] J.J. Moreau, Proximité et dualité dans un espace Hilbertien, Bull. Soc. Math. France
93 (1965) 273–299.
[84] U. Mosco, Convergence of convex sets and of solutions of variational inequalities,
Advances in Mathematics 3 (1969) 510–585. doi:10.1016/0001-8708(69)90009-7
[85] J.v. Neumann, Zur Theorie der Gesellschaftsspiele, Math. Ann. 100 (1928)
295–320. doi:10.1007/BF01448847
[86] J.v. Neumann and O. Morgenstern, The Theory of Games and Economic Behavior,
Springer, New York, 1944.
[87] L.W. Neustadt, The existence of the optimal control in the absence of convexity
conditions, J. Math. Anal. Appl. 7 (1963) 110–117.
doi:10.1016/0022-247X(63)90081-7
[88] M. Padberg, Linear Optimization and Extensions, Springer, 1961.
[89] C. Panne and W. Poop, Minimum-cost cattle feed under probabilistic problem con-
straint, Management Sci. 9 (1963) 405–430. doi:10.1287/mnsc.9.3.405
[90] B.T. Polyak, History of mathematical programming in the USSR, Math. Program-
ming (B) 91 (2002) 401–416. doi:10.1007/s101070100258
[91] L.S. Pontryagin, V.G. Boltyanski and R.V. Gamkrelidse, Mathematical Theory of
Optimal Processes, Nauka, Moscow, 1956. (in Russian)
[92] M. Powell, A method for nonlinear constrainets in minimization problems, in:
Fletcher, R. (ed.): Optimization, Acad. Press, New York, 1969, 283–298.
[93] B.N. Pshenichnyj, Necessary Conditions for Extrema, Nauka, Moscow, 1969. (in
Russian)
[94] B.N. Pshenichnyj, Convex Analysis and Extremum Problems, Nauka, Moscow,
1980. (in Russian)
[95] C.J. Poussin, Sure la méthode de l’approximation minimum, Anales de la Societe
Scientifique de Bruxelles 35 (1911) 1–16.
[96] E.Ya. Remez, Foundations of Numerical Methods for Chebyshev Approximation,
Naukova Dumka, Kiew, 1969. (in Russian)
[97] R.T. Rockafellar, Convex Analysis, Princeton University Press, 1972.
[98] R.T. Rockafellar, Monotone operators and the proximal point algorithm, SIAM J.
Control Optim. 14 (1976) 877–898. doi:10.1137/0314056
[99] N. Rouche, P. Habets and M. Laloy, Stability Theory by Lyapunov’s Direct Method,
Springer, 1977. doi:10.1007/978-1-4684-9362-7
[100] E. Roxin, The existence of optimal control, Michigan Math. J. 9 (1962) 109–119.
doi:10.1307/mmj/1028998668
44 R. Tichatschke

[101] H. Schramm and J. Zowe, A version of the bundle idea for minimizing a nonsmooth
function: Conceptual idea, convergence analysis, numerical results, SIAM J. Optim.
2 (1992) 121–152. doi:10.1137/0802008
[102] N. Shor, Application of the subgradient method for the solution of network trans-
port problems, Notes of Sc. Seminar, Ukrainian Acad. of Sci., Kiew, 1962. (in
Russian)
[103] M. Slater, Lagrange Multipliers Revisited, Cowles Commission Discussion Paper,
No. 403.
[104] G. Stampacchia, Formes bilineares coercives sur les ensembles convexes, Comptes
Rendus Acad. Sciences Paris 258 (1964) 4413–4416.
[105] G.J. Stiegler, Journal Econometrica, 1949.
[106] R. Tichatschke, Proximal Point Methods for Variational Problems (2011),
http://www.math.uni-trier.de/ tichatschke/monograph.de.html
[107] A.N. Tikhonov, Improper problems of optimal planning and stable methods of their
solution, Soviet Math. Doklady 6 (1965) 1264–1267 (in Russian).
[108] A.N. Tikhonov, Methods for the regularization of optimal control problems, Soviet
Math. Doklady 6 (1965) 761–763 (in Russian).
[109] A.N. Tikhonov, On the stability of the functional optimization problems, J. Num.
Mat. i Mat. Fis. 6 (1966) 631–634 (in Russian).
[110] G. Tintner, Stochastic linear programming with applications to agricultural eco-
nomics, 2nd Symp. Linear programming 2 (1955) 197–228.
[111] E.S. Ventzel, Elements of Game Theory, Fizmatgis, Moscow, 1959. (in Russian)
[112] N.N. Vorobyev, Finite non-coallition games, Uspekhi Matem. Nauk 14 (1959)
21–56. (in Russian)
[113] D.B. Yudin and E.G. Golstein, Problems and Methods of Linear Programming,
Sov. Radio, Moscow, 1961. (in Russian)
[114] D.B. Yudin and A.S. Nemirovskij, Complexity of Problems and Efficiency of Opti-
mization Methods, Nauka, Moscow, 1979. (in Russian)
[115] A.P. Yushkevich, Histrory of Mathematics in Russia before 1917, Nauka Moscow,
1968. (in Russian)
[116] S.I. Zukhovitskij, On the approximation of real functions in the sense of P.L.
Chebyshev, Uspekhi Matem. Nauk 11 (1956) 125–159. (in Russian)

Received 24 May 2012

You might also like