Table of Contents
- Saddle points
- Lagrangian duality as a saddle-point
- Stranger duals: Turning Lagrangian duality into von Neumann's minimax theorem
Saddle points
Consider a two-player zero sum game, where there is a player that wishes to maximize the value of the game and another player that wishes to minimize it. We can model the value of the game as a function . Without any assumptions on , we get the following inequality, called the max-min inequality:
We can interpret the sides of this inequality as follows. On the right hand side, makes the first move, followed by , who has knowledge of 's move. On the left hand side, makes the first move, and follows with knowledge of 's move. Intuitively, it is better for you if you play after your opponent, and this is what the max-min inequality says.
A key question is existence of saddle-point.
We say is convex-concave if is convex for each and is concave for each . [^1]
In this setting, we have the following, Von Neumann's Minimax theorem.
If and are compact convex sets and is convex-concave, then we have
In fact there exists a generalization of this theorem to quasiconcave functions called Sion's minimax theorem, although I will not use it any further in this post. In applications, the biggest constraint that compactness of and imposes is that those sets must be bounded.
Lagrangian duality as a saddle-point
Let be convex functions, defined on a convex domain . We consider convex optimization problems in the following standard form:
Denote the optimal value of this problem by . As is standard in convex optimization, we define a quantity, the Lagrangian for as follows:
Observe that if for any , then taking we can make arbitrarily large. Thus we may actually think of the optimization problem above as a two-player game, where (the 'primal player') plays first trying to minimize the optimization problem, and (the 'dual player') plays second trying to maximize it. Then we have
In the last line, we define to be the value of , the Lagrangian dual optimization problem. In contrast, we call the original optimization problem the primal. By applying the max-min inequality, we have just shown weak duality, that .
In convex optimization, we have a very famous form of a minimax theorem (although for some reason, I almost never see it called that). It is Slater's condition:
If there exists such that for all , then . For any , if is linear, we only require .
In convex optimization the situation where is called strong duality, but I really think of it as just a minimax theorem. This is really clear when we write the conclusion to match the von Neumann minimax theorem:
Despite the obvious similarity in the statement of the theorem, this theorem is not directly comparable to von Neumann's minimax theorem. We can immediately observe the following differences:
- is a convex-linear function, not just convex-concave. Linearity implies concavity, so this condition is stronger.
- is fixed in Slater's condition, while in von Neumann's minimax theorem it can be any compact convex set.
- We place a condition on that is often merely technical in applications: that there exists some point that is strictly feasible with respect to the constraints .
- is allowed to be any convex set, namely, it need not be compact (i.e. bounded). is not bounded either!
I find it quite interesting, though, that given we also require the first 3 conditions to be true (the last of which does not usually cause trouble), we get an 'unbounded' version of the minimax theorem.
Quote the following table from [LJJ20].
Setting | Gradient complexity | Lower bound |
---|---|---|
Strongly-Convex-Linear | ? | |
Strongly-Convex-Concave |
is the strong-convexity condition number.
Stranger duals: Turning Lagrangian duality into von Neumann's minimax theorem
To be continued