# Duality
Boyd Chapter 5
The key concept of this chapter is optimizing an objective function **with constraints**: Duality exchanges the objective function with the constraint functions.
## 1 Duality (Mechanism)
Let $x^*, p^*=f_0(x^*)$ be the optimal $x$ and $f_0(x)$
$\begin{aligned}
\min &f_0(x) \quad x \in R^n \\
\text { s.t. } &f_i(x) \leq 0 \quad i=1, \ldots, m \quad \text { domain } D \\
& h_i(x)=0 \quad i=1, \ldots, p=\operatorname{dom} f_0 \cap_i \operatorname{dom} f_i \cap_i \operatorname{dom} h_i \\
&
\end{aligned}$
#def ***Lagrangian***: $L: R^n \times R^m \times R^p \rightarrow R$
$\begin{align*}
L(x, \lambda, v)=f_0(x)+\sum_{i=1}^m \lambda_i f_i(x)+\sum_{i=1}^p v_i h_i(x)\\
\lambda_i, v_i \text {: Lagrange multiplier }, \lambda_i, \in R_{+}, v_i \in R
\end{align*}$
#note The Lagrangian changes the constrained problem into an unconstrained problem by introducing more variables, i.e. $\lambda, v$.
#lemma If the solution $x^*$ is feasible, then $\sum_{i=1}^m \lambda_i f_i(x)<0$, and $\sum_{i=1}^p v_i h_i(x)=0$
#def ***Lagrange dual function*** (solution $x$ may not be feasible)
$\begin{align*}
g(\lambda, v)&= \inf _{x \in D} L(x, \lambda, v)\\
\end{align*}$
#def ***Dual problem***. Step 1) given $\lambda,v$, $\min_xL(\bullet)$. Step 2) $\max_{\lambda,v}\min_xL(\bullet)$
$\begin{align*}
\max _{\lambda, v} g(\lambda, v) &\text { s.t. } \lambda \in R_{+}^m, v \in R^p\\
\max _{\lambda, v} g(\lambda, v)= &\max_{\lambda,v}\min_xL(x,\lambda,v) &\text{dual problem}\\
\underset{\text{if Slater's condition satisfied}}{\equiv} &\min_x\max_{\lambda,v}L(x,\lambda,v) &\text{primal problem}
\end{align*}$
$\lambda, v$ are sometimes called ***shadow price*** - a price is incurred when the solution is not feasible.
#lemma $g(\lambda, v)$ yields lower bounds on the optimal value $p^*=f_0(x^*)$ of the original problem.
#lemma $g(\lambda,v)$ is concave w.r.t. $\lambda, v$.
*Proof*.
$\begin{align*}
g(\lambda,v)&= \min_x f_0(x)+\sum_{i=1}^m \lambda_i f_i(x)+\sum_{i=1}^p v_i h_i(x)\\
-g(\lambda,v) &= \max_x \underset{\text{convex w.r.t} (\lambda, v)}{\underbrace{-f_0(x) -\sum_{i=1}^m \lambda_i f_i(x) -\sum_{i=1}^p v_i h_i(x) }}\\
\rightarrow -g(\lambda,v) & \quad\text{is convex w.r.t. } (\lambda,v) \text{ (max preserves convexity)}\\
\Longleftrightarrow g(\lambda,v) & \quad\text{is concave w.r.t } (\lambda,v) \quad\text{Q.E.D.}
\end{align*}$
#lemma $g(\lambda, v)\leq p^*$ is an optimal value where $\lambda\succcurlyeq0$
*Proof*. For any feasible $\tilde{x}$ and $\lambda\succcurlyeq0$ the following is true:
$\begin{align*}
& f_0(\tilde{x}) \geq L(\tilde{x}, \lambda, v) \quad \text {Because } \sum \lambda_i f_i(\tilde{x})+\sum v_i h_i(\tilde{x}) \leq 0 \text { } \\
& L(\tilde{x}, \lambda, v) \geq g(\lambda, v)
\quad\text { by definition of } g(\lambda, v) \\
\Longleftrightarrow &p^*=f_0(x^*)\geq g(\lambda,v) \quad\text{Q.E.D.}
\end{align*}$
provides the knowledge of the "gap" between optimality
## 2 Primal-dual Conversion
### 2.1 LP - Linear Programming
#### 2.1.1 LP with inequality constraints only
$\min c^T x,\;
\text { subject to }\left\{\begin{array} { c }
{ A x \preccurlyeq b } \\
{ x \succcurlyeq0 }
\end{array} \Rightarrow \left\{\begin{array}{c}
A x-b \preccurlyeq 0 \\
-x \preccurlyeq 0
\end{array}\right.\right.
$
**Lagrangian** (there's no $v$ since there's not equality constraint):
$\begin{align*}
L(x, \lambda) & =c^T x+\lambda_I^T(A x-b)-\lambda_{I I}^T x \\
& =-\lambda_I^T b+\left(A^T \lambda_I-\lambda_{I I}+c\right)^T x, \quad \lambda_I, \lambda_{I I} \succcurlyeq0 \\
g(\lambda) &= \inf_x L(x,\lambda)= \begin{cases}
-b^T \lambda_I, &A^T\lambda_I+c\geq0 \quad (A^T\lambda_I-\lambda_{II}+c=0)\\
-\infty, &\text{otherwise} \quad (A^T\lambda_I-\lambda_{II}+c\neq0)
\end{cases}
\end{align*}$
**Dual**:
$\begin{align*}
g(\lambda) &= -b^T \lambda_I\\
\max_\lambda g(\lambda)&= \max_\lambda -b\lambda_I =\min_\lambda b^T\lambda_I, \quad \text{s.t.} A^T\lambda_I+c\succcurlyeq0, \;\lambda_{I}\succcurlyeq0\\
\end{align*}$
#lemma $\nabla f_0(x)=c\in K^*, \; c=-A^T\lambda_I+\lambda_{II}$
$\begin{align*}
K^*&=\set{[A^T\;\;-I]\begin{bmatrix}\lambda_I\\\lambda_{II}\end{bmatrix}|\; \begin{bmatrix}\lambda_I \\ \lambda_{II}\end{bmatrix}\succcurlyeq0 }\\
K&= \set{x|Ax \succcurlyeq0,Ix \succcurlyeq0}\\\\
\end{align*}$
**Dual of dual** is the primal for LP:
$\begin{align*}
&\min_\lambda b^T \lambda, \quad\text{s.t. } -A^T\lambda-c\preccurlyeq0, -\lambda\preccurlyeq0\\
L(\lambda,x)&= b^T \lambda+x_I^T(-A^T\lambda-c)+x_{II}^T(-\lambda)\\
&= (b-Ax_I-x_{II})\lambda-x_I^Tc\\
g(x)&= \min_{\lambda}L(\lambda,x)=\min_\lambda(b-Ax_I-x_{II})\lambda-x_I^Tc\\
&=\begin{cases}
-Ax_I-x_{II}+b=0 &\rightarrow g(x)-x_{I}^Tc\\
-Ax_I-x_{II}+b\neq0 &\rightarrow g(x)=-\infty\\
\end{cases}\\
\end{align*}$
$\begin{align*}
\max_xg(x)&= \max_x -c^Tx_I, \quad\text{s.t. } Ax_I+x_{II}=b, \;x_I,x_{II} \succcurlyeq0\\
&\equiv \min c^Tx, \quad\text{s.t. }Ax\preccurlyeq b, x\succcurlyeq0
\end{align*}$
#### 2.1.2 LP with inequality and equality constraints
$\begin{align*}
\min c^T x,\;
\text { subject to }\left\{\begin{array} { c }
{ A x = b } \\
{ x \succcurlyeq0 }
\end{array} \Rightarrow \left\{\begin{array}{c}
A x-b = 0 \\
-x \preccurlyeq 0
\end{array}\right.\right.
\end{align*}$
Lagrangian:
$\begin{align*}
\begin{aligned}
L(x, \lambda, v) & =c^T x+\lambda^T(-x)+v^T(A x-b) \\
& =-b^T v+\left(c+A^T v-\lambda\right)^T x
\end{aligned}
\end{align*}$
Dual:
$\begin{align*}
g(\lambda,v)=\inf_x L(x,\lambda,v)=\begin{cases}
g(\lambda,v)=-b^Tv &\text{if }A^Tv-\lambda+c=0 \\
g(\lambda,v)=-\infty &\text{otherwise}
\end{cases}
\end{align*}$
#lemma $g$ is linear on affine domain $\set{(\lambda, v) \mid A^T v-\lambda+c=0}$, hence $g$ is concave
#lemma If $\lambda \succcurlyeq0$, then $A^Tv+c \succcurlyeq0$. If $A^Tv+c \succcurlyeq0$, then $p^*\geq -b^Tv$ :
$\begin{align*}
&\max-b^Tv, &A^Tv+c \succcurlyeq0 \quad\text{or}\\
&\max b^Tv, &A^Tv-c \preccurlyeq0
\end{align*}$
### 2.2 QP - Quadratic Programming
#### 2.2.1 Quadratic form only involving only $x$
$\begin{align*}
\min x^Tx, \quad\text{s.t. } Ax=b
\end{align*}$
**Lagrangian**:
$\begin{align*}
L(x,\lambda)=x^T x+v^T(A x-b)
\end{align*}$
To min $L$ over $x$, we set $\nabla_x L(x,v)=2x+A^Tv=0$, which gets us:
$\begin{align*}
x=-\frac{1}{2} A^T v \quad\text{(1)}
\end{align*}$
**Dual**:
$\begin{align*}
g(v)=L\left(x=-\frac{1}{2} A^T v, v\right)=-\frac{1}{4} v^T A A^T v-b^T v
\end{align*}$
which is concave w.r.t. $v$
$p^* \geq-\frac{1}{4} v^T A A^T v-b^T v, \quad \forall v$, by the lower bound property of Lagrange dual function
To maximize $g(v)$, we set $\nabla g(v)=0 \rightarrow v=-2(AA^T)^{-1}b$. Thus we have
$\begin{align*}
g(v)=-\frac{1}{a} v^T A A^T v-b^T v=b^T\left(A A^T\right)^{-1} b \quad\text{(2)}
\end{align*}$
By (1), (2):
$\begin{align*}
\left\{\begin{array}{c}
x^*=A^T\left(A A^T\right)^{-1} b \\
p^*=x^{* T} x^*=b^T\left(A A^T\right)^{-1} b
\end{array}\right.
\end{align*}$
#### 2.2.2 Quadratic form involving $P\in \mathbb{S}_{++}^n$ and $x$
$\begin{align*}
\min x^TPx, \quad\text{s.t. }P\in\mathbb{S}_{++}^n, Ax\preccurlyeq b
\end{align*}$
Lagrange dual function:
$\begin{align*}
g(\lambda)=\min_xx^TPx+\lambda^T(Ax-b)&= - \frac{1}{4}\lambda^TAP^{-1}A^T \lambda-b^T \lambda
\end{align*}$
Dual problem:
$\begin{align*}
\max_\lambda - \frac{1}{4}\lambda^TAP^{-1}A^T \lambda-b^T \lambda \quad\text{s.t.} \lambda\succcurlyeq0
\end{align*}$
#### 2.2.3 Quadratic form with a linear term
$\begin{align*}
\min x^TAx+2b^Tx, \quad\text{s.t. } x^Tx\leq1, \;A\in\mathbb{S}^n
\end{align*}$
#note $x^TAx= x^TA^Tx=\frac{1}{2}x^T(A+A^T)x \rightarrow A\in\mathbb{S}^n$
Dual function:
$\begin{align*}
g(\lambda)&= \min_{x}x^T(A+\lambda I)x+2b^Tx-\lambda\\
&= \begin{cases}
-\infty &A+\lambda I\not{\succcurlyeq}0 \;\text{ or } A+\lambda I \succcurlyeq0 , b\notin R(A+\lambda I)\\
-b^T(A+\lambda I)^{\dagger}b-\lambda &\text{otherwise}
\end{cases}
\end{align*}$
Dual problem (can be converted into semidefinite programming):
$\begin{align*}
&\max_\lambda-b^T(A+\lambda I)^{\dagger}b-\lambda, \quad\text{s.t. }A+\lambda I\succcurlyeq0, b\in R(A+\lambda I)\\
\equiv&\max_\lambda -t-\lambda, \quad\text{s.t.} \begin{bmatrix}A+\lambda I&b\\b^T&I\end{bmatrix} \succcurlyeq0\\
&\begin{aligned}
& {\left[\begin{array}{cc}
I & 0 \\
-\left((A+\lambda I)^{+} b\right)^T & I
\end{array}\right]\left[\begin{array}{cc}
A+\lambda I & b \\
b^T & t
\end{array}\right]\left[\begin{array}{cc}
I & -(A+\lambda I)^{+} b \\
0 & I
\end{array}\right] \geqslant 0} \\
& {\left[\begin{array}{cc}
A+\lambda I & 0 \\
0 & -b^T(A+\lambda I)^{+} b+t
\end{array}\right] \geqslant 0}
\end{aligned}
\end{align*}$
### 2.3 Conjugate functions' duality
$\begin{align*}
\min f_0(x), \quad\text{s.t. }Ax \preccurlyeq b, Cx=d
\end{align*}$
#lemma $A^T \lambda+C^Tv=f_0(x)-y^Tx,\;\text{where } -y=A^T \lambda+C^Tv$
$\begin{align*}
f^*(y)=\min_xf_0(x)-y^Tx=\max_xy^Tx-f_0(x)
\end{align*}$
Dual function as conjugate:
$\begin{align*}
g(\lambda, v) & =\inf _{x \in \operatorname{dom} f_0}\left(f_0(x)+\lambda^T(A x-b)+v^T(C x-d)\right) \\
& =\inf _{x \in d o m f_0}\left(f_0(x)+\left(A^T \lambda+C^T v\right)^T x-b^T \lambda-d^T v\right) \\
& =-f_0^*\left(-A^T \lambda-C^T v\right)-b^T \lambda-d^T v \quad \text { Conjugate fn. }\\
\quad\text{where }f_0^*(y)&= \max_{x\in \mathcal{D}(f)}y^Tx-f_0(x)
\end{align*}$
![[Pasted image 20230214124852.png|400]]
### 2.4 Entropy Maximization
$\begin{align*}
\min f_0(x)=\sum_{i=1}^n x_i \log x_i, \quad x \in R_{++}^n \quad\text{s.t. } Ax \preccurlyeq b, a^Tx=1
\end{align*}$
#lemma $f_0^*(y)=\sum_{i=1}^n e^{y_i-1}, y_i \in R$
Dual function:
$\begin{align*}
g(\lambda, v) & =-b^T \lambda-v-\sum_{i=1}^n e^{-a_i^T \lambda-v-1} \\
& =-b^T \lambda-v-e^{-v-1} \sum_{i=1}^n e^{-a_i^T \lambda}, \mathrm{a}_i: \text { the } \mathrm{i}^{\text {th }} \text { column of } \mathrm{A}
\end{align*}$
To maximize $g(\lambda,v)$, we set $v=\log \sum_{i=1}^n e^{-a_i^T \lambda}-1$
Dual problem:
$\begin{align*}
\max -b^T \lambda-\log\left(\sum\limits_{i=1}^Ne^{-a_i^T \lambda}\right)\quad\text{s.t. }\lambda \succcurlyeq0
\end{align*}$
### 2.5 Example: Min volume covering ellipsoid
#lemma Given ellipsoid $E_x=\set{z|z^TXz\leq1},\; x\in \mathbb{R}_{}^{n\times n}, z\in \mathbb{R}^n$, its volume is proportional to $(\det X^{-1})^{1/2}$
*Proof*. Let $X=U\Sigma U^T$ where $UU^T=I$. We set $U^Tz=\frac{1}{\sqrt{\lambda_1}}e_1,\cdots,\frac{1}{\sqrt{\lambda_n}}e_n$
$\begin{align*}
E_x&= \set{z|(z^TU) \Sigma (U^Tz)\leq1}, \quad\text{where } \Sigma=\begin{bmatrix}\lambda_1&&\\
&\ddots\\&&\lambda_n
\end{bmatrix}\\
V_{E}&\propto \prod_i \lambda_i^{-1/2}=(\det X^{-1})^{1/2}\quad\text{Q.E.D.}
\end{align*}$
Primal problem:
$\begin{align*}
\begin{aligned}
& \min f_0(x)=\log \operatorname{det} X^{-1}, X \in \mathbb{S}_{++}^n \\
& \text { s.t. } a_i^T X a_i \leq 1, i=1, \ldots, m
\end{aligned}
\end{align*}$
Lagrangian:
$\begin{align*}
L(x, \lambda)=\log \operatorname{det} X^{-1}+\sum_{i=1}^m \lambda_i a_i^T X a_i-1^T \lambda, \lambda \in R_{+}^m
\end{align*}$
Dual function:
$\begin{align*}
g(\lambda)=\min _\gamma L(x, \lambda), \lambda \in R_{+}^m
\end{align*}$
Dual problem:
$\begin{align*}
\begin{aligned}
& \max \log \operatorname{det}\left(\sum_{i=1} \lambda_i a_i a_i^T\right)-1^T \lambda+n \\
& \text { s.t. } \sum_{i=1} \lambda_i a_i a_i^T>0, \lambda \in \mathbb{R}_{+}^m
\end{aligned}
\end{align*}$
## 3 Interpretation
### 3.1 Saddle-point Interpretation (game theory)
![[Pasted image 20230216124541.png|300]]
#def The **saddle point** $(\tilde{w},\tilde{z})$ satisfies $\max_zf(\tilde{w},z)=f(\tilde{w},\tilde{z}),\;\min_wf(w,\tilde{z})=f(\tilde{w},\tilde{z})$.
#lemma $\max _{z \in Z} \min _{w \in W} f(w, z) \leq \min _{w \in W} \max _{z \in Z} f(w, z)$
*Proof*. Given arbitrary pair $(\widetilde{w}, \tilde{z}) \in D$.
$\begin{aligned}
& \min _{w \in W} f(w, \tilde{z}) \leq f(\widetilde{w}, \tilde{z}) \leq \max _{z \in Z} f(\widetilde{w}, z) \quad \forall \widetilde{w}, \tilde{z} \in D \\
& \min _{w \in W} f(w, \tilde{z}) \leq \max _{z \in Z} f(\widetilde{w}, z)\\
&\max _{z \in Z} \min _{w \in W} f(w, z) \leq \min _{w \in W} \max _{z \in Z} f(w, z) \quad\text{Q.E.D.}
\end{aligned}$
#theorem $\min _{w \in W} \max _{z \in Z} f(w, z)=\max _{z \in Z} \min _{w \in W} f(w, z)$ iff a saddle point of $f$ exists
*Proof*. Let:
$\begin{align*}
\tilde{w}=\arg_w\min_w\max_zf(w,z) &\leftrightarrow \min_w\max_zf(w,z)=f(\tilde{w},z) \\
\tilde{z}=\arg_z\max_z\min_wf(w,z) &\leftrightarrow \max_z\min_wf(w,z)=f(w,\tilde{z})=f(\tilde{w},z)
\end{align*}$
First we prove the forward proposition. Given $\max _{z \in Z} \min _{w \in W} f(w, z) = \min _{w \in W} \max _{z \in Z} f(w, z)$:
$\begin{align*}
f(\tilde{w},\tilde{z})&\leq\max_zf(\tilde{w},z)=\min_wf(w,\tilde{z})\leq f(\tilde{w},\tilde{z})\\
\Longleftrightarrow \quad& (\tilde{w},\tilde{z})\text{ is a saddle point}
\end{align*}$
Then we prove the converse proposition. Given saddle point $(\tilde{w},\tilde{z})$:
$\begin{align*}
\max_z\min_w f(w,z)&\geq\min_wf(w,\tilde{z})=f(\tilde{w},\tilde{z})\\
\min_w\max_zf(w,z)&\leq\max_zf(\tilde{w},z)=f(\tilde{w},\tilde{z})\\
\Longleftrightarrow \quad \max_z\min_wf(w,z)&= \min_w\max_Zf(w,z) \quad\text{Q.E.D.}
\end{align*}$
**Example**: Suppose we have dual and primal problems:
$\begin{align*}
\max _{z \in Z} \min _{w \in W} f(w, z) \leq \min _{w \in W} \max _{z \in Z} f(w, z)
\end{align*}$
Sub-example 1:
$\begin{align*}
f(w,z)=\left[\begin{array}{ccc}
1 & -1 & 3 \\
2 & 2 & -1 \\
3 & 1 & -2
\end{array}\right]=\begin{bmatrix}w_1^T\\w_2^T\\w_3^T\end{bmatrix}=[z_1\;z_2\;z_3]
\end{align*}$
- $\max _{z \in Z} \min _{w \in W} f(w, z)=\boxed{1}$ $=\max \begin{cases} \min_w f(w,1)=1\\\min_w f(w,2)=-1\\\min_w f(w,3)=-2 \end{cases}$
- $\min _{w \in W} \max _{z \in Z} f(w, z)=\boxed{2}=\min \begin{cases} \max_z f(1,z)=3\\\max_z f(2,z)=2 \\\max_z f(3,z)=3\end{cases}$
- $\max _{z \in Z} \min _{w \in W} f(w, z)\neq \min _{w \in W} \max _{z \in Z}f(w, z)$
Sub-example 2:
$\begin{align*}
f(w,z)=\left[\begin{array}{ccc}
4 & 6 & 3 \\
2 & 3 & 1 \\
3 & 5 & 2
\end{array}\right]=\begin{bmatrix}w_1^T\\w_2^T\\w_3^T\end{bmatrix}=[z_1\;z_2\;z_3]
\end{align*}$
- $\max _{z} \min _{w} f(w, z)=\boxed{3}=\min_x\max_zf(w,z)$. (Each row is concave, each column is convex, $(w^*, z^*)$ is a saddle point)
**Example**: *Single big game*:
$\begin{align*}
\min_w\max_z f(w,z)=w^TAz, \quad\text{s.t. }1^Tw=1, 1^Tz=1;\; w_i\in\set{0,1}, z_i\in\set{0,1}
\end{align*}$
Note that this problem is nonconvex: Both the objective function and the constraint are nonconvex.
#note $f(w,z)=w^TAz=\sum\limits_{i,j}a_{ij}w_iz_j$
Sub-example 1:
$\begin{align*}
A=\begin{bmatrix}1&2&3\\3&1&2\\2&3&1\end{bmatrix},\;w=\begin{bmatrix}1\\0\\0\end{bmatrix},\;z=\begin{bmatrix}0\\0\\1\end{bmatrix} \rightarrow f(w,z)=A_{13}=3
\end{align*}$
#question w,z which represent player? which represent dealer?
Sub-example 2: Relax the constraint: $w,z\geq0$ instead of $w_i,z_i\in\set{0,1}$
(The relaxation turns the *single bid game* to *multiple bit game*)
$\begin{align*}
A=\begin{bmatrix}1&2&3\\3&1&2\\2&3&1\end{bmatrix}, \;w=\begin{bmatrix}1/3\\1/3\\1/3\\\end{bmatrix},\;\max_zw^TAz\bigg\vert_{w=\frac{1}{3}[1,1,1]^T}=2
\end{align*}$
![[Pasted image 20230221124943.png|500]]
### 3.2 Geometric Interpretation (conjugate function)
![[4_duality 2023-02-16 13.25.15.excalidraw|700]]
$\begin{align*}
\min f_0(x), \quad\text{s.t. } f_1(x)\leq0
\end{align*}$
Let $t = f_0(x),\;u=f_1(x), \;G=\set{f_1(x),f_0(x)=(u,t)|\;x\in D}$. The Lagrangian and the dual function are
$\begin{align*}
L(x,\lambda)&= f_0(x)+\lambda f_1(x)\\
g(\lambda)&= \min_xL(x,\lambda)=\min_{u,t\in G}f_0(x)+\lambda u\\
&= -\max_{u,t\in G}(-\lambda)u-f_0(x)
\end{align*}$
We can observe that $g(\lambda)$ is the conjugate function of $t=f_0(x)$.
#note If the function is non-convex, there will be a ***duality gap***
### 3.3 Slater's Condition
#def **Strong duality** holds, given that the primal problem is convex, and if equality constraints satisfy the following
$\begin{align*}
f_i(x)<0,\;i=1,\cdots,m, \;\exists x\in\text{relint}D
\end{align*}$
#def ***Relative interior***:
![[4_duality 2023-02-21 13.00.27.excalidraw]]
$\begin{align*}
\text{relint }C&= \set{x\in C|\;B(x,r)\cap \text{aff } C\subseteq C \text{ for some } r>0}\\
B(x,r) &= \set{y|\;\Vert y-x \Vert \leq r}, \quad\text{where }\Vert \bullet \Vert \text{ is any norm}
\end{align*}$
#lemma Let $d^*$ denote the optimal solution, let $\text{gap}=p^*-d^*\geq0$
- If $\text{gap}>0$ then it's a weak duality
- If $\text{gap}=0$ then it's a strong duality
#note Slater's condition is sufficient, but not necessary
### 3.4 Shadow-Price Interpretation (diet problem)
**Primal problem**:
$\begin{align*}
&\min c^Tx, \quad\text{s.t. }Ax \succcurlyeq b, x \succcurlyeq0\\
\Longleftrightarrow &\min c^Tx, \quad\text{s.t. } -Ax+b \preccurlyeq0, -x \preccurlyeq0
\end{align*}$
- $b_i$ is the constraint for each nutrition element (e.g. protein, vitamin, ...)
- $A_{i:}$ (i.e. each row of $A$) is the nutrition composition for each ingredient
**Dual problem**:
$\begin{align*}
\max\lambda^Tb, \quad\text{s.t. }A^T \lambda \preccurlyeq c, \lambda \succcurlyeq0
\end{align*}$
**Lagrangian**:
$\begin{align*}
L(x, \lambda) & =c^T x+\lambda_1^T(-A x+b)+\lambda_2^T(-x) \\
& =\left[c^T+\lambda_1^T(-A)-\lambda_2^T\right] x+\lambda_1^T b \\
c^T&=\lambda_1^T(A)+\lambda_2^T, \text { or } A^T \lambda_1 \preccurlyeq c
\end{align*}$
#question Chef vs. Pharma company
- $\lambda_{1i}$ is the shadow price for each nutrition element
- $A_{:i}^T$ (i.e. each column of $A^T$) is the nutrition composition for each ingredient
- $c_i$ is the real price for each nutrition element (if acquired from real food)
![[Pasted image 20230221133141.png]]
## 4 KKT Conditions
#def ***KKT*** (Karush-Kuhn-Tucker) conditions (suppose constraints are differentiable):
1. Primal constraints:
$\begin{align*}
f_i(x) &\leq0, \;i=1,\cdots,m\\
h_i(x) &=0, \;i=1,\cdots,p
\end{align*}$
2. Dual constraints: $\lambda \succcurlyeq0$
3. Complementary slackness: $\lambda_i f_i(x)=0, \;i=1,\cdots,m$
4. Gradient of Lagrangian w.r.t. $x$ variables
$\begin{align*}
\nabla_xf_0(x)+\sum\limits_{i=1}^m \lambda_i \nabla_xf_i(x)+\sum\limits_{i=1}^pv_i\nabla_xh_i(x) =0
\end{align*}$
#lemma If strong duality holds, KKT conditions are **necessary** for optimality
#lemma If the problem is convex, KKT conditions are **sufficient** for optimality
*Proof*. Suppose KKT is sufficient for optimality, i.e. $\text{KKT}\rightarrow \tilde{x}, \tilde{\lambda}, \tilde{v}$
$\begin{align*}
g(\tilde{\lambda},\tilde{v})&= \min_xL(x,\tilde{\lambda}, \tilde{v })=L(\tilde{x},\tilde{\lambda}, \tilde{v})\\
&= f_0(\tilde{x})+\sum\limits_{i=1}^m\lambda_if_i(\tilde{x})+\sum\limits_{i=1}^pv_ih_i(\tilde{x})\\
&= f_0(\tilde{x})
\end{align*}$
#lemma Convex problem + Slater's condition: KKT conditions are necessary and sufficient
#lemma Hessian of Lagrangian is PSD if the problem is convex:
$\begin{align*}
\nabla_x^2L(x,\lambda,v)=\nabla_x^2f_0(x)+\sum\limits_i\lambda_i \nabla_x^2f_i(x) \succcurlyeq0
\end{align*}$
## 5 Sensitivity (shadow-price & perturbation)
![[Pasted image 20230223123323.png|500]]
Original problem's, primal, Lagrangian, dual: $u_i=w_i=0$
$\begin{align*}
\min f_0(x), &\quad\text{s.t. }f_i\leq0,\;h_i=0\\
L(x,\lambda,v)&= f_0(x)+\sum\limits_i \lambda_if_i+\sum\limits_i v_ih_i\\
g(\lambda, v) &= \max_x L(x,\lambda, v)\\
\max g(\lambda,v) &\quad\text{s.t. } \lambda\geq0
\end{align*}$
Perturbed problem's, Lagrangian, dual: $\tilde{L}=L(x,\lambda,v)-\sum\limits_i \lambda_iu_i-\sum\limits_iv_iw_i$
$\begin{align*}
\min f_0(x), &\quad\text{s.t. }f_i\leq u_i,\; h_i=w_i\\
\tilde{g }(\lambda,v)&=g(\lambda, v)-u^T \lambda-w^Tv, \quad\text{s.t. }\lambda\geq0\\
p^*(u,w) &= \max_{\lambda,v}g(\lambda,v)-u^T \lambda -w^Tv\\
&\geq g(\lambda^*, v^*)-u^T \lambda ^*-w^Tv^*\\
&= p^*(0,0)-u^T\lambda^*-w^Tv^*\\
\text{where } -\lambda_i^*&= \frac{\partial p^*(u, w)}{\partial u_i}\bigg\vert_{(u,v)=(0,0)}, \;v_i=-\frac{\partial p^*(u, w)}{\partial w_i}\bigg\vert_{(u,v)=(0,0)}
\end{align*}$
![[Pasted image 20230223124959.png|500]]
![[Pasted image 20230223125013.png|500]]
## 6 Generalized Inequalities (GI)
Introduction
![[Pasted image 20230228125610.png|400]]
- $f=[f_1(x),\cdots, f_m(x)]^T, \lambda=[\lambda_1, \cdots, \lambda_n]^T$
- $f^T \lambda$ penalizes whenever $f$ falls out of the $\mathbb{R}_{+}^{}$ cone, i.e.
$\begin{align*}
f \preccurlyeq_{R_+^m}0 &\Longleftrightarrow-f \succcurlyeq_{R_+^m}0\\
\lambda \succcurlyeq_{\mathbb{R}_{+}^{m}}0 & \Longleftrightarrow \mathbb{R}_{+}^{m}=(\mathbb{R}_{+}^{m})^*\\
\lambda^Tf&= \sum\limits_i \lambda_if_i \leq0
\end{align*}$
- If $f\not\preccurlyeq_{\mathbb{R}_{+}^{m}}0$ , then $\exists \lambda\in \mathbb{R}_{+}^{m}, \quad\text{s.t. }\lambda^Tf>0$
#def $A \succcurlyeq_{K_i}B \Longleftrightarrow A-B\in K_i,\quad A \succ_{K_i}B \Longleftrightarrow A-B\in \text{relint}K_i$
#def $A\preccurlyeq_{K}B \Longleftrightarrow B-A\in K$
Primal:
$\begin{align*}
\min f_0(x), \quad\text{s.t. }f_i \preccurlyeq_{K_i},\; i=0,\cdots, m,\;h_i=0,\; i=1,\cdots,p
\end{align*}$
Lagrangian:
$\begin{align*}
\mathcal{L}(x,\lambda,v)&= f_0(x)+\sum_{i=1}^m \lambda_i^T f_i(x)+\sum_{i=1}^p v_i h_i(x), \\
&\quad\text{s.t. } \lambda_i \succcurlyeq_{K_i}0, \;i=0,\cdots,m,\; \lambda_i\in \mathbb{R}_{}^{k_i}
\end{align*}$
### 6.1 SOCP with GI constraints
Primal
$\begin{align*}
\min f^Tx, \quad\text{s.t. }&\left\|A_i x+b_i\right\|_2 \leq c_i^T x+d, i=1, \ldots, m\\
\quad\text{i.e. } &\left(A_i x+b_i, c_i^T+d_i\right) \in K_i, i=1, \ldots, m
\end{align*}$
Lagrangian:
$\begin{align*}
L(x, \lambda, v)= & f^T x-\sum\left(z_i^T\left(A_i x+b_i\right)+w_i\left(c_i^T x+d_i\right)\right) \\
& \left(z_i, w_i\right) \in K_i^*, i=1, \ldots, m
\end{align*}$
Dual:
$\begin{align*}
\max -\sum_i\left(b_i^T z_i+d_i w_i\right), \quad\text{s.t. }&\Vert z_i \Vert\leq w_i,\;i=1,\cdots,k,\; \\
&\sum_i A_i^T z_i+c_i w_i=f
\end{align*}$
Ummm... What is this?
$\begin{align*}
&\begin{cases}
f \preccurlyeq_K0 \Longleftrightarrow &K=\left\{ \begin{bmatrix}u\\t\end{bmatrix} \bigg|\; \Vert u \Vert_p \leq t\right\}\\
\lambda \succcurlyeq_{K^*}0 \Longleftrightarrow &K^*=\left\{ \begin{bmatrix}v\\s\end{bmatrix} \bigg|\; \Vert v \Vert_q \leq s\right\}
\end{cases}\\
&\lambda^Tf=[v^T\;\;s]\begin{bmatrix}u\\t\end{bmatrix}\geq0 \Longleftrightarrow v^Tu+st\leq0\\\\
&\begin{array}{ll}
\text{1. If } f\prec_K0 & \text{then } -\lambda^Tf>0 \Longleftrightarrow \lambda^Tf<0 \;\forall\lambda\neq0\\
\text{2. If } \lambda\succ_{K^*}0 &\text{then } \lambda^Tf<0 \;\forall f\neq0\\
\text{3. If }f\not\preccurlyeq_{K}0 &\text{then } \exists \lambda\in K^*, \quad\text{s.t. }\lambda^Tf>0
\end{array}
\end{align*}$
### 6.2 SDP with GI constraints
Primal
$\begin{align*}
\min c^Tx, \quad\text{s.t. }x_1 F_1+\cdots+x_n F_n+G \leqslant 0, \text { where } F_1, \ldots, F_n, G \in \mathbb{S}^k,\; Z\in \mathbb{S}_+^k
\end{align*}$
Lagrangian:
$\begin{align*}
L(x,\lambda,v)=c^T x+\operatorname{tr}\left(\left(x_1 F_1+\cdots+x_n F_n+G\right) Z\right)
\end{align*}$
Dual:
$\begin{align*}
g(\lambda,v)&= \inf_x L(x,\lambda,v)=\max_Ztr(GZ)\\
tr(F_iZ)+c_i&= 0,\;i=1,\cdots,n,\; Z \succcurlyeq0
\end{align*}$
Thinking generalized inequality
$\begin{align*}
f(x)
\end{align*}$
![[Pasted image 20230228133508.png|300]]
## 7 Summary
1. Duality provides a lower bound of the problem, even when the primal isn't convex
2. KKT converts minimization problem into equations
3. Lagrange multipliers provide sensitivity of the constraints
4. Generalized inequality broadens the scope of convex optimization
[slides](https://cseweb.ucsd.edu/classes/wi23/cse203B-a/slides/lec5_duality.pdf)