- [Limiting Subgradient](../Non-Smooth%20Calculus/Limiting%20Subgradient.md) - [Cantor's Intersection Theorem](../../MATH%20601%20Functional%20Analysis,%20Measure%20Theory/Cantor's%20Intersection%20Theorem.md) --- ### **Intro** The content of the file is taken from chapter 5 of Boris Morducovhich book: Convex analysis and beyond. --- ### **Ekeland Variational Principle** Suppose $X$ is a complete metric space with metric $d$. The assumption of a complete metric means that we will be looking for the limit of some sequences. The lemma follows prescribes an algorithm for constructing a convergence sequence of iterates given any suboptimal point initially. #### **Lemma 5.1 | Ekeland Variational Principle Preparations** > Let $f: X \rightarrow \overline \R$ be an l.s.c function that is bounded below. > For any $\epsilon > 0$, select $x_0$ to be such that > $ > \begin{aligned} > f(x_0) < \inf_{x \in X} f(x) + \epsilon. > \end{aligned} > $ > For any $\lambda > 0$, define $F: X \rightarrow 2^X$ as: > $ > \begin{aligned} > F(x):= > \left\lbrace > y\in X \left| \; > f(y) + \frac{\epsilon}{\lambda} d(x, y) \le f(x) > \right. > \right\rbrace. > \end{aligned} > $ > and construct the suboptimal sequence $(x_k)_{k \ge 0}$ iteratively using the relations: > $ > \begin{aligned} > x_1 & \in \text{dom}\; f, > \\ > x_{k + 1} &\in F(x_k): f(x_{k + 1}) \le \left( > \inf_{x \in F(x_k)} f(x) > \right) + \frac{1}{k} \quad (k\in \N). > \end{aligned} > $ > Then there exists $z \in X$ such that $x_k \rightarrow z$ and $\{z\} = \bigcap_{k\in\N} F(x_k) \subseteq F(x_0)$ hence > $ > \begin{aligned} > f(z) + \frac{\epsilon}{\lambda}d(x_0, z) \le f(x_0) < \inf_{x \in X}f(x) + \epsilon > \end{aligned} > $ **Observations** The definition of the sequence $(x_k)_{k \ge0}$ looks suspiciously similar to the proximal point method. $F(x)$ is all points close to $x$ and lower than $f(x)$ in $X$. **Proof** Observe that: 1. By definition of $x_1\in \text{dom}\; f$, $F(x_1) \neq \emptyset$. 2. If $x_{k + 1} \in F(x_k)$, then $F(x_k)\neq \emptyset$. 3. $F(x_{k + 1})\neq \emptyset$ because $x_{k + 1}$ is in $F(x_{k + 1})$ always. Proceeding stepwise, we have the following key steps for our proof. 1. **(Step I)**: $F(x_k)$ is closed for all $x_k$. 2. **(Step II)**: $\forall k \in \N: F(x_{k + 1})\subseteq F(x_k)$. 3. **(Step III)**: Verify $\{z\} = \bigcap_{k \in \N}F(x_k)$ is a singleton using Cantor's intersection theorem. **(Step I)**. We show that $F(x_k)$ is closed for all $x_k$. For all $x \in X$, let $\bar x \in F(x)$, take any sequence $F(x_k) \ni z_n \rightarrow \bar x$, by completeness of the metric and lower semi-continuity of $f$, it has: $ \begin{aligned} f(\bar x) + \frac{\epsilon}{\lambda}d(\bar x, x_k) \le \left( \liminf_{n\rightarrow \infty} f(z_n) + \frac{\epsilon}{\lambda}d(z_n, x_k) \right) \le f(x_k). \end{aligned} $ Therefore, $\bar x \in F(x)$. **(Step II)**. We show that for all $k \in \N: F(x_{k + 1})\subseteq F(x_k)$. Pick any $y \in F(x_{k + 1})$ then $ \begin{aligned} f(y) + \frac{\epsilon}{\lambda} d(x_{k + 1}, y) \le f(x_{k + 1}). \end{aligned} $ By $x_{k + 1} \in F(x_k)$, it has $ \begin{aligned} f(x_k) &\ge f(x_{k + 1}) + \frac{\epsilon}{\lambda}d(x_{k}, x_{k + 1}) & \\ &\ge \left( f(y) + \frac{\epsilon}{\lambda}d(x_{k + 1}, y) \right) + \frac{\epsilon}{\lambda}d(x_{k}, x_{k + 1}) & y \in F(x_{k + 1}) \\ &\ge f(y) + \frac{\epsilon}{\lambda} d(x_{k}, y). & \text{triangle inequality. } \end{aligned} $ Therefore, $y \in F(x_{k})$ as well. **(Step III)**. We can use Cantor's intersection theorem, it has $\text{diam}\; F(x_k) \rightarrow 0$. because we checked that $F(x_k) \neq \emptyset$ for all $n \in \N$ from **(Step I)**. It remains to check that $\text{diam}(F_n)\rightarrow 0$. To show, take any $y \in F(x_{k + 1})$, then by definition it has $ \begin{aligned} \frac{\epsilon}{\lambda}d(y, x_{k + 1}) &\le f(x_{k + 1}) - f(y) \\ &\le \left( \inf_{x \in F(x_k)} f(x) + \frac{1}{k} \right) - f(y) & x_{k + 1} \in F(x_k) \\ &\le \left( \inf_{x \in F(x_{k + 1})} f(x) + \frac{1}{k} \right) - f(y) & \text{by }F(x_{k + 1}) \subseteq F(x_k) \\ &\le 1/k. \end{aligned} $ Therefore, the last line yields the result $d(y, x_{k + 1})\le \frac{\lambda}{\kappa\epsilon}$ for all $k \in \N$. Using triangle inequality, we have for all $u \in F(x_{k + 1})$ that $d(y, x_{k + 1}) \le d(y, u) + d(y, x_{k + 1})\le 2\frac{\lambda}{\kappa\epsilon}$. Therefore, the set $F(x_k)$ indeed converges to a singleton. **Remarks** No assumptions on convexity, this one only requires a complete metric space for limit of sequences, lsc to ensure the closure of the set$F(x_k)$. Nothing is said about the convergence of $f(x)$ yet. The convergence is possible even if the set of minimizer is non-unique. Of course, $F(x)$ is a subset of the level set of $f$ at height $f(x)$. Observe that the bounded below assumption is crucial for the description of choosing the sequence $(x_k)_{k \in \N}$. #### **Theorem 5.2 | Ekeland Variational Principle** > Let $(X, d)$ be a complete metric space. > Let $f: X \rightarrow \overline \R$ be an l.s.c function and bounded below. > For all $\epsilon > 0$ and $x_0 \in \text{dom}\; f$ such that $f(x_0) < \inf_{x \in X}f(x) + \epsilon$, it has for all $\lambda > 0$, there exists $z \in \text{dom}\; f$ such that: > 1. $f(z) + (\epsilon/\lambda) d(z, x_0) \le f(x_0)$, so $z$ is $\epsilon/\lambda$ suboptimal. > 2. $d(z, x_0) \le \lambda$, $z$ is $\lambda$ away from $x_0$. > 3. $f(z) < f(x) + (\epsilon/\lambda)d(x, z)$ for all $x \in X \setminus \{z\}$. **Proof** --- ### **Let's Follow Rock-Wets instead** Ekeland variational principle states something really weird. It seems quite important and See coverage in [Wikipedia](https://en.wikipedia.org/wiki/Ekeland%27s_variational_principle). On the internet, it says that this allows compactness to be used for function that is doesn't necessary has bounded level set. #### **Theorem | Ekeland Variational Principles** > Let $f : \R^n \rightarrow \overline \R$ be lsc and $\inf_x f(x) = f^+ > \infty$. > Let $\epsilon > 0$, suppose that $\bar x$ has $f(\bar x) \le f^+ + \epsilon$. > Then for any $\delta$, there exists a point $\tilde x \in \mathbb B(\bar x, \epsilon/\delta)$ such that $f(\tilde x) \le f(\bar x)$ and $(\forall x \in \R^n):\; f(x) + \delta\Vert x - \tilde x\Vert > f(\tilde x)$. **Proof** Let's consider the function $f_1 = f + \delta \Vert \cdot - \bar x\Vert$. Define $C := \argmin{x}\; f_1(x)$ then $C\neq \emptyset$ is compact because $f_1$ is level bounded: $ \begin{aligned} \{x | f_1(x) \le \alpha\} \subseteq \{x | f^+ + \delta \Vert x - \bar x\Vert \le \alpha\} = \mathbb B (\bar x, (\alpha - f^+)/\delta), \end{aligned} $ Now consider the function $f_C = f + \delta_C$ which is lsc and level bounded hence $\argmin{x} \;f_C(x) \neq\emptyset$. Let's choose $\tilde x \in \argmin{x}\; f_C(x)$. Obviously that for all $x \in C$, $f(\tilde x) \le f(x)$. When $x \neq C$, $f_1(\tilde x) < f_1( x)$ by definition of $C$ hence: $ \begin{aligned} f(\tilde x) \le f(x) + \delta(\Vert x - \bar x\Vert - \Vert \tilde x - \bar x\Vert) &\le f(x) + \delta \Vert x - \tilde x\Vert. \end{aligned} $ Hence, for all $x \neq \tilde x$: $f(\tilde x) < f(x) + \delta\Vert x - \tilde x\Vert$. Therefore, $\tilde x$ also minimizes $f_1$. In addition: $ \begin{aligned} f(\tilde x) < f(\tilde x) + \delta \Vert \tilde x - \bar x\Vert &\le f(\bar x) \le f^+ + \epsilon \implies \delta\Vert \tilde x - \bar x\Vert \le \epsilon \end{aligned} $ Which writes as $f(\tilde x) \le f(\bar x)$ and $\tilde x \in \mathbb (\overline x, \epsilon/\delta)$. **Remarks** It's just saying that for all lsc function and a $\epsilon$ suboptimal point $\bar x$, it's possible to always choose a new $\tilde x$ such that it's strictly below $f(x) + \delta \Vert x - \tilde x\Vert$, not infinitely away from $\tilde x$.