在这一节,我们用MCMC框架来考察几个例子。
一、和共轭凸函数的关系
设MM<script id="MathJax-Element-1" type="math/tex">M</script> 是函数p: \mathbb{R}^n \mapsto [-\infty, \infty]p:Rn?[?∞,∞]<script id="MathJax-Element-2" type="math/tex">p: \mathbb{R}^n \mapsto [-\infty, \infty]</script> 的上境图,易知此时有\begin{align*} w^* = p( \boldsymbol{0} ) \end{align*}
<script id="MathJax-Element-3" type="math/tex; mode=display">\begin{align*} w^* = p( \boldsymbol{0} ) \end{align*}</script> 且\begin{align*} q(\boldsymbol{\mu}) = \inf_{(\boldsymbol{u} ,w) \in epi(p)} \{ w + \boldsymbol{\mu}^\top \boldsymbol{u} \} = \inf_{ \{ (\boldsymbol{u} ,w) | p(\boldsymbol{u} ) \leq w \} } \{ w + \boldsymbol{\mu}^\top \boldsymbol{u} \}, \end{align*}
<script id="MathJax-Element-4" type="math/tex; mode=display">\begin{align*} q(\boldsymbol{\mu}) = \inf_{(\boldsymbol{u} ,w) \in epi(p)} \{ w + \boldsymbol{\mu}^\top \boldsymbol{u} \} = \inf_{ \{ (\boldsymbol{u} ,w) | p(\boldsymbol{u} ) \leq w \} } \{ w + \boldsymbol{\mu}^\top \boldsymbol{u} \}, \end{align*}</script> 将w<script id="MathJax-Element-5" type="math/tex">w</script> 用p(\boldsymbol{u} )<script id="MathJax-Element-6" type="math/tex">p(\boldsymbol{u} )</script> 替换可得\begin{align*} q(\boldsymbol{\mu}) = \inf_{ \boldsymbol{u} \in \mathbb{R}^n } \{ p(\boldsymbol{u} ) + \boldsymbol{\mu}^\top \boldsymbol{u} \} = - \sup_{ \boldsymbol{u} \in \mathbb{R}^n } \{ (-\boldsymbol{\mu})^\top \boldsymbol{u} - p(\boldsymbol{u} ) \} = -p^*(-\boldsymbol{\mu}). \end{align*}
<script id="MathJax-Element-7" type="math/tex; mode=display">\begin{align*} q(\boldsymbol{\mu}) = \inf_{ \boldsymbol{u} \in \mathbb{R}^n } \{ p(\boldsymbol{u} ) + \boldsymbol{\mu}^\top \boldsymbol{u} \} = - \sup_{ \boldsymbol{u} \in \mathbb{R}^n } \{ (-\boldsymbol{\mu})^\top \boldsymbol{u} - p(\boldsymbol{u} ) \} = -p^*(-\boldsymbol{\mu}). \end{align*}</script> 于是\begin{align*} q^* = \sup_{\boldsymbol{\mu} \in \mathbb{R}^n} q(\boldsymbol{\mu}) = \sup_{\boldsymbol{\mu} \in \mathbb{R}^n} \{ 0 \cdot (- \boldsymbol{\mu}) - p^*(-\boldsymbol{\mu}) \} = p^{**}( \boldsymbol{0} ), \end{align*}
<script id="MathJax-Element-8" type="math/tex; mode=display">\begin{align*} q^* = \sup_{\boldsymbol{\mu} \in \mathbb{R}^n} q(\boldsymbol{\mu}) = \sup_{\boldsymbol{\mu} \in \mathbb{R}^n} \{ 0 \cdot (- \boldsymbol{\mu}) - p^*(-\boldsymbol{\mu}) \} = p^{**}( \boldsymbol{0} ), \end{align*}</script> 由此易知,若p = p^{**}<script id="MathJax-Element-9" type="math/tex">p = p^{**}</script> (即p<script id="MathJax-Element-10" type="math/tex">p</script> 是正常闭凸函数),则有w^* = q^*<script id="MathJax-Element-11" type="math/tex">w^* = q^*</script> ,如右图所示。
二、一般的对偶优化
考虑最小化函数f: \mathbb{R}^n \mapsto [-\infty, \infty]<script id="MathJax-Element-12" type="math/tex">f: \mathbb{R}^n \mapsto [-\infty, \infty]</script> ,引入函数F: \mathbb{R}^{n+r} \mapsto [-\infty, \infty]<script id="MathJax-Element-13" type="math/tex">F: \mathbb{R}^{n+r} \mapsto [-\infty, \infty]</script> 使得\begin{align} \label{equ: general optimization duality f} f(\boldsymbol{x}) = F(\boldsymbol{x}, \boldsymbol{0} ), \ \forall \boldsymbol{x} \in \mathbb{R}^n. \end{align}
<script id="MathJax-Element-14" type="math/tex; mode=display">\begin{align} \label{equ: general optimization duality f} f(\boldsymbol{x}) = F(\boldsymbol{x}, \boldsymbol{0} ), \ \forall \boldsymbol{x} \in \mathbb{R}^n. \end{align}</script> 设函数p: \mathbb{R}^r \mapsto [-\infty, \infty]<script id="MathJax-Element-15" type="math/tex">p: \mathbb{R}^r \mapsto [-\infty, \infty]</script> 定义如下:\begin{align} \label{equ: general optimization duality p} p(\boldsymbol{u} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} F(\boldsymbol{x},\boldsymbol{u}), \end{align}
<script id="MathJax-Element-16" type="math/tex; mode=display">\begin{align} \label{equ: general optimization duality p} p(\boldsymbol{u} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} F(\boldsymbol{x},\boldsymbol{u}), \end{align}</script> 这里\boldsymbol{u} <script id="MathJax-Element-17" type="math/tex">\boldsymbol{u} </script> 可以看成是一个扰动项,p(\boldsymbol{u} )<script id="MathJax-Element-18" type="math/tex">p(\boldsymbol{u} )</script> 是原函数经过扰动后的最优解,当\boldsymbol{u} = \boldsymbol{0} <script id="MathJax-Element-19" type="math/tex">\boldsymbol{u} = \boldsymbol{0} </script> 时,p( \boldsymbol{0} )<script id="MathJax-Element-20" type="math/tex">p( \boldsymbol{0} )</script> 就是原函数的最优解,因为显然有\begin{align*} p(\boldsymbol{0} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} F(\boldsymbol{x}, \boldsymbol{0} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} f(\boldsymbol{x}). \end{align*}
<script id="MathJax-Element-21" type="math/tex; mode=display">\begin{align*} p(\boldsymbol{0} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} F(\boldsymbol{x}, \boldsymbol{0} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} f(\boldsymbol{x}). \end{align*}</script> 设M<script id="MathJax-Element-22" type="math/tex">M</script> 是p<script id="MathJax-Element-23" type="math/tex">p</script> 的上境图,易知此时有\begin{align*} w^* = p( \boldsymbol{0} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} F(\boldsymbol{x}, \boldsymbol{0} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} f(\boldsymbol{x}). \end{align*}
<script id="MathJax-Element-24" type="math/tex; mode=display">\begin{align*} w^* = p( \boldsymbol{0} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} F(\boldsymbol{x}, \boldsymbol{0} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} f(\boldsymbol{x}). \end{align*}</script> 且\begin{align*}q(\boldsymbol{\mu}) = \inf_{(\boldsymbol{u} ,w) \in epi(p)} \{ w + \boldsymbol{\mu}^\top \boldsymbol{u} \} = \inf_{ \{ (\boldsymbol{u} ,w) | p(\boldsymbol{u} ) \leq w \} } \{ w + \boldsymbol{\mu}^\top \boldsymbol{u} \} = \inf_{ \boldsymbol{u} \in \mathbb{R}^r } \{ p(\boldsymbol{u} ) + \boldsymbol{\mu}^\top \boldsymbol{u}\}, \end{align*}
<script id="MathJax-Element-25" type="math/tex; mode=display">\begin{align*}q(\boldsymbol{\mu}) = \inf_{(\boldsymbol{u} ,w) \in epi(p)} \{ w + \boldsymbol{\mu}^\top \boldsymbol{u} \} = \inf_{ \{ (\boldsymbol{u} ,w) | p(\boldsymbol{u} ) \leq w \} } \{ w + \boldsymbol{\mu}^\top \boldsymbol{u} \} = \inf_{ \boldsymbol{u} \in \mathbb{R}^r } \{ p(\boldsymbol{u} ) + \boldsymbol{\mu}^\top \boldsymbol{u}\}, \end{align*}</script> 将p(\boldsymbol{u} )<script id="MathJax-Element-26" type="math/tex">p(\boldsymbol{u} )</script> 用F(\boldsymbol{x}, \boldsymbol{u} )<script id="MathJax-Element-27" type="math/tex">F(\boldsymbol{x}, \boldsymbol{u} )</script> 替换可得\begin{align*} q(\boldsymbol{\mu}) = \inf_{ (\boldsymbol{x}, \boldsymbol{u} ) \in \mathbb{R}^{n+r} } \{ F(\boldsymbol{x}, \boldsymbol{u} ) + \boldsymbol{\mu}^\top \boldsymbol{u} \} = - \sup_{ (\boldsymbol{x}, \boldsymbol{u} ) \in \mathbb{R}^{n+r} } \{ (-\boldsymbol{\mu})^\top \boldsymbol{u} - F(\boldsymbol{x}, \boldsymbol{u} ) \} = - F^*(\boldsymbol{0} , -\boldsymbol{\mu}), \end{align*}
<script id="MathJax-Element-28" type="math/tex; mode=display">\begin{align*} q(\boldsymbol{\mu}) = \inf_{ (\boldsymbol{x}, \boldsymbol{u} ) \in \mathbb{R}^{n+r} } \{ F(\boldsymbol{x}, \boldsymbol{u} ) + \boldsymbol{\mu}^\top \boldsymbol{u} \} = - \sup_{ (\boldsymbol{x}, \boldsymbol{u} ) \in \mathbb{R}^{n+r} } \{ (-\boldsymbol{\mu})^\top \boldsymbol{u} - F(\boldsymbol{x}, \boldsymbol{u} ) \} = - F^*(\boldsymbol{0} , -\boldsymbol{\mu}), \end{align*}</script> 于是\begin{align*} q^* = \sup_{\boldsymbol{\mu} \in \mathbb{R}^r} q(\boldsymbol{\mu}) = \sup_{\boldsymbol{\mu} \in \mathbb{R}^r} \{ -F^*(\boldsymbol{0} , -\boldsymbol{\mu}) \} = - \inf_{\boldsymbol{\mu} \in \mathbb{R}^r} F^*(\boldsymbol{0} , -\boldsymbol{\mu}) = - \inf_{\boldsymbol{\mu} \in \mathbb{R}^r} F^*(\boldsymbol{0} , \boldsymbol{\mu}), \end{align*}
<script id="MathJax-Element-29" type="math/tex; mode=display">\begin{align*} q^* = \sup_{\boldsymbol{\mu} \in \mathbb{R}^r} q(\boldsymbol{\mu}) = \sup_{\boldsymbol{\mu} \in \mathbb{R}^r} \{ -F^*(\boldsymbol{0} , -\boldsymbol{\mu}) \} = - \inf_{\boldsymbol{\mu} \in \mathbb{R}^r} F^*(\boldsymbol{0} , -\boldsymbol{\mu}) = - \inf_{\boldsymbol{\mu} \in \mathbb{R}^r} F^*(\boldsymbol{0} , \boldsymbol{\mu}), \end{align*}</script> 由此易知,若想强对偶成立,应有\begin{align*} \inf_{\boldsymbol{x} \in \mathbb{R}^n} f(\boldsymbol{x}) = - \inf_{\boldsymbol{\mu} \in \mathbb{R}^r} F^*(\boldsymbol{0} , \boldsymbol{\mu}). \end{align*}
<script id="MathJax-Element-30" type="math/tex; mode=display">\begin{align*} \inf_{\boldsymbol{x} \in \mathbb{R}^n} f(\boldsymbol{x}) = - \inf_{\boldsymbol{\mu} \in \mathbb{R}^r} F^*(\boldsymbol{0} , \boldsymbol{\mu}). \end{align*}</script>
三、含有不等式约束的优化
式(\ref{equ: general optimization duality f}<script id="MathJax-Element-31" type="math/tex">\ref{equ: general optimization duality f}</script> )和式(\ref{equ: general optimization duality p}<script id="MathJax-Element-32" type="math/tex">\ref{equ: general optimization duality p}</script> )中F<script id="MathJax-Element-33" type="math/tex">F</script> 和p<script id="MathJax-Element-34" type="math/tex">p</script> 的不同选择可以得到不同的对偶问题,考虑含有不等式约束的优化问题:\begin{align} \label{equ: optimization with inequality constraints} \begin{split} \min_{\boldsymbol{x}} & \ f(\boldsymbol{x}) \\ \mbox{s.t.} & \ \boldsymbol{x} \in X, \ \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{0}. \end{split} \end{align}
<script id="MathJax-Element-35" type="math/tex; mode=display">\begin{align} \label{equ: optimization with inequality constraints} \begin{split} \min_{\boldsymbol{x}} & \ f(\boldsymbol{x}) \\ \mbox{s.t.} & \ \boldsymbol{x} \in X, \ \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{0}. \end{split} \end{align}</script> 其中X<script id="MathJax-Element-36" type="math/tex">X</script> 是\mathbb{R}^n<script id="MathJax-Element-37" type="math/tex">\mathbb{R}^n</script> 的非空子集,f: X \mapsto \mathbb{R}<script id="MathJax-Element-38" type="math/tex">f: X \mapsto \mathbb{R}</script> ,g_j: X \mapsto \mathbb{R}<script id="MathJax-Element-39" type="math/tex">g_j: X \mapsto \mathbb{R}</script> 是给定函数。引入扰动约束集合\begin{align*}C_{\boldsymbol{u} } = \{ \boldsymbol{x} \in X \ | \ \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{u} \}, \ \boldsymbol{u} \in \mathbb{R}^r. \end{align*}
<script id="MathJax-Element-40" type="math/tex; mode=display">\begin{align*}C_{\boldsymbol{u} } = \{ \boldsymbol{x} \in X \ | \ \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{u} \}, \ \boldsymbol{u} \in \mathbb{R}^r. \end{align*}</script> 和函数\begin{align*} F(\boldsymbol{x}, \boldsymbol{u} ) = \begin{cases} f(\boldsymbol{x}) & \boldsymbol{x} \in C_{\boldsymbol{u} }, \\ \infty & \boldsymbol{x} \not \in C_{\boldsymbol{u} }. \end{cases} \end{align*}
<script id="MathJax-Element-41" type="math/tex; mode=display">\begin{align*} F(\boldsymbol{x}, \boldsymbol{u} ) = \begin{cases} f(\boldsymbol{x}) & \boldsymbol{x} \in C_{\boldsymbol{u} }, \\ \infty & \boldsymbol{x} \not \in C_{\boldsymbol{u} }. \end{cases} \end{align*}</script> 显然对于\forall \boldsymbol{x} \in C_{ \boldsymbol{0} }<script id="MathJax-Element-42" type="math/tex">\forall \boldsymbol{x} \in C_{ \boldsymbol{0} }</script> 有F(\boldsymbol{x}, \boldsymbol{0} ) = f(\boldsymbol{x})<script id="MathJax-Element-43" type="math/tex">F(\boldsymbol{x}, \boldsymbol{0} ) = f(\boldsymbol{x})</script> 且\begin{align*} p(\boldsymbol{u} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} F(\boldsymbol{x}, \boldsymbol{u} ) = \inf_{\boldsymbol{x} \in X, \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{u} } f(\boldsymbol{x}), \end{align*}
<script id="MathJax-Element-44" type="math/tex; mode=display">\begin{align*} p(\boldsymbol{u} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} F(\boldsymbol{x}, \boldsymbol{u} ) = \inf_{\boldsymbol{x} \in X, \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{u} } f(\boldsymbol{x}), \end{align*}</script> 我们称之为原始函数,易知\begin{align*} w^* = p(\boldsymbol{0} ) = \inf_{\boldsymbol{x} \in X, \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{0} } f(\boldsymbol{x}). \end{align*}
<script id="MathJax-Element-45" type="math/tex; mode=display">\begin{align*} w^* = p(\boldsymbol{0} ) = \inf_{\boldsymbol{x} \in X, \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{0} } f(\boldsymbol{x}). \end{align*}</script>
另一方面,\begin{align} \label{equ: Lagrangian function} q(\boldsymbol{\mu}) = \inf_{ \boldsymbol{u} \in \mathbb{R}^r } \left\{ p(\boldsymbol{u} ) + \boldsymbol{\mu}^\top \boldsymbol{u} \right\} = \inf_{\boldsymbol{x} \in X, \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{u} } \left\{ f(\boldsymbol{x}) + \boldsymbol{\mu}^\top \boldsymbol{u} \right\} = \begin{cases} \inf_{\boldsymbol{x} \in X} \left\{ f(\boldsymbol{x}) + \boldsymbol{\mu}^\top \boldsymbol{g}(\boldsymbol{x}) \right\} & \boldsymbol{\mu} \geq \boldsymbol{0}, \\ - \infty & otherwise. \end{cases}\end{align}
<script id="MathJax-Element-46" type="math/tex; mode=display">\begin{align} \label{equ: Lagrangian function} q(\boldsymbol{\mu}) = \inf_{ \boldsymbol{u} \in \mathbb{R}^r } \left\{ p(\boldsymbol{u} ) + \boldsymbol{\mu}^\top \boldsymbol{u} \right\} = \inf_{\boldsymbol{x} \in X, \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{u} } \left\{ f(\boldsymbol{x}) + \boldsymbol{\mu}^\top \boldsymbol{u} \right\} = \begin{cases} \inf_{\boldsymbol{x} \in X} \left\{ f(\boldsymbol{x}) + \boldsymbol{\mu}^\top \boldsymbol{g}(\boldsymbol{x}) \right\} & \boldsymbol{\mu} \geq \boldsymbol{0}, \\ - \infty & otherwise. \end{cases}\end{align}</script> 我们称之为对偶函数或Lagrangian函数。
例4.2.1[线性规划的对偶]:考虑如下形式的线性规划:\begin{align*} \min_{\boldsymbol{x}} & \ \boldsymbol{c}^\top \boldsymbol{x} \\ \mbox{s.t.} & \ \boldsymbol{a}_j^\top \boldsymbol{x} \geq b_j, j = 1, \dots, r. \end{align*}
<script id="MathJax-Element-47" type="math/tex; mode=display">\begin{align*} \min_{\boldsymbol{x}} & \ \boldsymbol{c}^\top \boldsymbol{x} \\ \mbox{s.t.} & \ \boldsymbol{a}_j^\top \boldsymbol{x} \geq b_j, j = 1, \dots, r. \end{align*}</script> 其中\boldsymbol{c} \in \mathbb{R}^n<script id="MathJax-Element-48" type="math/tex">\boldsymbol{c} \in \mathbb{R}^n</script> ,\boldsymbol{a}_j \in \mathbb{R}^n<script id="MathJax-Element-49" type="math/tex">\boldsymbol{a}_j \in \mathbb{R}^n</script> ,b_j \in \mathbb{R}<script id="MathJax-Element-50" type="math/tex">b_j \in \mathbb{R}</script> ,j = 1, \dots, r<script id="MathJax-Element-51" type="math/tex">j = 1, \dots, r</script> 。于是由式(\ref{equ: Lagrangian function}<script id="MathJax-Element-52" type="math/tex">\ref{equ: Lagrangian function}</script> )知对于\boldsymbol{\mu} \geq \boldsymbol{0}<script id="MathJax-Element-53" type="math/tex">\boldsymbol{\mu} \geq \boldsymbol{0}</script> 有\begin{align*} q(\boldsymbol{\mu}) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} \left\{ \boldsymbol{c}^\top \boldsymbol{x} + \sum_{j=1}^r \mu_j (b_j - \boldsymbol{a}_j^\top \boldsymbol{x}) \right\} = \begin{cases} \boldsymbol{\mu}^\top \boldsymbol{b} & \sum_{j=1}^r \mu_j \boldsymbol{a}_j = \boldsymbol{c}, \\ -\infty & otherwise. \end{cases} \end{align*}
<script id="MathJax-Element-54" type="math/tex; mode=display">\begin{align*} q(\boldsymbol{\mu}) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} \left\{ \boldsymbol{c}^\top \boldsymbol{x} + \sum_{j=1}^r \mu_j (b_j - \boldsymbol{a}_j^\top \boldsymbol{x}) \right\} = \begin{cases} \boldsymbol{\mu}^\top \boldsymbol{b} & \sum_{j=1}^r \mu_j \boldsymbol{a}_j = \boldsymbol{c}, \\ -\infty & otherwise. \end{cases} \end{align*}</script> 对于其它的\boldsymbol{\mu} \in \mathbb{R}^n<script id="MathJax-Element-55" type="math/tex">\boldsymbol{\mu} \in \mathbb{R}^n</script> 有q(\boldsymbol{\mu}) = -\infty<script id="MathJax-Element-56" type="math/tex">q(\boldsymbol{\mu}) = -\infty</script> ,因此对偶问题为\begin{align*} \max_{\boldsymbol{\mu}} & \ \boldsymbol{\mu}^\top \boldsymbol{b} \\ \mbox{s.t.} & \ \sum_{j=1}^r \mu_j \boldsymbol{a}_j = \boldsymbol{c}, \ \boldsymbol{\mu} \geq \boldsymbol{0}. \end{align*}
<script id="MathJax-Element-57" type="math/tex; mode=display">\begin{align*} \max_{\boldsymbol{\mu}} & \ \boldsymbol{\mu}^\top \boldsymbol{b} \\ \mbox{s.t.} & \ \sum_{j=1}^r \mu_j \boldsymbol{a}_j = \boldsymbol{c}, \ \boldsymbol{\mu} \geq \boldsymbol{0}. \end{align*}</script>
四、增广Lagrangian对偶
依然考虑问题(\ref{equ: optimization with inequality constraints}<script id="MathJax-Element-58" type="math/tex">\ref{equ: optimization with inequality constraints}</script> ),和之前相同,引入函数\begin{align*} F_c(\boldsymbol{x}, \boldsymbol{u} ) = \begin{cases} f(\boldsymbol{x}) + \frac{c}{2} \| \boldsymbol{u} \|^2 & \boldsymbol{x} \in C_{\boldsymbol{u} }, \\ \infty & \boldsymbol{x} \not \in C_{\boldsymbol{u} }. \end{cases} \end{align*}
<script id="MathJax-Element-59" type="math/tex; mode=display">\begin{align*} F_c(\boldsymbol{x}, \boldsymbol{u} ) = \begin{cases} f(\boldsymbol{x}) + \frac{c}{2} \| \boldsymbol{u} \|^2 & \boldsymbol{x} \in C_{\boldsymbol{u} }, \\ \infty & \boldsymbol{x} \not \in C_{\boldsymbol{u} }. \end{cases} \end{align*}</script> 其中c<script id="MathJax-Element-60" type="math/tex">c</script> 是一个正数。显然对于\forall \boldsymbol{x} \in C_{ \boldsymbol{0} }<script id="MathJax-Element-61" type="math/tex">\forall \boldsymbol{x} \in C_{ \boldsymbol{0} }</script> 有F(\boldsymbol{x}, \boldsymbol{0} ) = f(\boldsymbol{x})<script id="MathJax-Element-62" type="math/tex">F(\boldsymbol{x}, \boldsymbol{0} ) = f(\boldsymbol{x})</script> 且\begin{align*} p_c(\boldsymbol{u} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} F_c(\boldsymbol{x}, \boldsymbol{u} ) = \inf_{\boldsymbol{x} \in X, \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{u} } \left\{ f(\boldsymbol{x}) + \frac{c}{2} \| \boldsymbol{u} \|^2 \right\}, \end{align*}
<script id="MathJax-Element-63" type="math/tex; mode=display">\begin{align*} p_c(\boldsymbol{u} ) = \inf_{\boldsymbol{x} \in \mathbb{R}^n} F_c(\boldsymbol{x}, \boldsymbol{u} ) = \inf_{\boldsymbol{x} \in X, \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{u} } \left\{ f(\boldsymbol{x}) + \frac{c}{2} \| \boldsymbol{u} \|^2 \right\}, \end{align*}</script> 易知\begin{align*} w^* = p_c(\boldsymbol{0} ) = \inf_{\boldsymbol{x} \in X, \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{0} } f(\boldsymbol{x}). \end{align*}
<script id="MathJax-Element-64" type="math/tex; mode=display">\begin{align*} w^* = p_c(\boldsymbol{0} ) = \inf_{\boldsymbol{x} \in X, \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{0} } f(\boldsymbol{x}). \end{align*}</script> 即最小公共点没有改变。
另一方面,对偶函数\begin{align*} q_c(\boldsymbol{\mu}) = \inf_{ \boldsymbol{u} \in \mathbb{R}^r } \{ p_c(\boldsymbol{u} ) + \boldsymbol{\mu}^\top \boldsymbol{u} \} = \inf_{\boldsymbol{x} \in X, \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{u} } \left\{ f(\boldsymbol{x}) + \boldsymbol{\mu}^\top \boldsymbol{u} + \frac{c}{2} \| \boldsymbol{u} \|^2 \right\}, \end{align*}
<script id="MathJax-Element-65" type="math/tex; mode=display">\begin{align*} q_c(\boldsymbol{\mu}) = \inf_{ \boldsymbol{u} \in \mathbb{R}^r } \{ p_c(\boldsymbol{u} ) + \boldsymbol{\mu}^\top \boldsymbol{u} \} = \inf_{\boldsymbol{x} \in X, \boldsymbol{g}(\boldsymbol{x}) \leq \boldsymbol{u} } \left\{ f(\boldsymbol{x}) + \boldsymbol{\mu}^\top \boldsymbol{u} + \frac{c}{2} \| \boldsymbol{u} \|^2 \right\}, \end{align*}</script> 对于固定的\boldsymbol{x} \in X<script id="MathJax-Element-66" type="math/tex">\boldsymbol{x} \in X</script> ,上式的下确界可以通过优化\boldsymbol{u} <script id="MathJax-Element-67" type="math/tex">\boldsymbol{u} </script> 的每一维依次得到,对于第j<script id="MathJax-Element-68" type="math/tex">j</script> 维,\begin{align*} \min_{u_j} & \ \mu_j u_j + \frac{c}{2} u_j^2 \\ \mbox{s.t.} & \ g_j(\boldsymbol{x}) \leq u_j. \end{align*}
<script id="MathJax-Element-69" type="math/tex; mode=display">\begin{align*} \min_{u_j} & \ \mu_j u_j + \frac{c}{2} u_j^2 \\ \mbox{s.t.} & \ g_j(\boldsymbol{x}) \leq u_j. \end{align*}</script> 易知最优解为\begin{align*} g_j^+(\boldsymbol{x}, \boldsymbol{\mu}, c) = \max \left\{ -\frac{\mu_j}{c}, g_j(\boldsymbol{x}) \right\}, \ j = 1, \dots, r, \end{align*}
<script id="MathJax-Element-70" type="math/tex; mode=display">\begin{align*} g_j^+(\boldsymbol{x}, \boldsymbol{\mu}, c) = \max \left\{ -\frac{\mu_j}{c}, g_j(\boldsymbol{x}) \right\}, \ j = 1, \dots, r, \end{align*}</script> 故\begin{align} \label{equ: augmented Lagrangian function} q_c(\boldsymbol{\mu}) = \inf_{\boldsymbol{x} \in X} \left\{ f(\boldsymbol{x}) + \boldsymbol{\mu}^\top g^+(\boldsymbol{x}, \boldsymbol{\mu}, c) + \frac{c}{2} \| g^+(\boldsymbol{x}, \boldsymbol{\mu}, c) \|^2 \right\}, \end{align}
<script id="MathJax-Element-71" type="math/tex; mode=display">\begin{align} \label{equ: augmented Lagrangian function} q_c(\boldsymbol{\mu}) = \inf_{\boldsymbol{x} \in X} \left\{ f(\boldsymbol{x}) + \boldsymbol{\mu}^\top g^+(\boldsymbol{x}, \boldsymbol{\mu}, c) + \frac{c}{2} \| g^+(\boldsymbol{x}, \boldsymbol{\mu}, c) \|^2 \right\}, \end{align}</script> 其中g^+(\boldsymbol{x}, \boldsymbol{\mu}, c)<script id="MathJax-Element-72" type="math/tex">g^+(\boldsymbol{x}, \boldsymbol{\mu}, c)</script> 的第j<script id="MathJax-Element-73" type="math/tex">j</script> 维是g_j^+(\boldsymbol{x}, \boldsymbol{\mu}, c)<script id="MathJax-Element-74" type="math/tex">g_j^+(\boldsymbol{x}, \boldsymbol{\mu}, c)</script> 。
式(\ref{equ: augmented Lagrangian function}<script id="MathJax-Element-75" type="math/tex">\ref{equ: augmented Lagrangian function}</script> )称作增广Lagrangian函数,相较于式(\ref{equ: Lagrangian function}<script id="MathJax-Element-76" type="math/tex">\ref{equ: Lagrangian function}</script> ),增广Lagrangian函数多引入了最后那个二次项,这使得它多了些优异的性质,例如它常常是实值函数(当f<script id="MathJax-Element-77" type="math/tex">f</script> 和g_j<script id="MathJax-Element-78" type="math/tex">g_j</script> 都是连续函数且X<script id="MathJax-Element-79" type="math/tex">X</script> 是紧集时),有时甚至是可微的。
五、极大极小问题
设函数\phi: X \times Z \mapsto R<script id="MathJax-Element-80" type="math/tex">\phi: X \times Z \mapsto R</script> ,其中X<script id="MathJax-Element-81" type="math/tex">X</script> 和Z<script id="MathJax-Element-82" type="math/tex">Z</script> 分别是\mathbb{R}^n<script id="MathJax-Element-83" type="math/tex">\mathbb{R}^n</script> 和\mathbb{R}^m<script id="MathJax-Element-84" type="math/tex">\mathbb{R}^m</script> 的非空子集,则极大极小问题可以形式化为:\begin{align*} \min_{\boldsymbol{x}} & \ \sup_{\boldsymbol{z} \in Z} \phi(\boldsymbol{x}, \boldsymbol{z}) \\ \mbox{s.t.} & \ \boldsymbol{x} \in X. \end{align*}
<script id="MathJax-Element-85" type="math/tex; mode=display">\begin{align*} \min_{\boldsymbol{x}} & \ \sup_{\boldsymbol{z} \in Z} \phi(\boldsymbol{x}, \boldsymbol{z}) \\ \mbox{s.t.} & \ \boldsymbol{x} \in X. \end{align*}</script> 和\begin{align*} \max_{\boldsymbol{z}} & \ \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{z}) \\ \mbox{s.t.} & \ \boldsymbol{z} \in Z. \end{align*}
<script id="MathJax-Element-86" type="math/tex; mode=display">\begin{align*} \max_{\boldsymbol{z}} & \ \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{z}) \\ \mbox{s.t.} & \ \boldsymbol{z} \in Z. \end{align*}</script> 一个有趣的问题是在什么条件下有极大极小等式\begin{align*} \sup_{\boldsymbol{z} \in Z} \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{z}) = \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in Z} \phi(\boldsymbol{x}, \boldsymbol{z}). \end{align*}
<script id="MathJax-Element-87" type="math/tex; mode=display">\begin{align*} \sup_{\boldsymbol{z} \in Z} \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{z}) = \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in Z} \phi(\boldsymbol{x}, \boldsymbol{z}). \end{align*}</script> 成立,且相应的极值都能取到,这在零和博弈和对偶优化理论里都是很常见的问题。
引入函数p: \mathbb{R}^m \mapsto [-\infty, \infty]<script id="MathJax-Element-88" type="math/tex">p: \mathbb{R}^m \mapsto [-\infty, \infty]</script> :\begin{align} \label{equ: minimax perturbation function} p(\boldsymbol{u} ) = \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in Z} \{ \phi(\boldsymbol{x}, \boldsymbol{z}) - \boldsymbol{u} ^\top \boldsymbol{z} \}, \end{align}
<script id="MathJax-Element-89" type="math/tex; mode=display">\begin{align} \label{equ: minimax perturbation function} p(\boldsymbol{u} ) = \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in Z} \{ \phi(\boldsymbol{x}, \boldsymbol{z}) - \boldsymbol{u} ^\top \boldsymbol{z} \}, \end{align}</script> 这里p(\boldsymbol{u} )<script id="MathJax-Element-90" type="math/tex">p(\boldsymbol{u} )</script> 可以看成是一个扰动函数,线性项\boldsymbol{u} ^\top \boldsymbol{z}<script id="MathJax-Element-91" type="math/tex">\boldsymbol{u} ^\top \boldsymbol{z}</script> 是扰动项,设M<script id="MathJax-Element-92" type="math/tex">M</script> 是p<script id="MathJax-Element-93" type="math/tex">p</script> 的上境图,那么\begin{align*} w^* = p( \boldsymbol{0} ) = \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in Z} \phi(\boldsymbol{x}, \boldsymbol{z}).\end{align*}
<script id="MathJax-Element-94" type="math/tex; mode=display">\begin{align*} w^* = p( \boldsymbol{0} ) = \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in Z} \phi(\boldsymbol{x}, \boldsymbol{z}).\end{align*}</script>
下面研究q^*<script id="MathJax-Element-95" type="math/tex">q^*</script> ,在此先引入函数闭凹包的概念。函数f: X \mapsto [-\infty, \infty]<script id="MathJax-Element-96" type="math/tex">f: X \mapsto [-\infty, \infty]</script> 的闭凹包为\begin{align*} cl(conc(f)) = - cl(conv(-f)). \end{align*}
<script id="MathJax-Element-97" type="math/tex; mode=display">\begin{align*} cl(conc(f)) = - cl(conv(-f)). \end{align*}</script> 结合命题1.3.13知\begin{align} \label{equ: concave closure} \sup_{\boldsymbol{x} \in X} f(\boldsymbol{x}) = - \inf_{\boldsymbol{x} \in X} -f(\boldsymbol{x}) = - \inf_{\boldsymbol{x} \in \mathbb{R}^n} cl(conv(-f))(\boldsymbol{x}) = \sup_{\boldsymbol{x} \in \mathbb{R}^n} - cl(conv(-f))(\boldsymbol{x}) = \sup_{\boldsymbol{x} \in \mathbb{R}^n} cl(conc(f))(\boldsymbol{x}). \end{align}
<script id="MathJax-Element-98" type="math/tex; mode=display">\begin{align} \label{equ: concave closure} \sup_{\boldsymbol{x} \in X} f(\boldsymbol{x}) = - \inf_{\boldsymbol{x} \in X} -f(\boldsymbol{x}) = - \inf_{\boldsymbol{x} \in \mathbb{R}^n} cl(conv(-f))(\boldsymbol{x}) = \sup_{\boldsymbol{x} \in \mathbb{R}^n} - cl(conv(-f))(\boldsymbol{x}) = \sup_{\boldsymbol{x} \in \mathbb{R}^n} cl(conc(f))(\boldsymbol{x}). \end{align}</script>
命题4.2.2:设函数\phi: X \times Z \mapsto R<script id="MathJax-Element-99" type="math/tex">\phi: X \times Z \mapsto R</script> ,其中X<script id="MathJax-Element-100" type="math/tex">X</script> 和Z<script id="MathJax-Element-101" type="math/tex">Z</script> 分别是\mathbb{R}^n<script id="MathJax-Element-102" type="math/tex">\mathbb{R}^n</script> 和\mathbb{R}^m<script id="MathJax-Element-103" type="math/tex">\mathbb{R}^m</script> 的非空子集。若对于\forall \boldsymbol{x} \in X<script id="MathJax-Element-104" type="math/tex">\forall \boldsymbol{x} \in X</script> ,-cl(conc(\phi))(\boldsymbol{x}, \cdot)<script id="MathJax-Element-105" type="math/tex">-cl(conc(\phi))(\boldsymbol{x}, \cdot)</script> 是正常函数,考虑之前的MCMC框架,设函数p<script id="MathJax-Element-106" type="math/tex">p</script> 的定义如式(\ref{equ: minimax perturbation function}<script id="MathJax-Element-107" type="math/tex">\ref{equ: minimax perturbation function}</script> ),M = epi(p)<script id="MathJax-Element-108" type="math/tex">M = epi(p)</script> ,那么对偶函数为\begin{align*} q(\boldsymbol{\mu}) = \inf_{\boldsymbol{x} \in X} cl(conc(\phi))(\boldsymbol{x}, \boldsymbol{\mu}). \end{align*}
<script id="MathJax-Element-109" type="math/tex; mode=display">\begin{align*} q(\boldsymbol{\mu}) = \inf_{\boldsymbol{x} \in X} cl(conc(\phi))(\boldsymbol{x}, \boldsymbol{\mu}). \end{align*}</script>
证明:将p(\boldsymbol{u} )<script id="MathJax-Element-110" type="math/tex">p(\boldsymbol{u} )</script> 重新写为\begin{align*} p(\boldsymbol{u} ) = \inf_{\boldsymbol{x} \in X} p_{\boldsymbol{x}}(\boldsymbol{u} ), \end{align*}
<script id="MathJax-Element-111" type="math/tex; mode=display">\begin{align*} p(\boldsymbol{u} ) = \inf_{\boldsymbol{x} \in X} p_{\boldsymbol{x}}(\boldsymbol{u} ), \end{align*}</script> 其中p_{\boldsymbol{x}}(\boldsymbol{u} ) = \sup_{\boldsymbol{z} \in Z} \{ \phi(\boldsymbol{x}, \boldsymbol{z}) - \boldsymbol{u} ^\top \boldsymbol{z} \}, \boldsymbol{x} \in X <script id="MathJax-Element-112" type="math/tex">p_{\boldsymbol{x}}(\boldsymbol{u} ) = \sup_{\boldsymbol{z} \in Z} \{ \phi(\boldsymbol{x}, \boldsymbol{z}) - \boldsymbol{u} ^\top \boldsymbol{z} \}, \boldsymbol{x} \in X </script> 。将\phi(\boldsymbol{x}, \boldsymbol{z})<script id="MathJax-Element-113" type="math/tex">\phi(\boldsymbol{x}, \boldsymbol{z})</script> 看作\boldsymbol{z}<script id="MathJax-Element-114" type="math/tex">\boldsymbol{z}</script> 的函数,易知有\begin{align*} - \phi^*(\boldsymbol{x}, \boldsymbol{z}) = \sup_{\boldsymbol{z} \in Z} \{ \boldsymbol{u} ^\top \boldsymbol{z} + \phi(\boldsymbol{x}, \boldsymbol{z}) \} = p_{\boldsymbol{x}}(-\boldsymbol{u} ), \end{align*}
<script id="MathJax-Element-115" type="math/tex; mode=display">\begin{align*} - \phi^*(\boldsymbol{x}, \boldsymbol{z}) = \sup_{\boldsymbol{z} \in Z} \{ \boldsymbol{u} ^\top \boldsymbol{z} + \phi(\boldsymbol{x}, \boldsymbol{z}) \} = p_{\boldsymbol{x}}(-\boldsymbol{u} ), \end{align*}</script> 即p_{\boldsymbol{x}}(-\cdot)<script id="MathJax-Element-116" type="math/tex">p_{\boldsymbol{x}}(-\cdot)</script> 是- \phi(\boldsymbol{x}, \cdot)<script id="MathJax-Element-117" type="math/tex">- \phi(\boldsymbol{x}, \cdot)</script> 的共轭函数,由共轭定理(4)知
\begin{align} \label{equ: minimax duality} p_{\boldsymbol{x}}^*(-\cdot) = - \phi^{**}(\boldsymbol{x}, \cdot) = cl(conv(-\phi)) (\boldsymbol{x}, \cdot) = - cl(conc(\phi)) (\boldsymbol{x}, \cdot), \end{align}
<script id="MathJax-Element-118" type="math/tex; mode=display">\begin{align} \label{equ: minimax duality} p_{\boldsymbol{x}}^*(-\cdot) = - \phi^{**}(\boldsymbol{x}, \cdot) = cl(conv(-\phi)) (\boldsymbol{x}, \cdot) = - cl(conc(\phi)) (\boldsymbol{x}, \cdot), \end{align}</script> 于是\begin{align*} p_{\boldsymbol{x}}^*(-\boldsymbol{\mu}) = - cl(conc(\phi)) (\boldsymbol{x}, \boldsymbol{\mu}). \end{align*}
<script id="MathJax-Element-119" type="math/tex; mode=display">\begin{align*} p_{\boldsymbol{x}}^*(-\boldsymbol{\mu}) = - cl(conc(\phi)) (\boldsymbol{x}, \boldsymbol{\mu}). \end{align*}</script> 因此对于任意\boldsymbol{\mu} \in \mathbb{R}^m<script id="MathJax-Element-120" type="math/tex">\boldsymbol{\mu} \in \mathbb{R}^m</script> 有\begin{align*} q(\boldsymbol{\mu}) & = \inf_{\boldsymbol{u} \in \mathbb{R}^m} \{ p(\boldsymbol{u} ) + \boldsymbol{u} ^\top \boldsymbol{\mu} \} \\ & = \inf_{\boldsymbol{u} \in \mathbb{R}^m}\inf_{\boldsymbol{x} \in X} \{ p_{\boldsymbol{x}}(\boldsymbol{u} ) + \boldsymbol{u} ^\top \boldsymbol{\mu} \} \\ & = \inf_{\boldsymbol{x} \in X} \inf_{\boldsymbol{u} \in \mathbb{R}^m} \{ p_{\boldsymbol{x}}(\boldsymbol{u} ) + \boldsymbol{u} ^\top \boldsymbol{\mu} \} \\ & = \inf_{\boldsymbol{x} \in X} \left\{ - \sup_{\boldsymbol{u} \in \mathbb{R}^m} \{ \boldsymbol{u} ^\top (-\boldsymbol{\mu}) - p_{\boldsymbol{x}}(\boldsymbol{u} )\} \right \} \\ & = \inf_{\boldsymbol{x} \in X} \{ -p_{\boldsymbol{x}}^*(-\boldsymbol{\mu}) \} \\ & = \inf_{\boldsymbol{x} \in X} cl(conc(\phi)) (\boldsymbol{x}, \boldsymbol{\mu}). \end{align*}
<script id="MathJax-Element-121" type="math/tex; mode=display">\begin{align*} q(\boldsymbol{\mu}) & = \inf_{\boldsymbol{u} \in \mathbb{R}^m} \{ p(\boldsymbol{u} ) + \boldsymbol{u} ^\top \boldsymbol{\mu} \} \\ & = \inf_{\boldsymbol{u} \in \mathbb{R}^m}\inf_{\boldsymbol{x} \in X} \{ p_{\boldsymbol{x}}(\boldsymbol{u} ) + \boldsymbol{u} ^\top \boldsymbol{\mu} \} \\ & = \inf_{\boldsymbol{x} \in X} \inf_{\boldsymbol{u} \in \mathbb{R}^m} \{ p_{\boldsymbol{x}}(\boldsymbol{u} ) + \boldsymbol{u} ^\top \boldsymbol{\mu} \} \\ & = \inf_{\boldsymbol{x} \in X} \left\{ - \sup_{\boldsymbol{u} \in \mathbb{R}^m} \{ \boldsymbol{u} ^\top (-\boldsymbol{\mu}) - p_{\boldsymbol{x}}(\boldsymbol{u} )\} \right \} \\ & = \inf_{\boldsymbol{x} \in X} \{ -p_{\boldsymbol{x}}^*(-\boldsymbol{\mu}) \} \\ & = \inf_{\boldsymbol{x} \in X} cl(conc(\phi)) (\boldsymbol{x}, \boldsymbol{\mu}). \end{align*}</script>
-cl(conc(\phi))(\boldsymbol{x}, \cdot)<script id="MathJax-Element-122" type="math/tex">-cl(conc(\phi))(\boldsymbol{x}, \cdot)</script> 是正常函数的条件是必需的,否则式(\ref{equ: minimax duality}<script id="MathJax-Element-123" type="math/tex">\ref{equ: minimax duality}</script> )中的第二个等号可能不成立。此外我们还可得到如下一些结论:
- 一般来说,我们有\begin{align*} \sup_{\boldsymbol{z} \in Z} \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{z}) \leq q^* \leq w^* = \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in Z} \phi(\boldsymbol{x}, \boldsymbol{z}), \end{align*}
<script id="MathJax-Element-124" type="math/tex; mode=display">\begin{align*} \sup_{\boldsymbol{z} \in Z} \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{z}) \leq q^* \leq w^* = \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in Z} \phi(\boldsymbol{x}, \boldsymbol{z}), \end{align*}</script> 其中第一个不等号成立是因为\begin{align*} q(\boldsymbol{\mu}) = \inf_{\boldsymbol{u} \in \mathbb{R}^m} \{ p(\boldsymbol{u} ) + \boldsymbol{u} ^\top \boldsymbol{\mu} \} = \inf_{\boldsymbol{u} \in \mathbb{R}^m} \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in Z} \{ \phi(\boldsymbol{x}, \boldsymbol{z}) + \boldsymbol{u} ^\top(\boldsymbol{\mu} - \boldsymbol{z}) \} \geq \inf_{\boldsymbol{u} \in \mathbb{R}^m} \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{\mu}) = \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{\mu}), \end{align*}
<script id="MathJax-Element-125" type="math/tex; mode=display">\begin{align*} q(\boldsymbol{\mu}) = \inf_{\boldsymbol{u} \in \mathbb{R}^m} \{ p(\boldsymbol{u} ) + \boldsymbol{u} ^\top \boldsymbol{\mu} \} = \inf_{\boldsymbol{u} \in \mathbb{R}^m} \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in Z} \{ \phi(\boldsymbol{x}, \boldsymbol{z}) + \boldsymbol{u} ^\top(\boldsymbol{\mu} - \boldsymbol{z}) \} \geq \inf_{\boldsymbol{u} \in \mathbb{R}^m} \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{\mu}) = \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{\mu}), \end{align*}</script> 于是有\begin{align*} q^* = \sup_{\boldsymbol{\mu} \in \mathbb{R}^m} q(\boldsymbol{\mu}) \geq \sup_{\boldsymbol{\mu} \in \mathbb{R}^m} \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{\mu}) \geq \sup_{\boldsymbol{z} \in Z} \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{z}). \end{align*}
<script id="MathJax-Element-126" type="math/tex; mode=display">\begin{align*} q^* = \sup_{\boldsymbol{\mu} \in \mathbb{R}^m} q(\boldsymbol{\mu}) \geq \sup_{\boldsymbol{\mu} \in \mathbb{R}^m} \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{\mu}) \geq \sup_{\boldsymbol{z} \in Z} \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{z}). \end{align*}</script> 第二个等号是弱对偶定理,由此可见,若极大极小等式成立,则有强对偶成立。 - 若\begin{align*} \phi(\boldsymbol{x}, \boldsymbol{z}) = cl(conc(\phi)) (\boldsymbol{x}, \boldsymbol{z}), \forall \boldsymbol{x} \in X, \boldsymbol{z} \in Z \end{align*}
<script id="MathJax-Element-127" type="math/tex; mode=display">\begin{align*} \phi(\boldsymbol{x}, \boldsymbol{z}) = cl(conc(\phi)) (\boldsymbol{x}, \boldsymbol{z}), \forall \boldsymbol{x} \in X, \boldsymbol{z} \in Z \end{align*}</script> 成立且对于\forall \boldsymbol{x} \in X<script id="MathJax-Element-128" type="math/tex">\forall \boldsymbol{x} \in X</script> ,-cl(conc(\phi)) (\boldsymbol{x}, \cdot)<script id="MathJax-Element-129" type="math/tex">-cl(conc(\phi)) (\boldsymbol{x}, \cdot)</script> 是正常函数,则有\begin{align*} q^* = \sup_{\boldsymbol{z} \in \mathbb{R}^m} q(\boldsymbol{z}) = \sup_{\boldsymbol{z} \in \mathbb{R}^m} \inf_{\boldsymbol{x} \in X} cl(conc(\phi))(\boldsymbol{x}, \boldsymbol{z}) = \sup_{\boldsymbol{z} \in Z} \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{z}), \end{align*}
<script id="MathJax-Element-130" type="math/tex; mode=display">\begin{align*} q^* = \sup_{\boldsymbol{z} \in \mathbb{R}^m} q(\boldsymbol{z}) = \sup_{\boldsymbol{z} \in \mathbb{R}^m} \inf_{\boldsymbol{x} \in X} cl(conc(\phi))(\boldsymbol{x}, \boldsymbol{z}) = \sup_{\boldsymbol{z} \in Z} \inf_{\boldsymbol{x} \in X} \phi(\boldsymbol{x}, \boldsymbol{z}), \end{align*}</script> 这意味着若\phi = cl(conc(\phi))<script id="MathJax-Element-131" type="math/tex">\phi = cl(conc(\phi))</script> ,则极大极小等式等价于强对偶关系。 - 由式(\ref{equ: concave closure}<script id="MathJax-Element-132" type="math/tex">\ref{equ: concave closure}</script> )知\sup_{\boldsymbol{z} \in Z} \phi(\boldsymbol{x}, \boldsymbol{z}) = \sup_{\boldsymbol{z} \in \mathbb{R}^m} cl(conc(\phi)) (\boldsymbol{x}, \boldsymbol{z})<script id="MathJax-Element-133" type="math/tex">\sup_{\boldsymbol{z} \in Z} \phi(\boldsymbol{x}, \boldsymbol{z}) = \sup_{\boldsymbol{z} \in \mathbb{R}^m} cl(conc(\phi)) (\boldsymbol{x}, \boldsymbol{z})</script> ,于是\begin{align*} w^* = \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in Z} \phi(\boldsymbol{x}, \boldsymbol{z}) = \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in \mathbb{R}^m} cl(conc(\phi)) (\boldsymbol{x}, \boldsymbol{z}). \end{align*}
<script id="MathJax-Element-134" type="math/tex; mode=display">\begin{align*} w^* = \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in Z} \phi(\boldsymbol{x}, \boldsymbol{z}) = \inf_{\boldsymbol{x} \in X} \sup_{\boldsymbol{z} \in \mathbb{R}^m} cl(conc(\phi)) (\boldsymbol{x}, \boldsymbol{z}). \end{align*}</script> 若对于\forall \boldsymbol{x} \in X<script id="MathJax-Element-135" type="math/tex">\forall \boldsymbol{x} \in X</script> ,-cl(conc(\phi)) (\boldsymbol{x}, \cdot)<script id="MathJax-Element-136" type="math/tex">-cl(conc(\phi)) (\boldsymbol{x}, \cdot)</script> 是正常函数,由命题4.2.2知\begin{align*} q^* = \sup_{\boldsymbol{z} \in \mathbb{R}^m} q(\boldsymbol{z}) = \sup_{\boldsymbol{z} \in \mathbb{R}^m} \inf_{\boldsymbol{x} \in X} cl(conc(\phi)) (\boldsymbol{x}, \boldsymbol{z}) \end{align*}
<script id="MathJax-Element-137" type="math/tex; mode=display">\begin{align*} q^* = \sup_{\boldsymbol{z} \in \mathbb{R}^m} q(\boldsymbol{z}) = \sup_{\boldsymbol{z} \in \mathbb{R}^m} \inf_{\boldsymbol{x} \in X} cl(conc(\phi)) (\boldsymbol{x}, \boldsymbol{z}) \end{align*}</script> 由此可知w^*<script id="MathJax-Element-138" type="math/tex">w^*</script> 和q^*<script id="MathJax-Element-139" type="math/tex">q^*</script> 分别是cl(conc(\phi))<script id="MathJax-Element-140" type="math/tex">cl(conc(\phi))</script> 的\inf \sup<script id="MathJax-Element-141" type="math/tex">\inf \sup</script> 值和\sup \inf<script id="MathJax-Element-142" type="math/tex">\sup \inf</script> 值。