In mathematics, mirror descent is an iterative optimization algorithm for finding a local minimum of a differentiable function.
It generalizes algorithms such as gradient descent and multiplicative weights.
Mirror descent was originally proposed by Nemirovski and Yudin in 1983. [1]
In gradient descent with the sequence of learning rates applied to a differentiable function , one starts with a guess for a local minimum of and considers the sequence such that
This can be reformulated by noting that
In other words, minimizes the first-order approximation to at with added proximity term .
This squared Euclidean distance term is a particular example of a Bregman distance. Using other Bregman distances will yield other algorithms such as Hedge which may be more suited to optimization over particular geometries. [2] [3]
We are given convex function to optimize over a convex set , and given some norm on .
We are also given differentiable convex function , - strongly convex with respect to the given norm. This is called the distance-generating function, and its gradient is known as the mirror map.
Starting from initial , in each iteration of Mirror Descent:
Mirror descent in the online optimization setting is known as Online Mirror Descent (OMD). [4]
In mathematics, mirror descent is an iterative optimization algorithm for finding a local minimum of a differentiable function.
It generalizes algorithms such as gradient descent and multiplicative weights.
Mirror descent was originally proposed by Nemirovski and Yudin in 1983. [1]
In gradient descent with the sequence of learning rates applied to a differentiable function , one starts with a guess for a local minimum of and considers the sequence such that
This can be reformulated by noting that
In other words, minimizes the first-order approximation to at with added proximity term .
This squared Euclidean distance term is a particular example of a Bregman distance. Using other Bregman distances will yield other algorithms such as Hedge which may be more suited to optimization over particular geometries. [2] [3]
We are given convex function to optimize over a convex set , and given some norm on .
We are also given differentiable convex function , - strongly convex with respect to the given norm. This is called the distance-generating function, and its gradient is known as the mirror map.
Starting from initial , in each iteration of Mirror Descent:
Mirror descent in the online optimization setting is known as Online Mirror Descent (OMD). [4]