Nesterov's gradient acceleration

From Calculus
Revision as of 03:48, 9 September 2014 by Vipul (talk | contribs) (Created page with "==Definition== '''Nesterov's gradient acceleration''' refers to a general approach that can be used to modify a gradient descent-type method to improve its initial conver...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Definition

Nesterov's gradient acceleration refers to a general approach that can be used to modify a gradient descent-type method to improve its initial convergence.

The two-step iteration description

In this description, there are two intertwined sequences of iterates that constitute our guesses:

x(0),x(1),x(2),

y(0),y(1),y(2),

Explicitly, the sequences are intertwined as follows:

x(0),y(0),x(1),y(1),x(2),y(2),

We use parenthesized superscripts to denote the iteration stage, because the subscripts are reserved for use by the coordinates.

For the initialization, we set:

x(0)=y(0)=Initial guess

The two-step iteration is as follows:

  • x(k+1)=Gradient descent-type method applied to y(k)
  • y(k+1)=x(k+1)+m(k+1)(x(k+1)x(k))

Note that the sign on the right side is not a typographical error: y(k+1) is a non-convex combination of x(k) and x(k+1). The value m(k+1) is a member of a predetermined sequence of real numbers in the interval [0,1) that is independent of the specific problem. A typical expression is m(k)=(k1)/(k+2) (so m(k+1)=k/(k+3)).