Gradient descent with decaying learning rate: Difference between revisions

From Calculus
(Created page with "==Definition== '''Gradient descent with decaying learning rate''' is a form of gradient descent where the learning rate varies as a function of the number of iterations,...")
 
No edit summary
Line 3: Line 3:
'''Gradient descent with decaying learning rate''' is a form of [[gradient descent]] where the learning rate varies as a function of the number of iterations, but is not otherwise dependent on the value of the vector at the stage. The update rule is as follows:
'''Gradient descent with decaying learning rate''' is a form of [[gradient descent]] where the learning rate varies as a function of the number of iterations, but is not otherwise dependent on the value of the vector at the stage. The update rule is as follows:


<math>\vec{x}^{(k+1)} = \vec{x}^{(k)} - \alpha_k f\left(\vec{x^{(k)}\right)</math>
<math>\vec{x}^{(k+1)} = \vec{x}^{(k)} - \alpha_k f\left(\vec{x^{(k)}}\right)</math>


where <math>\alpha_k</math> depends only on <math>k</math> and not on the choice of <math>x^{(k)}</math>.
where <math>\alpha_k</math> depends only on <math>k</math> and not on the choice of <math>x^{(k)}</math>.

Revision as of 15:11, 1 September 2014

Definition

Gradient descent with decaying learning rate is a form of gradient descent where the learning rate varies as a function of the number of iterations, but is not otherwise dependent on the value of the vector at the stage. The update rule is as follows:

x(k+1)=x(k)αkf(x(k))

where αk depends only on k and not on the choice of x(k).

Cases

Type of decay Example expression for αk More information
linear decay αk=α0k+1 Gradient descent with linearly decaying learning rate
quadratic decay αk=α0(k+1)2 Gradient descent with quadratically decaying learning rate
exponential decay αk=α0eβk where β>0 Gradient descent with exponentially decaying learning rate