Gradient descent with decaying learning rate: Difference between revisions
(Created page with "==Definition== '''Gradient descent with decaying learning rate''' is a form of gradient descent where the learning rate varies as a function of the number of iterations,...") |
No edit summary |
||
| Line 3: | Line 3: | ||
'''Gradient descent with decaying learning rate''' is a form of [[gradient descent]] where the learning rate varies as a function of the number of iterations, but is not otherwise dependent on the value of the vector at the stage. The update rule is as follows: | '''Gradient descent with decaying learning rate''' is a form of [[gradient descent]] where the learning rate varies as a function of the number of iterations, but is not otherwise dependent on the value of the vector at the stage. The update rule is as follows: | ||
<math>\vec{x}^{(k+1)} = \vec{x}^{(k)} - \alpha_k f\left(\vec{x^{(k)}\right)</math> | <math>\vec{x}^{(k+1)} = \vec{x}^{(k)} - \alpha_k f\left(\vec{x^{(k)}}\right)</math> | ||
where <math>\alpha_k</math> depends only on <math>k</math> and not on the choice of <math>x^{(k)}</math>. | where <math>\alpha_k</math> depends only on <math>k</math> and not on the choice of <math>x^{(k)}</math>. | ||
Revision as of 15:11, 1 September 2014
Definition
Gradient descent with decaying learning rate is a form of gradient descent where the learning rate varies as a function of the number of iterations, but is not otherwise dependent on the value of the vector at the stage. The update rule is as follows:
where depends only on and not on the choice of .
Cases
| Type of decay | Example expression for | More information |
|---|---|---|
| linear decay | Gradient descent with linearly decaying learning rate | |
| quadratic decay | Gradient descent with quadratically decaying learning rate | |
| exponential decay | where | Gradient descent with exponentially decaying learning rate |