Gradient descent with decaying learning rate
Gradient descent with decaying learning rate is a form of gradient descent where the learning rate varies as a function of the number of iterations, but is not otherwise dependent on the value of the vector at the stage. The update rule is as follows:
where depends only on and not on the choice of .
|Type of decay||Example expression for||More information|
|linear decay||Gradient descent with linearly decaying learning rate|
|quadratic decay||Gradient descent with quadratically decaying learning rate|
|exponential decay||where||Gradient descent with exponentially decaying learning rate|