Gradient descent with decaying learning rate

Definition

Gradient descent with decaying learning rate is a form of gradient descent where the learning rate varies as a function of the number of iterations, but is not otherwise dependent on the value of the vector at the stage. The update rule is as follows:

${\vec {x}}^{(k+1)}={\vec {x}}^{(k)}-\alpha _{k}f\left({\vec {x}}^{(k)}\right)$

where $\alpha _{k}$ depends only on $k$ and not on the choice of $x^{(k)}$ .

Cases

Type of decay	Example expression for $\alpha _{k}$	More information
linear decay	$\alpha _{k}={\frac {\alpha _{0}}{k+1}}$	Gradient descent with linearly decaying learning rate
quadratic decay	$\alpha _{k}={\frac {\alpha _{0}}{(k+1)^{2}}}$	Gradient descent with quadratically decaying learning rate
exponential decay	$\alpha _{k}=\alpha _{0}e^{-\beta k}$ where $\beta >0$	Gradient descent with exponentially decaying learning rate