Accelerated gradient method
The term accelerated gradient method is used for variants of gradient descent that involve an "acceleration" or "momentum" term.
In terms of a gradient descent step and a momentum step
A typical accelerated gradient method carries out two steps in every iteration:
- A gradient descent-type step, that moves along the direction of the gradient vector (this could be executed in any of a number of ways, such as gradient descent with constant learning rate, or parallel coordinate descent with constant learning rate).
- A momentum or acceleration step, where we move further in the line from a previous iterate to the current iterate, using a momentum term to decide how much to move.
In terms of a sequence of global quadratic approximations
Fill this in later