Gradient descent using Newton's method: Difference between revisions

Revision as of 16:42, 31 May 2014

Definition

Gradient descent using Newton's method is a variant of gradient descent where the step size along the gradient descent is determined using Newton's method. In other words, we move the same way that we would move if we were applying Newton's method to the function restricted to the line of the gradient vector through the point.

By default, we are referring to gradient descent using one iteration of Newton's method, i.e., we stop Newton's method after one iteration. However, we can consider gradient descent using Newton's method where we use multiple iterations of Newton's method to determine the step size for gradient descent.

Learning algorithm

Explicitly, the learning algorithm is:

${\vec {x}}_{k+1}={\vec {x}}_{k}-\alpha _{k}\nabla f({\vec {x}}_{k})$

where $\alpha _{k}$ is the second derivative of $f$ along the gradient vector. Epxlicitly, if ${\vec {v}}_{k}=\nabla f({\vec {x}}_{k})$ , we have:

$\alpha _{k}={\frac {|{\vec {v}}_{k}|^{2}}{{\vec {v}}_{k}^{T}(H(f)({\vec {x}}_{k})){\vec {v}}_{k}}}$

@@ Line 4: / Line 4: @@
 By default, we are referring to gradient descent using ''one'' iteration of Newton's method, i.e., we stop Newton's method after one iteration. However, we can consider gradient descent using Newton's method where we use multiple iterations of Newton's method to determine the step size for gradient descent.
+===Learning algorithm===
+Explicitly, the learning algorithm is:
+<math>\vec{x}_{k+1} = \vec{x}_k - \alpha_k \nabla f(\vec{x}_k)</math>
+where <math>\alpha_k</math> is the second derivative of <math>f</math> along the gradient vector. Epxlicitly, if <math>\vec{v}_k = \nabla f(\vec{x}_k)</math>, we have:
+<math>\alpha_k = \frac{|\vec{v}_k|^2}{\vec{v}_k^T(H(f)(\vec{x}_k))\vec{v}_k}</math>