Newton's method converges quadratically from sufficiently close to a root of multiplicity one

From Calculus

Statement

Suppose f is a function of one variable that is at least two times continuously differentiable around a point x* in its domain, and further, f(x*)0. Then, there exists ε>0 such that for any x(0)(x*ε,x*+ε), the sequence obtained by applying Newton's method for root-finding for a function of one variable starting from x(0) either reaches the root in finitely many steps or has quadratic convergence (or higher-order convergence) to the root x*.

Case that the second derivative is nonzero

In the case that f(x*)0, we get precisely quadratic convergence from sufficiently close to the root, with convergence rate 12|f(x*)f(x*)|.

General case of an infinitely differentiable function with finite order of zero

Suppose that, of the integers greater than or equal to 2, r is the smallest integer such that f(r)(x*)0. Then, we get convergence of order r, with convergence rate:

r1r!|f(r)(x*)f(x*)|

Corresponding statement for optimization

The preceding statements are in the context of Newton's method for root-finding for a function of one variable. They also imply corresponding statements for Newton's method for optimization of a function of one variable, but all the derivative orders increase by one. Explicitly, for a function F, the statements for finding the minimum of F correspond to statements for funding the zeros of F (i.e., set f=F to easily translate statements back and forth).

Suppose we have that x* is a point of local minimum. Suppose F is at least three times continuously differentiable at x*, and F(x*)0. Then, there exists a value ε>0 such that for all x(0)(x*ε,x*+ε), Newton's method for optimization, starting at x(0), converges to x*, and the convergence is quadratic or of higher order.

Case that the third derivative is nonzero

In the case that F(x*)0, the convergence is precisely quadratic convergence and the convergence rate is:

12|F(x*)F(x*)|

General case of an infinitely differentiable function with finite order of zero

Suppose that, of the integers greater than or equal to 2, r is the smallest integer such that F(r+1)(x*)0. Then, we get convergence of order r, with convergence rate:

r1r!|F(r+1)(x*)F(x*)|

Related facts

Proof

Proof for the case that the second derivative is nonzero

Since the function is twice continuously differentiable around x*, we have a value δ such that f has a partial Taylor expansion around x* (going up to degree two) for x(x*δ,x*+δ). Explicitly, if h=xx*, we should be able to write:

f(x)=f(x*+h)=f(x*)+hf(x*)+h22f(x*)+o(h2)

where the o(h2) formally means that:

limh0f(x*+h)f(x*)hf(x*)h22f(x*)h2=0

Note that in case the function is three or more times differentiable, that term is actually O(h3).

Suppose x(k), the kth iterate of Newton's method, is in the interval (x*δ,x*+δ). Define:


h(k)=x(k)x*

We thus obtain:

f(x(k))=f(x*)+h(k)f(x*)+(h(k))22f(x*)+o((h(k))2)

Since f(x*)=0, this simplifies to:

f(x(k))=h(k)(f(x*)+h(k)2f(x*)+o(h(k)))

We also have:

f(x(k))=f(x*)+h(k)f(x*)+o(h(k))

We therefore obtain:

f(x(k))f(x(k))=h(k)(1+h(k)2f(x*)f(x*)+o(h(k)))(1+h(k)f(x*)f(x*)+o(h(k)))

Taylor-expanding the reciprocal of the denominator and simplifying, we obtain:

f(x(k))f(x(k))=h(k)(1h(k)2f(x*)f(x*)+o(h(k)))

We have:

x(k+1)=x(k)f(x(k))f(x(k))

Recall that h(k)=x(k)x*. Similarly, h(k+1)=x(k+1)x*. Thus:

h(k+1)=h(k)h(k)(1h(k)2f(x*)f(x*)+o(h(k)))

This simplifies to:

h(k+1)=(h(k))22f(x*)f(x*)+o((h(k))2)

Notice that if h(k) is sufficiently small, then we are guaranteed that |h(k+1)|<|h(k)|, so that the corresponding statement holds for h(k+1). In particular, we obtain convergence, and moreover, we obtain that:

limk|x(k+1)x*||x(k)x*|2=limk|h(k+1)||h(k)|2=12|f(x*)f(x*)|

Proof for the case that a higher finite order derivative is nonzero

Consider the case that f(x*)0, and of the integers that are at least 2, r is the smallest one such that f(r)(x*)0. The proof remains very similar to the above, with the following changes.

We get:

f(x(k))=h(k)(f(x*)+(h(k))r1r!f(r)(x*)+o((h(k))r1))

and:

f(x(k))=f(x*)+(h(k))r1(r1)!f(r)(x*)+o((h(k))r1)

Dividing and Taylor-expanding, we get:

f(x(k))f(x(k))=h(k)(1+(1r!1(r1)!)(h(k))r1f(r1)(x*)f(x*)+o((h(k))r1))

Thus, we get:

h(k+1)=h(k)f(x(k))f(x(k))=r1r!(h(k))rf(r)(x*)f(x*)+o((h(k))r)

Thus, we get:

|h(k+1)(h(k))r|=|r1r!f(r)(x*)f(x*)+o(1)|

So that, if we start from sufficiently close, the limit is:

r1r!|f(r)(x*)f(x*)|