Broyden's method for root-finding for a vector-valued function of a vector variable

Definition

Suppose $x_{1}, x_{2}, \dots, x_{m}$ are variables and $F_{1}, F_{2}, \dots, F_{m}$ are real-valued functions of these variables, each of which is jointly differentiable with respect to the variables on a given domain. We can define the vector-valued function $\vec{F} (x_{1}, x_{2}, \dots, x_{m}) = (F_{1} (x_{1}, x_{2}, \dots, x_{m}), F_{2} (x_{1}, x_{2}, \dots, x_{m}), \dots, F_{m} (x_{1}, x_{2}, \dots, x_{m}))$ . If we wish, we can also treat the input variables as the coordinates of a vector $\vec{x} = (x_{1}, x_{2}, \dots, x_{m})$ . We therefore have a differentiable $\vec{F} : R^{m} \to R^{m}$ .

Broyden's method is a slight variant of Newton's method for root-finding for a vector-valued function of a vector variable. The main difference is that we do not compute the inverse of the Jacobian at each stage of the iteration. Instead, we compute it only once and use a rank one update at subsequent stages.

Iterative step

Recall that the iterative step for Newton's method is given by:

${\vec{x}}_{n} = {\vec{x}}_{n - 1} - (J (\vec{F}) ({\vec{x}}_{n - 1}))^{- 1} \vec{F} ({\vec{x}}_{n - 1})$

The idea behind Broyden's method is that, instead of computing $(J (\vec{F}) ({\vec{x}}_{n - 1}))^{- 1}$ globally each time, we compute it only the first time, and then update it each time using a rank one update. The idea is that the matrix $J_{n - 1}$ (our approximation for the Jacobian) should satisfy:

$J_{n - 1} ({\vec{x}}_{n - 1} - {\vec{x}}_{n - 2}) ≃ \vec{F} ({\vec{x}}_{n - 1}) - \vec{F} ({\vec{x}}_{n - 2})$

However, the equation above is underdetermined -- there are many matrices that would satisfy the condition. So we try to find the matrix satisfying the condition for which the Frobenius norm $| J_{n - 1} - J_{n - 2} |$ is as small as possible. When we work out the mathematics, we get the formula:

$J_{n - 1} = J_{n - 2} + \frac{Δ {\vec{F}}_{n - 2} - J_{n - 2} Δ {\vec{x}}_{n - 1}}{| Δ {\vec{x}}_{n - 2} |^{2}} Δ {\vec{x}}_{n}^{T}$ where

Δ {\vec{x}}_{n - 2} = {\vec{x}}_{n - 1} - {\vec{x}}_{n - 2}

Δ {\vec{F}}_{n - 2} = {\vec{F}}_{n - 1} - {\vec{F}}_{n - 2}

Note that the update to the Jacobian is a rank-one update, i.e., $J_{n - 1} - J_{n - 2}$ is a rank one matrix (arising as a Hadamard product of two vectors. Therefore, we can use the Sherman-Morrison formula to calculate the inverse of the matrix:

$J_{n - 1}^{- 1} = J_{n - 2}^{- 1} + \frac{Δ {\vec{x}}_{n - 2} - J_{n - 2}^{- 1} Δ {\vec{F}}_{n - 2}}{Δ {\vec{x}}_{n - 1}^{T} J_{n - 2}^{- 1} Δ {\vec{F}}_{n - 2}} (Δ {\vec{x}}_{n - 2}^{T} J_{n - 2}^{- 1})$

In words, we are updating the Jacobian along the direction of the line joining ${\vec{x}}_{n - 2}$ and ${\vec{x}}_{n - 1}$ while trying to change it as little as possible overall.

Extreme cases

In the case $m = 1$ , so that we are dealing with functions of one variable, Broyden's method reduces to the usual secant method. Explicitly, the formula becomes:

$x_{n} : = x_{n - 1} - \frac{f (x_{n - 1}}{\frac{f (x_{n - 1}) - f (x_{n - 2})}{x_{n - 1} - x_{n - 2}}}$

This simplifies to the usual secant method formula.