# Bisection method

## Definition

The bisection method, also called the interval halving method, binary search method, and dichotomy method, is a root-finding algorithm.

## Summary

Item Value
Initial condition works for a continuous function $f$ (or more generally, a function $f$ satisfying the intermediate value property) on an interval $[a,b]$ given that $f(a)$ and $f(b)$ have opposite signs.
Iterative step At stage $i$, we know that $f$ has a root in an interval $[a_i,b_i]$. We test $f$ at $(a_i + b_i)/2$. We then define the new interval $[a_{i+1},b_{i+1}]$ as the left half $[a_i, (a_i+b_i)/2]$ if the signs of $f$ at $a_i$ and $(a_i + b_i)/2$ oppose one another, and as the right half otherwise.
Convergence rate The size of the interval within which we are guaranteed to have a root halves at each step.
The distance between the root and our "best guess" at any stage (say, the midpoint of the guaranteed interval) has an upper bound that halves at each step, so we have linear convergence.
Computational tools needed Floating-point arithmetic to compute averages
Ability to compute the value of a function at a point, or more minimalistically, determine whether the value is positive or negative.
Termination stage We may terminate the algorithm based either on the size of the interval in the domain (i.e., we know that we are close to a root) or the closeness of the function value to zero. How we terminate depends on our goals.

## Initial condition

The bisection method works for a continuous function $f$ (or more generally, a function $f$ satisfying the intermediate value property) on an interval $[a,b]$ given that $f(a)$ and $f(b)$ have opposite signs.

The bisection method can be used to find a root of a continuous function on a connected interval if we are able to locate two points in the domain of the function where it has opposite signs. We simply restrict the function to that domain and apply the method.

## Iterative step

Let $a_0 = a, b_0 = b$.

At stage $n$ for $n$ a positive integer.

Prior knowledge:

• $f(a_{n-1})$ and $f(b_{n-1})$ are both numerically distinguishable from zero (so they have defined signs, positive or negative) and they have opposite signs.
• Combining that with the fact that $f$ is a continuous function, the intermediate value theorem tells us that $f$ has a root on $[a_{n-1},b_{n-1}]$.

Iterative step:

• Compute $f((a_{n - 1} + b_{n - 1})/2)$.
• If it is equal to (or numerically indistinguishable from) zero, then return $(a_{n-1} + b_{n-1})/2$ as the root and terminate the algorithm.
• If $f((a_{n-1} + b_{n-1})/2)$ has sign opposite to $f(a_{n-1})$ (and therefore the same as $f(b_{n-1})$), then choose $a_n = a_{n - 1}$ and $b_n = (a_{n-1} + b_{n-1})/2$, so the new interval (for the next iteration) is $[a_n,b_n] = [a_{n-1}, (a_{n-1}+b_{n-1})/2]$.
• If $f((a_{n-1} + b_{n-1})/2)$ has sign opposite to $f(b_{n-1})$ (and therefore the same as $f(a_{n-1})$), then choose $a_n = (a_{n-1} + b_{n-1})/2$ and $b_n = b_{n-1}$, so the new interval (for the next iteration) is $[a_n,b_n] = [(a_{n-1}+b_{n-1})/2, b_{n-1}]$.

## Convergence rate

### Rate of decline in guaranteed interval size

We have the relationship:

$b_n - a_n= \frac{b_{n-1} - a_{n-1}}{2}$

Thus, we obtain that:

$b_n - a_n = \frac{b_0 - a_0}{2^n} = \frac{b - a}{2^n}$

In particular, the size of each interval is a linear multiple of the size of the preceding one, with a fixed constant of proportionality 1/2, and the overall interval size decays exponentially with $n$. This fits the paradigm of linear convergence.

### Convergence rate for best guess point estimate

At stage $n$, we can define our best guess $c_i$ as the midpoint $(a_n + b_n)/2$.

The sequence $(c_n)$ converges to a root, say $\alpha$. We need to determine the rate of convergence.

The convergence is not necessarily steady: some steps could get us closer to the root than others, and some steps may move us farther from the root in absolute terms. Therefore, the appropriate notion of convergence rate should rely on the R-convergence idea: bound the error from above by a known sequence:

We know that:

$\alpha \in (a_i,b_i), \mbox{ i.e., } a_i < \alpha < b_i$

Equivalently:

$|c_n - \alpha| < \frac{b_n - a_n}{2}$

Thus, we can define a sequence:

$\varepsilon_n = \frac{b_n - a_n}{2} = \frac{b - a}{2^{n+1}}$

Clearly, the sequence $\varepsilon_n$ has linear convergence to zero, threfore the original sequence of point has R-linear convergence to zero.

## Computational tools needed

• We need to be able to compute halves of intervals. This is most easily done using floating-point binary arithmetic.
• We also need to be able to compute function values at particular points, and determine the signs of these values. Note that we do not care about computing the function value per se but we do need reliable information about its size.

## Termination

### Domain-based termination

We may terminate the algorithm when the size of the interval within which a root is guaranteed has fallen below a certain pre-specified length $\ell$. The number of steps for such termination can be predicted in advance as:

$\left\lceil \log_2 \left(\frac{|b - a|}{\ell}\right)\right\rceil$

At this stage, we may return the interval or the midpoint, depending on the desired format of the answer.

### Output-based termination

We may terminate the algorithm once the value of the function at the midpoint is sufficiently close to zero.

The number of stages needed for such termination cannot be computed simply by knowing the length of the domain. However, if $f$ is differentiable and we have an upper bound $B$ on the magnitude of the derivative of $f$, then we know that if we are within $\varepsilon/B$ distance of the root on the domain, the absolute value of the function value is at most $\varepsilon$. We can therefore put an upper bound on the number of steps necessary.

$\left\lceil \log_2 \left(\frac{|b - a|B}{\varepsilon}\right)\right\rceil$

## Selection of root found and sensitivity to interval choice

### This finds only one root, not all roots

It's worth noting that this process finds only one root, not all roots. Consider, for instance, the function:

$f(x) := (x - 1)(x - 4)(x - 5)$

on the interval $[0,6]$. This is negative at 0 and positive at 6. At the midpoint 3, it is positive, so our first iteration picks the left half of the interval, namely $[0,3]$. The process will then gradually converge to the root 1. The roots 4 and 5 get missed out. The reason they get missed out is that because an even number of them appeared in the test interval $[3,6]$, we had the same sign of the function at both ends of the interval.

### The root found may transition as we move either endpoint of the interval

Consider a function:

$g(x) := (x - 1)(x - 3)(x - 5)$

Suppose we apply the bisection method to determine a root of $g$ on the interval $[0,a]$ where $a > 5$. Note that the sign of $g$ is negative at 0 and positive at $a$, so the bisection method is applicable. However, what root it converges to depends on the value of $a$. Explicitly:

• If $5 < a < 6$, then the first iteration yields the interval $[0,a/2]$, with $a/2 < 3$, and therefore we converge to the root 1 (which is the only root in the interval).
• If $6 < a < 10$, then the first iteration yields the interval $[a/2,a]$, with $a/2 > 3$, and therefore we converge to the root 5 (which is the only root in the interval).
• More generally, if $5 \cdot 2^n < a < 6 \cdot 2^n$ for $n$ a nonnegative integer, we converge to the root 1. On the other hand, if $6 \cdot 2^n < a < 10 \cdot 2^n$ for $n$ a nonnegative integer, we converge to the root 5.

### The process sacrifices smart use of information about the function for a guaranteed convergence rate

By always picking the midpoint, we may be ignoring valuable information about just how far from zero the values of the functions at the endpoints are. There are some other methods that are better suited to making use of this information. However, these methods either work only for some functions or take more of a "gamble": they could go more horribly wrong. The methods include:

## Sensitivity to the form in which the function is presented

### The qualitative behavior depends purely on the signum of the function

In order to determine how the bisection method works for a particular function $f$, it suffices to know the function $\operatorname{sgn} \circ f$, i.e., the composite of the signum function and $f$. Explicitly, the function that predicts the way the bisection method will unfold is the function:

$(\operatorname{sgn} \circ f)(x) := \left \lbrace \begin{array}{rl} 1, & \mbox{ if } f(x) > 0 \\ -1, & \mbox{ if } f(x) < 0 \\ 0, & \mbox{ if } f(x) = 0\\\end{array}\right.$

Further, it is also invariant under the flipping of all signs. So it is dependent on the signum of the function, up to global sign.

From the computational perspective, there is an important caveat to the above: what matters for the signum function is not whether the actual value of $f$ is positive, negative, or zero, but rather, whether the value as computed is definitely positive, definitely negative, or numerically indistinguishable from zero.

Modulo this computational caveat, two functions that are positive at the same places and negative at the same places would exhibit the same behavior with respect to the bisection method. In particular, if $q(x) = (p(x))^3$, $p$ and $q$ behave identically under the bisection method.

### Invariance under scalar multiplication

The qualitative behavior of the bisection method (and in particular, the sequence of nested intervals obtained while running the method) is invariant under scalar multiplication of the function by a constant. In other words, the functions $f$ and $x \mapsto \lambda f(x)$, for a constant $\lambda \ne 0$, behave the same way. This is modulo the caveat about numerical precision. Note, however, that given a particular function, multiplying it by a constant can never improve the precision (because we cannot get more information than already exists) so the key point remains that replacing the function by a scalar multiple doesn't help.

### Covariance under affine transformations on the domain

Consider a function $f$ and a a function $x \mapsto f(mx + c)$ for $m \ne 0$ and $c \in \R$. The functions behave the same way with respect to the bisection method, provided that we adjust the values of $x$ by the same scalar multiple. In other words, the nested intervals for the second function would correspond to the interval with endpoints at $(1/m) (x - c)$ for an interval endpoint $x$ for the function $f$. Note that in the case that $m$ is negative, the roles of left and right would also get switched.

## The case of polynomials

### Polynomials with distinct roots

Consider a polynomial of the form:

$f(x) = g(x)(x - \alpha_1)(x - \alpha_2) \dots (x - \alpha_n)$

on an interval $[a,b]$ where $g$ has constant sign on the interval (and in particular, has no root) and $a <\alpha_1 < \alpha_2 < \dots < \alpha_n < b$. Also assume that $n$ is odd. Thus, the bisection method can be applied to the function $f$.

In order to determine the value of $i$ for which the bisection method converges to $\alpha_i$, it suffices to know the values:

$\left\{ \frac{\alpha_1 - a}{b - a}, \frac{\alpha_2 - a}{b - a}, \dots, \frac{\alpha_n - a}{b - a}\right\}$

Moreover, one of the following must be true for the $\alpha_i$ to which the bisection method converges:

• The quotient $\frac{\alpha_i - a}{b - a}$ is a dyadic rational, i.e., it has the form $t/2^s$ for an integer $t$ and positive integer $s$, and the bisection method terminates in finitely many steps at precisely $\alpha_i$.
• $i$ is odd, so there are an even number of roots in the interval $(a,\alpha_i)$ and an even number of roots in the interval $(\alpha_i,b)$, and the bisection method converges to the root $\alpha_i$ but does not reach it in finitely many steps: The rationale for this case is that every time we narrow the interval, we discard either an even number of roots on the left or an even number of roots on the right.

### Polynomials with some root repetition

Consider a polynomial of the form:

$f(x) = g(x)(x - \alpha_1)^{m_1}(x - \alpha_2)^{m_2} \dots (x - \alpha_n)^{m_n}$

on an interval $[a,b]$ where $g$ has constant sign on the interval (and in particular, has no root) and $a <\alpha_1 < \alpha_2 < \dots < \alpha_n < b$. Further, assume that $f(a)$ and $f(b)$ have opposite signs. This is equivalent to the assumption that $\sum_{j=1}^n m_j$ is odd, or equivalently, that the number of $m_j$s that are odd is odd.

It is still the case that the behavior can be predicted by knowledge of:

$\left\{ \frac{\alpha_1 - a}{b - a}, \frac{\alpha_2 - a}{b - a}, \dots, \frac{\alpha_n - a}{b - a}\right\}$

along with knowledge of the tuple $(m_1,m_2,\dots,m_n)$. Moreover, one of the following must be true for the $\alpha_i$ to which the process converges:

• The quotient $\frac{\alpha_i - a}{b - a}$ is a dyadic rational, i.e., it has the form $t/2^s$ for an integer $t$ and positive integer $s$, and the bisection method terminates in finitely many steps at precisely $\alpha_i$.
• $m_i$ is odd, and the number of $m_j$s, $j < i$, with $m_j$ odd, is even (the latter is equivalent to requiring that the number of $m_j$s, $j > i$, with $m_j$ odd, is even) and the bisection method converges to the root but does not terminate at it in finitely many steps.

Further, note the following: with the exception of the situation that the bisection method terminates in finitely many steps (and this can happen only if the relevant quotient $(\alpha_i - a)/(b - a)$ is a dyadic rational) what matters is only the set of values $\alpha_j$ for which $m_j$ is odd. We do not need to know the values of the $\alpha_j$s for which $m_j$ is even, nor do we know the values of the odd $m_j$s themselves.