Bisection method: Difference between revisions
No edit summary |
|||
Line 103: | Line 103: | ||
<math>(\operatorname{sgn} \circ f)(x) := \left \lbrace \begin{array}{rl} 1, & \mbox{ if } f(x) > 0 \\ -1, & \mbox{ if } f(x) < 0 \\ 0, & \mbox{ if } f(x) = 0\\\end{array}\right.</math> | <math>(\operatorname{sgn} \circ f)(x) := \left \lbrace \begin{array}{rl} 1, & \mbox{ if } f(x) > 0 \\ -1, & \mbox{ if } f(x) < 0 \\ 0, & \mbox{ if } f(x) = 0\\\end{array}\right.</math> | ||
From the computational perspective, there is an important caveat to the above: what matters for the signum function is not whether the actual value of <math>f</math> is positive, negative, or zero, but rather, whether the value as computed is definitely positive, definitely negative, or numerically indistinguishable from zero. | |||
Modulo this computational caveat, two functions that are positive at the same places and negative at the same places would exhibit the same behavior with respect to the bisection method. In particular, if <math>q(x) = (p(x))^3</math>, <math>p</math> and <math>q</math> behave identically under the bisection method. | |||
===The process sacrifices smart use of information about the function for a guaranteed convergence rate=== | ===The process sacrifices smart use of information about the function for a guaranteed convergence rate=== |
Revision as of 04:24, 26 April 2014
This article is about a root-finding algorithm. See all root-finding algorithms
Definition
The bisection method, also called the interval halving method, binary search method, and dichotomy method, is a root-finding algorithm.
Summary
Item | Value |
---|---|
Initial condition | works for a continuous function (or more generally, a function satisfying the intermediate value property) on an interval given that and have opposite signs. |
Iterative step | At stage , we know that has a root in an interval . We test at . We then define the new interval as the left half if the signs of at and oppose one another, and as the right half otherwise. |
Convergence rate | The size of the interval within which we are guaranteed to have a root halves at each step. The distance between the root and our "best guess" at any stage (say, the midpoint of the guaranteed interval) has an upper bound that halves at each step, so we have linear convergence. |
Computational tools needed | Floating-point arithmetic to compute averages Ability to compute the value of a function at a point, or more minimalistically, determine whether the value is positive or negative. |
Termination stage | We may terminate the algorithm based either on the size of the interval in the domain (i.e., we know that we are close to a root) or the closeness of the function value to zero. How we terminate depends on our goals. |
Initial condition
The bisection method works for a continuous function (or more generally, a function satisfying the intermediate value property) on an interval given that and have opposite signs.
The bisection method can be used to find a root of a continuous function on a connected interval if we are able to locate two points in the domain of the function where it has opposite signs. We simply restrict the function to that domain and apply the method.
Iterative step
Let .
At stage for a nonnegative integer.
Prior knowledge:
- and are both numerically distinguishable from zero (so they have defined signs, positive or negative) and they have opposite signs.
- Combining that with the fact that is a continuous function, the intermediate value theorem tells us that has a root on .
Iterative step:
- Compute .
- If it is equal to (or numerically indistinguishable from) zero, then return as the root and terminate the algorithm.
- If has sign opposite to (and therefore the same as ), then choose and , so the new interval (for the next iteration) is .
- If has sign opposite to (and therefore the same as ), the nchoose and , so the new interval (for the next iteration) is .
Convergence rate
At stage , we can define our best guess as the midpoint .
We can measure the convergence rate by looking at the size of the interval within which a root is guaranteed, and how this changes with time. We notice that the interval size halves at each stage, so that . The distance between and the location of an actual root is bounded from above by half the interval length, so this also asymptotically halves with each iteration (at worst, and on average).
This is a form of linear convergence.
Computational tools needed
- We need to be able to compute halves of intervals. This is most easily done using floating-point binary arithmetic.
- We also need to be able to compute function values at particular points, and determine the signs of these values. Note that we do not care about computing the function value per se but we do need reliable information about its size.
Termination
Domain-based termination
We may terminate the algorithm when the size of the interval within which a root is guaranteed has fallen below a certain pre-specified length . The number of steps for such termination can be predicted in advance as:
At this stage, we may return the interval or the midpoint, depending on the desired format of the answer.
Output-based termination
We may terminate the algorithm once the value of the function at the midpoint is sufficiently close to zero.
The number of stages needed for such termination cannot be computed simply by knowing the length of the domain. However, if is differentiable and we have an upper bound on the magnitude of the derivative of , then we know that if we are within distance of the root on the domain, the absolute value of the function value is at most . We can therefore put an upper bound on the number of steps necessary.
Selection of root found and sensitivity to interval choice
This finds only one root, not all roots
It's worth noting that this process finds only one root, not all roots. Consider, for instance, the function:
on the interval . This is negative at 0 and positive at 6. At the midpoint 3, it is positive, so our first iteration picks the left half of the interval, namely . The process will then gradually converge to the root 1. The roots 4 and 5 get missed out. The reason they get missed out is that because an even number of them appeared in the test interval , we had the same sign of the function at both ends of the interval.
The root found may transition as we move either endpoint of the interval
Consider a function:
Suppose we apply the bisection method to determine a root of on the interval where . Note that the sign of is negative at 0 and positive at , so the bisection method is applicable. However, what root it converges to depends on the value of . Explicitly:
- If , then the first iteration yields the interval , with , and therefore we converge to the root 1 (which is the only root in the interval).
- If , then the first iteration yields the interval , with , and therefore we converge to the root 5 (which is the only root in the interval).
- More generally, if for a nonnegative integer, we converge to the root 1. On the other hand, if for a nonnegative integer, we converge to the root 5.
The qualitative behavior depends purely on the signum of the function
In order to determine how the bisection method works for a particular function , it suffices to know the function , i.e., the composite of the signum function and . Explicitly, the function that predicts the way the bisection method will unfold is the function:
From the computational perspective, there is an important caveat to the above: what matters for the signum function is not whether the actual value of is positive, negative, or zero, but rather, whether the value as computed is definitely positive, definitely negative, or numerically indistinguishable from zero.
Modulo this computational caveat, two functions that are positive at the same places and negative at the same places would exhibit the same behavior with respect to the bisection method. In particular, if , and behave identically under the bisection method.
The process sacrifices smart use of information about the function for a guaranteed convergence rate
By always picking the midpoint, we may be ignoring valuable information about just how far from zero the values of the functions at the endpoints are. There are some other methods that are better suited to making use of this information. However, these methods either work only for some functions or take more of a "gamble": they could go more horribly wrong. The methods include: