Numerical differentiation

Definition

Numerical differentiation refers to a method for computing the approximate numerical value of the derivative of a function at a point in the domain as a difference quotient. Explicitly, the numerical derivative $f'(x)$ of a function $f$ at a point $x$ may be computed using either of these three formulas, for $h$ a sufficiently small positive real number:

Expression	Interpretation of limit as $h\to 0$
Forward difference quotient ${\frac {f(x+h)-f(x)}{h}}$ , comes from the forward difference form of the finite difference	The right-hand derivative $f'_{+}(x)$ . If $f$ is differentiable at $x$ , this equals the two-sided derivative $f'(x)$ .
Backward difference quotient ${\frac {f(x)-f(x-h)}{h}}$ , comes from the backward difference form of the finite difference	The left-hand derivative $f'_{-}(x)$ . If $f$ is differentiable at $x$ , this equals the two-sided derivative $f'(x)$ .
Central difference quotient ${\frac {f(x+(h/2))-f(x-(h/2))}{h}}$ , comes from the central difference form of the finite difference	If $f$ is differentiable at $x$ , this equals the two-sided derivative $f'(x)$ . Otherwise, however, it does not have any direct interpretation as a one-sided derivative of $f$ .

Relative precision of the formulas

The central difference quotient provides substantially greater precision but requires somewhat more computation

Suppose that $f$ has a Taylor series around $x$ . In other words, we can expand:

$f(x+h)=f(x)+hf'(x)+{\frac {h^{2}}{2}}f''(x)+{\frac {h^{3}}{3!}}f'''(x)+\dots$

In this case, the three methods for approximating the derivative give us the following results:

Method	Computed approximate value to $f'(x)$	Error term (computed value minus actual value)	Order of convergence (smallest exponent on $h$ with nonzero coefficient in Taylor expansion of error) -- higher order is better
forward difference quotient ${\frac {f(x+h)-f(x)}{h}}$	$f'(x)+{\frac {h}{2}}f''(x)+{\frac {h^{2}}{6}}f'''(x)+\dots$	${\frac {h}{2}}f''(x)+{\frac {h^{2}}{6}}f'''(x)+\dots$	1
backward difference quotient ${\frac {f(x)-f(x-h)}{h}}$	$f'(x)-{\frac {h}{2}}f''(x)+{\frac {h^{2}}{6}}f'''(x)+\dots$	$-{\frac {h}{2}}f''(x)+{\frac {h^{2}}{6}}f'''(x)+\dots$	1
central difference quotient ${\frac {f(x+(h/2))-f(x-(h/2))}{h}}$	$f'(x)+{\frac {h^{2}}{12}}f'''(x)+\dots$	${\frac {h^{2}}{12}}f'''(x)+\dots$	2

We therefore see that the central difference quotient computes a substantially more precise value for the derivative.

Note that we for the result above to hold, we do not require the function to have a Taylor series; we only need the function to be three or more times continuously differentiable (in fact, a somewhat weaker version holds if the function is only twice continuously differentiable).

We can understand the trade-off between precision and computation as follows:

The central difference quotient requires the computation of the function at two points other than $x$ . If the function value at $x$ is already known, this requires twice the computational load required by the forward difference quotient or backward difference quotient.
The output for the central difference quotient converges quadratically to the correct value. Therefore, we can get answers close to the actual derivative for smaller values of $h$ . If we need $h\approx 10^{-6}$ to get a certain level of precision for the forward difference quotient, we only need $h\approx 10^{-3}$ to obtain similar precision for the central difference quotient.

How the trade-off between greater computational cost and greater precision plays out depends on the nature of the function computation cost and how that changes as we make $h$ smaller.

In case of uncertainty regarding differentiability or in case of measurement error, we should not use a single difference quotient but should plot multiple points

In the case of uncertainty regarding whether the function is indeed differentiable, or in case of concern about measurement error in the value of the derivative, we should consider plotting the value of the function at the point and nearby points and fitting a linear function, or a low-degree Taylor polynomial function, through the points, using polynomial regression if necessary to find the best fit.

The simplest form of such a check is that the points of the graph corresponding to the domain points $x-(h/2),x,x+(h/2)$ are almost collinear on the graph of $f$ .

Application of numerical differentiation to functions of more than one variable

Suppose $f$ is a function of $n$ variables. We can use the method of numerical differentiation to calculate the partial derivatives of $f$ , and hence the gradient vector of $f$ . The idea is to compute each partial derivative using a difference quotient. Note that:

If we use forward difference quotients for all the partial derivatives, we need to evaluate the function at $(n+1)$ points: the point at which we are making the computation, and $n$ points, each obtained by moving a bit from the starting point in one coordinate direction. Explicitly, if the point at which we are trying to compute the gradient vector is $(x_{1},x_{2},\dots ,x_{n})$ and we choose a value of $h$ , we need to calculate the function values at the points:

$(x_{1},x_{2},\dots ,x_{n}),(x_{1}+h,x_{2},\dots ,x_{n}),(x_{1},x_{2}+h,\dots ,x_{n}),\dots ,(x_{1},x_{2},\dots ,x_{n}+h)$

Note that it is not necessary to choose the same $h$ in all directions. We could choose different increments $h_{1},h_{2},\dots ,h_{n}$ in the different directions, and use the values at the corresponding points:

$(x_{1},x_{2},\dots ,x_{n}),(x_{1}+h_{1},x_{2},\dots ,x_{n}),(x_{1},x_{2}+h_{2},\dots ,x_{n}),\dots ,(x_{1},x_{2},\dots ,x_{n}+h_{n})$

Similarly, if we are using backward difference quotients, or a mix of forward and backward difference quotients, we need to compute the function value at a total of $n+1$ points.
If we use central difference quotients, we need a total of $2n$ function value computations to calculate all the partial derivatives.