Proof of chain rule for differentiation: Difference between revisions

Latest revision as of 03:24, 21 January 2013

This article describes a proof of the chain rule for differentiation.

Specific point, named functions version

Suppose $f$ and $g$ are functions such that $g$ is differentiable at a point $x=x_{0}$ , and $f$ is differentiable at $g(x_{0})$ . Then the composite $f\circ g$ is differentiable at $x_{0}$ , and we have:
$\!{\frac {d}{dx}}[f(g(x))]|_{x=x_{0}}=f'(g(x_{0}))g'(x_{0})$

Pure Leibniz notation version

Suppose $u=g(x)$ is a function of $x$ and $v=f(u)$ is a function of $u$ . Then,
${\frac {dv}{dx}}={\frac {dv}{du}}{\frac {du}{dx}}$

Proof

Intuitive proof using the pure Leibniz notation version

The following intuitive proof is not rigorous, but captures the underlying idea:

Start with the expression $\!{\frac {dv}{du}}{\frac {du}{dx}}$ .
Cancel the $du$ between the denominator and the numerator.
We are left with $\!{\frac {dv}{dx}}$ .

First attempt at formalizing the intuition

This again is not a complete proof, but it gets closer:

Note the following product relation between difference quotients when measured between any pair of distinct points $x=x_{1}$ and $x=x_{2}$ :

${\frac {\Delta v}{\Delta x}}={\frac {\Delta v}{\Delta u}}{\frac {\Delta u}{\Delta x}}$

In the functional notation, this would read as saying the obvious thing that:

${\frac {f(g(x_{2}))-f(g(x_{1}))}{x_{2}-x_{1}}}={\frac {f(g(x_{2}))-f(g(x_{1}))}{g(x_{2})-g(x_{1})}}{\frac {g(x_{2})-g(x_{1})}{x_{2}-x_{1}}}$

Now, set $x_{1}=x_{0}$ and let $x_{2}=x$ be some number close enough to $x_{0}$ . We get:

${\frac {f(g(x))-f(g(x_{0}))}{x-x_{0}}}={\frac {f(g(x))-f(g(x_{0}))}{g(x)-g(x_{0})}}{\frac {g(x)-g(x_{0})}{x-x_{0}}}$

Now, take the limit on both sides as $x\to x_{0}$ , and we get:

$\lim _{x\to x_{0}}{\frac {f(g(x))-f(g(x_{0}))}{x-x_{0}}}=\lim _{x\to x_{0}}{\frac {f(g(x))-f(g(x_{0}))}{g(x)-g(x_{0})}}\lim _{x\to x_{0}}{\frac {g(x)-g(x_{0})}{x-x_{0}}}$

Since $g$ is differentiable at $x_{0}$ , $g$ is continuous at $x_{0}$ as well (using the fact that differentiable implies continuous) and thus the first limit on the right side can be taken as:

$\lim _{x\to x_{0}}{\frac {f(g(x))-f(g(x_{0}))}{x-x_{0}}}=\lim _{g(x)\to g(x_{0})}{\frac {f(g(x))-f(g(x_{0}))}{g(x)-g(x_{0})}}\lim _{x\to x_{0}}{\frac {g(x)-g(x_{0})}{x-x_{0}}}$

The three limits are now definitionally derivatives, so we get:

$(f\circ g)'(x_{0})=f'(g(x_{0}))g'(x_{0})$

In the $\Delta$ -notation, we are simply taking the appropriate limits on

${\frac {\Delta v}{\Delta x}}={\frac {\Delta v}{\Delta u}}{\frac {\Delta u}{\Delta x}}$

and getting that:

${\frac {dv}{dx}}={\frac {dv}{du}}{\frac {du}{dx}}$

Fixing the bug in the proof

The proof above is correct in essentials, but has one bug -- namely, the issue that $\Delta u=g(x)-g(x_{0})$ may be equal to zero for $x\neq x_{0}$ , making the difference quotient $\Delta v/\Delta u$ undefined at these points. This is not an issue if this happens only at finitely many points, because we can take the limit close enough. It does become an issue, however, if $g(x)=g(x_{0})$ at points $x$ arbitrarily close to $x_{0}$ .

The solution to those problem is to replace the expression ${\frac {f(g(x))-f(g(x_{0}))}{g(x)-g(x_{0})}}$ by the expression:

$H(x):=\left\lbrace {\begin{array}{rl}f'(g(x_{0})),&{\mbox{ if }}g(x)=g(x_{0})\\{\frac {f(g(x))-f(g(x_{0}))}{g(x)-g(x_{0})}},&{\mbox{ if }}g(x)\neq g(x_{0})\\\end{array}}\right.$

Alternatively, $H(x)=h(g(x))$ where:

$h(u):=\left\lbrace {\begin{array}{rl}f'(u),&{\mbox{ if }}u=g(x_{0})\\{\frac {f(u)-f(g(x_{0}))}{u-g(x_{0})}},&{\mbox{ if }}u\neq g(x_{0})\\\end{array}}\right.$

In other words, we fill in the removable discontinuity before plugging it into the product expression. If we use this expression instead, the proof becomes rigorous.

Here is the full rigorous proof. We first verify that the following identity always holds for $x\neq x_{0}$ :

${\frac {f(g(x))-f(g(x_{0}))}{x-x_{0}}}=H(x){\frac {g(x)-g(x_{0})}{x-x_{0}}}$

We prove this by making two cases:

$g(x)=g(x_{0})$ : In this case, the left side is zero, and the right side is $f'(g(x_{0}))\cdot 0=0$ .
$g(x)\neq g(x_{0})$ : In this case, cancel $g(x)-g(x_{0})$ between the denominator of the first expression and the numerator of the second expression on the right side to get the left side. This is similar to the buggy proof.

Since the equality now holds for all $x\neq x_{0}$ , it makes sense to try to take the limit. We have:

$\lim _{x\to x_{0}}{\frac {f(g(x))-f(g(x_{0}))}{x-x_{0}}}=\lim _{x\to x_{0}}\left[H(x){\frac {g(x)-g(x_{0})}{x-x_{0}}}\right]$

We now split the limit for the product on the right side. Note that such a splitting is a gamble, and works only if both pieces have clear limits.

$\lim _{x\to x_{0}}{\frac {f(g(x))-f(g(x_{0}))}{x-x_{0}}}=\lim _{x\to x_{0}}H(x)\lim _{x\to x_{0}}{\frac {g(x)-g(x_{0})}{x-x_{0}}}$

The left side is $(f\circ g)'(x_{0})$ . The second expression on the right side is $g'(x_{0})$ . It remains to calculate $\lim _{x\to x_{0}}H(x)$ . This is $\lim _{x\to x_{0}}h(g(x))$ . We compute this as follows:

The differentiability of $f$ at $g(x_{0})$ guarantees that $h$ is continuous at $g(x_{0})$ .
The differentiability of $g$ at $x_{0}$ guarantees that $g$ is continuous at $x_{0}$ .

With both these facts, we get that $\lim _{x\to x_{0}}h(g(x))=h(g(x_{0}))=f'(g(x_{0}))$ by definition. Plugging this back in completes the proof.

@@ Line 25: / Line 25: @@
 Note the following product relation between difference quotients when measured between any pair of distinct points <math>x = x_1</math> and <math>x = x_2</math>:
-<math>\frac{\Delta v}{\Delta x} = \frac{Delta v}{\Delta u} \frac{Delta u}{\Delta x}</math>
+<math>\frac{\Delta v}{\Delta x} = \frac{\Delta v}{\Delta u} \frac{\Delta u}{\Delta x}</math>
 In the functional notation, this would read as saying the obvious thing that:
@@ Line 49: / Line 49: @@
 In the <math>\Delta</math>-notation, we are simply taking the appropriate limits on
-<math>\frac{\Delta v}{\Delta x} = \frac{Delta v}{\Delta u} \frac{Delta u}{\Delta x}</math>
+<math>\frac{\Delta v}{\Delta x} = \frac{\Delta v}{\Delta u} \frac{\Delta u}{\Delta x}</math>
 and getting that:
@@ Line 61: / Line 61: @@
 The solution to those problem is to replace the expression <math>\frac{f(g(x)) - f(g(x_0))}{g(x) - g(x_0)}</math> by the expression:
-<math>\left \lbrace \begin{array}{rl} f'(g(x_0)), & \mbox{ if } g(x) = g(x_0) \\ \frac{f(g(x)) - f(g(x_0))}{g(x) - g(x_0)}, & \mbox{ if } g(x) \ne g(x_0) \\\end{array}\right.</math>
+<math>H(x) := \left \lbrace \begin{array}{rl} f'(g(x_0)), & \mbox{ if } g(x) = g(x_0) \\ \frac{f(g(x)) - f(g(x_0))}{g(x) - g(x_0)}, & \mbox{ if } g(x) \ne g(x_0) \\\end{array}\right.</math>
+Alternatively, <math>H(x) = h(g(x))</math> where:
+<math>h(u) := \left \lbrace \begin{array}{rl} f'(u), & \mbox{ if } u = g(x_0) \\ \frac{f(u) - f(g(x_0))}{u - g(x_0)}, & \mbox{ if } u \ne g(x_0) \\\end{array}\right.</math>
 In other words, we fill in the removable discontinuity before plugging it into the product expression. If we use this expression instead, the proof becomes rigorous.
-Here is the full rigorous proof. {{fillin}}
+Here is the full rigorous proof. We first verify that the following identity always holds for <math>x \ne x_0</math>:
+<math>\frac{f(g(x)) - f(g(x_0))}{x - x_0} = H(x) \frac{g(x) - g(x_0)}{x - x_0}</math>
+We prove this by making two cases:
+* <math>g(x) = g(x_0)</math>: In this case, the left side is zero, and the right side is <math>f'(g(x_0)) \cdot 0 = 0</math>.
+* <math>g(x) \ne g(x_0)</math>: In this case, cancel <math>g(x) - g(x_0)</math> between the denominator of the first expression and the numerator of the second expression on the right side to get the left side. This is similar to the buggy proof.
+Since the equality now holds for all <math>x \ne x_0</math>, it makes sense to try to take the limit. We have:
+<math>\lim_{x \to x_0} \frac{f(g(x)) - f(g(x_0))}{x - x_0} = \lim_{x \to x_0} \left[H(x) \frac{g(x) - g(x_0)}{x - x_0}\right]</math>
+We now split the limit for the product on the right side. Note that such a splitting is a gamble, and works only if both pieces have clear limits.
+<math>\lim_{x \to x_0} \frac{f(g(x)) - f(g(x_0))}{x - x_0} = \lim_{x \to x_0} H(x) \lim_{x \to x_0} \frac{g(x) - g(x_0)}{x - x_0}</math>
+The left side is <math>(f \circ g)'(x_0)</math>. The second expression on the right side is <math>g'(x_0)</math>. It remains to calculate <math>\lim_{x \to x_0} H(x)</math>. This is <math>\lim_{x \to x_0} h(g(x))</math>. We compute this as follows:
+* The differentiability of <math>f</math> at <math>g(x_0)</math> guarantees that <math>h</math> is continuous at <math>g(x_0)</math>.
+* The differentiability of <math>g</math> at <math>x_0</math> guarantees that <math>g</math> is continuous at <math>x_0</math>.
+With both these facts, we get that <math>\lim_{x \to x_0} h(g(x)) = h(g(x_0)) = f'(g(x_0))</math> by definition. Plugging this back in completes the proof.