Proof of chain rule for differentiation: Difference between revisions

From Calculus
(Created page with "This article describes a proof of the chain rule for differentiation. ==Specific point, named functions version== Suppose <math>f</math> and <math>g</math> are functio...")
 
 
(2 intermediate revisions by the same user not shown)
Line 25: Line 25:
Note the following product relation between difference quotients when measured between any pair of distinct points <math>x = x_1</math> and <math>x = x_2</math>:
Note the following product relation between difference quotients when measured between any pair of distinct points <math>x = x_1</math> and <math>x = x_2</math>:


<math>\frac{\Delta v}{\Delta x} = \frac{Delta v}{\Delta u} \frac{Delta u}{\Delta x}</math>
<math>\frac{\Delta v}{\Delta x} = \frac{\Delta v}{\Delta u} \frac{\Delta u}{\Delta x}</math>


In the functional notation, this would read as saying the obvious thing that:
In the functional notation, this would read as saying the obvious thing that:
Line 49: Line 49:
In the <math>\Delta</math>-notation, we are simply taking the appropriate limits on
In the <math>\Delta</math>-notation, we are simply taking the appropriate limits on


<math>\frac{\Delta v}{\Delta x} = \frac{Delta v}{\Delta u} \frac{Delta u}{\Delta x}</math>
<math>\frac{\Delta v}{\Delta x} = \frac{\Delta v}{\Delta u} \frac{\Delta u}{\Delta x}</math>


and getting that:
and getting that:
Line 61: Line 61:
The solution to those problem is to replace the expression <math>\frac{f(g(x)) - f(g(x_0))}{g(x) - g(x_0)}</math> by the expression:
The solution to those problem is to replace the expression <math>\frac{f(g(x)) - f(g(x_0))}{g(x) - g(x_0)}</math> by the expression:


<math>\left \lbrace \begin{array}{rl} f'(g(x_0)), & \mbox{ if } g(x) = g(x_0) \\ \frac{f(g(x)) - f(g(x_0))}{g(x) - g(x_0)}, & \mbox{ if } g(x) \ne g(x_0) \\\end{array}\right.</math>
<math>H(x) := \left \lbrace \begin{array}{rl} f'(g(x_0)), & \mbox{ if } g(x) = g(x_0) \\ \frac{f(g(x)) - f(g(x_0))}{g(x) - g(x_0)}, & \mbox{ if } g(x) \ne g(x_0) \\\end{array}\right.</math>
 
Alternatively, <math>H(x) = h(g(x))</math> where:
 
<math>h(u) := \left \lbrace \begin{array}{rl} f'(u), & \mbox{ if } u = g(x_0) \\ \frac{f(u) - f(g(x_0))}{u - g(x_0)}, & \mbox{ if } u \ne g(x_0) \\\end{array}\right.</math>


In other words, we fill in the removable discontinuity before plugging it into the product expression. If we use this expression instead, the proof becomes rigorous.
In other words, we fill in the removable discontinuity before plugging it into the product expression. If we use this expression instead, the proof becomes rigorous.


Here is the full rigorous proof. {{fillin}}
Here is the full rigorous proof. We first verify that the following identity always holds for <math>x \ne x_0</math>:
 
<math>\frac{f(g(x)) - f(g(x_0))}{x - x_0} = H(x) \frac{g(x) - g(x_0)}{x - x_0}</math>
 
We prove this by making two cases:
 
* <math>g(x) = g(x_0)</math>: In this case, the left side is zero, and the right side is <math>f'(g(x_0)) \cdot 0 = 0</math>.
* <math>g(x) \ne g(x_0)</math>: In this case, cancel <math>g(x) - g(x_0)</math> between the denominator of the first expression and the numerator of the second expression on the right side to get the left side. This is similar to the buggy proof.
 
Since the equality now holds for all <math>x \ne x_0</math>, it makes sense to try to take the limit. We have:
 
<math>\lim_{x \to x_0} \frac{f(g(x)) - f(g(x_0))}{x - x_0} = \lim_{x \to x_0} \left[H(x) \frac{g(x) - g(x_0)}{x - x_0}\right]</math>
 
We now split the limit for the product on the right side. Note that such a splitting is a gamble, and works only if both pieces have clear limits.
 
<math>\lim_{x \to x_0} \frac{f(g(x)) - f(g(x_0))}{x - x_0} = \lim_{x \to x_0} H(x) \lim_{x \to x_0} \frac{g(x) - g(x_0)}{x - x_0}</math>
 
The left side is <math>(f \circ g)'(x_0)</math>. The second expression on the right side is <math>g'(x_0)</math>. It remains to calculate <math>\lim_{x \to x_0} H(x)</math>. This is <math>\lim_{x \to x_0} h(g(x))</math>. We compute this as follows:
 
* The differentiability of <math>f</math> at <math>g(x_0)</math> guarantees that <math>h</math> is continuous at <math>g(x_0)</math>.
* The differentiability of <math>g</math> at <math>x_0</math> guarantees that <math>g</math> is continuous at <math>x_0</math>.
 
With both these facts, we get that <math>\lim_{x \to x_0} h(g(x)) = h(g(x_0)) = f'(g(x_0))</math> by definition. Plugging this back in completes the proof.

Latest revision as of 03:24, 21 January 2013

This article describes a proof of the chain rule for differentiation.

Specific point, named functions version

Suppose and are functions such that is differentiable at a point , and is differentiable at . Then the composite is differentiable at , and we have:

Pure Leibniz notation version

Suppose is a function of and is a function of . Then,

Proof

Intuitive proof using the pure Leibniz notation version

The following intuitive proof is not rigorous, but captures the underlying idea:

  • Start with the expression .
  • Cancel the between the denominator and the numerator.
  • We are left with .

First attempt at formalizing the intuition

This again is not a complete proof, but it gets closer:

Note the following product relation between difference quotients when measured between any pair of distinct points and :

In the functional notation, this would read as saying the obvious thing that:

Now, set and let be some number close enough to . We get:

Now, take the limit on both sides as , and we get:

Since is differentiable at , is continuous at as well (using the fact that differentiable implies continuous) and thus the first limit on the right side can be taken as:

The three limits are now definitionally derivatives, so we get:

In the -notation, we are simply taking the appropriate limits on

and getting that:

Fixing the bug in the proof

The proof above is correct in essentials, but has one bug -- namely, the issue that may be equal to zero for , making the difference quotient undefined at these points. This is not an issue if this happens only at finitely many points, because we can take the limit close enough. It does become an issue, however, if at points arbitrarily close to .

The solution to those problem is to replace the expression by the expression:

Alternatively, where:

In other words, we fill in the removable discontinuity before plugging it into the product expression. If we use this expression instead, the proof becomes rigorous.

Here is the full rigorous proof. We first verify that the following identity always holds for :

We prove this by making two cases:

  • : In this case, the left side is zero, and the right side is .
  • : In this case, cancel between the denominator of the first expression and the numerator of the second expression on the right side to get the left side. This is similar to the buggy proof.

Since the equality now holds for all , it makes sense to try to take the limit. We have:

We now split the limit for the product on the right side. Note that such a splitting is a gamble, and works only if both pieces have clear limits.

The left side is . The second expression on the right side is . It remains to calculate . This is . We compute this as follows:

  • The differentiability of at guarantees that is continuous at .
  • The differentiability of at guarantees that is continuous at .

With both these facts, we get that by definition. Plugging this back in completes the proof.