Summary table of multivariable derivatives: Difference between revisions
| (7 intermediate revisions by the same user not shown) | |||
| Line 31: | Line 31: | ||
|- | |- | ||
| Directional derivative in the direction of <math>v</math> || <math>D_v f</math> or <math>\partial_v f</math> || <math>\mathbf R^n \to \mathbf R</math> || <math>D_v f(x) = \lim_{t \to 0} \frac{f(x + tv) - f(x)}{t}</math> || When <math>v = e_j</math>, this reduces to the <math>j</math>th partial derivative. | | Directional derivative in the direction of <math>v</math> || <math>D_v f</math> or <math>\partial_v f</math> || <math>\mathbf R^n \to \mathbf R</math> || <math>D_v f(x) = \lim_{t \to 0} \frac{f(x + tv) - f(x)}{t}</math> || When <math>v = e_j</math>, this reduces to the <math>j</math>th partial derivative. | ||
|- | |||
| Total derivative with respect to the <math>j</math>th variable || <math>\frac{df}{dx_j}</math> || <math>\mathbf R \to \mathbf R</math> || For <math>i \ne j</math>, we treat the variable <math>x_i = g_i(x_j)</math> as a function of <math>x_j</math>, and take the single-variable derivative with respect to <math>x_j</math> (more formally, <math>g : \mathbf R \to \mathbf R^n</math> is a function such that the <math>j</math>th component <math>g_j = \mathrm{id}</math> is the identity function). From the chain rule this becomes <math>\frac{df}{dx_j} = \nabla f(x) \cdot g'(x) = \frac{\partial f}{\partial x_1} \frac{dx_1}{dx_j} + \cdots + \frac{\partial f}{\partial x_n} \frac{dx_n}{dx_j}</math> || | |||
|} | |} | ||
I think in this case, since <math>f'(x_0)(v)</math> coincides with <math>\nabla f(x_0)\cdot v</math>, people don't usually define the derivative separately. For example, Folland in ''Advanced Calculus'' defines ''differentiability'' but not the derivative! He just says that the vector that makes a function differentiable is the gradient. | I think in this case, since <math>f'(x_0)(v)</math> coincides with <math>\nabla f(x_0)\cdot v</math>, people don't usually define the derivative separately. For example, Folland in ''Advanced Calculus'' defines ''differentiability'' but not the derivative! He just says that the vector that makes a function differentiable is the gradient. | ||
"Total derivative" is used for two different things (which coincide in the special case where they both make sense); see my answer https://math.stackexchange.com/a/3698838/35525 for details. | |||
TODO: answer questions like "Is the gradient the derivative?" | TODO: answer questions like "Is the gradient the derivative?" | ||
| Line 62: | Line 66: | ||
| Directional derivative in the direction of <math>v</math> || <math>D_v f</math> or <math>\partial_v f</math> || <math>\mathbf R^n \to \mathbf R^m</math> || <math>D_v f(x) = \lim_{t \to 0} \frac{f(x + tv) - f(x)}{t}</math> || | | Directional derivative in the direction of <math>v</math> || <math>D_v f</math> or <math>\partial_v f</math> || <math>\mathbf R^n \to \mathbf R^m</math> || <math>D_v f(x) = \lim_{t \to 0} \frac{f(x + tv) - f(x)}{t}</math> || | ||
|- | |- | ||
| Total or Fréchet derivative (sometimes just called the derivative) at point <math>x_0\in \mathbf R^n</math> || <math>f'(x_0)</math> or <math>(Df)_{x_0}</math> or <math>d_{x_0}f</math> || <math>\mathbf R^n \to \mathbf R^m</math> || The linear transformation <math>L</math> such that <math>\lim_{x\to x_0} \frac{\|f(x) - f(x_0) - L(x-x_0)\|}{\|x-x_0\|} = 0 </math> || The derivative ''at a given point'' is a linear transformation. One might wonder then what the derivative (without giving a point) is, i.e. what meaning to assign to "<math>f'</math>" as we can in the single-variable case. Its type would have to be <math>\mathbf R^n \to \mathbf R^n \to \mathbf R^m</math> or more specifically <math>\mathbf R^n \to \mathcal L(\mathbf R^n, \mathbf R^m)</math>. Also the notation <math>f'(x_0)</math> is slightly confusing: if the total derivative is a function, what happens if <math>n=m=1</math>? We see that <math>f'(x_0)\colon \mathbf R \to \mathbf R</math>, so the single-variable derivative isn't actually a number! To get the actual slope of the tangent line, we must evaluate the function at <math>1</math>: <math>f'(x_0)(1) \in \mathbf R</math>. Some authors avoid this by using different notation in the general multivariable case. Others accept this type error and ignore it. | | Total or Fréchet derivative (sometimes just called the derivative) at point <math>x_0\in \mathbf R^n</math> || <math>f'(x_0)</math> or <math>(Df)_{x_0}</math> or <math>d_{x_0}f</math> || <math>\mathbf R^n \to \mathbf R^m</math> || The linear transformation <math>L</math> such that <math>\lim_{x\to x_0} \frac{\|f(x) - f(x_0) - L(x-x_0)\|}{\|x-x_0\|} = 0 </math> || The derivative ''at a given point'' is a linear transformation. One might wonder then what the derivative (without giving a point) is, i.e. what meaning to assign to "<math>f'</math>" as we can in the single-variable case. Its type would have to be <math>\mathbf R^n \to \mathbf R^n \to \mathbf R^m</math> or more specifically <math>\mathbf R^n \to \mathcal L(\mathbf R^n, \mathbf R^m)</math> (where <math>\mathcal L(\mathbf R^n, \mathbf R^m)</math> is the set of linear transformations from <math>\mathbf R^n</math> to <math>\mathbf R^m</math>). Also the notation <math>f'(x_0)</math> is slightly confusing: if the total derivative is a function, what happens if <math>n=m=1</math>? We see that <math>f'(x_0)\colon \mathbf R \to \mathbf R</math>, so the single-variable derivative isn't actually a number! To get the actual slope of the tangent line, we must evaluate the function at <math>1</math>: <math>f'(x_0)(1) \in \mathbf R</math>. Some authors avoid this by using different notation in the general multivariable case. Others accept this type error and ignore it. | ||
|- | |- | ||
| Derivative matrix, differential matrix, Jacobian matrix at point <math>x_0\in \mathbf R^n</math> || <math>Df(x_0)</math> or <math>\mathcal M(f'(x_0))</math> || <math>\mathcal M_{m,n}(\mathbf R)</math> || <math>\begin{pmatrix}\partial_1 f_1(x_0) & \cdots & \partial_n f_1(x_0) \\ \vdots & \ddots & \vdots \\ \partial_1 f_n(x_0) & \cdots & \partial_n f_n(x_0)\end{pmatrix}</math> || Since the total derivative is a linear transformation, and since linear transformations from <math>\mathbf R^n</math> to <math>\mathbf R^m</math> have a one-to-one correspondence with real-valued <math>m</math> by <math>n</math> matrices, the behavior of the total derivative can be summarized in a matrix; that summary is the derivative matrix. Some authors say that the total derivative ''is'' the matrix. TODO: talk about gradient vectors as rows. | | Derivative matrix, differential matrix, Jacobian matrix at point <math>x_0\in \mathbf R^n</math> || <math>Df(x_0)</math> or <math>\mathcal M(f'(x_0))</math> || <math>\mathcal M_{m,n}(\mathbf R)</math> || <math>\begin{pmatrix}\partial_1 f_1(x_0) & \cdots & \partial_n f_1(x_0) \\ \vdots & \ddots & \vdots \\ \partial_1 f_n(x_0) & \cdots & \partial_n f_n(x_0)\end{pmatrix}</math> || Since the total derivative is a linear transformation, and since linear transformations from <math>\mathbf R^n</math> to <math>\mathbf R^m</math> have a one-to-one correspondence with real-valued <math>m</math> by <math>n</math> matrices, the behavior of the total derivative can be summarized in a matrix; that summary is the derivative matrix. Some authors say that the total derivative ''is'' the matrix. TODO: talk about gradient vectors as rows. | ||
|- | |||
| Total derivative with respect to the <math>j</math>th variable || <math>\frac{df}{dx_j}</math> || || || | |||
|} | |} | ||
| Line 84: | Line 90: | ||
==External links== | ==External links== | ||
* this post does a similar thing: https://reallyeli.com/posts/total_derivative.html | |||
Latest revision as of 20:28, 17 June 2020
This page is a summary table of multivariable derivatives.
- TODO maybe good to have separate rows for evaluated and pre-evaluated versions, for things that are functions/can be applied
Single-variable real function
For comparison and completeness, we give a summary table of the single-variable derivative. Let be a single-variable real function.
| Term | Notation | Type | Definition | Notes |
|---|---|---|---|---|
| Derivative of | or | |||
| Derivative of at | or or | In the most general multivariable case, will become a linear transformation, so analogously we may wish to talk about the single-variable as the function defined by , where on the left side "" is a function and on the right side "" is a number. If "" is a function, we can evaluate it at to recover the number: . This is pretty confusing, and in practice everyone thinks of "" in the single-variable case as a number, making the notation divergent; see Notational confusion of multivariable derivatives § The derivative as a linear transformation in the several variable case and a number in the single-variable case for more information. |
Real-valued function of Rn
Let be a real-valued function of .
| Term | Notation | Type | Definition | Notes |
|---|---|---|---|---|
| Partial derivative of with respect to its th variable | or or or or | Here is the th vector of the standard basis, i.e. the vector with all zeroes except a one in the th spot. Therefore can also be written when broken down into components. | ||
| Gradient | ||||
| Gradient at | or | or the vector such that | ||
| Directional derivative in the direction of | or | When , this reduces to the th partial derivative. | ||
| Total derivative with respect to the th variable | For , we treat the variable as a function of , and take the single-variable derivative with respect to (more formally, is a function such that the th component is the identity function). From the chain rule this becomes |
I think in this case, since coincides with , people don't usually define the derivative separately. For example, Folland in Advanced Calculus defines differentiability but not the derivative! He just says that the vector that makes a function differentiable is the gradient.
"Total derivative" is used for two different things (which coincide in the special case where they both make sense); see my answer https://math.stackexchange.com/a/3698838/35525 for details.
TODO: answer questions like "Is the gradient the derivative?"
Vector-valued function of R
Let be a vector-valued function of . A parametric curve (or parametrized curve) is an example of this. Since the function is vector-valued, some authors use a boldface letter like .
| Term | Notation | Type | Definition | Notes |
|---|---|---|---|---|
| Velocity vector at | or |
Note the absence for partial/directional derivatives. There is only one variable with respect to which we can differentiate, so there is no direction to choose from.
Vector-valued function of Rn
Let be a vector-valued function of . Since the function is vector-valued, some authors use a boldface letter like .
| Term | Notation | Type | Definition | Notes |
|---|---|---|---|---|
| Partial derivative with respect to the th variable | or or or or | |||
| Directional derivative in the direction of | or | |||
| Total or Fréchet derivative (sometimes just called the derivative) at point | or or | The linear transformation such that | The derivative at a given point is a linear transformation. One might wonder then what the derivative (without giving a point) is, i.e. what meaning to assign to "" as we can in the single-variable case. Its type would have to be or more specifically (where is the set of linear transformations from to ). Also the notation is slightly confusing: if the total derivative is a function, what happens if ? We see that , so the single-variable derivative isn't actually a number! To get the actual slope of the tangent line, we must evaluate the function at : . Some authors avoid this by using different notation in the general multivariable case. Others accept this type error and ignore it. | |
| Derivative matrix, differential matrix, Jacobian matrix at point | or | Since the total derivative is a linear transformation, and since linear transformations from to have a one-to-one correspondence with real-valued by matrices, the behavior of the total derivative can be summarized in a matrix; that summary is the derivative matrix. Some authors say that the total derivative is the matrix. TODO: talk about gradient vectors as rows. | ||
| Total derivative with respect to the th variable |
Note the absence of the gradient in the above table. The generalization of the gradient to the case is the derivative matrix.
See also
- Notational confusion of multivariable derivatives
- Relation between gradient vector and partial derivatives
- Relation between gradient vector and directional derivatives
- Directional derivative
- machinelearning:Summary table of probability terms
References
- Tao, Terence. Analysis II. 2nd ed. Hindustan Book Agency. 2009.
- Folland, Gerald B. Advanced Calculus. Pearson. 2002.
- Pugh, Charles Chapman. Real Mathematical Analysis. Springer. 2010.
External links
- this post does a similar thing: https://reallyeli.com/posts/total_derivative.html