Gradient vector: Difference between revisions

From Calculus
No edit summary
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{multivariable analogue of|derivative}}
==Definition at a point==
==Definition at a point==


Line 11: Line 12:
Suppose <math>f</math> is a function of a vector variable <math>\overline{x}</math>. Suppose <math>\overline{c}</math> is a point in the interior of the [[domain]] of <math>f</math>, i.e., <math>f</math> is defined in an open ball centered at <math>\overline{c}</math>. The gradient vector of <math>f</math> at <math>\overline{c}</math>, denoted <math>(\nabla f)(\overline{c})</math>, is a vector <math>\overline{v}</math> satisfying the following:
Suppose <math>f</math> is a function of a vector variable <math>\overline{x}</math>. Suppose <math>\overline{c}</math> is a point in the interior of the [[domain]] of <math>f</math>, i.e., <math>f</math> is defined in an open ball centered at <math>\overline{c}</math>. The gradient vector of <math>f</math> at <math>\overline{c}</math>, denoted <math>(\nabla f)(\overline{c})</math>, is a vector <math>\overline{v}</math> satisfying the following:


* For every <matH>\epsilon > 0</math>
* For every <math>\varepsilon > 0</math>
* there exists <math>\delta > 0</math> such that
* there exists <math>\delta > 0</math> such that
* for every <math>\overline{x}</math> satisfying <math>0 < |\overline{x} - \overline{c}| < \delta</math> (in other words, <math>\overline{x}</math> is in an open ball of radius <math>\delta</math> centered at <math>\overline{c}</math>, but not qual to <math>\overline{c}</math>)
* for every <math>\overline{x}</math> satisfying <math>0 < |\overline{x} - \overline{c}| < \delta</math> (in other words, <math>\overline{x}</math> is in an open ball of radius <math>\delta</math> centered at <math>\overline{c}</math>, but not qual to <math>\overline{c}</math>)
* we have <math>|f(\overline{x}) - f(\overline{c}) - \overline{v} \cdot (\overline{x} - \overline{c})| < \epsilon|\overline{x} - \overline{c}|</math>
* we have <math>|f(\overline{x}) - f(\overline{c}) - \overline{v} \cdot (\overline{x} - \overline{c})| < \varepsilon|\overline{x} - \overline{c}|</math>


===Note on why the epsilon-delta definition is necessary===
===Note on why the epsilon-delta definition is necessary===
Line 22: Line 23:
<math>\lim_{\overline{x} \to \overline{c}} \frac{f(\overline{x}) - f(\overline{c})}{\overline{x} - \overline{c}}</math>
<math>\lim_{\overline{x} \to \overline{c}} \frac{f(\overline{x}) - f(\overline{c})}{\overline{x} - \overline{c}}</math>


Unfortunately, the above notation does not make direct sense because it is not permissible to divide a scalar by a vector. To rectify this, we revisit what the <math>\epsilon-\delta</math> definition of the derivative says. It turns out that that <math>\epsilon-\delta</matH> definition can more readily be generalized to functions of vector variables. The key insight is to use the [[dot product of vectors]].
Unfortunately, the above notation does not make direct sense because it is not permissible to divide a scalar by a vector. To rectify this, we revisit what the <math>\varepsilon-\delta</math> definition of the derivative says. It turns out that that <math>\varepsilon-\delta</matH> definition can more readily be generalized to functions of vector variables. The key insight is to use the [[dot product of vectors]].


<center>{{#widget:YouTube|id=0a9NEdMHSpI}}</center>
<center>{{#widget:YouTube|id=XFda15ReMQ8}}</center>


==Definition as a function==
==Definition as a function==
Line 34: Line 35:
If the gradient vector of <math>f</math> exists at all points of the domain of <math>f</math>, we say that <math>f</math> is differentiable everywhere on its domain.
If the gradient vector of <math>f</math> exists at all points of the domain of <math>f</math>, we say that <math>f</math> is differentiable everywhere on its domain.


<center>{{#widget:YouTube|id=cg7z5auWG30}}</center>
<center>{{#widget:YouTube|id=jyC5wEgBipg}}</center>


==Relation with directional derivatives==
==Relation with directional derivatives and partial derivatives==


===Statement of relation===
===Relation with directional derivatives===


{{further|[[Relation between gradient vector and directional derivatives]]}}
{{further|[[Relation between gradient vector and directional derivatives]]}}
Line 52: Line 53:
|}
|}


<center>{{#widget:YouTube|id=TvvSB2q5L1E}}</center>
<center>{{#widget:YouTube|id=TAeK0MVuzJU}}</center>
<center>{{#widget:YouTube|id=RvponqyVtFU}}</center>
<center>{{#widget:YouTube|id=T8WGcg_T-DM}}</center>
===Relation with directional derivatives and partial derivatives===
===Relation with partial derivatives===


If the gradient vector exists, then its coordinates are given by the partial derivatives.
{{further|[[Relation between gradient vector and partial derivatives]]}}
{| class="sortable" border="1"
! Version type !! Statement
|-
| at a point, in multivariable notation || Suppose <math>f</math> is a real-valued function of <math>n</math> variables <math>x_1,x_2,\dots,x_n</math>. Suppose <math>(a_1,a_2,\dots,a_n)</math> is a point in the domain of <math>f</math> such that the [[gradient vector]] of <math>f</math> at <math>(a_1,a_2,\dots,a_n)</math>, denoted <math>(\nabla f)(a_1,a_2,\dots,a_n)</math>, exists. Then, the [[partial derivative]]s of <math>f</math> with respect to all variables exist, and the coordinates of the gradient vector are the partial derivatives. In other words:<br><math>(\nabla f)(a_1,a_2,\dots,a_n) = \langle f_{x_1}(a_1,a_2,\dots,a_n), f_{x_2}(a_1,a_2,\dots,a_n), \dots f_{x_n}(a_1,a_2,\dots,a_n)\rangle </math>
|-
| generic point, in multivariable notation || Suppose <math>f</math> is a real-valued function of <math>n</math> variables <math>x_1,x_2,\dots,x_n</math>. Then, we have <br><math>(\nabla f)(x_1,x_2,\dots,x_n) = \langle f_{x_1}(x_1,x_2,\dots,x_n), f_{x_2}(x_1,x_2,\dots,x_n), \dots f_{x_n}(x_1,x_2,\dots,x_n)\rangle </math>.<br>Equality holds [[concept of equality conditional to existence of one side|wherever the left side makes sense]].
|-
| generic point, point-free notation || Suppose <math>f</math> is a function of <math>n</math> variables <math>x_1,x_2,\dots,x_n</math>. Then, we have <br><math>\nabla f = \langle f_{x_1}, f_{x_2}, \dots f_{x_n} \rangle</math>. Equality holds [[concept of equality conditional to existence of one side|wherever the left side makes sense]].
|}
 
<center>{{#widget:YouTube|id=84hww4kkoq8}}</center>
 
===Note on continuous partials===
 
{{further|[[Continuous partials implies differentiable]]}}
 
This says that if all the [[partial derivative]]s of a function are continuous at and around a point in the domain, then the function is in fact differentiable, hence the gradient vector is described in terms of the partial derivatives as described above.
 
In particular, if all the partials exist and are continuous everywhere, the gradient vector exists everywhere and is given as described above.
 
Note that this is significant because, ''a priori'' (i.e., without checking continuity), knowledge of the partials tells us what the gradient vector should be if it exists, but it doesn't tell us whether the gradient vector does exist. Continuity helps bridge that knowledge gap.
 
==Graphical interpretation==
 
===For a function of two variables===
 
Suppose <math>f</math> is a function of two variables <math>x,y</math> and suppose <math>(x_0,y_0)</math> is a point in the domain. We say that <math>f</math> is differentiable at a point <math>(x_0,y_0)</math> if the [[gradient vector]] exists at the point. This is equivalent to the [[graph of a function of two variables|graph of the function]] having a well defined tangent plane at <math>(x_0,y_0,f(x_0,y_0))</math>. Further, the equation of this tangent plane is given by:
 
<math>z - f(x_0,y_0) = f_x(x_0,y_0)(x - x_0) + f_y(x_0,y_0)(y - y_0)</math>
 
Another way of putting this is:
 
<math>z - f(x_0,y_0) = (\nabla f)(x_0,y_0) \cdot (\langle x,y \rangle - \langle x_0,y_0\rangle)</math>
 
Note that it is possible that the partial derivatives both exist but the function is not differentiable. In this case, the surface does not have a well defined tangent plane at the point. Even though we can define a plane by the equation above, this is not the tangent plane, because the tangent plane does not exist.


<center>{{#widget:YouTube|id=yINihD_bYzA}}</center>
<center>{{#widget:YouTube|id=sICGShQ7CHE}}{{#widget:YouTube|id=hX5KhF_Xog0}}</center>
===For a function of multiple variables===


===Proof of relation===
Suppose <math>f</math> is a function of multiple variables <matH>x_1,x_2,\dots,x_n</math> and suppose <math>(a_1,a_2,\dots,a_n)</math> is a point in the domain of <math>f</math>. We say that <math>f</math> is differentiable at <math>(a_1,a_2,\dots,a_n)</math> if the [[gradient vector]] <math>(\nabla f)(a_1,a_2,\dots,a_n)</math> exists. This is equivalent to the [[graph of a function of multiple variables|graph of the function]] having a well defined tangent hyperplane at the point <math>(a_1,a_2,\dots,a_n,f(a_1,a_2,\dot,a_n))</math>. The equation of the tangent hyperplane is given by:


{{fillin}}
<math>x_{n+1} - f(a_1,a_2,\dots,a_n) = (\nabla f)(a_1,a_2,\dots,a_n) \cdot (\langle x_1,x_2,\dots,x_n\rangle - \langle a_1,a_2,\dots,a_n \rangle)</math>

Latest revision as of 22:08, 8 May 2016

This article describes an analogue for functions of multiple variables of the following term/fact/notion for functions of one variable: derivative

Definition at a point

Generic definition

Suppose is a function of many variables. We can view as a function of a vector variable. The gradient vector at a particular point in the domain is a vector whose direction captures the direction (in the domain) along which changes to are concentrated, and whose magnitude is the directional derivative in that direction.

If the gradient vector of exists at a point, then we say that is differentiable at that point.

Formal epsilon-delta definition

Suppose is a function of a vector variable . Suppose is a point in the interior of the domain of , i.e., is defined in an open ball centered at . The gradient vector of at , denoted , is a vector satisfying the following:

  • For every
  • there exists such that
  • for every satisfying (in other words, is in an open ball of radius centered at , but not qual to )
  • we have

Note on why the epsilon-delta definition is necessary

Intuitively, we want to define the gradient vector analogously to the derivative of a function of one variable, i.e., as the limit of the difference quotient:

Unfortunately, the above notation does not make direct sense because it is not permissible to divide a scalar by a vector. To rectify this, we revisit what the definition of the derivative says. It turns out that that definition can more readily be generalized to functions of vector variables. The key insight is to use the dot product of vectors.

{{#widget:YouTube|id=XFda15ReMQ8}}

Definition as a function

Generic definition

Suppose is a function of many variables. We can view as a function of a vector variable. The gradient vector of is a vector-valued function (with vector outputs in the same dimension as vector inputs) defined as follows: it sends every point to the gradient vector of the function at the point. Note that the domain of the function is precisely the subset of the domain of where the gradient vector is defined.

If the gradient vector of exists at all points of the domain of , we say that is differentiable everywhere on its domain.

{{#widget:YouTube|id=jyC5wEgBipg}}

Relation with directional derivatives and partial derivatives

Relation with directional derivatives

For further information, refer: Relation between gradient vector and directional derivatives

Version type Statement
at a point, in vector notation (multiple variables) Suppose is a function of a vector variable . Suppose is a unit vector and is a point in the domain of . Suppose that the gradient vector of at exists. We denote this gradient vector by . Then, we have the following relationship:

The right side here is the dot product of vectors.
generic point, in vector notation (multiple variables) Suppose is a function of a vector variable . Suppose is a unit vector. We then have:

The right side here is a dot product of vectors. The equality holds whenever the right side makes sense.
generic point, point-free notation (multiple variables) Suppose is a function of a vector variable . Suppose is a unit vector. We then have:

The right side here is a dot product of vector-valued functions (the constant function and the gradient vector of ). The equality holds whenever the right side makes sense.
{{#widget:YouTube|id=TAeK0MVuzJU}}
{{#widget:YouTube|id=T8WGcg_T-DM}}

Relation with partial derivatives

For further information, refer: Relation between gradient vector and partial derivatives

Version type Statement
at a point, in multivariable notation Suppose is a real-valued function of variables . Suppose is a point in the domain of such that the gradient vector of at , denoted , exists. Then, the partial derivatives of with respect to all variables exist, and the coordinates of the gradient vector are the partial derivatives. In other words:
generic point, in multivariable notation Suppose is a real-valued function of variables . Then, we have
.
Equality holds wherever the left side makes sense.
generic point, point-free notation Suppose is a function of variables . Then, we have
. Equality holds wherever the left side makes sense.
{{#widget:YouTube|id=84hww4kkoq8}}

Note on continuous partials

For further information, refer: Continuous partials implies differentiable

This says that if all the partial derivatives of a function are continuous at and around a point in the domain, then the function is in fact differentiable, hence the gradient vector is described in terms of the partial derivatives as described above.

In particular, if all the partials exist and are continuous everywhere, the gradient vector exists everywhere and is given as described above.

Note that this is significant because, a priori (i.e., without checking continuity), knowledge of the partials tells us what the gradient vector should be if it exists, but it doesn't tell us whether the gradient vector does exist. Continuity helps bridge that knowledge gap.

Graphical interpretation

For a function of two variables

Suppose is a function of two variables and suppose is a point in the domain. We say that is differentiable at a point if the gradient vector exists at the point. This is equivalent to the graph of the function having a well defined tangent plane at . Further, the equation of this tangent plane is given by:

Another way of putting this is:

Note that it is possible that the partial derivatives both exist but the function is not differentiable. In this case, the surface does not have a well defined tangent plane at the point. Even though we can define a plane by the equation above, this is not the tangent plane, because the tangent plane does not exist.

{{#widget:YouTube|id=sICGShQ7CHE}}{{#widget:YouTube|id=hX5KhF_Xog0}}

For a function of multiple variables

Suppose is a function of multiple variables and suppose is a point in the domain of . We say that is differentiable at if the gradient vector exists. This is equivalent to the graph of the function having a well defined tangent hyperplane at the point . The equation of the tangent hyperplane is given by: