Partial derivative: Difference between revisions

From Calculus
 
(17 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{multivariable analogue of|derivative}}
==Definition at a point==
==Definition at a point==


===Generic definition===
===Generic definition===


Suppose <math>f</math> is a function of more than one variable, where <math>x</math> is one of the input variables to <math>f</math>. Fix a choice <math>x = x_0</math> and fix the values of all the other variables. The '''partial derivative''' of <math>f</math> with respect to <math>x</math>, denoted <math>\partial f/\partial x</math>, or <math>f_x</math>, is defined as the derivative at <math>x_0</math> of the function that sends <math>x</math> to <math>f</math> at <math>x</math> for the same fixed choice of the other input variables.
Suppose <math>f</math> is a function of more than one variable, where <math>x</math> is one of the input variables to <math>f</math>. Fix a choice <math>x = x_0</math> and fix the values of all the other variables. The '''partial derivative''' of <math>f</math> with respect to <math>x</math> ''at the point'', denoted <math>\partial f/\partial x</math>, or <math>f_x</math>, is defined as the derivative at <math>x_0</math> of the function that sends <math>x</math> to <math>f</math> at <math>x</math> for the same fixed choice of the other input variables.


<center>{{#widget:YouTube|id=wPEMBebMm64}}</center>
<center>{{#widget:YouTube|id=rsZc1SYuiig}}</center>


===For a function of two variables===
===For a function of two variables===
Line 23: Line 25:
|}
|}


<center>{{#widget:YouTube|id=vC89LB9mK1g}}{{#widget:YouTube|id=Tj4Nf6xWZh0}}</center>
<center>{{#widget:YouTube|id=P9ciiweXh_A}}{{#widget:YouTube|id=j6TthcrlvoU}}</center>


===For a function of multiple variables===
===For a function of multiple variables===
Line 43: Line 45:
|}
|}


<center>{{#widget:YouTube|id=dt-hRS9VNWs}}</center>
<center>{{#widget:YouTube|id=bWn-XXfMleE}}</center>


==Definition as a function==
==Definition as a function==
Line 70: Line 72:
|}
|}


<center>{{#widget:YouTube|id=fvyCAWWFyKA}}</center>
<center>{{#widget:YouTube|id=0zu9BstjjZA}}</center>


===For a function of multiple variables===
===For a function of multiple variables===
Line 86: Line 88:
|}
|}


==Related notions==
==Graphical interpretation==


* [[Derivative]]
===For a function of two variables at a point===
* [[Higher derivative]]
* [[Higher partial derivative]]


==Caveats==
Suppose <math>f</math> is a function of two variables <math>x,y</math> and <math>(x_0,y_0)</math> is a point in the domain of the function. Consider the graph of <math>f</math> in three-dimensional space, given by <math>z =f(x,y)</math>.


===Value of partial derivative depends on all inputs===
We have the following:


{{further|[[Value of partial derivative depends on all inputs]]}}
{| class="sortable" border="1"
! Partial derivative !! Graphical interpretation
|-
| The partial derivative <math>f_x(x_0,y_0)</math> at a point <math>(x_0,y_0)</math> in the domain of the function || The slope of the tangent line at <math>x=  x_0</math> to the restriction of the graph of <math>f</math> to the plane <math>y = y_0</math>.
|-
| The partial derivative <math>f_y(x_0,y_0)</math> at a point <math>(x_0,y_0)</math> in the domain of the function || The slope of the tangent line at <math>y = y_0</math> to the restriction of the graph of <math>f</math> to the plane <math>x = x_0</math>.
|}


The ''general'' expression for the partial derivative with respect to any of the inputs is an expression in terms of ''all'' the inputs. For instance, the ''general'' expression for <math>f_x(x,y)</math> is an expression involving both <math>x</math> and <math>y</math>. This is because, even though the <math>y</math>-coordinate is kept constant when calculating the partial derivative at a ''particular'' point, that constant value need not be the same at all the points.
<center>{{#widget:YouTube|id=yMu-ZIoGjTo}}</center>


For instance, consider:
===For a function of multiple variables at a point===


<math>f(x,y) := x^2 + y^2 + xy^2</math>
Suppose <math>f</math> is a function of <math>n</math> variables <math>x_1,x_2,\dots,x_n</math> and suppose <math>(a_1,a_2,\dots,a_n)</math> is a point in the domain of <math>f</math>. Consider the graph of <math>f</math> in <math>\R^{n+1}</matH> given by:


Then, we have:
<math>x_{n+1} = f(x_1,x_2,\dots,x_n)</math>


<math>f_x(x,y) = 2x + y^2</math>
For any <math>i \in \{ 1,2,\dots,n\}</math>, we define the partial derivative <math>f_{x_i}(a_1,a_2,\dots,a_n)</math>, also denoted <math>f_i(a_1,a_2,\dots,a_n)</math>, as follows:


and:
* First, consider the intersection of the graph of <math>f</math> with the plane given by the set of <math>n - 1</math> equations <math>x_j = a_j</math> for all <math>j \ne i</math>. This is a plane parallel to the <math>x_ix_{n+1}</math>-plane.
* In this plane, consider the slope of the tangent line at <math>x_i =a_i</math>. This is the value of the partial derivative.


<math>f_y(x,y) = 2y + 2xy</math>
==Related notions==


Note that ''each'' of the expressions involves ''both'' the variables <math>x</math> and <math>y</math>. In particular, this means that the ''value'' of <math>f_x</math> at a point depends on ''both'' the <math>x</math>-coordinate and the <math>y</math>-coordinate of the point. Thus, for instance, <math>f_x(2,3) = 13</math> and <math>f_x(2,4) = 20</math> are ''distinct'' because of the different <math>y</math>-values.
* [[Derivative]]
* [[Higher derivative]]
* [[Higher partial derivative]]
 
==Domain considerations==
 
As already noted in the definition of partial derivative, the domain of the partial derivative of a function with respect to a variable is a subset of the domain of the function. However, we can actually say a little more.


In fact, the only cases where the partial derivative with respect to one variable depends only on that variable is where the function is additively separable in terms of a function purely of that variable and a function of the other variables.
===For a function of two variables===


<center>{{#widget:YouTube|id=XYydvau-qIQ}}</center>
Suppose <math>f</math> is a function of two variables <math>x,y</math>. Then, a ''necessary'' condition for us to make sense of the partial derivative <math>f_x</math> at a point <math>(x_0,y_0)</math> is that <math>f</math> be defined on a small open interval about the point <math>x_0</math> on the line <matH>y = y_0</math>. Note that it is ''not necessary'' that <math>f</math> actually be defined in an open ball surrounding the point <math>(x_0,y_0)</math> -- the only thing that matters is that <math>f</math> be defined under slight perturbations of <math>x</math>, holding <math>y</math> constant.


===Meaning of partial derivative depends on all variables===
Similar remarks apply to <math>f_y</math>: a ''necessary'' condition for us to make sense of the partial derivative <math>f_y</math> at a point <math>(x_0,y_0)</math> is that <math>f</math> be defined on a small open interval about the point <math>y_0</math> on the line <matH>x = x_0</math>.


{{further|[[Meaning of partial derivative depends on entire coordinate system]]}}
Consider, for instance, a function defined on the set <math>[0,1] \times [0,1]</math>, i.e., the set <math>\{ (x,y) \mid 0 \le x \le 1, 0 \le y \le 1 \}</math>. It makes sense to try computing the partial derivative <math>f_x</math> at all points in the subset <math>(0,1) \times [0,1]</math>, i.e., all points whose <math>x</math>-coordinate is ''strictly'' between <math>0</math> and <matH>1</math>, but the <math>y</matH>-coordinate is allowed to take the extreme values 0 and 1. Similarly, it makes sense to try computing the partial derivative <math>f_y</math> at all points in the subset <math>[0,1] \times (0,1)</math>, i.e., all points whose <math>y</math>-coordinate is ''strictly'' between <math>0</math> and <matH>1</math>, but the <math>x</matH>-coordinate is allowed to take the extreme values 0 and 1.
 
Note that the above only refers to the points at which it makes sense to ''try'' computing the partial derivative. It may still turn out that the partial derivative does not exist at many of these points.


This is a very subtle but very important point. It says that the ''partial derivative'' with respect to one variable depends not only on the choice of that particular variable, but on the choice of the other variables that are being kept constant for the purpose of computing the partial derivative. If a coordinate transformation is performed that changes what those ''other variables'' are, that could affect the value of the partial derivative.
<center>{{#widget:YouTube|id=9Q9OrXye748}}</center>


This has a very real-world corollary. In economics and social science, we often talk of the partial derivative with respect to one variable as measuring what happens ''ceteris paribus'' on the other variables. However, the notion of ''ceteris paribus'' on other variables depends on what the other variables are. If we redefine the coordinate system to change that meaning, the partial derivative can change.
==Caveats==


For instance, consider the function:
===Value of partial derivative depends on all inputs===


<math>u = f(x,y) := 2x + 3y</math>
{{further|[[Value of partial derivative depends on all inputs]]}}


In this case, we have:
{{#lst:value of partial derivative depends on all inputs|example 1}}


<math>\frac{\partial u}{\partial x} = f_x(x,y) = 2</math>
===Meaning of partial derivative depends on entire coordinate system===


Now, suppose we consider <math>u</math> in terms of <math>x</math> and <math>v = x + y</math>. Then, <math>f(x,y)</math>, as a function of <math>x</math> and <math>v</math>, is <math>3v - x</math>. Thus, we get:
{{further|[[Meaning of partial derivative depends on entire coordinate system]]}}


<math>\frac{\partial u}{\partial x} = -1</math>
This is a very subtle but very important point. It says that the ''partial derivative'' with respect to one variable depends not only on the choice of that particular variable, but on the choice of the other variables that are being kept constant for the purpose of computing the partial derivative. If a coordinate transformation is performed that changes what those ''other variables'' are, that could affect the value of the partial derivative.


Note that the two partial derivatives with respect to <math>x</math> are not equal. The reason for this is that in the first case, we are taking the partial derivative with respect to <math>x</math> keeping <math>y</math> constant, whereas in the second case, we are taking the partial derivative with respect to <math>x</math> keeping <math>v = x + y</math> constant.
This has a very real-world corollary. In economics and social science, we often talk of the partial derivative with respect to one variable as measuring what happens ''ceteris paribus'' on the other variables. However, the notion of ''ceteris paribus'' on other variables depends on what the other variables are. If we redefine the coordinate system to change that meaning, the partial derivative can change.


<center>{{#widget:YouTube|id=dTHMasMwolI}}</center>
{{#lst:meaning of partial derivative depends on entire coordinate system|example 1}}

Latest revision as of 16:58, 25 January 2013

This article describes an analogue for functions of multiple variables of the following term/fact/notion for functions of one variable: derivative

Definition at a point

Generic definition

Suppose f is a function of more than one variable, where x is one of the input variables to f. Fix a choice x=x0 and fix the values of all the other variables. The partial derivative of f with respect to x at the point, denoted f/x, or fx, is defined as the derivative at x0 of the function that sends x to f at x for the same fixed choice of the other input variables.

{{#widget:YouTube|id=rsZc1SYuiig}}

For a function of two variables

Suppose f is a real-valued function of two variables x,y, i.e., the domain of f is a subset of R2. Suppose (x0,y0) is a point in the domain of f, i.e., it's the point with x=x0 and y=y0 (here, x0,y0 are actual numerical values). We define the partial derivatives at (x0,y0) as follows:

Item For partial derivative with respect to x For partial derivative with respect to y
Notation f(x,y)x|(x,y)=(x0,y0)
Also denoted fx(x0,y0) or f1(x0,y0)
f(x,y)y|(x,y)=(x0,y0)
Also denoted fy(x0,y0) or f2(x0,y0)
Definition as derivative ddxf(x,y0)|x=x0. In other words, it is the derivative (at x=x0) of the function xf(x,y0) ddyf(x0,y)|y=y0. In other words, it is the derivative (at y=y0) of the function yf(x0,y).
Definition as limit (using derivative as limit of difference quotient) limxx0f(x,y0)f(x0,y0)xx0
limh0f(x0+h,y0)f(x0,y0)h
limyy0f(x0,y)f(x0,y0)yy0
limh0f(x0,y0+h)f(x0,y0)h
Definition as directional derivative Directional derivative at (x0,y0) with respect to a unit vector in the positive x-direction. Directional derivative at (x0,y0) with respect to a unit vector in the positive y-direction.
{{#widget:YouTube|id=P9ciiweXh_A}}{{#widget:YouTube|id=j6TthcrlvoU}}

For a function of multiple variables

The notation here gets a little messy, so read it carefully. We consider a function f of n variables, which we generically denote (x1,x2,,xn) respectively. Consider a point (a1,a2,,an) in the domain of the function. In other words, this is a point where x1=a1,x2=a2,,xn=an.

Suppose i is a natural number in the set {1,2,3,,n}.

Item Value for partial derivative with respect to xi
Notation xif(x1,x2,,xn)|(x1,x2,,xn)=(a1,a2,,an)
Also denoted fxi(a1,a2,,an) or fi(a1,a2,,an)
Definition as derivative ddxif(a1,a2,,ai1,xi,ai+1,,an)|xi=ai. In other words, it is the derivative of the function xif(a1,a2,,ai1,xi,ai+1,,an) with respect to xi, evaluated at the point xi=ai.
Definition as a limit (using derivative as limit of difference quotient) limxiaif(a1,a2,,ai1,xi,ai+1,,an)f(a1,a2,,an)xiai
Definition as a directional derivative Directional derivative in the positive xi-direction.
{{#widget:YouTube|id=bWn-XXfMleE}}

Definition as a function

Generic definition

Suppose f is a function of more than one variable, where x is one of the input variables to f. The partial derivative of f with respect to x, denoted f/x, or fx is defined as the function that sends points in the domain of f (including values of all the variables) to the partial derivative with respect to x of f (i.e., the derivative treating the other inputs as constants for the computation of the derivative). In particular, the domain of the partial derivative of f with respect to x is a subset of the domain of f.

We can compute the partial derivative of f relative to each of the inputs to f.

MORE ON THE WAY THIS DEFINITION OR FACT IS PRESENTED: We first present the version that deals with a specific point (typically with a

{}0

subscript) in the domain of the relevant functions, and then discuss the version that deals with a point that is free to move in the domain, by dropping the subscript. Why do we do this?
The purpose of the specific point version is to emphasize that the point is fixed for the duration of the definition, i.e., it does not move around while we are defining the construct or applying the fact. However, the definition or fact applies not just for a single point but for all points satisfying certain criteria, and thus we can get further interesting perspectives on it by varying the point we are considering. This is the purpose of the second, generic point version.

For a function of two variables

Suppose f is a real-valued function of two variables x,y, i.e., the domain of f is a subset of R2. The partial derivatives of f with respect to x and y are both functions of two variables each of which has domain a subset of the domain of f.

Item For partial derivative with respect to x For partial derivative with respect to y
Notation f(x,y)x
Also denoted fx(x,y) or f1(x,y)
f(x,y)y
Also denoted fy(x,y) or f2(x,y)
Definition as derivative It is the derivative of the function xf(x,y), treating y as an unknown constant It is the derivative of the function yf(x,y), treating x as an unknown constant
Definition as limit (using derivative as limit of difference quotient) limh0f(x+h,y)f(x,y)h limh0f(x,y+h)f(x,y)h
Definition as directional derivative Directional derivative with respect to a unit vector in the positive x-direction. Directional derivative with respect to a unit vector in the positive y-direction.
{{#widget:YouTube|id=0zu9BstjjZA}}

For a function of multiple variables

Item Value for partial derivative with respect to xi
Notation xif(x1,x2,,xn)
Also denoted fxi(x1,x2,,xn) or fi(x1,x2,,xn)
Definition as derivative It is the derivative of the function xif(x1,x2,,xi1,xi,xi+1,,xn) with respect to xi, where all the other variables are treated as unknown constants while doing the differentiation.
Definition as a limit (using derivative as limit of difference quotient) limh0f(x1,x2,,xi1,xi+h,xi+1,,xn)f(x1,x2,,xn)h
Definition as a directional derivative Directional derivative in the positive xi-direction.

Graphical interpretation

For a function of two variables at a point

Suppose f is a function of two variables x,y and (x0,y0) is a point in the domain of the function. Consider the graph of f in three-dimensional space, given by z=f(x,y).

We have the following:

Partial derivative Graphical interpretation
The partial derivative fx(x0,y0) at a point (x0,y0) in the domain of the function The slope of the tangent line at x=x0 to the restriction of the graph of f to the plane y=y0.
The partial derivative fy(x0,y0) at a point (x0,y0) in the domain of the function The slope of the tangent line at y=y0 to the restriction of the graph of f to the plane x=x0.
{{#widget:YouTube|id=yMu-ZIoGjTo}}

For a function of multiple variables at a point

Suppose f is a function of n variables x1,x2,,xn and suppose (a1,a2,,an) is a point in the domain of f. Consider the graph of f in Rn+1 given by:

xn+1=f(x1,x2,,xn)

For any i{1,2,,n}, we define the partial derivative fxi(a1,a2,,an), also denoted fi(a1,a2,,an), as follows:

  • First, consider the intersection of the graph of f with the plane given by the set of n1 equations xj=aj for all ji. This is a plane parallel to the xixn+1-plane.
  • In this plane, consider the slope of the tangent line at xi=ai. This is the value of the partial derivative.

Related notions

Domain considerations

As already noted in the definition of partial derivative, the domain of the partial derivative of a function with respect to a variable is a subset of the domain of the function. However, we can actually say a little more.

For a function of two variables

Suppose f is a function of two variables x,y. Then, a necessary condition for us to make sense of the partial derivative fx at a point (x0,y0) is that f be defined on a small open interval about the point x0 on the line y=y0. Note that it is not necessary that f actually be defined in an open ball surrounding the point (x0,y0) -- the only thing that matters is that f be defined under slight perturbations of x, holding y constant.

Similar remarks apply to fy: a necessary condition for us to make sense of the partial derivative fy at a point (x0,y0) is that f be defined on a small open interval about the point y0 on the line x=x0.

Consider, for instance, a function defined on the set [0,1]×[0,1], i.e., the set {(x,y)0x1,0y1}. It makes sense to try computing the partial derivative fx at all points in the subset (0,1)×[0,1], i.e., all points whose x-coordinate is strictly between 0 and 1, but the y-coordinate is allowed to take the extreme values 0 and 1. Similarly, it makes sense to try computing the partial derivative fy at all points in the subset [0,1]×(0,1), i.e., all points whose y-coordinate is strictly between 0 and 1, but the x-coordinate is allowed to take the extreme values 0 and 1.

Note that the above only refers to the points at which it makes sense to try computing the partial derivative. It may still turn out that the partial derivative does not exist at many of these points.

{{#widget:YouTube|id=9Q9OrXye748}}

Caveats

Value of partial derivative depends on all inputs

For further information, refer: Value of partial derivative depends on all inputs


For instance, consider:

f(x,y):=x2+y2+xy2

Then, we have:

fx(x,y)=2x+y2

and:

fy(x,y)=2y+2xy

Note that each of the expressions involves both the variables x and y. In particular, this means that the value of fx at a point depends on both the x-coordinate and the y-coordinate of the point. Thus, for instance:

fx(2,3)=2(2)+32=4+9=13

fx(2,4)=2(2)+42=4+16=20

Despite the same x-value of 2 in both cases, the fx-values are different because of differences in the input y-values.

Similarly, consider:

fy(1,4)=2(4)+2(1)(4)=8+8=16

fy(2,4)=2(4)+2(2)(4)=8+16=24

Despite the same y-value of 4 in both cases, the fy-values are different because of differences in the input x-values.

{{#widget:YouTube|id=2T7iFZVLtn0}}


Meaning of partial derivative depends on entire coordinate system

For further information, refer: Meaning of partial derivative depends on entire coordinate system

This is a very subtle but very important point. It says that the partial derivative with respect to one variable depends not only on the choice of that particular variable, but on the choice of the other variables that are being kept constant for the purpose of computing the partial derivative. If a coordinate transformation is performed that changes what those other variables are, that could affect the value of the partial derivative.

This has a very real-world corollary. In economics and social science, we often talk of the partial derivative with respect to one variable as measuring what happens ceteris paribus on the other variables. However, the notion of ceteris paribus on other variables depends on what the other variables are. If we redefine the coordinate system to change that meaning, the partial derivative can change.


Consider the function:

u=f(x,y):=2x+3y

In this case, we have:

ux=fx(x,y)=2

Now, suppose we consider f in terms of x and v=x+y. Then, we have y=vx. Rewriting u in terms of x and v, we get:

u=2x+3y=2x+3(vx)=2x+3v3x=3vx

In other words, we can define u as a function of two variables x and v. If we use the letter g to denote this new function, we get:

u=g(x,v):=3vx

In this case, we have:

ux=gx(x,v)=1

Note that the two partial derivatives with respect to x are not equal. The reason for this is that in the first case, we are taking the partial derivative with respect to x keeping y constant, whereas in the second case, we are taking the partial derivative with respect to x keeping v=x+y constant. In this case, when we increase x slightly, the value of y decreases to keep the total constant.

Here's the geometric interpretation:

  • In the first case, where we are computing fx(x,y), we are geometrically computing the directional derivative along the positive x-direction, i.e., along a line with y-coordinate.
  • In the second case, where we are computing gx(x,v), we are geometrically computing (up to scalar multiples) the directional derivative along lines with x+y constant. These lines are downward sloping with a slope of 1.
{{#widget:YouTube|id=tfH2iqt2E0E}}