Partial derivative

From Calculus
Jump to: navigation, search
This article describes an analogue for functions of multiple variables of the following term/fact/notion for functions of one variable: derivative

Definition at a point

Generic definition

Suppose f is a function of more than one variable, where x is one of the input variables to f. Fix a choice x = x_0 and fix the values of all the other variables. The partial derivative of f with respect to x at the point, denoted \partial f/\partial x, or f_x, is defined as the derivative at x_0 of the function that sends x to f at x for the same fixed choice of the other input variables.

For a function of two variables

Suppose f is a real-valued function of two variables x,y, i.e., the domain of f is a subset of \R^2. Suppose (x_0,y_0) is a point in the domain of f, i.e., it's the point with x = x_0 and y = y_0 (here, x_0,y_0 are actual numerical values). We define the partial derivatives at (x_0,y_0) as follows:

Item For partial derivative with respect to x For partial derivative with respect to y
Notation \frac{\partial f(x,y)}{\partial x}|_{(x,y) = (x_0,y_0)}
Also denoted f_x(x_0,y_0) or f_1(x_0,y_0)
\frac{\partial f(x,y)}{\partial y}|_{(x,y) = (x_0,y_0)}
Also denoted f_y(x_0,y_0) or f_2(x_0,y_0)
Definition as derivative \frac{d}{dx}f(x,y_0)|_{x = x_0}. In other words, it is the derivative (at x = x_0) of the function x \mapsto f(x,y_0) \frac{d}{dy}f(x_0,y)|_{y = y_0}. In other words, it is the derivative (at y = y_0) of the function y \mapsto f(x_0,y).
Definition as limit (using derivative as limit of difference quotient) \lim_{x \to x_0} \frac{f(x,y_0) - f(x_0,y_0)}{x - x_0}
\lim_{h \to 0} \frac{f(x_0 + h,y_0) - f(x_0,y_0)}{h}
\lim_{y \to y_0} \frac{f(x_0,y) - f(x_0,y_0)}{y - y_0}
\lim_{h \to 0} \frac{f(x_0,y_0 + h) - f(x_0,y_0)}{h}
Definition as directional derivative Directional derivative at (x_0,y_0) with respect to a unit vector in the positive x-direction. Directional derivative at (x_0,y_0) with respect to a unit vector in the positive y-direction.

For a function of multiple variables

The notation here gets a little messy, so read it carefully. We consider a function f of n variables, which we generically denote (x_1,x_2,\dots,x_n) respectively. Consider a point (a_1,a_2,\dots,a_n) in the domain of the function. In other words, this is a point where x_1 = a_1,x_2 =a_2, \dots, x_n = a_n.

Suppose i is a natural number in the set \{ 1,2,3,\dots,n \}.

Item Value for partial derivative with respect to x_i
Notation \frac{\partial}{\partial x_i}f(x_1,x_2,\dots,x_n)|_{(x_1,x_2,\dots,x_n) = (a_1,a_2,\dots,a_n)}
Also denoted f_{x_i}(a_1,a_2,\dots,a_n) or f_i(a_1,a_2,\dots,a_n)
Definition as derivative \frac{d}{dx_i}f(a_1,a_2,\dots,a_{i-1},x_i,a_{i+1}, \dots,a_n)|_{x_i = a_i}. In other words, it is the derivative of the function x_i \mapsto f(a_1,a_2,\dots,a_{i-1},x_i,a_{i+1},\dots,a_n) with respect to x_i, evaluated at the point x_i = a_i.
Definition as a limit (using derivative as limit of difference quotient) \lim_{x_i \to a_i} \frac{f(a_1,a_2,\dots,a_{i-1},x_i,a_{i+1},\dots,a_n) - f(a_1,a_2,\dots,a_n)}{x_i - a_i}
Definition as a directional derivative Directional derivative in the positive x_i-direction.

Definition as a function

Generic definition

Suppose f is a function of more than one variable, where x is one of the input variables to f. The partial derivative of f with respect to x, denoted \partial f/\partial x, or f_x is defined as the function that sends points in the domain of f (including values of all the variables) to the partial derivative with respect to x of f (i.e., the derivative treating the other inputs as constants for the computation of the derivative). In particular, the domain of the partial derivative of f with respect to x is a subset of the domain of f.

We can compute the partial derivative of f relative to each of the inputs to f.

MORE ON THE WAY THIS DEFINITION OR FACT IS PRESENTED: We first present the version that deals with a specific point (typically with a \{ \}_0 subscript) in the domain of the relevant functions, and then discuss the version that deals with a point that is free to move in the domain, by dropping the subscript. Why do we do this?
The purpose of the specific point version is to emphasize that the point is fixed for the duration of the definition, i.e., it does not move around while we are defining the construct or applying the fact. However, the definition or fact applies not just for a single point but for all points satisfying certain criteria, and thus we can get further interesting perspectives on it by varying the point we are considering. This is the purpose of the second, generic point version.

For a function of two variables

Suppose f is a real-valued function of two variables x,y, i.e., the domain of f is a subset of \R^2. The partial derivatives of f with respect to x and y are both functions of two variables each of which has domain a subset of the domain of f.

Item For partial derivative with respect to x For partial derivative with respect to y
Notation \frac{\partial f(x,y)}{\partial x}
Also denoted f_x(x,y) or f_1(x,y)
\frac{\partial f(x,y)}{\partial y}
Also denoted f_y(x,y) or f_2(x,y)
Definition as derivative It is the derivative of the function x \mapsto f(x,y), treating y as an unknown constant It is the derivative of the function y \mapsto f(x,y), treating x as an unknown constant
Definition as limit (using derivative as limit of difference quotient) \lim_{h \to 0} \frac{f(x + h,y) - f(x,y)}{h} \lim_{h \to 0} \frac{f(x,y + h) - f(x,y)}{h}
Definition as directional derivative Directional derivative with respect to a unit vector in the positive x-direction. Directional derivative with respect to a unit vector in the positive y-direction.

For a function of multiple variables

Item Value for partial derivative with respect to x_i
Notation \frac{\partial}{\partial x_i}f(x_1,x_2,\dots,x_n)
Also denoted f_{x_i}(x_1,x_2,\dots,x_n) or f_i(x_1,x_2,\dots,x_n)
Definition as derivative It is the derivative of the function x_i \mapsto f(x_1,x_2,\dots,x_{i-1},x_i,x_{i+1},\dots,x_n) with respect to x_i, where all the other variables are treated as unknown constants while doing the differentiation.
Definition as a limit (using derivative as limit of difference quotient) \lim_{h \to 0} \frac{f(x_1,x_2,\dots,x_{i-1},x_i + h, x_{i+1},\dots,x_n) - f(x_1,x_2,\dots,x_n)}{h}
Definition as a directional derivative Directional derivative in the positive x_i-direction.

Graphical interpretation

For a function of two variables at a point

Suppose f is a function of two variables x,y and (x_0,y_0) is a point in the domain of the function. Consider the graph of f in three-dimensional space, given by z =f(x,y).

We have the following:

Partial derivative Graphical interpretation
The partial derivative f_x(x_0,y_0) at a point (x_0,y_0) in the domain of the function The slope of the tangent line at x=  x_0 to the restriction of the graph of f to the plane y = y_0.
The partial derivative f_y(x_0,y_0) at a point (x_0,y_0) in the domain of the function The slope of the tangent line at y = y_0 to the restriction of the graph of f to the plane x = x_0.

For a function of multiple variables at a point

Suppose f is a function of n variables x_1,x_2,\dots,x_n and suppose (a_1,a_2,\dots,a_n) is a point in the domain of f. Consider the graph of f in \R^{n+1} given by:

x_{n+1} = f(x_1,x_2,\dots,x_n)

For any i \in \{ 1,2,\dots,n\}, we define the partial derivative f_{x_i}(a_1,a_2,\dots,a_n), also denoted f_i(a_1,a_2,\dots,a_n), as follows:

  • First, consider the intersection of the graph of f with the plane given by the set of n - 1 equations x_j = a_j for all j \ne i. This is a plane parallel to the x_ix_{n+1}-plane.
  • In this plane, consider the slope of the tangent line at x_i =a_i. This is the value of the partial derivative.

Related notions

Domain considerations

As already noted in the definition of partial derivative, the domain of the partial derivative of a function with respect to a variable is a subset of the domain of the function. However, we can actually say a little more.

For a function of two variables

Suppose f is a function of two variables x,y. Then, a necessary condition for us to make sense of the partial derivative f_x at a point (x_0,y_0) is that f be defined on a small open interval about the point x_0 on the line y = y_0. Note that it is not necessary that f actually be defined in an open ball surrounding the point (x_0,y_0) -- the only thing that matters is that f be defined under slight perturbations of x, holding y constant.

Similar remarks apply to f_y: a necessary condition for us to make sense of the partial derivative f_y at a point (x_0,y_0) is that f be defined on a small open interval about the point y_0 on the line x = x_0.

Consider, for instance, a function defined on the set [0,1] \times [0,1], i.e., the set \{ (x,y) \mid 0 \le x \le 1, 0 \le y \le 1 \}. It makes sense to try computing the partial derivative f_x at all points in the subset (0,1) \times [0,1], i.e., all points whose x-coordinate is strictly between 0 and 1, but the y-coordinate is allowed to take the extreme values 0 and 1. Similarly, it makes sense to try computing the partial derivative f_y at all points in the subset [0,1] \times (0,1), i.e., all points whose y-coordinate is strictly between 0 and 1, but the x-coordinate is allowed to take the extreme values 0 and 1.

Note that the above only refers to the points at which it makes sense to try computing the partial derivative. It may still turn out that the partial derivative does not exist at many of these points.

Caveats

Value of partial derivative depends on all inputs

For further information, refer: Value of partial derivative depends on all inputs


For instance, consider:

f(x,y) := x^2 + y^2 + xy^2

Then, we have:

f_x(x,y) = 2x + y^2

and:

f_y(x,y) = 2y + 2xy

Note that each of the expressions involves both the variables x and y. In particular, this means that the value of f_x at a point depends on both the x-coordinate and the y-coordinate of the point. Thus, for instance:

f_x(2,3) = 2(2) + 3^2 = 4 + 9 = 13

f_x(2,4) = 2(2) + 4^2 = 4 + 16 = 20

Despite the same x-value of 2 in both cases, the f_x-values are different because of differences in the input y-values.

Similarly, consider:

f_y(1,4) = 2(4) + 2(1)(4) = 8 + 8 = 16

f_y(2,4) = 2(4) + 2(2)(4) = 8 + 16 = 24

Despite the same y-value of 4 in both cases, the f_y-values are different because of differences in the input x-values.


Meaning of partial derivative depends on entire coordinate system

For further information, refer: Meaning of partial derivative depends on entire coordinate system

This is a very subtle but very important point. It says that the partial derivative with respect to one variable depends not only on the choice of that particular variable, but on the choice of the other variables that are being kept constant for the purpose of computing the partial derivative. If a coordinate transformation is performed that changes what those other variables are, that could affect the value of the partial derivative.

This has a very real-world corollary. In economics and social science, we often talk of the partial derivative with respect to one variable as measuring what happens ceteris paribus on the other variables. However, the notion of ceteris paribus on other variables depends on what the other variables are. If we redefine the coordinate system to change that meaning, the partial derivative can change.


Consider the function:

u = f(x,y) := 2x + 3y

In this case, we have:

\frac{\partial u}{\partial x} = f_x(x,y) = 2

Now, suppose we consider f in terms of x and v = x + y. Then, we have y = v - x. Rewriting u in terms of x and v, we get:

u = 2x + 3y = 2x + 3(v - x) = 2x + 3v - 3x = 3v - x

In other words, we can define u as a function of two variables x and v. If we use the letter g to denote this new function, we get:

u = g(x,v) := 3v - x

In this case, we have:

\frac{\partial u}{\partial x} = g_x(x,v) = -1

Note that the two partial derivatives with respect to x are not equal. The reason for this is that in the first case, we are taking the partial derivative with respect to x keeping y constant, whereas in the second case, we are taking the partial derivative with respect to x keeping v = x + y constant. In this case, when we increase x slightly, the value of y decreases to keep the total constant.

Here's the geometric interpretation:

  • In the first case, where we are computing f_x(x,y), we are geometrically computing the directional derivative along the positive x-direction, i.e., along a line with y-coordinate.
  • In the second case, where we are computing g_x(x,v), we are geometrically computing (up to scalar multiples) the directional derivative along lines with x + y constant. These lines are downward sloping with a slope of -1.