Hessian matrix: Difference between revisions

From Calculus
Line 11: Line 11:
Suppose <math>f</math> is a real-valued function of multiple variables <math>(x_1,x_2,\dots,x_n)</math>. Suppose <math>(a_1,a_2,\dots,a_n)</math> is a point in the domain of <math>f</math>. In other words, <math>a_1,a_2,\dots,a_n</math> are real numbers and the point has coordinates <math>x_1 = a_1, x_2 = a_2, \dots,x_n = a_n</math>. Suppose, further, that all the second-order partials (pure and mixed) of <math>f</math> with respect to these variables exist at the point <math>(a_1,a_2,\dots,a_n)</math>. Then, the Hessian matrix of <math>f</math> at <math>(a_1,a_2,\dots,a_n)</math>, denoted <math>H(f)(a_1,a_2,\dots,a_n)</math>, is a <math>n \times n</math> matrix of real numbers defined as follows:
Suppose <math>f</math> is a real-valued function of multiple variables <math>(x_1,x_2,\dots,x_n)</math>. Suppose <math>(a_1,a_2,\dots,a_n)</math> is a point in the domain of <math>f</math>. In other words, <math>a_1,a_2,\dots,a_n</math> are real numbers and the point has coordinates <math>x_1 = a_1, x_2 = a_2, \dots,x_n = a_n</math>. Suppose, further, that all the second-order partials (pure and mixed) of <math>f</math> with respect to these variables exist at the point <math>(a_1,a_2,\dots,a_n)</math>. Then, the Hessian matrix of <math>f</math> at <math>(a_1,a_2,\dots,a_n)</math>, denoted <math>H(f)(a_1,a_2,\dots,a_n)</math>, is a <math>n \times n</math> matrix of real numbers defined as follows:


The <math>(ij)^{th}</math> entry (i.e., the entry in the <math>i^{th}</math> row and <math>j^{th}</math> column) is <math>f_{x_ix_j}(a_1,a_2,\dots,a_n)</math>. This is the same as <math>\frac{\partial^2f}{\partial x_j \partial x_i}f(a_1,a_2,\dots,a_n)</math>. Note that in the two notations, the order in which we write the partials differs because the convention differs (left-to-right versus right-to-left).
The <math>(ij)^{th}</math> entry (i.e., the entry in the <math>i^{th}</math> row and <math>j^{th}</math> column) is <math>f_{x_ix_j}(a_1,a_2,\dots,a_n)</math>. This is the same as <math>\frac{\partial^2f}{\partial x_j \partial x_i}f(x_1,x_2,\dots,x_n)|_{(x_1,x_2,\dots,x_n) = (a_1,a_2,\dots,a_n)}</math>. Note that in the two notations, the order in which we write the partials differs because the convention differs (left-to-right versus right-to-left).


The matrix looks like this:
The matrix looks like this:

Revision as of 01:18, 24 April 2012

Definition at a point

For a function of two variables at a point

Suppose is a real-valued function of two variables and is a point in the domain of . Suppose all the four second-order partial derivatives exist at , i.e., the two pure second-order partials exist, and so do the two second-order mixed partial derivatives and . Then, the Hessian matrix of at , denoted , is a matrix of real numbers defined as follows:

For a function of multiple variables at a point

Suppose is a real-valued function of multiple variables . Suppose is a point in the domain of . In other words, are real numbers and the point has coordinates . Suppose, further, that all the second-order partials (pure and mixed) of with respect to these variables exist at the point . Then, the Hessian matrix of at , denoted , is a matrix of real numbers defined as follows:

The entry (i.e., the entry in the row and column) is . This is the same as . Note that in the two notations, the order in which we write the partials differs because the convention differs (left-to-right versus right-to-left).

The matrix looks like this:

Definition as a function

For a function of two variables

Suppose is a real-valued function of two variables . The Hessian matrix of , denoted , is a matrix-valued function that sends each point to the Hessian matrix at that point, if that matrix is defined. It is defined as:

In the point-free notation, we can write this as:

For a function of multiple variables

Suppose is a function of variables . The Hessian matrix of , denoted , is a matrix-valued function that sends each point to the Hessian matrix at that point, if the matrix is defined. It is defined as:

In the point-free notation, we can write it as:

Under continuity assumptions

If we assume that all the second-order partials of are continuous functions everywhere, then the following happens:

  • The Hessian matrix of at any point is a symmetric matrix, i.e., its entry equals its entry. This follows from Clairaut's theorem on equality of mixed partials.
  • We can think of the Hessian matrix as the second derivative of the function, i.e., it is a matrix describing the second derivative.
  • is twice differentiable as a function. Hence, the Hessian matrix of is the same as the Jacobian matrix of the gradient vector , where the latter is viewed as a vector-valued function.

Note that the final conclusion actually only requires the existence of the gradient vector, hence it holds even if the second-order partials are not continuous.