Hessian matrix: Difference between revisions

Revision as of 00:59, 24 April 2012

Definition at a point

For a function of two variables at a point

Suppose $f$ is a real-valued function of two variables $x,y$ and $(x_{0},y_{0})$ is a point in the domain of $f$ . Suppose all the four second-order partial derivatives exist at $(x_{0},y_{0})$ , i.e., the two pure second-order partials $f_{xx}(x_{0},y_{0}),f_{yy}(x_{0},y_{0})$ exist, and so do the two second-order mixed partial derivatives $f_{xy}(x_{0},y_{0})$ and $f_{yx}(x_{0},y_{0})$ . Then, the Hessian matrix of $f$ at $(x_{0},y_{0})$ , denoted $H(f)(x_{0},y_{0})$ , is a $2\times 2$ matrix of real numbers defined as follows:

${\begin{pmatrix}f_{xx}(x_{0},y_{0})&f_{xy}(x_{0},y_{0})\\f_{yx}(x_{0},y_{0})&f_{yy}(x_{0},y_{0})\\\end{pmatrix}}$

For a function of multiple variables at a point

Suppose $f$ is a real-valued function of multiple variables $(x_{1},x_{2},\dots ,x_{n})$ . Suppose $(a_{1},a_{2},\dots ,a_{n})$ is a point in the domain of $f$ . In other words, $a_{1},a_{2},\dots ,a_{n}$ are real numbers and the point has coordinates $x_{1}=a_{1},x_{2}=a_{2},\dots ,x_{n}=a_{n}$ . Suppose, further, that all the second-order partials (pure and mixed) of $f$ with respect to these variables exist at the point $(a_{1},a_{2},\dots ,a_{n})$ . Then, the Hessian matrix of $f$ at $(a_{1},a_{2},\dots ,a_{n})$ , denoted $H(f)(a_{1},a_{2},\dots ,a_{n})$ , is a $n\times n$ matrix of real numbers defined as follows:

The $(ij)^{th}$ entry (i.e., the entry in the $i^{th}$ row and $j^{th}$ column) is $f_{x_{i}x_{j}}(a_{1},a_{2},\dots ,a_{n})$ . This is the same as ${\frac {\partial ^{2}f}{\partial x_{j}\partial x_{i}}}f(a_{1},a_{2},\dots ,a_{n})$ . Note that in the two notations, the order in which we write the partials differs because the convention differs (left-to-right versus right-to-left).

The matrix looks like this:

${\begin{pmatrix}f_{x_{1}x_{1}}(a_{1},a_{2},\dots ,a_{n})&f_{x_{1}x_{2}}(a_{1},a_{2},\dots ,a_{n})&\dots &f_{x_{1}x_{n}}(a_{1},a_{2},\dots ,a_{n})\\f_{x_{2}x_{1}}(a_{1},a_{2},\dots ,a_{n})&f_{x_{2}x_{2}}(a_{1},a_{2},\dots ,a_{n})&\dots &f_{x_{2}x_{n}}(a_{1},a_{2},\dots ,a_{n})\\\cdot &\cdot &\cdot &\cdot \\\cdot &\cdot &\cdot &\cdot \\\cdot &\cdot &\cdot &\cdot \\f_{x_{n}x_{1}}(a_{1},a_{2},\dots ,a_{n})&f_{x_{n}x_{2}}(a_{1},a_{2},\dots ,a_{n})&\dots &f_{x_{n}x_{n}}(a_{1},a_{2},\dots ,a_{n})\\\end{pmatrix}}$

Definition as a function

For a function of two variables

Suppose $f$ is a real-valued function of two variables $x,y$ . The Hessian matrix of $f$ , denoted $H(f)$ , is a $2\times 2$ matrix-valued function that sends each point to the Hessian matrix at that point, if that matrix is defined. It is defined as:

$(x_{0},y_{0})\mapsto H(f)(x_{0},y_{0})={\begin{pmatrix}f_{xx}(x_{0},y_{0})&f_{xy}(x_{0},y_{0})\\f_{yx}(x_{0},y_{0})&f_{yy}(x_{0},y_{0})\\\end{pmatrix}}$

In the point-free notation, we can write this as:

$H(f)={\begin{pmatrix}f_{xx}&f_{xy}\\f_{yx}&f_{yy}\\\end{pmatrix}}$

For a function of multiple variables

Suppose $f$ is a function of variables $x_{1},x_{2},\dots ,x_{n}$ . The Hessian matrix of $f$ , denoted $H(f)$ , is a $n\times n$ matrix-valued function that sends each point to the Hessian matrix at that point, if the matrix is defined. It is defined as:

$(a_{1},a_{2},\dots ,a_{n})\mapsto H(f)(a_{1},a_{2},\dots ,a_{n})$

In the point-free notation, we can write it as:

${\begin{pmatrix}f_{x_{1}x_{1}}&f_{x_{1}x_{2}}&\dots &f_{x_{1}x_{n}}\\f_{x_{2}x_{1}}&f_{x_{2}x_{2}}&\dots &f_{x_{2}x_{n}}\\\cdot &\cdot &\cdot &\cdot \\\cdot &\cdot &\cdot &\cdot \\\cdot &\cdot &\cdot &\cdot \\f_{x_{n}x_{1}}&f_{x_{n}x_{2}}&\dots &f_{x_{n}x_{n}}\\\end{pmatrix}}$

Under continuity assumptions

If we assume that all the second-order partials of $f$ are continuous functions everywhere, then the following happens:

The Hessian matrix of $f$ at any point is a symmetric matrix, i.e., its $(ij)^{th}$ entry equals its $(ji)^{th}$ entry. This follows from Clairaut's theorem on equality of mixed partials.
We can think of the Hessian matrix as the second derivative of the function, i.e., it is a matrix describing the second derivative.
$f$ is twice differentiable as a function. Hence, the Hessian matrix of $f$ is the same as the Jacobian matrix of the gradient vector $\nabla f$ , where the latter is viewed as a vector-valued function.

Note that the final conclusion actually only requires the existence of the gradient vector, hence it holds even if the second-order partials are not continuous.

@@ Line 3: / Line 3: @@
 ===For a function of two variables at a point===
-Suppose <math>f</math> is a real-valued function of two variables <math>x,y</math> and <math>(x_0,y_0)</math> is a point in the domain of <math>f</math>. Suppose all the four second-order partial derivatives exist at <math>(x_0,y_0)</math>, i.e., the two pure second-order partials <math>f_{xx}(x_0,y_0),f_{yy}(x_0,y_0)</math> exist, and so do the two [[second-order mixed partial derivative]]s <math>f_{xy}(x_0,y_0</math> and <math>f_{yx}(x_0,y_0)</math>. Then, the Hessian matrix of <math>f</math> at <math>(x_0,y_0)</math>, denoted <math>H(f)(x_0,y_0)</math>, is a <math>2 \times 2</math> matrix of real numbers defined as follows:
+Suppose <math>f</math> is a real-valued function of two variables <math>x,y</math> and <math>(x_0,y_0)</math> is a point in the domain of <math>f</math>. Suppose all the four second-order partial derivatives exist at <math>(x_0,y_0)</math>, i.e., the two pure second-order partials <math>f_{xx}(x_0,y_0),f_{yy}(x_0,y_0)</math> exist, and so do the two [[second-order mixed partial derivative]]s <math>f_{xy}(x_0,y_0)</math> and <math>f_{yx}(x_0,y_0)</math>. Then, the Hessian matrix of <math>f</math> at <math>(x_0,y_0)</math>, denoted <math>H(f)(x_0,y_0)</math>, is a <math>2 \times 2</math> matrix of real numbers defined as follows:
 <math>\begin{pmatrix} f_{xx}(x_0,y_0) & f_{xy}(x_0,y_0) \\ f_{yx}(x_0,y_0) & f_{yy}(x_0,y_0) \\\end{pmatrix}</math>