This article describes an analogue for functions of multiple variables of the following term/fact/notion for functions of one variable: second derivative
Definition
Definition in terms of Jacobian matrix and gradient vector
Suppose
is a real-valued function of
variables
. The 'Hessian matrix of
is a
-matrix-valued function with domain a subset of the domain of
, defined as follows: the Hessian matrix at any point in the domain is the Jacobian matrix of the gradient vector of
at the point. In point-free notation, we denote by
the Hessian matrix function, and we define it as:
Interpretation as second derivative
The Hessian matrix function is the correct notion of second derivative for a real-valued function of
variables. Here's why:
- The correct notion of first derivative for a scalar-valued function of multiple variables is the gradient vector, so the correct notion of first derivative for
is
.
- The gradient vector
is itself a vector-valued function with
-dimensional inputs and
-dimensional outputs. The correct notion of derivative for that is the Jacobian matrix, with
-dimensional inputs and outputs valued in
-matrices.
Thus, the Hessian matrix is the correct notion of second derivative.
Definition in terms of second-order partial derivatives
For further information, refer: Relation between Hessian matrix and second-order partial derivatives
Wherever the Hessian matrix for a function exists, its entries can be described as second-order partial derivatives of the function. Explicitly, for a function
is a real-valued function of
variables
, the Hessian matrix
is a
-matrix-valued function whose
entry is the second-order partial derivative
, which is the same as
. Note that the diagonal entries give second-order pure partial derivatives whereas the off-diagonal entries give second-order mixed partial derivatives.
Computationally useful definition at a point
For a function of two variables at a point
Suppose
is a real-valued function of two variables
and
is a point in the domain of
at which
is twice differentiable. In particular, this means that all the four second-order partial derivatives exist at
, i.e., the two pure second-order partials
exist, and so do the two second-order mixed partial derivatives
and
. Then, the Hessian matrix of
at
, denoted
, can be expressed explicitly as a
matrix of real numbers defined as follows:
{{#widget:YouTube|id=47WX0VfWS8k}}
For a function of multiple variables at a point
Suppose
is a real-valued function of multiple variables
. Suppose
is a point in the domain of
at which
is twice differentiable. In other words,
are real numbers and the point has coordinates
. Suppose, further, that all the second-order partials (pure and mixed) of
with respect to these variables exist at the point
. Then, the Hessian matrix of
at
, denoted
, is a
matrix of real numbers that can be expressed explicitly as follows:
The
entry (i.e., the entry in the
row and
column) is
. This is the same as
. Note that in the two notations, the order in which we write the partials differs because the convention differs (left-to-right versus right-to-left).
The matrix looks like this:
{{#widget:YouTube|id=FIRMeFAeYqc}}
Definition as a function
For a function of two variables
Suppose
is a real-valued function of two variables
. The Hessian matrix of
, denoted
, is a
matrix-valued function that sends each point to the Hessian matrix at that point, if that matrix is defined. It is defined as:
In the point-free notation, we can write this as:
{{#widget:YouTube|id=39VO16DieuQ}}
For a function of multiple variables
Suppose
is a function of variables
. The Hessian matrix of
, denoted
, is a
matrix-valued function that sends each point to the Hessian matrix at that point, if the matrix is defined. It is defined as:
In the point-free notation, we can write it as:
{{#widget:YouTube|id=DeFoV-NfjQQ}}
Under continuity assumptions
If we assume that all the second-order partials of
are continuous functions everywhere, then the following happens:
- The Hessian matrix of
at any point is a symmetric matrix, i.e., its
entry equals its
entry. This follows from Clairaut's theorem on equality of mixed partials.
- We can think of the Hessian matrix as the second derivative of the function, i.e., it is a matrix describing the second derivative.
is twice differentiable as a function. Hence, the Hessian matrix of
is the same as the Jacobian matrix of the gradient vector
, where the latter is viewed as a vector-valued function.
Note that the final conclusion actually only requires the existence of the gradient vector, hence it holds even if the second-order partials are not continuous.