Quadratic function of multiple variables

Definition

Consider variables $x_{1},x_{2},\dots ,x_{n}$ . A quadratic function of the variables $x_{1},x_{2},\dots ,x_{n}$ is a function of the form:

$\left(\sum _{i=1}^{n}\sum _{j=1}^{n}a_{ij}x_{i}x_{j}\right)+\left(\sum _{i=1}^{n}b_{i}x_{i}\right)+c$

In vector form, if we denote by ${\vec {x}}$ the column vector with coordinates $x_{1},x_{2},\dots ,x_{n}$ , then we can write the function as:

${\vec {x}}^{T}A{\vec {x}}+{\vec {b}}^{T}{\vec {x}}+c$

where $A$ is a $n\times n$ matrix with entries $a_{ij}$ and ${\vec {b}}$ is the column vector with entries $b_{i}$ .

Note that the matrix $A$ is non-unique: if $A+A^{T}=F+F^{T}$ then we could replace $A$ by $F$ . Therefore, we could choose to replace $A$ by the matrix $(A+A^{T})/2$ and have the advantage of working with a symmetric matrix.

Key data

For the discussion here, assume that $A$ has been made a symmetric matrix.

Item	Value
default domain	the whole of $\mathbb {R} ^{n}$
range	If the matrix $A$ is not positive semidefinite or negative semidefinite, the range is all of $\mathbb {R}$ . If the matrix $A$ is positive semidefinite, the range is $[m,\infty )$ where $m$ is the minimum value. If the matrix $A$ is negative semidefinite, the range is $(-\infty ,m]$ where $m$ is the maximum value.
local minimum value and points of attainment	If the matrix $A$ is positive semidefinite, then $c-{\frac {1}{4}}{\vec {b}}^{T}M{\vec {b}}$ , attained at ${\frac {-1}{2}}A^{-1}{\vec {b}}$ , where $A=M^{T}M$ Otherwise, no local minimum value
local maximum value and points of attainment	If the matrix $A$ is negative definite, then $c-{\frac {1}{4}}{\vec {b}}^{T}M{\vec {b}}$ , attained at ${\frac {-1}{2}}A^{-1}{\vec {b}}$ (also applies if it's negative semidefinite) Otherwise, no local maximum value

Differentiation

Partial derivatives and gradient vector

The partial derivative with respect to the variable $x_{i}$ , and therefore also the $i^{th}$ coordinate of the gradient vector, is given by:

${\frac {\partial f}{\partial x_{i}}}=\left(\sum _{j=1}^{n}(a_{ij}+a_{ji})x_{j}\right)+b_{i}$

In terms of the matrix and vector notation, the gradient vector, expressed as a column vector, is:

$(\nabla f)({\vec {x}})=(A+A^{T}){\vec {x}}+{\vec {b}}$

In the case that $A$ is a symmetric matrix, this simplifies to:

$(\nabla f)({\vec {x}})=2A{\vec {x}}+{\vec {b}}$

Hessian matrix

The Hessian matrix of the quadratic function is the matrix $A+A^{T}$ . In the case that $A$ is symmetric, this simplifies to $2A$ .

Higher derivatives

All the higher derivative tensors are zero.

Cases

For the discussion of cases, assume that $A$ is a symmetric matrix. If $A$ is not symmetric, replace it by the symmetric matrix $(A+A^{T})/2$ .

Positive definite case

First, we consider the case where $A$ is a symmetric positive definite matrix. In other words, we can write $A$ in the form:

$A=M^{T}M$

where $M$ is a $n\times n$ invertible matrix.

We can "complete the square" for this function:

$f({\vec {x}})=\left(M{\vec {x}}+{\frac {1}{2}}(M^{T})^{-1}{\vec {b}}\right)^{T}\left(M{\vec {x}}+{\frac {1}{2}}(M^{T})^{-1}{\vec {b}}\right)+\left(c-{\frac {1}{4}}{\vec {b}}^{T}A^{-1}{\vec {b}}\right)$

In other words:

$f({\vec {x}})=\left\|M{\vec {x}}+{\frac {1}{2}}(M^{T})^{-1}{\vec {b}}\right\|^{2}+\left(c-{\frac {1}{4}}{\vec {b}}^{T}A^{-1}{\vec {b}}\right)$

This is minimized when the expression whose norm we are measuring is zero, so that it is minimized when we have:

$M{\vec {x}}+{\frac {1}{2}}(M^{T})^{-1}{\vec {b}}={\vec {0}}$

Simplifying, we obtain that we minimum occurs at:

${\vec {x}}=-{\frac {1}{2}}A^{-1}{\vec {b}}$

Moreover, the value of the minimum is:

$c-{\frac {1}{4}}{\vec {b}}^{T}A^{-1}{\vec {b}}$