Quadratic function of multiple variables

From Calculus

Definition

Consider variables x1,x2,,xn. A quadratic function of the variables x1,x2,,xn is a function of the form:

(i=1nj=1naijxixj)+(i=1nbixi)+c

In vector form, if we denote by x the column vector with coordinates x1,x2,,xn, then we can write the function as:

xTAx+bTx+c

where A is a n×n matrix with entries aij and b is the column vector with entries bi.

Note that the matrix A is non-unique: if A+AT=F+FT then we could replace A by F. Therefore, we could choose to replace A by the matrix (A+AT)/2 and have the advantage of working with a symmetric matrix.

Key data

For the discussion here, assume that A has been made a symmetric matrix.

Item Value
default domain the whole of Rn
range If the matrix A is not positive semidefinite or negative semidefinite, the range is all of R.
If the matrix A is positive semidefinite, the range is [m,) where m is the minimum value. If the matrix A is negative semidefinite, the range is (,m] where m is the maximum value.
local minimum value and points of attainment If the matrix A is positive definite, then c14bTA1b, attained at 12A1b
If A is positive semidefinite but not positive definite, it depends on whether b is in the image of A. If yes, replace A1b with the solution v to Av=b, so we get a local minimum of c14bTv attained at 12v
If A is not positive semidefinite or if b is not in the image of A, no local minimum value
local maximum value and points of attainment If the matrix A is negative semidefinite, then c14bTA1b, attained at 12A1b, where A=MTM
If A is negative semidefinite but not negative definite, it depends on whether b is in the image of A. If yes, replace A1b with the solution v to Av=b, so we get a local minimum of c14bTv attained at 12v
If A is not positive semidefinite or if b is not in the image of A, no local minimum value

Differentiation

Partial derivatives and gradient vector

The partial derivative with respect to the variable xi, and therefore also the ith coordinate of the gradient vector, is given by:

fxi=(j=1n(aij+aji)xj)+bi

In terms of the matrix and vector notation, the gradient vector, expressed as a column vector, is:

(f)(x)=(A+AT)x+b

In the case that A is a symmetric matrix, this simplifies to:

(f)(x)=2Ax+b

Hessian matrix

The Hessian matrix of the quadratic function is the matrix A+AT. In the case that A is symmetric, this simplifies to 2A.

Higher derivatives

All the higher derivative tensors are zero.

Cases

For the discussion of cases, assume that A is a symmetric matrix. If A is not symmetric, replace it by the symmetric matrix (A+AT)/2.

Positive definite case

First, we consider the case where A is a symmetric positive definite matrix. In other words, we can write A in the form:

A=MTM

where M is a n×n invertible matrix.

We can "complete the square" for this function:

f(x)=(Mx+12(MT)1b)T(Mx+12(MT)1b)+(c14bTA1b)

In other words:

f(x)=Mx+12(MT)1b2+(c14bTA1b)

This is minimized when the expression whose norm we are measuring is zero, so that it is minimized when we have:

Mx+12(MT)1b=0

Simplifying, we obtain that we minimum occurs at:

x=12A1b

Moreover, the value of the minimum is:

c14bTA1b