Quadratic function of multiple variables: Difference between revisions

Revision as of 02:53, 26 May 2014

Definition

Consider variables $x_{1}, x_{2}, \dots, x_{n}$ . A quadratic function of the variables $x_{1}, x_{2}, \dots, x_{n}$ is a function of the form:

$(\sum_{i = 1}^{n} \sum_{j = 1}^{n} a_{i j} x_{i} x_{j}) + (\sum_{i = 1}^{n} b_{i} x_{i}) + c$

In vector form, if we denote by $\vec{x}$ the column vector with coordinates $x_{1}, x_{2}, \dots, x_{n}$ , then we can write the function as:

${\vec{x}}^{T} A \vec{x} + {\vec{b}}^{T} \vec{x} + c$

where $A$ is a $n \times n$ matrix with entries $a_{i j}$ and $\vec{b}$ is the column vector with entries $b_{i}$ .

Note that the matrix $A$ is non-unique: if $A + A^{T} = F + F^{T}$ then we could replace $A$ by $F$ . Therefore, we could choose to replace $A$ by the matrix $(A + A^{T}) / 2$ and have the advantage of working with a symmetric matrix.

Key data

For the discussion here, assume that $A$ has been made a symmetric matrix.

Item	Value
default domain	the whole of $R^{n}$
range	If the matrix $A$ is not positive semidefinite or negative semidefinite, the range is all of $R$ . If the matrix $A$ is positive semidefinite, the range is $[m, \infty)$ where $m$ is the minimum value. If the matrix $A$ is negative semidefinite, the range is $(- \infty, m]$ where $m$ is the maximum value.
local minimum value and points of attainment	If the matrix $A$ is positive definite, then $c - \frac{1}{4} {\vec{b}}^{T} A^{- 1} \vec{b}$ , attained at $\frac{- 1}{2} A^{- 1} \vec{b}$ If $A$ is positive semidefinite but not positive definite(?) Otherwise, no local minimum value
local maximum value and points of attainment	If the matrix $A$ is negative semidefinite, then $c - \frac{1}{4} {\vec{b}}^{T} A^{- 1} \vec{b}$ , attained at $\frac{- 1}{2} A^{- 1} \vec{b}$ , where $A = - M^{T} M$ If $A$ is negative semidefinite but not negative definite (?) Otherwise, no local maximum value

Differentiation

Partial derivatives and gradient vector

The partial derivative with respect to the variable $x_{i}$ , and therefore also the $i^{t h}$ coordinate of the gradient vector, is given by:

$\frac{\partial f}{\partial x_{i}} = (\sum_{j = 1}^{n} (a_{i j} + a_{j i}) x_{j}) + b_{i}$

In terms of the matrix and vector notation, the gradient vector, expressed as a column vector, is:

$(\nabla f) (\vec{x}) = (A + A^{T}) \vec{x} + \vec{b}$

In the case that $A$ is a symmetric matrix, this simplifies to:

$(\nabla f) (\vec{x}) = 2 A \vec{x} + \vec{b}$

Hessian matrix

The Hessian matrix of the quadratic function is the matrix $A + A^{T}$ . In the case that $A$ is symmetric, this simplifies to $2 A$ .

Higher derivatives

All the higher derivative tensors are zero.

Cases

For the discussion of cases, assume that $A$ is a symmetric matrix. If $A$ is not symmetric, replace it by the symmetric matrix $(A + A^{T}) / 2$ .

Positive definite case

First, we consider the case where $A$ is a symmetric positive definite matrix. In other words, we can write $A$ in the form:

$A = M^{T} M$

where $M$ is a $n \times n$ invertible matrix.

We can "complete the square" for this function:

$f (\vec{x}) = {(M \vec{x} + \frac{1}{2} (M^{T})^{- 1} \vec{b})}^{T} (M \vec{x} + \frac{1}{2} (M^{T})^{- 1} \vec{b}) + (c - \frac{1}{4} {\vec{b}}^{T} A^{- 1} \vec{b})$

In other words:

$f (\vec{x}) = {‖ M \vec{x} + \frac{1}{2} (M^{T})^{- 1} \vec{b} ‖}^{2} + (c - \frac{1}{4} {\vec{b}}^{T} A^{- 1} \vec{b})$

This is minimized when the expression whose norm we are measuring is zero, so that it is minimized when we have:

$M \vec{x} + \frac{1}{2} (M^{T})^{- 1} \vec{b} = \vec{0}$

Simplifying, we obtain that we minimum occurs at:

$\vec{x} = - \frac{1}{2} A^{- 1} \vec{b}$

Moreover, the value of the minimum is:

$c - \frac{1}{4} {\vec{b}}^{T} A^{- 1} \vec{b}$

@@ Line 24: / Line 24: @@
 | [[range]] || If the matrix <math>A</math> is not positive semidefinite or negative semidefinite, the range is all of <math>\R</math>.<br>If the matrix <math>A</math> is positive semidefinite, the range is <math>[m,\infty)</math> where <math>m</math> is the minimum value. If the matrix <math>A</math> is negative semidefinite, the range is <math>(-\infty,m]</math> where <math>m</math> is the maximum value.
 |-
-| [[local minimum value]] and points of attainment || If the matrix <math>A</math> is positive semidefinite, then <math>c - \frac{1}{4}\vec{b}^TM\vec{b}</math>, attained at <math>\frac{-1}{2}A^{-1}\vec{b}</math>, where <math>A = M^TM</math><br>Otherwise, no local minimum value
+| [[local minimum value]] and points of attainment || If the matrix <math>A</math> is positive definite, then <math>c - \frac{1}{4}\vec{b}^TA^{-1}\vec{b}</math>, attained at <math>\frac{-1}{2}A^{-1}\vec{b}</math><br>If <math>A</matH> is positive semidefinite but not positive definite(?)<br>Otherwise, no local minimum value
 |-
-| [[local maximum value]] and points of attainment || If the matrix <math>A</math> is negative semidefinite, then <math>c - \frac{1}{4}\vec{b}^TM\vec{b}</math>, attained at <math>\frac{-1}{2}A^{-1}\vec{b}</math>, where <math>A = -M^TM</math><br>Otherwise, no local maximum value
+| [[local maximum value]] and points of attainment || If the matrix <math>A</math> is negative semidefinite, then <math>c - \frac{1}{4}\vec{b}^TA^{-1}\vec{b}</math>, attained at <math>\frac{-1}{2}A^{-1}\vec{b}</math>, where <math>A = -M^TM</math><br>If <math>A</math> is negative semidefinite but not negative definite (?)<br>Otherwise, no local maximum value
 |}