Quadratic function of multiple variables: Difference between revisions

Revision as of 03:41, 26 May 2014

Definition

Consider variables $x_{1},x_{2},\dots ,x_{n}$ . A quadratic function of the variables $x_{1},x_{2},\dots ,x_{n}$ is a function of the form:

$\left(\sum _{i=1}^{n}\sum _{j=1}^{n}a_{ij}x_{i}x_{j}\right)+\left(\sum _{i=1}^{n}b_{i}x_{i}\right)+c$

In vector form, if we denote by ${\vec {x}}$ the column vector with coordinates $x_{1},x_{2},\dots ,x_{n}$ , then we can write the function as:

${\vec {x}}^{T}A{\vec {x}}+{\vec {b}}^{T}{\vec {x}}+c$

where $A$ is a $n\times n$ matrix with entries $a_{ij}$ and ${\vec {b}}$ is the column vector with entries $b_{i}$ .

Note that the matrix $A$ is non-unique: if $A+A^{T}=F+F^{T}$ then we could replace $A$ by $F$ . Therefore, we could choose to replace $A$ by the matrix $(A+A^{T})/2$ and have the advantage of working with a symmetric matrix.

Key data

For the discussion here, assume that $A$ has been made a symmetric matrix.

Item	Value	Consistency with the case $n=1$ , where $f(x)=ax^{2}+bx+c$ , $A=(a)$ (a $1\times 1$ matrix), ${\vec {b}}=(b)$ (a 1-dimensional vector)
default domain	the whole of $\mathbb {R} ^{n}$	the whole of $\mathbb {R}$
range	If the matrix $A$ is not positive semidefinite or negative semidefinite, the range is all of $\mathbb {R}$ . If the matrix $A$ is positive definite or ( $A$ is positive semidefinite and ${\vec {b}}$ is in its image), the range is $[m,\infty )$ where $m$ is the minimum value. If the matrix $A$ is negative definite or ( $A$ is negative semidefinite and ${\vec {b}}$ is in its image), the range is $(-\infty ,m]$ where $m$ is the maximum value.	The case of "not positive semidefinite or negative semidefinite" does not arise for $n=1$ . Moreover, all the semidefinite cases must be definite, so we only have to consider the positive definite case and the negative definite case. The positive definite case corresponds to $a>0$ The negative definite case corresponds to $a<0$
local minimum value and points of attainment	If the matrix $A$ is negative definite, then $c-{\frac {1}{4}}{\vec {b}}^{T}A^{-1}{\vec {b}}$ , attained at ${\frac {-1}{2}}A^{-1}{\vec {b}}$ If $A$ is positive semidefinite but not positive definite, it depends on whether ${\vec {b}}$ is in the image of $A$ . If yes, replace $A^{-1}{\vec {b}}$ with the solution ${\vec {v}}$ to $A{\vec {v}}={\vec {b}}$ , so we get a local minimum of $c-{\frac {1}{4}}{\vec {b}}^{T}{\vec {v}}$ attained at ${\frac {-1}{2}}{\vec {v}}$ If $A$ is not positive semidefinite or if ${\vec {b}}$ is not in the image of $A$ , no local minimum value	The positive definite case corresponds to $a>0$ : Here, the local minimum value of $c-{\frac {b^{2}}{4a}}$ is attained at ${\frac {-b}{2a}}$ (consistent with the matrix formulation) The negative definite case corresponds to $a<0$ , and there is no minimum in this case.
local maximum value and points of attainment	If the matrix $A$ is negative semidefinite, then $c-{\frac {1}{4}}{\vec {b}}^{T}A^{-1}{\vec {b}}$ , attained at ${\frac {-1}{2}}A^{-1}{\vec {b}}$ If $A$ is negative semidefinite but not negative definite, it depends on whether ${\vec {b}}$ is in the image of $A$ . If yes, replace $A^{-1}{\vec {b}}$ with the solution ${\vec {v}}$ to $A{\vec {v}}={\vec {b}}$ , so we get a local minimum of $c-{\frac {1}{4}}{\vec {b}}^{T}{\vec {v}}$ attained at ${\frac {-1}{2}}{\vec {v}}$ If $A$ is not positive semidefinite or if ${\vec {b}}$ is not in the image of $A$ , no local minimum value	The negative definite case corresponds to $a<0$ : Here, the local maximum value of $c-{\frac {b^{2}}{4a}}$ is attained at ${\frac {-b}{2a}}$ (consistent with the matrix formulation) The positive definite case corresponds to $a>0$ , and there is no maximum in this case.
gradient vector function (analogous to the derivative)	${\vec {x}}\mapsto 2A{\vec {x}}+{\vec {b}}$	the derivative is $x\mapsto 2ax+b$ (consistent with the matrix formulation)
Hessian matrix (analogous to the second derivative)	${\vec {x}}\mapsto 2A$ (constant matrix-valued function)	the second derivative is the constant function $x\mapsto 2a$ (consistent with the matrix formulation)

Differentiation

Partial derivatives and gradient vector

The partial derivative with respect to the variable $x_{i}$ , and therefore also the $i^{th}$ coordinate of the gradient vector, is given by:

${\frac {\partial f}{\partial x_{i}}}=\left(\sum _{j=1}^{n}(a_{ij}+a_{ji})x_{j}\right)+b_{i}$

In terms of the matrix and vector notation, the gradient vector, expressed as a column vector, is:

$(\nabla f)({\vec {x}})=(A+A^{T}){\vec {x}}+{\vec {b}}$

In the case that $A$ is a symmetric matrix, this simplifies to:

$(\nabla f)({\vec {x}})=2A{\vec {x}}+{\vec {b}}$

Hessian matrix

The Hessian matrix of the quadratic function is the matrix $A+A^{T}$ . In the case that $A$ is symmetric, this simplifies to $2A$ .

Higher derivatives

All the higher derivative tensors are zero.

Cases

For the discussion of cases, assume that $A$ is a symmetric matrix. If $A$ is not symmetric, replace it by the symmetric matrix $(A+A^{T})/2$ .

Positive definite case

First, we consider the case where $A$ is a symmetric positive definite matrix. In other words, we can write $A$ in the form:

$A=M^{T}M$

where $M$ is a $n\times n$ invertible matrix.

We can "complete the square" for this function:

$f({\vec {x}})=\left(M{\vec {x}}+{\frac {1}{2}}(M^{T})^{-1}{\vec {b}}\right)^{T}\left(M{\vec {x}}+{\frac {1}{2}}(M^{T})^{-1}{\vec {b}}\right)+\left(c-{\frac {1}{4}}{\vec {b}}^{T}A^{-1}{\vec {b}}\right)$

In other words:

$f({\vec {x}})=\left\|M{\vec {x}}+{\frac {1}{2}}(M^{T})^{-1}{\vec {b}}\right\|^{2}+\left(c-{\frac {1}{4}}{\vec {b}}^{T}A^{-1}{\vec {b}}\right)$

This is minimized when the expression whose norm we are measuring is zero, so that it is minimized when we have:

$M{\vec {x}}+{\frac {1}{2}}(M^{T})^{-1}{\vec {b}}={\vec {0}}$

Simplifying, we obtain that we minimum occurs at:

${\vec {x}}=-{\frac {1}{2}}A^{-1}{\vec {b}}$

Moreover, the value of the minimum is:

$c-{\frac {1}{4}}{\vec {b}}^{T}A^{-1}{\vec {b}}$

@@ Line 24: / Line 24: @@
 | [[range]] || If the matrix <math>A</math> is not positive semidefinite or negative semidefinite, the range is all of <math>\R</math>.<br>If the matrix <math>A</math> is positive definite or (<math>A</math> is positive semidefinite and <math>\vec{b}</math> is in its image), the range is <math>[m,\infty)</math> where <math>m</math> is the minimum value. If the matrix <math>A</math> is negative definite or (<math>A</math> is negative semidefinite and <math>\vec{b}</math> is in its image), the range is <math>(-\infty,m]</math> where <math>m</math> is the maximum value. || The case of "not positive semidefinite or negative semidefinite" does not arise for <math>n = 1</math>. Moreover, all the semidefinite cases must be definite, so we only have to consider the positive definite case and the negative definite case.<br>The positive definite case corresponds to <math>a > 0</math><br>The negative definite case corresponds to <math>a < 0</math>
 |-
-| [[local minimum value]] and points of attainment || If the matrix <math>A</math> is positive definite, then <math>c - \frac{1}{4}\vec{b}^TA^{-1}\vec{b}</math>, attained at <math>\frac{-1}{2}A^{-1}\vec{b}</math><br>If <math>A</math> is positive semidefinite but not positive definite, it depends on whether <math>\vec{b}</math> is in the image of <math>A</math>. If yes, replace <math>A^{-1}\vec{b}</math> with the solution <math>\vec{v}</math> to <math>A\vec{v} = \vec{b}</math>, so we get a local minimum of <math>c - \frac{1}{4}\vec{b}^T\vec{v}</math> attained at <math>\frac{-1}{2}\vec{v}</math><br>If <math>A</math> is not positive semidefinite or if <math>\vec{b}</math> is not in the image of <math>A</math>, no local minimum value || The positive definite case corresponds to <math>a > 0</math>: Here, the local minimum value of <math>c - \frac{b^2}{4a}</math> is attained at <math>\frac{-b}{2a}</math> (consistent with the matrix formulation)<br>The negative definite case corresponds to <math>a < 0</math>, and there is no minimum in this case.
+| [[local minimum value]] and points of attainment || If the matrix <math>A</math> is negative definite, then <math>c - \frac{1}{4}\vec{b}^TA^{-1}\vec{b}</math>, attained at <math>\frac{-1}{2}A^{-1}\vec{b}</math><br>If <math>A</math> is positive semidefinite but not positive definite, it depends on whether <math>\vec{b}</math> is in the image of <math>A</math>. If yes, replace <math>A^{-1}\vec{b}</math> with the solution <math>\vec{v}</math> to <math>A\vec{v} = \vec{b}</math>, so we get a local minimum of <math>c - \frac{1}{4}\vec{b}^T\vec{v}</math> attained at <math>\frac{-1}{2}\vec{v}</math><br>If <math>A</math> is not positive semidefinite or if <math>\vec{b}</math> is not in the image of <math>A</math>, no local minimum value || The positive definite case corresponds to <math>a > 0</math>: Here, the local minimum value of <math>c - \frac{b^2}{4a}</math> is attained at <math>\frac{-b}{2a}</math> (consistent with the matrix formulation)<br>The negative definite case corresponds to <math>a < 0</math>, and there is no minimum in this case.
 |-
 | [[local maximum value]] and points of attainment || If the matrix <math>A</math> is negative semidefinite, then <math>c - \frac{1}{4}\vec{b}^TA^{-1}\vec{b}</math>, attained at <math>\frac{-1}{2}A^{-1}\vec{b}</math><br>If <math>A</math> is negative semidefinite but not negative definite, it depends on whether <math>\vec{b}</math> is in the image of <math>A</math>. If yes, replace <math>A^{-1}\vec{b}</math> with the solution <math>\vec{v}</math> to <math>A\vec{v} = \vec{b}</math>, so we get a local minimum of <math>c - \frac{1}{4}\vec{b}^T\vec{v}</math> attained at <math>\frac{-1}{2}\vec{v}</math><br>If <math>A</math> is not positive semidefinite or if <math>\vec{b}</math> is not in the image of <math>A</math>, no local minimum value || The negative definite case corresponds to <math>a < 0</math>: Here, the local maximum value of <math>c - \frac{b^2}{4a}</math> is attained at <math>\frac{-b}{2a}</math> (consistent with the matrix formulation)<br>The positive definite case corresponds to <math>a > 0</math>, and there is no maximum in this case.