Quadratic function of multiple variables: Difference between revisions
No edit summary |
|||
| Line 9: | Line 9: | ||
<math>\vec{x}^TA\vec{x} + \vec{b}^T\vec{x} + c</math> | <math>\vec{x}^TA\vec{x} + \vec{b}^T\vec{x} + c</math> | ||
where <math>A</math> is | where <math>A</math> is a <math>n \times n</math> matrix with entries <math>a_{ij}</math> and <math>\vec{b}</math> is the column vector with entries <math>b_i</math>. | ||
Note that the matrix <math>A</math> is non-unique: if <math>A + A^T = F + F^T</math> then we could replace <math>A</math> by <math>F</math>. Therefore, we could choose to replace <math>A</math> by the matrix <math>(A + A^T)/2</math> and have the advantage of working with a [[linear:symmetric matrix|symmetric matrix]]. | |||
==Key data== | ==Key data== | ||
| Line 31: | Line 33: | ||
The [[partial derivative]] with respect to the variable <math>x_i</math>, and therefore also the <math>i^{th}</math> coordinate of the [[gradient vector]], is given by: | The [[partial derivative]] with respect to the variable <math>x_i</math>, and therefore also the <math>i^{th}</math> coordinate of the [[gradient vector]], is given by: | ||
<math>\frac{\partial f}{\partial x_i} = \left(\sum_{j=1}^n a_{ij}x_j\right) + b_i</math> | <math>\frac{\partial f}{\partial x_i} = \left(\sum_{j=1}^n (a_{ij} + a_{ji})x_j\right) + b_i</math> | ||
In terms of the matrix and vector notation, the gradient vector, expressed as a column vector, is: | In terms of the matrix and vector notation, the gradient vector, expressed as a column vector, is: | ||
<math>(\nabla f)(\vec{x}) = A\vec{x} + \vec{b}</math> | <math>(\nabla f)(\vec{x}) = (A + A^T)\vec{x} + \vec{b}</math> | ||
In the case that <math>A</math> is a symmetric matrix, this simplifies to: | |||
<math>(\nabla f)(\vec{x}) = 2A\vec{x} + \vec{b}</math> | |||
This can be obtained by applying the product rule for differentiation to the functional form. | This can be obtained by applying the product rule for differentiation to the functional form. | ||
===Hessian matrix=== | ===Hessian matrix=== | ||
The [[Hessian matrix]] of the quadratic function is the matrix <math>A</math>. | The [[Hessian matrix]] of the quadratic function is the matrix <math>A + A^T</math>. | ||
===Higher derivatives=== | ===Higher derivatives=== | ||
Revision as of 19:17, 11 May 2014
Definition
Consider variables . A quadratic function of the variables is a function of the form:
In vector form, if we denote by the column vector with coordinates , then we can write the function as:
where is a matrix with entries and is the column vector with entries .
Note that the matrix is non-unique: if then we could replace by . Therefore, we could choose to replace by the matrix and have the advantage of working with a symmetric matrix.
Key data
| Item | Value |
|---|---|
| default domain | the whole of |
| range | If the matrix is not positive semidefinite or negative semidefinite, the range is all of . If the matrix is positive semidefinite, the range is where is the minimum value. If the matrix is negative semidefinite, the range is where is the maximum value. |
| local minimum value and points of attainment | If the matrix is positive definite, then , attained at (also applies if it's positive semidefinite) Otherwise, no local minimum value |
| local maximum value and points of attainment | If the matrix is negative definite, then , attained at (also applies if it's negative semidefinite) Otherwise, no local maximum value |
Differentiation
Partial derivatives and gradient vector
The partial derivative with respect to the variable , and therefore also the coordinate of the gradient vector, is given by:
In terms of the matrix and vector notation, the gradient vector, expressed as a column vector, is:
In the case that is a symmetric matrix, this simplifies to:
This can be obtained by applying the product rule for differentiation to the functional form.
Hessian matrix
The Hessian matrix of the quadratic function is the matrix .
Higher derivatives
All the higher derivative tensors are zero.
Cases
Positive definite case
First, we consider the case where is a positive definite matrix. In other words, we can write in the form:
where is a invertible matrix.
We can "complete the square" for this function:
In other words:
This is minimized when the expression whose norm we are measuring is zero, so that it is minimized when we have:
Simplifying, we obtain that we minimum occurs at:
Moreover, the value of the minimum is: