L1-regularized quadratic function of multiple variables

From Calculus

Definition

A L1-regularized quadratic function of the variables x1,x2,,xn is a function of the form (satisfying the positive definiteness condition below):

f(x1,x2,,xn):=(i=1nj=1naijxixj)+(i=1nbixi)+λi=1n|xi|+c

In vector form, if we denote by x the column vector with coordinates x1,x2,,xn, then we can write the function as:

xTAx+bTx+λ|x|1+c

where A is the n×n matrix with entries aij and b is the column vector with entries bi.

Note that the matrix A is non-unique: if A+AT=F+FT then we could replace A by F. Therefore, we could choose to replace A by the matrix (A+AT)/2. We will thus assume that A is a symmetric matrix.

We impose the further restriction that the matrix A be a symmetric positive definite matrix.

Key data

Item Value
default domain the whole of Rn

Differentiation

Partial derivatives and gradient vector

The partial derivative with respect to the variable xi, and therefore also the ith coordinate of the gradient vector (if it exists), is given as follows when xi0:

fxi=(j=1n(aij+aji)xj)+bi+λsgn(xi)

By the symmetry assumption, this becomes:

fxi=(j=1n2aijxj)+bi+λsgn(xi)

The partial derivative is undefined when xi=0.

The gradient vector exists if and only if all the coordinates are nonzero.

In vector notation, the gradient vector is as follows for all x with all coordinates nonzero:

f(x)=2Ax+b+λsgn¯(x)

where sgn¯ is the signum vector function.

Hessian matrix

The Hessian matrix of the function, defined wherever all the coordinates are nonzero, is the matrix 2A.