L1-regularized quadratic function of multiple variables
A -regularized quadratic function of the variables is a function of the form (satisfying the positive definiteness condition below):
In vector form, if we denote by the column vector with coordinates , then we can write the function as:
where is the matrix with entries and is the column vector with entries .
Note that the matrix is non-unique: if then we could replace by . Therefore, we could choose to replace by the matrix . We will thus assume that is a symmetric matrix.
We impose the further restriction that the matrix be a symmetric positive definite matrix.
|default domain||the whole of|
Partial derivatives and gradient vector
The partial derivative with respect to the variable , and therefore also the coordinate of the gradient vector (if it exists), is given as follows when :
By the symmetry assumption, this becomes:
The partial derivative is undefined when .
The gradient vector exists if and only if all the coordinates are nonzero.
In vector notation, the gradient vector is as follows for all with all coordinates nonzero:
where is the signum vector function.
The Hessian matrix of the function, defined wherever all the coordinates are nonzero, is the matrix .
We know the following two facts about the function:
- The function is a strictly convex function, and therefore it has a unique point of local minimum on the whole domain that is therefore also the point of absolute minimum. Further, it is either one of the points where the gradient vector is undefined, or it is the unique point where the gradient vector is defined and equal to zero.
- The function is a piecewise quadratic function, with the quadratic form associated with the function (namely, the quadratic term in the expression) the same in all pieces. The total number of pieces is , namely the different orthants (the points with different sign combinations for the coordinates).
Since is a symmetric positive definite matrix, we can write in the form:
where is a invertible matrix.
We can "complete the square" for this function:
In other words:
If this equals zero, that must happen at a unique point, and that point must satisfy:
Simplifying, we obtain that if we can find a solution to the equation below, that is the unique point of local and absolute minimum:
Moreover, the value of the minimum is: