Parallel coordinate descent

From Calculus
Revision as of 04:57, 8 September 2014 by Vipul (talk | contribs) (Created page with "==Definition== '''Parallel coordinate descent''' is a variant of gradient descent where we use a different learning rate in each coordinate. Explicitly, whereas with ordi...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Definition

Parallel coordinate descent is a variant of gradient descent where we use a different learning rate in each coordinate. Explicitly, whereas with ordinary gradient descent, we define each iterate by subtracting a scalar multiple of the gradient vector from the previous iterate:

Ordinary gradient descent:

In parallel coordinate descent, we use a vector learning rate, i.e., we use a learning rate that could be different in each coordinate:

Parallel coordinate descent: for each coordinate :

Alternatively, using coordinate-wise vector multiplication, we can describe the above as: