# Logarithmic scoring rule is proper

## Contents

## Statement

The logarithmic scoring rule is a proper scoring rule. Explicitly:

Consider a random variable that can take distinct values . Suppose we estimate probabilities for these values respectively (with , ). The logarithmic scoring rule works as follows: for every instance of the random variable , we assign score equal to the negative of the logarithm of the corresponding probability . Explicitly, if the instances are , the total score is:

The claim is that, if the actual probabilities are , then the assignment that minimizes the expected value of the score is .

## Related facts

## Proof

### Reduction to one random instance

Under the assumption that the instances are independent of each other, it suffices to show the result for one instance.

### Reduction to the case that all probabilities are strictly between zero and one

We now show that if any particular , the corresponding must equal 0, and if any , the corresponding must equal 1.

*Fill this in later*

### Proof for one instance and where all the actual probabilities are nonzero and less than one

The expected value for one instance is:

In words, we weight each score by the probability that that score is attained.

We are constrained to lie on the codimension one hyperplane given by . We can therefore use the idea of Lagrange multipliers to find the optima. The gradient vector of the expected value function is the vector with coordinates:

The normal vector to the hyperplane is given as the gradient vector of the function , and is the vector:

By the theory of Lagrange multipliers, we have that at any local extreme value, there exists a value such that:

for all . In other words:

for all . Adding up, we get:

We have that as well (these are the actual probabilities, so they add up to 1), so we get:

so . Plugging back, we get that the only point that could potentially be a point of local extremum satisfies for all .

We can now verify that this is indeed a point of local *minimum*. *Fill this in later*

We can also verify that the absolute minimum does not occur at the boundary: if but we set , then our expected score is , because there's a nonzero probability of paying an infinite cost, namely, in the case that the random variable takes the value .