Description
Dane Johnson
1
Part a
We want to find xMAP = arg max πpost(x) = arg min −log(πpost(x)). In
x∈R x∈R
the case that . We will find the minimum of −log(πpost(x)).
= 0 is a critical point.
Then since in this scenario 0 for any x, x = 0 must be a minimum of −log(πpost(x)). So in this scenario xMAP = 0.
In the case that . We will find the minimum of −log(πpost(x)).
= 1 is a critical point.
Then since in this scenario 0 for any x, x = 0 must be a minimum of −log(πpost(x)). So in this scenario xMAP = 1.
Since πpost(x) is the sum of two scaled Gaussian functions (which because of our coefficient assumptions here are strictly positive), one with a mean at x = 0 and another with x = 1, then if the coefficient of one of these ’pieces’ is not significantly larger than the other, the tails of the Gaussian functions interact nontrivially to give an xMAP between 0 and 1. The coefficients then determine whether xMAP is closer to 0 or 1 (see figures 3 and 4).
Next the conditional mean:
The simplification comes from the fact that and
are Gaussian functions and can serve as posterior distributions. In lecture we saw that in this case xMAP = µ, the mean of the Gaussian. Splitting the integral as we did shows the definition of xCM for these two Gaussian functions, which have means 0 and 1.
Part b
See the end of the document for the figures. In the first case, we might say that xMAP is a better representation of the underlying function since a larger proportion of the area under the curve (probability) is clustered around this value while in the second case the conditional mean is better under the same assumption that we want a value near where a larger proportion of the data is clustered. In the first case we have small variance relative to α, explaining why the data is clustered away from 1−α while in the second case the larger spread around x = 1 means that 1 − α ≈ 1 will be nearer to where more area under the curve is located.
In the first case the conditional covariance is
In the second case the conditional covariance, using the same formula, is σ2 = 0.0198. In both cases the conditional covariance is much closer to α than either of .
2
The model can be transformed to
log(Y ) = log(E) + log(G(X)) ,
where log(E) ∼ N(0,σ2I). When the variance is unknown the likelihood function for log(Y ) conditioned on X = x is
,
while in the case that the variance is known the likelihood function is then
.
Figures




Reviews
There are no reviews yet.