100% Guaranteed Results


STAT157 – homework3 Solved
$ 24.99
Category:

Description

5/5 – (1 vote)

1 Homework 3 – Berkeley STAT 157
In [1]: from mxnet import nd, autograd, gluon import matplotlib.pyplot as plt
2 1. Logistic Regression for Binary Classification
In multiclass classification we typically use the exponential model
p(y|o) = softmax(
1.1. Show that this parametrization has a spurious degree of freedom. That is, show that both o and o+c with c ∈ R lead to the same probability estimate. 1.2. For binary classification, i.e. whenever we have only two classes {−1,1}, we can arbitrarily set o−1 = 0. Using the shorthand o = o1 show that this is equivalent to

1.3. Show that the log-likelihood loss (often called logistic loss) for labels y ∈ {−1,1} is thus given by
−logp(y|o) = log(1 + exp(−y · o))
1.4. Show that for y = 1 the logistic loss asymptotes to o for o → ∞ and to exp(o) for o → −∞.
3 2. Logistic Regression and Autograd
1. Implement the binary logistic loss l(y,o) = log(1 + exp(−y · o)) in Gluon
2. Plot its values for y ∈ {−1,1} over the range of o ∈ [−5,5].
3. Plot its derivative with respect to o for o ∈ [−5,5] using ‘autograd’.
In [2]: def loss(y,o):
## add your loss function here return l
4 3. Ohm’s Law
Imagine that you’re a young physicist, maybe named Georg Simon Ohm, trying to figure out how current and voltage depend on each other for resistors. You have some idea but you aren’t quite sure yet whether the dependence is linear or quadratic. So you take some measurements, conveniently given to you as ‘ndarrays’ in Python. They are indicated by ‘current’ and ‘voltage’.
Your goal is to use least mean squares regression to identify the coefficients for the following three models using automatic differentiation and least mean squares regression. The three models are:
1. Quadratic model where voltage = c + r · current + q · current2.
2. Linear model where voltage = c + r · current.
3. Ohm’s law where voltage = r · current.
In [3]: current = nd.array([1.5420291, 1.8935232, 2.1603365, 2.5381863, 2.893443,
3.838855, 3.925425, 4.2233696, 4.235571, 4.273397,
4.9332876, 6.4704757, 6.517571, 6.87826, 7.0009003, 7.035741, 7.278681, 7.7561755, 9.121138, 9.728281])
voltage = nd.array([63.802246, 80.036026, 91.4903, 108.28776, 122.781975,
161.36314, 166.50816, 176.16772, 180.29395, 179.09758,
206.21027, 272.71857, 272.24033, 289.54745, 293.8488,
295.2281, 306.62274, 327.93243, 383.16296, 408.65967]) 5 4. Entropy
Let’s compute the binary entropy of a number of interesting data sources.
1. Assume that you’re watching the output generated by a monkey at a typewriter. The monkey presses any of the 44 keys of the typewriter at random (you can assume that it has not discovered any special keys or the shift key yet). How many bits of randomness per character do you observe?
2. Unhappy with the monkey you replaced it by a drunk typesetter. It is able to generate words, albeit not coherently. Instead, it picks a random word out of a vocabulary of 2,000 words. Moreover, assume that the average length of a word is 4.5 letters in English. How many bits of randomness do you observe now?
3. Still unhappy with the result you replace the typesetter by a high quality language model. These can obtain perplexity numbers as low as 20 points per character. The perplexity is defined as a length normalized probability, i.e.
PPL(x) = [p(x)]1/length(x)
6 5. Wien’s Approximation for the Temperature (bonus)
We will now abuse Gluon to estimate the temperature of a black body. The energy emanated from a black body is given by Wien’s approximation.

That is, the amount of energy depends on the fifth power of the wavelength λ and the temperature T of the body. The latter ensures a cutoff beyond a temperature-characteristic peak. Let us define this and plot it.
In [4]: # Lightspeed c = 299792458 # Planck’s constant h = 6.62607004e-34 # Boltzmann constant k = 1.38064852e-23
# Wavelength scale (nanometers) lamscale = 1e-6
# Pulling out all powers of 10 upfront p_out = 2 * h * c**2 / lamscale**5 p_in = (h / k) * (c/lamscale)
# Wien’s law def wien(lam, t):
return (p_out / lam**5) * nd.exp(-p_in / (lam * t))
# Plot the radiance for a few different temperatures lam = nd.arange(0,100,0.01) for t in [10, 100, 150, 200, 250, 300, 350]:
radiance = wien(lam, t) plt.plot(lam.asnumpy(), radiance.asnumpy(), label=(‘T=’ + str(t) + ‘K’)
plt.legend() plt.show()

Next we assume that we are a fearless physicist measuring some data. Of course, we need to pretend that we don’t really know the temperature. But we measure the radiation at a few wavelengths.
In [5]: # real temperature is approximately 0C realtemp = 273
# we observe at 3000nm up to 20,000nm wavelength wavelengths = nd.arange(3,20,2)
# our infrared filters are pretty lousy … delta = nd.random_normal(shape=(len(wavelengths))) * 1
radiance = wien(wavelengths + delta,realtemp)
plt.plot(wavelengths.asnumpy(), radiance.asnumpy(), label=’measured’) plt.plot(wavelengths.asnumpy(), wien(wavelengths, realtemp).asnumpy(), labe plt.legend() plt.show()

Use Gluon to estimate the real temperature based on the variables wavelengths and radiance.
• You can use Wien’s law implementation wien(lam,t) as your forward model.
• Use the loss function l(y,y0) = (logy − logy0)2 to measure accuracy.

Reviews

There are no reviews yet.

Be the first to review “STAT157 – homework3 Solved”

Your email address will not be published. Required fields are marked *

Related products