Description
Created by Ankur Mali
Course Policy: Carefully read all the instructions, before you start working on the assignment
• Please avoid single line answers, submissions without any explanations would receive 0 points.
• Late assignments will suffer 50 percent points reductions per day, formula is x/2, where x=number of days and counter will start at 02:30:00 pm.
• We will be creating Canvas submission page for this. Submit all files on Canvas.
• All queries related to Assignment should have a subject line IST597:Assignment00100 Queries
Problem 1. Build your own custom optimizer. (7 points)
This assignment will build a custom stochastic algorithm to update your model weights. You will modify the starter code provided for assignment one and build on top of it. In other words, you will replace the Keras optimizer with a custom build optimizer (algorithm 1). You will compare the custom optimizer with Keras inbuild optimizers (SGD, RMSProp, and Adam) and show performance across ten trials. Report your findings and comment on speed, stability, and robustness. Note:- Based on assignment 1, select the best model (with and without regularization) for each dataset. You should have total 12 models for both datasets.
1
– Assignment #00100 2
Algorithm: Stochastic optimization. Here gt2 indicates the element-wise square indicates the element-wise square or cube . Set α = [1e−2,2e−5], β1 = 0.9,β2 = 0.999,β3 = 0.999987, and . All operations on vectors are element-wise. With we denote β1 β2 and β3 to the power t.
Require:α : Stepsize
Require: β1,β2,β3 ∈ [0,1] : Exponential decay rates for the moment estimates Require: f(θ) : Stochastic objective function with parameters θ
Require: θ0 : Initial parameter vector m0 ← 0 (Initialize 1st moment vector) v0 ← 0 (Initialize 2nd moment vector) u0 ← 0 (Initialize 3rd moment vector) t ← 0 (Initialize timestep)
while θt not converged do t ← t + 1
gt ← ∇θft (θt−1)( Get gradients w.r.t. stochastic objective at timestep t) mt ← β1 · mt−1 + (1 − β1) · gt (Update biased first moment estimate) vt ← β2 · vt−1 + (1 − β2) · gt2 (Update biased second raw moment estimate) ut ← β3 · ut−1 + (1 − β3) · gt3 (Update biased third raw moment estimate)
( Compute bias-corrected first moment estimate)
( Compute bias-corrected second raw moment estimate )
( Compute bias-corrected third raw moment estimate )
(Update parameters)
end while return θt( Resulting parameters )




Reviews
There are no reviews yet.