Description
Worth 15 points
• The last problem of this homework will ask you to do a few simple things in TensorFlow on Google’s computing service, Google Cloud Platform (GCP). This part of the assignment is meant to challenge you to learn a new tool from scratch by reading documentation and following its tutorials. This is tough, by design. Start early so that you have plenty of time to grapple with the material.
1 Warmup: Constructing a 3-tensor (1 point)
Create a TensorFlow constant tensor tflogo with shape 5-by-4-by-3. This tensor will represent the 5-by-4-by-3 volume that contains the orange structure depicted in the logo (said another way, the orange structure is inscribed in this 5-by-4-by-3 volume). Each cell of your tensor should correspond to one cell in this volume. Each entry of your tensor should be 1 if and only if the corresponding cell is part of the orange structure, and should be 0 otherwise. Looking at the logo, we see that the orange structure can be broken into 11 cubic cells, so your tensor tflogo should have precisely 11 non-zero entries. For the sake of consistency, the (0,3,2)-entry of your tensor (using 0-indexing) should correspond to the top rear corner of the structure where the cross of the “T” meets the top of the “F”. Note: if you look carefully, the shadows in the logo do not correctly reflect the orange structure—the shadow of the “T” is incorrectly drawn. Do not let this fool you!
Figure 1: The TensorFlow logo.
were looking at one horizontal slice of the tensor at a time, working your way from top to bottom.
2 Building and training simple models (4 points)
In this problem, you’ll use TensorFlow to build the loss functions for a pair of commonlyused statistical models. In all cases, your answer should include placeholder variables x and ytrue, which will serve as the predictor (independent variable) and response (dependent variable), respectively. Please use W to denote a parameter that multiplies the predictor, and b to denote a bias parameter (i.e., a parameter that is added).
1. Logistic regression with a negative log-likelihood loss. In this model, which we discussed briefly in class, the binary variable Y is distributed as a Bernoulli random variable with success parameter σ(WTX+b), where σ(z) = (1+exp(−z))−1 is the logistic function, and X ∈ R6 is the predictor random variable, and W ∈ R6,b ∈R are the model parameters. Derive the log-likelihood of Y , and write the TensorFlow code that represents the negative log-likelihood loss function. Hint: the loss should be a sum over all observations of a negative log-likelihood term.
2. Estimating parameters in logistic regression. The zip file at http://wwwpersonal.umich.edu/~klevin/teaching/Winter2019/STATS507/HW10_logistic. zip contains four Numpy .npy files that contain train and test data generated from a logistic model:
• logistic xtest.npy : contains a 500-by-6 matrix whose rows are the independent variables (predictors) from the test set.
• logistic xtrain.npy : contains a 2000-by-6 matrix whose rows are the independent variables (predictors) from the train set.
• logistic ytest.npy : contains a binary 500-dimensional vector of dependent variables (responses) from the test set.
• logistic ytrain.npy : contains a binary 2000-dimensional vector of dependent variables (responses) from the train set.
The i-th row of the matrix in logistic xtrain.npy is the predictor for the response in the i-th entry of the vector in logistic ytrain.npy, and analogously for the two test set files. Please include these files in your submission so that we can run your code without downloading them again. Note: we didn’t discuss reading numpy data from files. To load the files, you can simply call xtrain = np.load(’xtrain.npy’) to read the data into the variable xtrain. xtrain will be a Numpy array.
Load the training data and use it to obtain estimates of W and b by minimizing the negative log-likelihood via gradient descent. Another note: you’ll have to play around with the learning rate and the number of steps. Two good ways to check if optimization is finding a good minimizer:
• Try printing the training data loss before and after optimization.
• Use the test data to validate your estimated parameters.
3. Evaluating logistic regression on test data. Load the test data. What is the negative log-likelihood of your model on this test data? That is, what is the negative log-likelihood when you use your estimated parameters with the previously unseen test data?
4. Evaluating the estimated logistic parameters. The data was, in reality, generated with
W = (1,1,2,3,5,8), b = −1.
Write TensorFlow expressions to compute the squared error between your estimated parameters and their true values. Evaluate the error in recovering W and b separately. What are the squared errors? Note: you need only evaluate the error of your final estimates, not at every step.
5. For ease of grading, please make the variables from the above problems availablein a dictionary called results_logistic. The dictionary should have keys ’W’,
’Wsqerr’, ’b’, ’bsqerr’, ’log_lik_test’ , with respective values sess.run(x) where x ranges over the corresponding quantities. For example, if my squared error for W is stored in a TF variable called W_squared_error, then the key ’Wsqerr’ should have value sess.run(W_squared_error).
6. Classification of normally distributed data. The .zip file at http://wwwpersonal.umich.edu/~klevin/teaching/Winter2019/STATS507/HW10_normal.zip contains four Numpy .npy files that contain train and test data generated from K = 3 different classes. Each class k ∈{1,2,3} has an associated mean µk ∈ R and variance σk2 ∈ R, and all observations from a given class are i.i.d. N(µk,σk2). The four files are:
• normal_xtest.npy : contains a 500-vector whose entries are the independent variables (predictors) from the test set.
• normal_xtrain.npy : contains a 2000-vector whose entries are the independent variables (predictors) from the train set.
• normal_ytest.npy : contains a 500-by-3 dimensional matrix whose rows are one-hot encodings of the class labels for the test set.
• normal_ytrain.npy : contains a 2000-by-3 dimensional matrix whose rows are one-hot encodings of the class labels for the train set.
The i-th entry of the vector in normal_xtrain.npy is the observed random variable from class with label given by the i-th row of the matrix in normal_ytrain.npy, and analogously for the two test set files. Please include these files in your submission so that we can run your code without downloading them again.
Load the training data and use it to obtain estimates of the vector of class means µ = (µ0,µ1,µ2) and variances ) by minimizing the cross-entropy between the estimated normals and the one-hot encodings of the class labels (as we did in our softmax regression example in class). Please name the corresponding variables mu and sigma2. This time, instead of using gradient descent, use Adagrad, supplied by TensorFlow as the function tf.train.AdagradOptimizer. Adagrad is a stochastic gradient descent algorithm, popular in machine learning. You can call this just like the gradient descent optimizer we used in class—just supply a learning rate. Documentation for the TF implementation of Adagrad can be found here: https://www. tensorflow.org/api_docs/python/tf/train/AdagradOptimizer. See https:// en.wikipedia.org/wiki/Stochastic_gradient_descent for more information about stochastic gradient descent and the Adagrad algorithm.
Note: you’ll no longer be able to use the built-in logit cross-entropy that we used for training our models in lecture. Your cross-entropy for one observation should now look something like −Pk yk0 logpk, where y0 is the one-hot encoded vector and p is the vector whose k-th entry is the (estimated) probability of the k-th observation given its class. Another note: do not include any estimation of the mixing coefficients (i.e., the class priors) in your model. You only need to estimate three means and three variances, because we are building a discriminative model in this problem.
7. Evaluating loss on test data. Load the test data. What is the cross-entropy of your model on this test data? That is, what is the cross-entropy when you use your estimated parameters with the previously unseen test data?
8. Evaluating parameter estimation on test data. The true parameter values for the three classes were µ0 = −1,σ02 = 0.5 µ1 = 0,σ12 = 1 µ2 = 3,σ22 = 1.5.
Write a TensorFlow expression to compute the total squared error (i.e., summed over the six parameters) between your estimates and their true values. What is the squared error? Note: you need only evaluate the error of your final estimates, not at every step.
9. Evaluating classification error on test data. Write and evaluate a TensorFlow expression that computes the classification error of your estimated model averaged over the test data.
10. Again, for ease of grading, define a dictionary called results_class, with keys
’mu’, ’sigma2’, ’crossent_test’, ’class_error’ with keys corresponding to the evaluation (again using sess.run) of your estimate of µ, σ2, the cross-entropy on the test set, and the classification error from the previous problem.
3 Building a Complicated Model (1 point)
Note: it will not be enough to simply copy the tutorial’s python code into your jupyter notebook, since the demo code supplied in the tutorials is meant to be run from the command line.
Another note: If it was not clear, you are, for this problem and this problem only, permitted to copy-paste code from the TensorFlow tutorials as much as you like without penalty.
One more note: Please make sure that in both tutorial.ipynb and your main submission notebook uniqname.hw10.ipynb you do not set any training times to be excessively long. You are free to set the number of training steps as you like for running on your own machine, but please set these parameters to something more reasonable in your submission so that we do not need to wait too long when running your notebook. Aim to set the number of training steps so that we can run each of your submitted notebooks less than a minute.
4 Running Models on Google Cloud Platform (9 points)
In this problem, you’ll get a bit of experience running TensorFlow jobs on Google Cloud Platform (GCP), Google’s cloud computing service. Google has provided us with a grant, which will provide each of you with free compute time on GCP.
Good luck, and have fun!
The first thing you should do is claim your share of the grant money by visiting this link: https://google.secure.force.com/GCPEDU?cid=VYZbhLIwytS0UVxuWxYyRYgNVxPMOf37oBx0hRmx
Once you have claimed your credits, you should create a project, which will serve as a repository for your work on this problem. You should name your project uniqname-stats507w19, where uniqname is your unique name in all lower-case letters. Your project’s billing should be automatically linked to your credits, but you can verify this fact in the billing section dashboard in the GCP browser console. Please add both me (UMID klevin) and your GSI Roger Fan (UMID rogerfan) as owners. You can do this in the IAM tab of the IAM & admin dashboard by clicking “Add” near the top of the page, and listing our UMich emails and specifying our Roles as Project → Owner.
Note: this problem is comparatively complicated, and involves a lot of moving parts. At the end of this problem (several pages below), I have included a list of all the files that should be included in your submission for this problem, as well as a list of what should be on your GCP project upon submission.
2. Let us return to the classifier that you trained above on the normally-distributeddata. In this and the next several subproblems, we will take an adaptation of that model and upload it to GCP where it will serve as a prediction node similar to the one you built in the tutorial above. Train the same classifier on the same training data, but this time, save the resulting trained model in a directory called normal_trained. You’ll want to use the tf.saved_model.simple_save function. Refer to the GCP documentation at https://cloud.google.com/ml-engine/docs/deploying-models,
and the documentation on the tf.saved_model.simple_save function, here: https:
//www.tensorflow.org/programmers_guide/saved_model#save_and_restore_models Please include a copy of this model directory in your submission. Hint: a stumbling block in this problem is figuring out what to supply as the inputs and outputs arguments to the simple_save function. Your arguments should look something like inputs = {’x’:x}, outputs = {’prediction’:prediction}.
3. Let’s upload that model to GCP. First, we need somewhere to put your model. Youalready set up a bucket in the tutorial, but let’s build a separate one. Create a new bucket called uniqname-stats507w19-hw10-normal, where uniqname is your uniqname. You should be able to do this by making minor changes to the commands you ran in the tutorial, or by following the instructions at
https://cloud.google.com/solutions/running-distributed-tensorflow-oncompute-engine#creating_a_cloud_storage_bucket. Now, we need to upload your saved model to this bucket. There are several ways to do this, but the easiest is to follow the instructions at https://cloud.google.com/storage/docs/ uploading-objects and upload your model through the GUI. Optional challenge (worth no extra points, just bragging rights): Instead of using the GUI, download and install the Google Cloud SDK, available at https://cloud. google.com/sdk/ and use the gsutil command line tool to upload your model to a storage bucket.
4. Now we need to create a version of your model. Versions are how the GCP machine learning tools organize different instances of the same model (e.g., the same model trained on two different data sets). To do this, follow the instructions located at https://cloud.google.com/ml-engine/docs/deploying-models#creating_a_model_ version, which will ask you to
• Upload a SavedModel directory (which you just did)
• Create a Cloud ML Engine model resource
• Create a Cloud ML Engine version resource (this specifies where your model is stored, among other information)
• Enable the appropriate permissions on your account.
–runtime-version 1.6 to your gcloud ml-engine versions create command, and making sure that you are running TensorFlow version 1.6 on your local machine (i.e., the machine where you’re running Jupyter). Running version 1.7 locally while running 1.6 on GCP also seems to work fine.
5. Create a .json file corresponding to a single prediction instance on the input observation x = 4. Name this .json file instance.hw10.json, and please include a copy of it in your submission. Hint: you will likely find it easiest to use nano/vim/emacs to edit edit the .json file from the tutorial (GCP Cloud Shell has versions of all three of these editors). Doing this will allow you to edit a copy of the .json file directly in the GCP shell instead of going through the trouble of repeatedly downloading and uploading files. Being proficient with a shell-based text editor is also, generally speaking, a good skill for a data scientist to have.
6. Okay, it’s time to make a prediction. Follow the instructions at https://cloud. google.com/ml-engine/docs/online-predict#requesting_predictions to submit the observation in your .json file to your running model. Your model will make a prediction, and print the output of the model to the screen. Please include a copy-paste of the command you ran to request this prediction as well as the resulting output. Which cluster does your model think x = 4 came from? Hint: if you are getting errors about dimensions being wrong, make sure that your instance has the correct dimension expected by your model. Second hint: if you are encountering an error along the lines of Error during model execution:
AbortionError(code=StatusCode.INVALID_ARGUMENT, details=”NodeDef mentions attr ’output_type’, this is an indication that there is a mismatch between the version of TensorFlow that you used to create your model and the one that you are running on GCP. See the discussion of gcloud ml-engine versions create above.
That’s all of it! Great work! Here is a list of all files that should be included for this problem in your submission, as well as a list of what processes or resources should be left running in your GCP project:
• You should leave the datalab notebook and its supporting resources (i.e., the prediction node and storage bucket) from the GCP ML tutorial running in your GCP project.
• Include in your submission a copy of the saved model directory constructed from your classifier. You should also have a copy of this directory in a storage bucket on GCP.
• Leave a storage bucket running on GCP containing your uploaded model directory. This storage bucket should contain a model with a single version.
• Include in your submission a .json file representing a single observation. You need not include a copy of this file in a storage bucket on GCP; it will be stored by default in your GCP home directory if you created it in a text editor in the GCP shell.
• Include in your jupyter notebook a copy-paste of the command you ran to request your model’s prediction on the .json file, and please include the output that was printed to the screen in response to that prediction request. Note: Please make sure that the cell(s) that you copy-paste into is/are set to be Raw NBconvert cell(s), so that your commands display as code but are not run as code by Jupyter.




Reviews
There are no reviews yet.