Description
Model selection Performance
Best param log2 π Mean of MSE Std of MSE MSE on train MSE on test
Least square – – –
π
(show your estimated w)
π1(π€) = π2(π€) = Spars=
LASSO
π (show your estimated w)
π1(π€) = π2(π€) = Spars=
Ridge
π (show your estimated w)
π1(π€) = π2(π€) = Spars=
Caption for statistics in the table:
β’ Best param π: the regularization coefficient you choose using cross validation.
β’ Mean of MSE: the averaged MSE of the 5-fold cross validation process for your chosen π.
β’ Std of MSE: the standard deviation of MSE of the 5-fold cross validation process for your chosen
π.
β’ π1(π€): π1 norm of π
β’ π2(π€): π2 norm of π
β’ Spars: Sparsity, i.e., the number of zeros in the augmented weight vector
(b) Bob learned that π1 regularization could lead to more sparsity, and he really wants to visualize this. So he collects another bunch of datasets for 2-dimensional (before augmentation) features:
number of training samples (ππ‘π) number of testing samples
Dataset4 10 1000
Dataset5 30 1000
Dataset6 100 1000
Dataset7 10 1000
Dataset8 30 1000
Dataset9 100 1000
He tries them out and find that the last three datasets (7,8,9) are βspecial casesβ where the π1 norm might not provide the intended result.
i. Repeat (a)(i) for all new datasets. (Youβll have 6 tables)
ii. For each dataset, draw the following plot in the 2D space π€2 vs. π€1 with π€0 = your estimated π€0: (1) draw the curve of βMSE = training_MSE of your estimated π and
βMSE=10+training_MSE of your estimated π; (2) draw the curve for βπ€βπ1 = the π1 norm of your estimated π. Repeat this plot drawing for ridge regression results, except for (2) draw the curve for βπ€βπ2 = the π2 norm of your estimated π. (therefore you have 2 plots for each dataset. An example is shown below.)
iii. Based on the statistics and plots, answer the following questions:
1. Observe and explain how the plots relate to sparsity.
2. Can you explain how much effect the regularizer has, from looking at the plots (i.e., how different the regularized performance (MSE) is from the unregularized performance)
3. Observe and explain how Lasso has a different effect with the βspecial caseβ datasets than the other datasets
Hint: please refer to the example code file in the homework folder on how to generate such plots.




Reviews
There are no reviews yet.