Deep Learning – Each combination of hyper-parameter will specify how to set each of the following: Solved

Description

5/5 – (1 vote)

num epochs: Number of iterations through the training section of the dataset [a positive integer]
i
(num_epochs_3, learning_rate_3), train_accuracy_3, test_accuracy_3)]

21 accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 22 23 # Train.
24 i, train_accuracy, test_accuracy = 0, [], []
25 log_period_updates = int(log_period_samples / batch_size) 26 with tf.train.MonitoredSession() as sess:
27 while mnist.train.epochs_completed < num_epochs:
28
29 # Update.
30 i += 1
31 batch_xs, batch_ys = mnist.train.next_batch(batch_size)
32
33 # Training step
34 sess.run(train_step,feed_dict={x:batch_xs,y_:batch_ys})
35
36 # Periodically evaluate.
37 if i % log_period_updates == 0:
38
39 # Compute and store train accuracy on 20% training data.
40 a=0.2
41 ex = eval_mnist.train.images
42 ey = eval_mnist.train.labels
43 size = int(ey.shape[0]*a)
44 part_ex = ex[0:size,:]
45 part_ey = ey[0:size,:]
46 train = sess.run(accuracy,feed_dict={x:part_ex,y_:part_ey})
47 print(“%d th iter train accuracy %f” %(i,train))
48 train_accuracy.append(train)
49
50 # Compute and store test accuracy.
51 test = sess.run(accuracy,feed_dict={x:eval_mnist.test.images,y_:eval_mnist.
52 print(“%d th iter test accuracy %f” %(i,test))
53 test_accuracy.append(test)
54
55 # save in a list
56 experiments_task1.append(
57 ((num_epochs, learning_rate), train_accuracy, test_accuracy))
SHOW HIDDEN OUTPUT
Model 2 (20 pts)
1 hidden layer (32 units) with a ReLU non-linearity, followed by a softmax.
(input → non-linear layer → linear layer → softmax → class probabilities)
Hyper-parameters
Train the model with three different hyper-parameter settings:
num_epochs=15, learning_rate=0.0001 num_epochs=15, learning_rate=0.005 num_epochs=15, learning_rate=0.1
1 # CAREFUL: Running this CL resets the experiments_task1 dictionary where results sh 2 # Store results of runs with different configurations in a dictionary.
3 # Use a tuple (num_epochs, learning_rate) as keys, and a tuple (training_accuracy, 4 experiments_task2 = []
5 settings = [(15, 0.0001), (15, 0.005), (15, 0.1)]
1 print(‘Training Model 2’)
2
3 # Train Model 2 with the different hyper-parameter settings.
4 for (num_epochs, learning_rate) in settings:
5
6 # Reset graph recreate placeholders and dataset
6 # Reset graph, recreate placeholders and dataset.
7 tf.reset_default_graph() # reset the tensorflow graph
8 x, y_ = get_placeholders()
9 mnist = get_data() # use for training.
10 eval_mnist = get_data() # use for evaluation.
11
12 # Define model, loss, update and evaluation metric.
13 initializer = tf.contrib.layers.xavier_initializer()
14
15 # non-linear layer
16 w_1 = tf.Variable(initializer([784,32]))
17 b_1 = tf.Variable(initializer([32]))
18 h_1 = tf.nn.relu(tf.matmul(x,w_1)+b_1)
19
20 # linear layer
21 w_2 = tf.Variable(initializer([32,10]))
22 b_2 = tf.Variable(initializer([10]))
23 logits = tf.matmul(h_1,w_2)+b_2
24 y = tf.nn.softmax(logits)
25 loss = tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_,logits=log 26 train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss) 27
28 # evalutaion
29 correct_prediction = tf.equal(tf.argmax(y_,1),tf.argmax(y,1))
30 accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 31
32 # Train.
33 i, train_accuracy, test_accuracy = 0, [], []
34 log_period_updates = int(log_period_samples / batch_size) 35 with tf.train.MonitoredSession() as sess:
36 while mnist.train.epochs_completed < num_epochs:
37
38 # Update.
39 i += 1
40 batch_xs, batch_ys = mnist.train.next_batch(batch_size)
41
42 # Training step
43 sess.run(train_step,feed_dict={x:batch_xs,y_:batch_ys})
44
45 # Periodically evaluate.
46 if i % log_period_updates == 0:
47
48 # Compute and store train accuracy on 20% training data.
49 a=0.2
50 ex = eval_mnist.train.images
51 ey = eval_mnist.train.labels
52 size = int(ey.shape[0]*a)
53 part_ex = ex[0:size,:]
54 part_ey = ey[0:size,:]
55 train = sess.run(accuracy,feed_dict={x:part_ex,y_:part_ey})
56 print(“%d th iter train accuracy %f” %(i,train))
57 train_accuracy.append(train)
58
59 # Compute and store test accuracy.
60 test = sess.run(accuracy,feed_dict={x:eval_mnist.test.images,y_:eval_mnist.
61 print(“%d th iter test accuracy %f” %(i,test))
62 test_accuracy.append(test)
63
64 experiments_task2.append(
65 ((num_epochs, learning_rate), train_accuracy, test_accuracy))
SHOW HIDDEN OUTPUT
Model 3 (20 pts)
2 hidden layers (32 units) each, with ReLU non-linearity, followed by a softmax.
(input → non-linear layer → non-linear layer → linear layer → softmax → class probabilities)
Hyper-parameters
i h d l i h h diff h i
Train the model with three different hyper-parameter settings:
num_epochs=5, learning_rate=0.003 num_epochs=40, learning_rate=0.003 num_epochs=40, learning_rate=0.05
1 # CAREFUL: Running this CL resets the experiments_task1 dictionary where results sh 2 # Store results of runs with different configurations in a dictionary.
3 # Use a tuple (num_epochs, learning_rate) as keys, and a tuple (training_accuracy, 4 experiments_task3 = []
5 settings = [(5, 0.003), (40, 0.003), (40, 0.05)]
1 print(‘Training Model 3’)
2
3 # Train Model 3 with the different hyper-parameter settings.
4 for (num_epochs, learning_rate) in settings:
5
6 # Reset graph, recreate placeholders and dataset.
7 tf.reset_default_graph() # reset the tensorflow graph
8 x, y_ = get_placeholders()
9 mnist = get_data() # use for training.
10 eval_mnist = get_data() # use for evaluation.
11
12 # Define model, loss, update and evaluation metric.
13 initializer = tf.contrib.layers.xavier_initializer()
14
15 # non-linear layer 1
16 w_1 = tf.Variable(initializer([784,32]))
17 b_1 = tf.Variable(initializer([32]))
18 h_1 = tf.nn.relu(tf.matmul(x,w_1)+b_1)
19
20 # non-linear layer 2
21 w_2 = tf.Variable(initializer([32,32]))
22 b_2 = tf.Variable(initializer([32]))
23 h_2 = tf.nn.relu(tf.matmul(h_1,w_2)+b_2)
24
25 # linear layer
26 w_3 = tf.Variable(initializer([32,10]))
27 b_3 = tf.Variable(initializer([10]))
28 logits = tf.matmul(h_2,w_3)+b_3
29 y = tf.nn.softmax(logits)
30 loss = tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_,logits=log 31 train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss) 32
33 # evalutaion
34 correct_prediction = tf.equal(tf.argmax(y_,1),tf.argmax(y,1))
35 accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 36
37 # Train.
38 i, train_accuracy, test_accuracy = 0, [], []
39 log_period_updates = int(log_period_samples / batch_size) 40 with tf.train.MonitoredSession() as sess:
41 while mnist.train.epochs_completed < num_epochs:
42
43 # Update.
44 i += 1
45 batch_xs, batch_ys = mnist.train.next_batch(batch_size)
46
47 # Training step
48 sess.run(train_step,feed_dict={x:batch_xs,y_:batch_ys})
49
50 # Periodically evaluate.
51 if i % log_period_updates == 0:
52
53 # Compute and store train accuracy on 20% training data.
54 a=0.2
55 ex = eval_mnist.train.images 56 ey = eval mnist train labels 56 ey = eval_mnist.train.labels
57 size = int(ey.shape[0]*a)
58 part_ex = ex[0:size,:]
59 part_ey = ey[0:size,:]
60 train = sess.run(accuracy,feed_dict={x:part_ex,y_:part_ey})
61 print(“%d th iter train accuracy %f” %(i,train))
62 train_accuracy.append(train)
63
64 # Compute and store test accuracy.
65 test = sess.run(accuracy,feed_dict={x:eval_mnist.test.images,y_:eval_mnist.
66 print(“%d th iter test accuracy %f” %(i,test))
67 test_accuracy.append(test)
68
69 experiments_task3.append(
70 ((num_epochs, learning_rate), train_accuracy, test_accuracy))
SHOW HIDDEN OUTPUT
Model 4 (20 pts)
Model
3 layer convolutional model (2 convolutional layers followed by max pooling) + 1 non-linear layer (32 units), followed by softmax.
(input(28×28) → conv(3x3x8) + maxpool(2×2) → conv(3x3x8) + maxpool(2×2) → atten → non-linear → linear layer → softmax → class probabilities)
Use padding = ‘SAME’ for both the convolution and the max pooling layers.
Employ plain convolution (no stride) and for max pooling operations use 2×2 sliding windows, with no overlapping pixels (note: this operation will down-sample the input image by 2×2).
Hyper-parameters
Train the model with three different hyper-parameter settings:
num_epochs=5, learning_rate=0.01 num_epochs=10, learning_rate=0.001 num_epochs=20, learning_rate=0.001
1 # CAREFUL: Running this CL resets the experiments_task1 dictionary where results sh 2 # Store results of runs with different configurations in a dictionary.
3 # Use a tuple (num_epochs, learning_rate) as keys, and a tuple (training_accuracy, 4 experiments_task4 = []
5 settings = [(5, 0.01), (10, 0.001), (20, 0.001)]
1 print(‘Training Model 4’)
2
3 # Train Model 4 with the different hyper-parameter settings.
4 for (num_epochs, learning_rate) in settings:
5
6 # Reset graph, recreate placeholders and dataset.
7 tf.reset_default_graph() # reset the tensorflow graph
8 x, y_ = get_placeholders()
9 x_image = tf.reshape(x, [-1, 28, 28, 1]) 10 mnist = get_data() # use for training.
11 eval_mnist = get_data() # use for evaluation.
12
13 # Define model, loss, update and evaluation metric.
14 initializer = tf.contrib.layers.xavier_initializer()
15
16 # conv layer 1
16 # conv layer 1
17 w_conv1 = tf.Variable(initializer([3,3,1,8]))
18 b_conv1 = tf.Variable(initializer([8]))
19 h_conv1 = tf.nn.relu(tf.nn.conv2d(x_image, w_conv1, strides=[1, 1, 1, 1], padding 20 h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], paddi 21
22 # conv layer 2
23 w_conv2 = tf.Variable(initializer([3,3,8,8]))
24 b_conv2 = tf.Variable(initializer([8]))
25 h_conv2 = tf.nn.relu(tf.nn.conv2d(h_pool1, w_conv2, strides=[1, 1, 1, 1], padding 26 h_pool2 = tf.nn.max_pool(h_conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], paddi 27
28 # flatten
29 h_flat = tf.reshape(h_pool2, [-1, 7*7*8])
30
31 # non-linear layer
32 w_n = tf.Variable(initializer([7*7*8,32]))
33 b_n = tf.Variable(initializer([32]))
34 h_n = tf.nn.relu(tf.matmul(h_flat,w_n)+b_n)
35
36 # linear layer + softmax & loss
37 w_linear = tf.Variable(initializer([32,10]))
38 b_linear = tf.Variable(initializer([10])) 39 logits = tf.matmul(h_n,w_linear)+b_linear
40 y = tf.nn.softmax(logits)
41 loss = tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_,logits=log 42 train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss) 43
44 # evalutaion
45 correct_prediction = tf.equal(tf.argmax(y_,1),tf.argmax(y,1))
46 accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 47
48 # Train.
49 i, train_accuracy, test_accuracy = 0, [], []
50 log_period_updates = int(log_period_samples / batch_size) 51 with tf.train.MonitoredSession() as sess:
52 while mnist.train.epochs_completed < num_epochs:
53
54 # Update.
55 i += 1
56 batch_xs, batch_ys = mnist.train.next_batch(batch_size)
57
58 # Training step
59 sess.run(train_step,feed_dict={x:batch_xs,y_:batch_ys})
60
61 # Periodically evaluate.
62 if i % log_period_updates == 0:
63
64 # Compute and store train accuracy on 20% training data.
65 a=0.2
66 ex = eval_mnist.train.images
67 ey = eval_mnist.train.labels
68 size = int(ey.shape[0]*a)
69 part_ex = ex[0:size,:]
70 part_ey = ey[0:size,:]
71 train = sess.run(accuracy,feed_dict={x:part_ex,y_:part_ey})
72 print(“%d th iter train accuracy %f” %(i,train))
73 train_accuracy.append(train)
74
75 # Compute and store test accuracy.
76 test = sess.run(accuracy,feed_dict={x:eval_mnist.test.images,y_:eval_mnist.
77 print(“%d th iter test accuracy %f” %(i,test))
78 test_accuracy.append(test)
79
80 experiments_task4.append(
81 ((num_epochs, learning_rate), train_accuracy, test_accuracy))
SHOW HIDDEN OUTPUT
Evaluation

Feed more training data
Extension (Ungraded)
In the previous tasks you have used plain Stohastic Gradient Descent to train the models.
There is a large literatures on variants of Stochastic Gradient Descent, that improve learning speed and robustness to hyper-parameters.
Here you can nd the documentation for several optimizers already implemented in TensorFlow, as well as the original papers proposing these methods.italicized text.
AdamOptimizer and RMSProp are among the most commonly employed in Deep Learning.
How does replacing SGD with these optimizers affect the previous results?
1 from tensorflow.python.client import device_lib
2 device_lib.list_local_devices()
SHOW HIDDEN OUTPUT
1 # CAREFUL: Running this CL resets the experiments_task5 using RMSPropOptimizer dict 2 # Store results of runs with different configurations in a dictionary.
3 # Use a tuple (num_epochs, learning_rate) as keys, and a tuple (training_accuracy, 4 experiments_task5 = []
5 settings = [(5, 0.01), (10, 0.001), (20, 0.001)]
1 print(‘Training Model 4.2’)
2
3 # Train Model 4.2 with the different hyper-parameter settings.
4 for (num_epochs, learning_rate) in settings:
5
6 # Reset graph, recreate placeholders and dataset.
7 tf.reset_default_graph() # reset the tensorflow graph
8 x, y_ = get_placeholders()
9 x_image = tf.reshape(x, [-1, 28, 28, 1]) 10 mnist = get_data() # use for training.
11 eval_mnist = get_data() # use for evaluation.
12
13 # Define model, loss, update and evaluation metric. 14 initializer = tf.contrib.layers.xavier_initializer()
15
16 # conv layer 1
17 w_conv1 = tf.Variable(initializer([3,3,1,8]))
18 b_conv1 = tf.Variable(initializer([8]))
19 h_conv1 = tf.nn.relu(tf.nn.conv2d(x_image, w_conv1, strides=[1, 1, 1, 1], padding 20 h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], paddi 21
22 # conv layer 2
23 w_conv2 = tf.Variable(initializer([3,3,8,8]))
24 b_conv2 = tf.Variable(initializer([8]))
25 h_conv2 = tf.nn.relu(tf.nn.conv2d(h_pool1, w_conv2, strides=[1, 1, 1, 1], padding 26 h_pool2 = tf.nn.max_pool(h_conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], paddi 27
28 # flatten
29 h_flat = tf.reshape(h_pool2, [-1, 7*7*8])
30
31 # non-linear layer
32 w_n = tf.Variable(initializer([7*7*8,32]))
33 b_n = tf.Variable(initializer([32]))
34 h_n = tf.nn.relu(tf.matmul(h_flat,w_n)+b_n)
35
36 # linear layer + softmax & loss
37 w_linear = tf.Variable(initializer([32,10]))
38 b_linear = tf.Variable(initializer([10])) 39 logits = tf.matmul(h_n,w_linear)+b_linear
40 y = tf nn softmax(logits)

Reviews

There are no reviews yet.

Be the first to review “Deep Learning – Each combination of hyper-parameter will specify how to set each of the following: Solved”

Deep Learning – Each combination of hyper-parameter will specify how to set each of the following: Solved

Description

Reviews

Related products

Deep Learning – 17044633_DL_hw2 Solved

Deep Learning – Assignment 2. Convolutional, Recurrent, and Graph Solved

Deep Learning – Assignment 1. MLPs and Backpropagation Solved

Deep Learning – √ Solved

Deep Learning – √ Solved