COL 341: Assignment 2 Solved

Description

5/5 – (1 vote)

Notes:
• This assignment has two parts – Neural Network and Convolutional Neural Networks. that you might plot.
• You are advised to use vector operations (wherever possible) for best performance.
• Include a report of maximum 5 pages which should be a brief description explaining what you did. Include any observations and/or plots required by the question in the report.
• You should use Python for all your programming solutions.
• Your assignments will be auto-graded, make sure you test your programs before submitting. We will use your code to train the model on training data and predict on test set.
• Input/output format, submission format and other details are included. Your programs should be modular enough to accept specified parameters.
• You should submit work of your own. You should cite the source, if you choose to use any external resource.
• You can use total of 7 buffer days across all assignments.
• Data is available at this link
In this problem, we’ll train neural networks to classify a binary class(Toy) and multi-class(CIFAR10) dataset.
(a) (12.5 points) Write a program to implement a general neural network architecture. Implement the back-propagation algorithm from first principles to train the network. You should train the network using Mini-Batch Gradient Descent. Your implementation should be generic enough so that it can work with different architectures. Assume a fully connected architecture i.e., each unit in a hidden layer is connected to every unit in the next layer. Have an option for an adaptive learning rate .
Use Binary Cross Entropy Loss(BCE) as the loss function. Use sigmoid as the activation function for the units in intermediate and output layers.
(1)
Here i indexes samples. Here yi is a single value which is 1 if label is 1 or 0 otherwise. n is the number of samples in the batch. ˆyi ∈(0,1)ispredictedprobabilityofasamplebelongingtoclass1.
Use Your Implementation to train neural network on Toy training dataset with predefined parameters(corresponding to fixed learning rate) and predict on the publicly available Toy testing dataset. • Fixed Learning Rate
w(t) = w(t − 1) − η0∇wL(w;Xb(t−1):bt,yb(t−1):bt) (2)
Here t indicates epoch number while denotes model weights. b indicates batch size while X,y denotes data and labels respectively. η0,L represents fixed learning rate and loss function respectively. ∇wL signifies gradient of the loss function with respect to model weights.
(b) (12.5 points) Modify neural network architecture made in part a) to cater for multi-class dataset(CIFAR10). Youhave been provided 2 CSV data files corresponding to train and test sets respectively. Each row in the data file corresponds to an image of size 32×32. Last entry of each row corresponds to the label associated with the image(-1 for test set) preceded by the vector of gray-scale pixel intensities of the image. The data belongs to 10 different classes corresponding to 10 categories of images. Use Cross Entropy Loss(CE) as the loss function. Use sigmoid as the activation function for the units in intermediate layers. Use softmax as the activation function for the units in output layer.
) (3)
(4)
Here i indexes samples while j indexes class labels. Here yi is a one-hot vector where only one value (corresponding to true class index) is non-zero for sample i. n is the number of samples in the batch. k is the number of labels. yˆij ∈ (0,1) : Pj yˆij = 1∀i,j is the prediction of a sample. zij is the value being input into jth perceptron of softmax layer for ith sample.
Use Your Implementation to train a neural network on the given CIFAR10 training dataset with predefined parameters(corresponding to adaptive learning rate) and predict on the publicly available CIFAR10 testing dataset.
• Adaptive Learning Rate
) (5)
Here t indicates epoch number while w denotes model weights. b indicates batch size while X,y denotes data and labels respectively. η0,L represents seed value and loss function respectively. ∇wL signifies gradient of the loss function with respect to model weights.
(c) (12.5 points) Use Your Implementation to train a neural network on the given CIFAR10 dataset with predefinedparameters. Experiment with different types of architectures. Vary the number of hidden layers. Try different number of units in each layer, different loss and activation functions etc. What happens as we increase the number of units or the number of hidden layers? Comment on your observation and submit your best performing architecture.
Note: Use holdout method to find the best architecture. Split the data set into two: a set of examples to train with, and a validation set. Train the model on the training set. Use the prediction on the validation set to determine which architecture to use.
(d) (12.5 points) In the previous part pixel values are being used as features of the image. Can some features otherthan the absolute pixel values be used? Experiment with various feature extraction techniques such as Gaber Filters, DCT, FFT, HOG, wavelet etc. to extract feature from the images. You can use pre-defined python libraries to create these features. How does these changes affect your accuracy compared to the previous parts.
Comment on your observations and report details of your best performing design.
Evaluation:
• For part-a and part-b, you can get 0 (error), partial marks (code runs fine but weights are incorrect within some predefined threshold) and full (works as expected).
• For part-c and part-d, marks will be given based on accuracy on test data-set. There will be relative marking for this part.
• For part-c and part-d marking will be done in two parts: code (75%) and report(25%).
Submission Instructions:
Neural Network
Submit your code in 4 executable python files called neural a.py, neural b.py, neural c.py, neural d.py
The file name should corresponds with part [a,b,c,d] of the assignment. The parameters are dependant on the mode:
python neural a.py trainfile.csv param.txt weightfile.txt
Here you have to write a line aligned weightfile.txt by writing the weight values for the trained model. Here, param.txt will contain five lines of input, the first being a number [1-2] indicating which learning rate strategy to use and the second being the fixed learning rate (for ”1”), seed value for adaptive learning rate (for ”2”). The third line will be the max number of iterations. Also print the number of iterations your program takes. The fourth line will contain batch size for one iteration of mini batch gradient descent. Fifth line will contain the architecture of the neural network with a sequence of numbers denoting number of perceptrons in consequent layers eg. 10 10 5 denotes 3 hidden layers with 10, 10 and 5 perceptrons respectively.
python neural b.py trainfile.csv param.txt weightfile.txt Same as for part a.
python neural c.py trainfile.csv testfile.csv outputfile.txt
Here you have to write the predictions (1 per line) and create a line aligned outputfile.txt. However there would be no input param.txt as parameters would correspond to your best architecture and will need to be comprehensively detailed in the report.
python neural d.py trainfile.csv testfile.csv outputfile.txt Same as for part c.
Part (e)
Here you have to submit a pdf file with all details of best architectures and features for parts c) and d). Report results and observations of all variations experimented with irrespective of whether they lead to increase in accuracy. There will be no demos or presentations for this part.
Coding Guidelines:
(a) For parts a) and b)
• Don’t shuffle/change order of the data files.
• Don’t scale/normalize data in any manner possible.
• Don’t apply early stopping criterion. Run for full specified number of iterations.
• To ensure uniformity of weight initialization, initialize all weights as zero.
• Be careful to code Mini-batch Gradient Descent to be closely consistent with the theoretical framework taught in class.
• Parameters for which public evaluation is going to be done is given alongside expected model weights after 1 iteration of training(for help in debugging) in the data files. Final evaluation will be done on different training set with different parameters. (b) For parts c) and d)
• Design code with emphasis on Vector Operations and minimal use of loops to ensure run-time efficiency. Note that Moodle allows maximum of 16 minutes for testing individual submissions and your code must finish executing within 10 minutes.
• Have a suitable threshold for the following: 1) Batch Size * Number of Iterations 2) The product of number of perceptrons in each hidden layer in order to ensure above.
• Using Feature engineering in part c) will be seen as an attempt to subvert assignment by gaining unfair advantage and will most likely invite disciplinary action.
• Their will be another required file coined MoodleID where you simply have to fill in your moodle ID eg. me2110786. This is necessary for construction of a leaderboard wherein ranks based on best scores till a point can be seen.
Extra Readings (highly recommended for part (a,b)):
(a) Backpropogation for different loss functions

(a) (25 Points) Write a program to implement a convolutional neural network with the following structure:
i. CONV1: Convolutional layer with 3 RGB inputs and 64 outputs with 3 × 3 filter size.
ii. POOL1: 2 × 2 max pooling layer.
iii. CONV2: Convolutional layer with 128 outputs 3 × 3 filter size. iv. POOL2: 2 × 2 max pooling layer.
v. FC1: Fully Connected Layer with 512 outputs vi. FC2: Fully Connected Layer with 256 outputs
vii. SM: Soft-max layer for classification with 10 outputs.
Instructions:
• Each row of the array stores a 32×32 colour image. The first 1024 entries contain the red channel values, the next 1024 the green, and the final 1024 the blue. The image is stored in row-major order, so that the first 32 entries of the array are the red channel values of the first row of the image.
• For all convolution layers use stride S = 1, padding P = 1
• All layers except for the pooling layers and the softmax layer should use ReLU as non-linearity.
• Use batch-normalization on the last layer activations (before computing the softmax) when training the network.
• You are allowed to experiment with different loss functions and optimizers but you have to document all the obtained results.
• You are requested to experiment with learning rates and different methods for parameter initialization.
Report:
• Report results and observations of all variations experimented with.
• Separately describe the configuration and training details of your best performing model.
• Line plot to show training loss on the y-axis and the epoch on the x-axis.
(b) (25 Points) Experiments with CNN There have been many advancements in the design of CNN architectures, for example Inception-Net and ResNet. For this problem you have to experiment with different CNN architectures on the CIFAR10 dataset and report your observations. You are free to choose any CNN architecture for this problem, but you have to code them yourself.
Note: Refer to the following links for an analysis of the state of the art architectures and the available codes.
i. SOTA for CIFAR 10
ii. Papers with Code

Instructions:
• Look for multiple architectures and experiment with their codes.
• You are allowed to use all sorts of feature engineering.
• Report the results obtained for all the experiments.
• Elaborately describe the configuration and training details of your best performing model and report the following:
i. Description and novelty over other architectures. ii. An analysis of why it works better than the rest.
iii. Hyper-parameters and configurations
You might have to read the research paper for having a clear idea of the model.
For this problem you will NOT only be evaluated on the accuracy obtained but also on the reported analysis of the best performing architecture and a demo.
Submission Instructions
(a) For problem 2(a)
• All the codes should be in Python 3.5 or higher.
• Please specify all the dependencies in a README file.
• You have to submit the prediction file along with your submission. We’ll run all your codes again and if your predictions do not match with the submitted prediction file, there will be a heavy penalty/disciplinary action.
• You will use Keras(with Tensorflow backend) to train the CNN
• Similar to Q1(c), your submission should include one file called cnn a.py which should run the training process with the best set of parameters. It should be runnable using the command:
python cnn a.py trainfile.csv testfile.csv output.txt and write the predictions for test data (1 per line) in output.txt.
• The codes will not be autograded on moodle, rather you have to submit your codes in the above format. We’ll grade them and later notify the results.
• You are advised to use the HPC facility for GPUs.
• Your code should complete within 1 hour.
• For any error in the code or layout of output file, a penalty will be applied depending on the issue.
(b) For problem 2(b)
• All submission instructions for 2(a).
• You have to submit the prediction file along with your submission and we’ll accordingly create a leader-board on the basis of test performance. We’ll run all your codes again and if your prediction does not match with the submitted prediction file, there will be a heavy penalty/disciplinary action.
• Report the results obtained with all the architectures but elaborately explain only the one with best results.
• Submit the code used to get the best results.
• Similar to Q1(c, d), there will be another required file coined MoodleID where you simply have to fill in your moodle ID eg. me2110786. This is necessary for construction of the leader-board.
Coding Guidelines Same as Q1.
Grading Scheme
• For part (b), your submission will only be considered if it leads to better results as compared to your part (a) submission.
• You’ll be awarded 33% of the total marks for getting any improvement.
• Rest of the marks will depend on; Accuracy + Report + Demo.
Extra Readings:
• Getting started with the Keras Sequential model

• CNN in Keras
• Dropout

• ResNet

Reviews

There are no reviews yet.

Be the first to review “COL 341: Assignment 2 Solved”

COL 341: Assignment 2 Solved

Description

Reviews

Related products

COL 341: Assignment 5 Solved

COL 341: Assignment 4 Solved

COL 341: Assignment 3 Solved

COL 341 – Machine Learning – Assignment 1 Solved