Description
Assignment 12
Total: 34 Points
General Instruction
• Submit uncompressed file(s) in the Dropbox folder via BeachBoard (Not email).
1. Using scikit learn, evaluate the classification accuracy of the decision tree, bagging, AdaBoost, and Random forest.
(a) Load the Breast cancer data using sklearn.datasets.load breast cancer.
(b) (2 points) Print out the names of the features (X) and the name of the target (y).
(c) (2 points) Allocate the half of the data to Train (X train, y train) and the remaining half to Test (X test, y test).
(d) The common goal of the classifiers is predicting target using features.
(e) The classifiers should be trained using Train set and be tested using Test set.
(f) Use the ‘Gini’ index as the criterion and fix the maximum depth of trees as 2.
(g) (5 points) Write a program that generates a decision tree from X train, y train and predict y pred from X test. You can compute accuracy of the classifier by comparing y pred and y test. Please print out the accuracy.
(h) (5 points) Visualize the tree using sklearn.tree.plot tree. Each node of trees should include feature name.
(i) (5 points) Similarly, write a program that generates multiple decision trees usingthe bagging. This method should record its prediction accuracy at bagging score by varying the parameter n estimators. Draw a 2D line plot whose X-axis is n estimators and Y-axis bagging score, and the plot should have more than 20 data points of different X-axis values.
(j) (5 points) Similarly, write a program that generates multiple decision trees usingthe AdaBoost. Draw a 2D line plot whose X-axis is n estimators and Y-axis boost score, and the plot should have more than 20 data points of different X-axis values.
(k) (10 points) Similarly, write a program that generates multiple decision trees usingthe random forest. Draw a 3D surface plot whose X-axis is n estimators, Y-axis max features, and Z-axis forest score. The plot should have more than 100 data points of different pair of X-axis and Y-axis values.
(l) Submit your Assn12.ipynb which includes all the plots.




Reviews
There are no reviews yet.