Sale!

CS 422/622 Project 3 solution

Original price was: $35.00.Current price is: $30.00. $25.50

Category:

Description

5/5 - (3 votes)

1 Training Adaboost (65 points)
File name: adaboost.py
Implement a function in python:
adaboost_train(X, Y, max_iter)
that takes sample data, sample labels, and an iteration count as inputs. The function should return variables
f and alpha. f is an array of trained decision tree stumps (a tree with a max depth of 1). alpha is a 1D
array of calculated alpha values. To indicate sample weights in subsequent iterations, create a new dataset
with duplicate samples. For example, if we have a dataset with 4 samples (s1, s2, s3, s4) with weights (1/8,
1/8, 1/4, 1/2), on the next iteration we will create a new dataset with 8 samples (s1, s2, s3, s3, s4, s4,
s4, s4). Now s4 is half of all the samples for the dataset, while s1 and s2 only contribute 1/8 each to the
whole dataset. At each iteration train a decision tree and get the predicted values from the test data, then
compare to the actual labels. Use this to calculate the alpha and update the data with the new weights for
the subsequent iterations.
Write-Up: What is the reason for using a decision tree stump rather than a decision tree with a greater
depth? How does this differentiate adaboost from a random forest ensemble method?
2 Testing Adaboost (25 points)
File name: adaboost.py
Implement a function in python:
adaboost_test(X, Y, f, alpha)
1
that takes sample data, sample labels, the previously calculated array f, and the previously calculated array
alpha as inputs. The function should return an accuracy indicating the overall performance of the adaboost
algorithm.
Write-Up: What would need to change to run an adaboost algorithm with a perceptron rather than a
decision tree?
3 Report (10 points)
622 is required to submit their report using LaTeX as a PDF and the source file. 422 can submit a
README.txt or a LaTeX with source files for extra credit.
2