Sale!

CS 6375 Homework 4 Machine Learning solved

Original price was: $35.00.Current price is: $30.00. $25.50

Category:

Description

5/5 - (5 votes)

In this project, you will implement the following three algorithms and test their performance on the 10 datasets available on MS Teams. Each dataset has three files: “.ts.data”
(training data); “.valid.data” (validation data) and “.test.data” (test data).
1. Tree Bayesian networks. (20 points) Use the Chow-Liu algorithm to learn the
structure and parameters of the Bayesian network. Use 1-Laplace smoothing to
ensure that you don’t have any zeros when computing the mutual information as
well as zero probabilities in the model. See section 2 in [Meila and Jordan, 2001].
An implementation of Chow-Liu tree is provided in the code base provided to you.
2. Mixtures of Tree Bayesian networks using EM. (40 points) The model is
defined as follows. We have one latent variable having k values and each mixture
component is a Tree Bayesian network. Thus, the distribution over the observed
variables, denoted by X (variables in the data) is given by:
P(X = x) = X
k
i=1
piTi(X = x)
where pi
is the probability of the i-th mixture component and Ti
is the distribution represented by the i-th Tree Bayesian network. Write code to learn the structure and parameters of the model using the EM-algorithm (in the M-step each
mixture component is learned using the Chow-Liu algorithm). Run the EM algorithm until convergence or until 50 iterations whichever is earlier. See section 3 in
[Meila and Jordan, 2001]. Use the following values for k ∈ {2, 5, 10, 20}. Test performance using the “test set.”
In the code provided, see file MIXTURE CLT.py, you have to write two functions
“learn(..)” and “computeLL(…)”
3. Mixtures of Tree Bayesian networks using Random Forests. (40 points)
The model is defined as above (see Item (3)). Learn the structure and parameters of the model using the following Random-Forests style approach. Given two
hyper-parameters (k, r), generate k sets of Bootstrap samples and learn the i-th Tree
1 CS 6375
Bayesian network using the i-th set of the Bootstrap samples by randomly setting
exactly r mutual information scores to 0 (as before use the Chow-Liu algorithm with
r mutual information scores set to 0 to learn the structure and parameters of the
Tree Bayesian network). Select k and r using the validation set and use 1-Laplace
smoothing. You can either set pi = 1/k for all i or use any reasonable method (reasonable method is extra credit). Describe your (reasonable) method precisely in your
report. Does it improve over the baseline approach that uses pi = 1/k.
4. (Extra credit, 10 points) Describe an algorithm (you don’t have to implement the
algorithm) for learning mixtures of tree Bayesian networks using the gradient Boosting approach. As a starting point, you can refer to a paper by Rosset and Segal
[Rosset and Segal, 2002]. Your description should be formal enough so that a graduate student with knowledge of machine learning and basic math should be able to
implement and understand your algorithm.
Report Test-set Log-Likelihood (LL) score on the 10 datasets available on MS Teams.
For EM and Random Forests (since they are randomized algorithms), choose the hyperparameters (k and r) using the validation set and then run the algorithms 5 times and
report the average and standard deviation. Can you rank the algorithms in terms of accuracy (measured using test set LL) based on your experiments? Comment on why you
think the ranking makes sense.
What to turn in:
• Your report in PDF format describing your experimental evaluation.
• Code along with a Readme file on how to compile and run your code.
• Note: Please submit a single zip file containing your report and your code (rar/gzip/tar
and other formats are not allowed).
References
[Meila and Jordan, 2001] Meila, M. and Jordan, M. I. (2001). Learning with mixtures of
trees. Journal of Machine Learning Research, 1:1–48.
[Rosset and Segal, 2002] Rosset, S. and Segal, E. (2002). Boosting density estimation.
In Proceedings of the 15th International Conference on Neural Information Processing
Systems, NIPS’02, page 657–664, Cambridge, MA, USA. MIT Press.
2 CS 6375