Sale!

CS 458 Project 3 Solved

Original price was: $40.00.Current price is: $35.00.

Category:

Description

5/5 - (1 vote)

P3-1. Revisit Text Documents Classification

Use the 20 newsgroups dataset embedded in scikit-learn:
from sklearn.datasets import fetch_20newsgroups
(See https://scikitlearn.org/stable/modules/generated/sklearn.datasets.fetch_20newsgroups.html#sklearn.datasets.f
etch_20newsgroups)

(a) Load the following 4 categories from the 20 newsgroups dataset: categories = [‘rec.autos’,
‘talk.religion.misc’, ‘comp.graphics’, ‘sci.space’].
(b) Build classifiers using the following methods:
 Support Vector Machine (sklearn.svm.LinearSVC)

 Naive Bayes classifiers (sklearn.naive_bayes.MultinomialNB)
 K-nearest neighbors (sklearn.neighbors.KNeighborsClassifier)
 Random forest (sklearn.ensemble.RandomForestClassifier)
 AdaBoost classifier (sklearn.ensemble.AdaBoostClassifier)
Optimize the hyperparameters of these methods and compare the results of these methods.

CS 458 Project 3

P3-2. Recognizing hand-written digits

Use the hand-written digits dataset embedded in scikit-learn:
from sklearn import datasets
digits = datasets.load_digits()
(a) Develop a multi-layer perceptron classifier to recognize images of hand-written digits. To
build your classifier, you can use:
sklearn.neural_network.MLPClassifier

(See https://scikitlearn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_
network.MLPClassifier)
Instructions: use sklearn.model_selection.train_test_split to split your dataset into random
train and test subsets, where you set test_size=0.5.
(b) Optimize the hyperparameters of your neural network to maximize the classification
accuracy. Show the confusion matrix of your neural network. Discuss and compare your results
with the results using a support vector classifier (see https://scikitlearn.org/stable/auto_examples/classification/plot_digits_classification.html#sphx-glr-autoexamples-classification-plot-digits-classification-py).

P3-3. Nonlinear Support Vector Machine

(a) Randomly generate the following 2-class data points
import numpy as np
np.random.seed(0)
X = np.random.rand(300, 2)*10-5
Y = np.logical_xor(X[:, 0] > 0, X[:, 1] > 0)
(b) Develop a nonlinear SVM binary classifier (sklearn.svm.NuSVC).
(c) Plot these data points and the corresponding decision boundaries, which is similar to the
figure in the slide 131 in Chapter 4.

CS 458 Project 3