Sale!

# CS 458 Project 1 Solved

Original price was: \$40.00.Current price is: \$35.00.

Category:

5/5 - (1 vote)

## P1-1. Curse of Dimensionality.

Reproduce a figure similar to the figure in slide 37 in Chapter 2, i.e.,
(a) Generate 1000 points following a uniform distribution under a given dimension, and then
compute difference between max and min distance between any pair of points. Hint: Refer to the
tutorial “Introduction to Numpy and Pandas” on how to generate random points.

(b) Repeat (a) for different dimensions from 2 to 50.
Plot log10
max−min
min
under different number of dimensions.

## P1-2. The Iris Dataset (https://en.wikipedia.org/wiki/Iris_flower_data_set)

The Iris dataset is embedded in scikit-learn. You can install scikit-learn by following the
instructions (https://scikit-learn.org/stable/install.html). Then you can load the Iris dataset using
the following codes:
from sklearn import datasets
The Iris dataset consists of 3 different types of irises’ (Setosa, Versicolour, and Virginica) petal
and sepal length, stored in a 150×4 numpy.ndarray.

CS 458 Project 1

a) Data Visualization. Duplicate the following figure using scatter plot.
b) Find the best discretization for the petal length and the petal width that can best separate the
Iris data and plot a figure similar to the figure in slide 54 in Chapter 2. For each flower type, list
in a table how many data samples are correctly separated and how many are not correctly
separated.

## P1-3. Principal Component Analysis for The Iris Dataset

You can use PCA embedded in scikit-learn by the following code:
from sklearn.decomposition import PCA