Sale!

# CS 458 Project 4 Solved

Original price was: \$40.00.Current price is: \$35.00.

Category:

5/5 - (1 vote)

## P4-1. Hierarchical Clustering Dendrogram

(a) Randomly generate the following data points:
import numpy as np
np.random.seed(0)
X1 = np.random.randn(50,2)+[2,2]
X2 = np.random.randn(50,2)+[6,10]
X3 = np.random.randn(50,2)+[10,2]
X = np.concatenate((X1,X2,X3))

(b) Use sklearn.cluster.AgglomerativeClustering to cluster the points generated in (a). Plot
Instructions: Set distance_threshold=0, n_clusters=None in AgglomerativeClustering. The
default metric used to compute the linkage is ‘euclidean’, so you do not need to change this
parameter.

## P4-2. Clustering structured dataset

(a) Generate a swiss roll dataset:
from sklearn.datasets import make_swiss_roll
# Generate data (swiss roll dataset)
n_samples = 1500
noise = 0.05
X, _ = make_swiss_roll(n_samples, noise=noise)
# Make it thinner
X[:, 1] *= .5

(b) Use sklearn.cluster.AgglomerativeClustering to cluster the points generated in (a), where
you set the parameters as n_clusters=6, connectivity=connectivity, linkage=’ward’, where
from sklearn.neighbors import kneighbors_graph
connectivity = kneighbors_graph(X, n_neighbors=10, include_self=False)

Plot the clustered data in a 3D figure and use different colors for different clusters in your figure.
(c) Use sklearn.cluster.DBSCAN to cluster the points generated in (a). Plot the clustered data in
a 3D figure and use different colors different clusters in your figure. Discuss and compare the
results of DBSCAN with the results in (b).

## P4-3. Clustering the handwritten digits data

Use the hand-written digits dataset embedded in scikit-learn:
from sklearn import datasets