Sale!

# CS/ECE/ME532 Assignment 10 solution

Original price was: \$35.00.Current price is: \$28.00.

Category:

5/5 - (1 vote)

## 1. Neural net functions

a) Sketch the function generated by the following 3-neuron ReLU neural network.
f(x) = 2(x − 0.5)+ − 2(2x − 1)+ + 4(0.5x − 2)+
where x ∈ R and where (z)+ = max(0, z) for any z ∈ R. Note that this is a
single-input, single-output function. Plot f(x) vs x by hand.
b) Consider the continuous function depicted below. Approximate this function with
ReLU neural network with 2 neurons. The function should be in the form
f(x) = X
2
j=1
vj (wjx + bj )+

Indicate the weights and biases of each neuron and sketch the neural network
function.
c) A neural network fw can be used for binary classification by predicting the label as
yˆ = sign(fw(x)). Consider a setting where x ∈ R
2 and the desired classifier is −1

if both elements of x are less than or equal to zero and +1 otherwise. Sketch the
desired classification regions in the two-dimensional plane, and provide a formula
for a ReLU network with 2-neurons that can produce the desired classification.
For simplicity, assume in this questions that sign(0) = −1.

## 2. Gradients of a neural net.

P
Consider a 2 layer neural network of the form f(x) =
J
j=1 vj (wT
j x)+. Suppose we want to train our network on a dataset of N samples xi
with corresponding labels yi
, using a least squares loss function L =
Pn
i=1(f(xi) − yi)
2
.

Derive the gradient descent update steps for the input weights wj and output weights
vj
.
1 of 7
3. Compressing neural nets. Large neural network models can be approximated by
P
considering low rank approximations to weight matrices. The neural network f(x) =
J
j=1 vj (wT
j x)+ can be written as
f(x) = v
T
(Wx)+.
where v is a J × 1 vector of the output weights and W is a J × d matrix with ith
row wT
j

. Let σ1, σ2, . . . denote the singular values of W and assume that σi ≤  for
i > r. Let fr denote the neural network obtained by replacing W with its best rank
r approximation Wˆ
r. Assuming that x has unit norm, find an upper bound to the
difference maxx |f(x) − fr(x)|. (Hint: for any pair of vectors a and b, the following
inequality holds ka+ − b+k2 ≤ ka − bk2).

4. Face Emotion Classification with a three layer neural network. In this problem
use code from an activity (or libraries such as Keras and Tensorflow).
a) Build a classifier using a full connected three layer neural network with logistic
• take a vector x ∈ R
10 as input (nine features plus a constant offset),

• have a single, fully connected hidden layer with 32 neurons
• output a scalar yb.
Note that since the logistic activation function is always positive, your decision
should be as follows: y >b 0.5 corresponds to a ‘happy’ face, while yb ≤ 0.5 is not
happy.