Sale!

EL-GY-9123 Homework 7 Neural Networks Solved

Original price was: $40.00.Current price is: $35.00. $29.75

Category:

Description

5/5 - (1 vote)

Neural Networks

1. Consider a neural network on a 3-dimensional input x = (x1, x2, x3) with weights and biases:
WH =




1 0 1
0 1 1
1 1 0
1 1 1




, bH =




0
0
−1
1




WO = [1, 1, −1, −1], bO = −1.5.

Assume the network uses the threshold activation function (1)
gact(z) = (
1, if z ≥ 0
0, if z < 0.
(1)
and the threshold output function (2):
yˆ =
(
1 z
O ≥ 0
0 z
O < 0.
(2)

(a) Write the components of z
H and u
H as a function of (x1, x2, x3). For each component j,
indicate where in the (x1, x2, x3) hyperplane u
H
j = 1.

(b) Write z
O as a function of (x1, x2, x3). In what region is ˆy = 1? (You can describe in
mathematical formulae).

2. Consider a neural network used for regression with a scalar input x and scalar target y,
z
H
j = WH
j x + b
H
j
, uH

j = max{0, zH
j }, j = 1, . . . , Nh
z
O =
X
Nh
k=1
WO
k u
H
k + b
O
, yˆ = gout(z
O
).

The hidden weights and biases are:
WH =


−1
1
1

 , b
H =


−1
1
−2

 .

(a) What is the number Nh of hidden units? For each j = 1, . . . , Nh, draw u
H
j
as a function of
x over some suitable range of values x. You may draw them on one plot, or on multiple
plots.
1

(b) Since the network is for regression, you may choose the activation function gout(z
O
) to
be linear. Given training data (xi
, yi), i = 1, . . . , N, formally define the loss function you
would use to train the network?

(c) Using the output activation and loss function selected in part (b), set up the formulation
to determine output weights and bias, WO
, b
O
, for the training data below. You should be
able to find a closed form solution. Write a few line of python code to solve the problem.
xi -2 -1 0 3 3.5
yi 0 0 1 3 3

(d) Based on your solution for the output weights and bias, draw ˆy vs. x over some suitable
range of values x. Write a few line of python code to do this.

(e) Write a function predict to output ˆy given a vector of inputs x. Assume x is a vector
representing a batch of samples and yhat is a vector with the corresponding outputs. Use
the activation function you selected in part (b), but your function should take the weights
and biases for both layers as inputs. Clearly state any assumptions on the formats for
the weights and biases. Also, to receive full credit, you must not use any for loops.
2

3. Consider a neural network that takes each input x and produces a prediction ˆy given
zj =
X
Ni
k=1

Wjkxk + bj , uj = 1/(1 + exp(−zj )), j = 1, . . . , M,
yˆ =
PM
j=1 ajuj
PM
j=1 uj
,
(3)

where M is the number of hidden units and is fixed (i.e. not trainable). To train the model,
we get training data (xi
, yi), i = 1, . . . , N.

(a) Rewrite the equations (3) for the batch of inputs xi from the training data. That is,
correctly add the indexing i, to the equations.
(b) If we use a loss function,
L =
X
N
i=1
(yi − yˆi)
2
,
draw the computation graph describing the mapping from xi and parameters to L. Indicate which nodes are trainable parameters.

(c) Compute the gradient ∂L/∂yˆi for all i.

(d) Suppose that, in backpropagation, we have computed ∂L/∂yˆi for all i, represented as
∂L/∂yˆ. Describe how to compute the components of the gradient ∂L/∂u.

(e) Suppose that we have computed the gradient ∂L/∂u, describe how would you compute
the gradient ∂L/∂z.

(f) Suppose that we have computed the gradient ∂L/∂z, describe how would you compute
the gradient ∂L/∂Wjk and ∂L/∂bj .

(g) Put all above together, describe how would you compute the gradient ∂L/∂Wjk and
∂L/∂bj .

(h) Write a few lines of python code to implement the gradients ∂L/∂u (as in part (d)), given
the gradient ∂L/∂yˆ. Indicate how you represent the gradients.

Full credit requires that
you avoid for-loops.
3