Sale!

ECE M146 Homework 6 Solved

Original price was: $40.00.Current price is: $35.00. $29.75

Category:

Description

5/5 - (1 vote)

ECE M146 Homework 6

1. The pdf for two jointly Gaussian random variables X and Y is of the following form
parameterized by the scalars m1, m2, σ1, σ2 and ρXY :
fX,Y (x, y) =
exp n
−1
2(1−ρ
2
XY )
h
(
x−m1
σ1
)
2 − 2ρXY 
x−m1
σ1
 y−m2
σ2

+ ( y−m2
σ2
)
2
io
2πσ1σ2
p
1 − ρ
2
XY
. (1)

The pdf for multivariate jointly Gaussian random variable Z ∈ R
k
is of the following
form parameterized by µ ∈ R
k and Σ ∈ R
k×k
.
fZ(z) = exp 

1
2
(z − µ)

−1
(z − µ)

p
(2π)
k
|Σ|
. (2)
Suppose Z = [X, Y ]
T
, i.e., z = [x, y]
T
.

(a) Find µ, Σ−1 and Σ in terms of m1, m2, σ1, σ2 and ρXY .
(b) Suppose ρXY = 0, what is Σ in this case? Can you write fX,Y (x, y) as the product
of two single variate Gaussian distributions? Are X and Y independent?
1
2. The Gaussian Discriminant Analysis (GDA) models the class conditional distribution
as multivariate Gaussian, i.e, P(X|Y ) ∼ N (µY , Σ). Suppose we want to enforce the
Naive Bayes (NB) assumption, i.e. P(Xi
|Y, Xj ) = P(Xi
|Y ), ∀j 6= i, to GDA.
Show that all off diagonal elements of Σ equal to 0: Σi,j = 0, ∀i 6= j with the NB
assumption.
2 ECE M146 Homework 6

3. Consider the classification problem for two classes, C0 and C1. In the generative
approach, we model the class-conditional distribution P(x|C0) and P(x|C1), as well as
the class priors P(C0) and P(C1).The posterior probability for class C0 can be written
as
P(C0|x) = P(x|C0)P(C0)
P(x|C0)P(C0) + P(x|C1)P(C1)
.
(a) Show that P(C0|x) = σ(a) where σ(a) is the sigmoid function defined by
σ(a) = 1
1 + exp(−a)
.

Find a in terms of P(x|C0), P(x|C1), P(C0) and P(C1).
(b) In the GDA model, we have the class conditional distribution as follows
P(x|C0) = 1
(2π)
n/2
|Σ|
1/2
exp 

1
2
(x − µ0)

−1
(x − µ0)

,

P(x|C1) = 1
(2π)
n/2
|Σ|
1/2
exp 

1
2
(x − µ1)

−1
(x − µ1)

.

Suppose we are able to find the maximum likelihood estimation of µ0, µ1, Σ, P(C0),
and P(C1). Show that a = w
T x + b for some w and b. Find w and b in terms of
µ0, µ1, Σ, P(C0), and P(C1). This shows that the decision boundary is linear.
(c) In (b), we modeled the class conditional distribution with same covariance matrix
Σ. Now let us consider two classes that have difference covariance matrix as
follows
P(x|C0) = 1
(2π)
n/2
|Σ0|
1/2
exp 

1
2
(x − µ0)

−1
0
(x − µ0)

,
P(x|C1) = 1
(2π)
n/2
|Σ1|
1/2
exp 

1
2
(x − µ1)

−1
1
(x − µ1)

.

Suppose we are able to find the maximum likelihood estimation of µ0, µ1, Σ0, Σ1, P(C0),
and P(C1). Show that a = x
TAx+w
T x+b for some A, w and b. Find w and b in
terms of µ0, µ1, Σ0, Σ1, P(C0), and P(C1). This shows that the decision boundary
is quadratic.
3 ECE M146 Homework 6
4. We are given a training set {(x
(i)
, y(i)
); i = {1, · · · , m}}, where x
(i) ∈ Rn and y
(i) ∈

{0, 1}. We consider the Gaussian Discriminant Analysis (GDA) model, which models
P(x|y) using multivariate Gaussian. Writing out the model, we have:
P(y = 1) = φ = 1 − P(y = 0)
P(x|y = 0) = 1
(2π)
n/2
|Σ|
1/2
exp 

1
2

(x − µ0)

−1
(x − µ0)

P(x|y = 1) = 1
(2π)
n/2
|Σ|
1/2
exp 

1

2
(x − µ1)

−1
(x − µ1)

The log-likelihood of the data is given by:
L(φ, µ0, µ1, Σ) = ln P(x
(i)
, · · · , x(m)
, y(i)
, · · · , y(m)
) = lnYm
i=1
P(x
(i)
|y
(i)
)P(y
(i)
).

In this exercise, we want to maximize L(φ, µ0, µ1, Σ) with respect to φ, µ0. The maximization over Σ is left for discussion.
(a) Write down the explicit expression for P(x
(1)
, · · · , x(m)
, y(1)
, · · · , y(m)
) and L(φ, µ0, µ1, Σ).
(b) Find the maximum likelihood estimate for φ. How do you know such φ is
the “best” but not the “worst”? Hint: Show that the second derivative of
L(φ, µ0, µ1, Σ) with respect to φ is negative.
(c) Find the maximum likelihood estimate for µ0. How do you know such µ0 is the
“best” but not the “worst”? Hint: Show that the Hessian Matrix of L(φ, µ0, µ1, Σ)
with respect to µ0 is negative definite. You may use the following: if A is positive
definite, then A−1
is also positive definite. Also B is negative definite if −B is
positive definite.
4 ECE M146 Homework 6

5. In this exercise, you will implement a binary classifier using the Gaussian Discriminant
Analysis (GDA) model in MATLAB. The data is given in data.csv. The first two
columns are the feature values and the last column contains the class labels.
(a) Visualization. Plot the data from different classes in different colors. Is the data
linearly separable?

(b) In the GDA model, we assume the class label follows a Bernoulli distribution
and we model the class conditional distribution as multivariate Gaussian with
same covariance matrix (Σ) and different means (µ0 and µ1). Find the maximum
likelihood estimate of the parameters P(y = 0) (parameter for the Bernoulli
distribution), µ0, µ1 and Σ given this data set.

(c) Using the result you find in Question 3 and your ML estimate of model parameters,
find the decision boundary parameterized by w
T x + b = 0. Report w, b and plot
the decision boundary on the same plot.

(d) Visualize your results by plotting the contour of the two distributions P(x, y = 0)
and P(x, y = 1). For consistency, set ’LevelList’ (’level’ for python) to logspace(-
3,-1,7). Does your decision boundary pass through the points where the two
distributions have equal probabilities ? Explain why.
5 ECE M146 Homework 6