Sale!

EE 660 Homework 4 (Week 5) solved

Original price was: $35.00.Current price is: $30.00. $25.50

Category:

Description

5/5 - (3 votes)

1. Consider the email spam classification problem of Murphy Problem 8.1. Suppose you
intend to use a linear perceptron classifier on that data (not logistic regression as
directed in Problem 8.1). In the parts below, unless stated otherwise, assume the
dataset of samples is split into for training and
for testing. Also, for the tolerance in the VC generalization bound, use 0.1 (for a
certainty of 0.9). The parts below have short answers.
Hint: You may use the relation that if is a linear perceptron classifier in D
dimensions (D features), . ( This will be proved in Problem 2.)
a) What is the VC dimension of the hypothesis set?
b) Expressing the upper bound on the out-of-sample error as
For measured on the training data, use from part (a) to get a value
for .
c) To get a lower , suppose you reduce the number of features to , and
also increase the training set size to 10,000. Now what is ?
d) Suppose that you had control over the number of training samples (by
collecting more email data). How many training samples would ensure a
generalization error of again with probability 0.9 (the same tolerance
), and using the reduced feature set (10 features)?
e) Instead suppose you use the test set to measure , so let’s call it .
What is the hypothesis set now? What is its cardinality?
f) Continuing from part (e), use the bound:
Use the original feature set and the original test set, so that . Give an
appropriate expression for and calculate it numerically.
2. AML Exercise 2.4 (page 52). In addition to the hints given in the book, you can solve
the problem by following the steps outlined below.
N = 4601 NTr = 3000 NTest = 1601
δ
H
dVC (H ) = D +1
Eout h( g ) ≤ Ein h( g ) + ε vc
Ein h( g ) dvc
ε vc
ε vc D = 10
ε vc
NTr
ε vc = 0.1
δ = 0.1
Ein h( g ) Etest h( g )
Eout h( g ) ≤ Etest h( g ) + ε
NTest = 1601
ε
p. 2 of 3 EE 660
For part (a):
i. Write a point as a dimensional vector;
ii. Construct the matrix suggested by the book;
iii. Write , the output of the perceptron, as function of and the weights
(note that is a dimensional vector with elements +1 and -1);
iv. Using the nonsingularity of , justify how any can be obtained.
For part (b):
i. Write a point as a linear combination of the other points;
ii. Write (output for the chosen point) and substitute the value of by the
expression just found on the previous item (Hint: use the function);
iii. What part of your expression in (ii) determines the class assignment of each point
, for ?
iv. You have just proven (part (a)) that with can be shattered.
When we add a line to can it still be shattered? In other words, can
you choose the value of ? Justify your answer. Hint: you can choose the
class label of the other points.
3. AML Problem 2.24 (page 75), except
>> Replace part (a) with:
(a.1) For a single given dataset, give an expression for . (AML
notation)
(a.2) Find analytically; express your answer in simplest form.
>> For parts (b) and (c), obtain by direct numerical computation, not by
adding bias and var.
>> For part (d), obtain bias(x), var(x), bias, var, and , all by analytical
(pencil and paper) techniques.
4. AML Problem 2.13 (a), (b).
xi d +1
(d +1) × (d +1)
h( X ) X w
h( X ) d +1
X h( X )
xk d +1
h xk ( ) xk
sgn{i}
xi i ≠ k
h( X ) X (d+1)×(d+1)
(d + 2)
th X
h xk ( )
(d +1)
g(D)
( x)
g ( x)
ΕD E{ out}
ΕD E{ out}
p. 3 of 3 EE 660
5. AML Problem 4.4 (a)-(c), plus additional parts (i)-(iii) below.
>> For part (c), assume both and are given as functions of x, and you
can express your answer in terms of them; and define
.
(i) In Fig. 4.3(a), set σ! = 0.5, and traverse the horizontal line from N ≈ 60 to N ≈
130. Explain why ℋ”# transitions from overfit to good fit (relative to ℋ!).
(ii) Also in Fig. 4.3(a), set N = 100, and traverse the vertical line from σ! = 0 to
σ! = 2. Explain why ℋ”# transitions from good fit to overfit (relative to ℋ!).
(iii) In Fig. 4.3(b), set N ≈ 75, and traverse the vertical line from Q$ = 0 to Q$ =
100. Explain the behavior.
g10 ( x) f ( x)
Eout g10 ( ) = Εx,y g10 ⎡ ( x) − y( x) ⎣ ⎤

2
{ }