Sale!

EL 9123 Homework 3 Model Order Selection Solved

Original price was: $40.00.Current price is: $35.00. $29.75

Category:

Description

5/5 - (1 vote)

Model Order Selection

1. For each of the following pairs of true functions f0(x) and model classes f(x, β) determine:

(i) if the model class is linear;

(ii) if there is no under-modeling; and

(iii) if there is no
under-modeling, what is the true parameter?

(a) f0(x) = 1 + 2x, f(x, β) = β0 + β1x + β2x

(b) f0(x) = 1 + 1/(2 + 3x), f(x, a0, a1, b0, b1) = (a0 + a1x)/(b0 + b1x).

(c) f0(x) = (x1 − x2)
2 and
f(x, a, b1, b2, c1, c2) = a + b1x1 + b2x2 + c1x
2
1 + +c2x
2
2

2. In this problem, we will see how to calculate the bias when there is under modeling.

Suppose
that training data (xi
, yi), i = 1, . . . , n is fit using a simple linear model of the form,
yˆ = f(x, β) = β0 + β1x.

However, the true relation between x and y is given
y = f0(x), f0(x) = β00 + β01x + β02x
2

where the “true” function f0(x) is quadratic and β0 = (β00, β01, β02) is the vector of the true
parameters. There is no noise.

(a) Write an expression for the least-squares estimate βb = (βb0, βb1) in terms of the training
data (xi
, yi), i = 1, . . . , n. These expressions will involve multiple steps.

You do not need
to simplify the equations. Just make sure you state clearly how one would compute βb
from the training values.

(b) Using the fact that yi = f0(xi) in the training data, write the expression for β = (βb0, βb1)
in terms of the values xi and the true parameter values β0.

Again, you do not need to
simplify the equations.

 

Just make sure you state clearly how one would compute βb from
the true parameter vector β0 and x.

(c) Suppose that the true parameters are β0 = (1, 2, −1) and the model is trained using
10 values xi uniformly spaced in [0, 1].

Write a short python program to compute the
estimate parameters βb. Plot the estimated function f(x, βb) and true function f0(x) for
x ∈ [0, 3].

(d) For what value x in this range x ∈ [0, 3] is the bias Bias2
(x) = (f(x, βb) − f0(x))2
largest?

3. A medical researcher wishes to evaluate a new diagnostic test for cancer.

A clinical trial is conducted where the diagnostic measurement y of each patient is recorded along with attributes
of a sample of cancerous tissue from the patient.

Three possible models are considered for
the diagnostic measurement:

• Model 1: The diagnostic measurement y depends linearly only on the cancer volume.

• Model 2: The diagnostic measurement y depends linearly on the cancer volume and the
patient’s age.

• Model 3: The diagnostic measurement y depends linearly on the cancer volume and the
patient’s age, but the dependence (slope) on the cancer volume is different for two types
of cancer – Type I and II.

 

(a) Define variables for the cancer volume, age and cancer type and write a linear model for
the predicted value ˆy in terms of these variables for each of the three models above.

For
Model 3, you will want to use one-hot coding.

(b) What are the numbers of parameters in each model? Which model is the most complex?

(c) Since the models in part (a) are linear, given training data, we should have yˆ = Aβ
where yˆ is the vector of predicted values on the training data, A is a feature matrix and
β is the vector of parameters.

 

To test the different models, data is collected from 100
patients.

The records of the first three patients are shown below:

Patient Measurement Cancer Cancer Patient
ID y type volume age
12 5 I 0.7 55
34 10 II 1.3 65
23 15 II 1.6 70
.

Based on this data, what would be the values of first three rows of the three A matrices
be for the three models in part (a)?

(d) To evaluate the models, 10-fold cross validation is used with the following results.

Model Mean training Mean test Test RSS
RSS RSS std deviation
1 2.0 2.01 0.03
2 0.7 0.72 0.04
3 0.65 0.70 0.05

All RSS values are per sample, and the last column is the (biased) standard deviation
– not the standard error.

 

Which model should be selected based on the “one standard
error rule”?
2