## Description

1. Use SimHousingPrices1

to simulate data that is a polynomial with normally-distributed

noise. The function SimPoly should receive as input:

RealThetas: A real vector of D+1 coefficients for a D-degree polynomial P(x)

StdDev: A non-negative scalar that denotes the scale of fluctuation of the

output around the polynomial value

x: A real vector of input datapoints

The function should provide as output:

y: The outputs. Each output yi is P(xi)+ei where ei is a simulated value of a

normally-distributed random variable, with mean zero and variance

2

.

The function should be in a submitted folder called Assignment02.Problem01

[20 points]

2. Define a cubic polynomial with based on the digits in your UNI (mine would be 2x

3

+x

2

+6x+9

as my uni is ip2169). Use SimPoly to simulate outputs with this polynomial and =0.1.

Simulate outputs for N training inputs and M testing inputs that are uniformly distributed in [-

1,1]. Perform polynomial curve fitting of degrees 0 to 8 by defining the relevant pseudoinverse

for the relevant matrices, and compare empirical risks on training and testing data by plotting

them along the degree axis. Do all this three times: run #1 with N=10, M=10; run #2 with N=100,

M=10; run #3 with N=10, M=100.

Your code should save files with the following information (as columns of numbers):

x.train.[R].txt – for R =1,2,3: 3 training inputs for the corresponding run

x.test.[R].txt – for R =1,2,3: 3 testing inputs for the corresponding run

y.train.[R].txt – for R =1,2,3: 3 training outputs for the corresponding run

y.test.[R].txt – for R =1,2,3: 3 testing outputs for the corresponding run

ThetaStar.[R].[D].txt – for R =1,2,3, and D =0, … ,8 : 3×11 files, each with the fit

coefficients for for the corresponding run and corresponding

degree polynomial.

Risk.train.[R].txt – for R =1,2,3: 3 training empirical risk values for the

corresponding run

Risk.test.[R].txt – for R =1,2,3: 3 testing outputs for the corresponding run

The function to do all of this should be called should be called FitCubic() in a submitted

folder called Assignment02.Problem02

[60 points]

3. Simulate data for logistic regression. Use SimHousingPrices1

to simulate classification

data that is drawn with probability that is logistically dependent on a linear combination of inputs,

plus normally-distributed noise. The function SimLogistic should receive as input:

RealThetas: A real vector of D+1 linear coefficients

x: A real matrix with D columns of input datapoints (row vectors)

The function should provide as output:

y: The binary outputs. Each output yi is randomly chosen with probability

Pr(yi =1) = 1/(1+exp(-zi)) where zi is a -defined linear combination of

the coordinates of the i-th input vector

The function should be in a submitted folder called Assignment02.Problem03

[20 points]

Good luck!1

1

You are encouraged to use the function in the posted solution for Assignment #1. Using your own function is

allowed, but it is at your own risk.