## Description

1. Let the n-by-p rank-r (n > p > r) matrix X have SVD X = UΣV

T where U is

n-by-r, Σ is r-by-r, and V is p-by-r.

a) Find the SVD of Z = XT

in terms of U, Σ, and V .

b) Find the orthonormal basis for the best rank-1 subspace to approximate the rows

of Z in terms of U, V , and Σ.

2. Uniqueness of solutions and Tikhonov regularization (ridge regression).

The least-squares problem is minw ||y − Xw||2

2

. Assume X is n-by-p with p < n.

a) Under what conditions is the solution to the least-squares problem not unique?

b) The Tikhonov-regularized least-squares problem is

min

w

||y − Xw||2

2 + λ||w||2

2

Show that this can be written as an ordinary least-squares problem minw ||yˆ −

Xwˆ ||2

2

and find yˆ and Xˆ .

c) Use the results from the previous part to determine the conditions for which the

Tikhonov-regularized least-squares problem has a unique solution.

3. Psuedoinverse and truncated SVD. The solution to the ridge regression problem

min

w

||y − Xw||2

2 + λ||w||2

2

is given by w∗ = (XTX + λI)

−1XT y . The psuedoinverse of X, denoted X†

, can be

defined by looking at the limit of the ridge regression solution as λ → 0 (from above):

X† = lim

λ↓0

(XTX + λI)

−1XT

.

1 of 2

a) Let X ∈ R

n×p

, p ≤ n, have SVD X = UΣV

T =

Pp

i=1 σiuiv

T

i

. Show that

(XTX + λI)

−1XT =

X

p

i=1

σi

σ

2

i + λ

viu

T

i

.

Hint: Note that XTX = V Σ

2V

T and λI = V λIV T

.

b) Using the limit definition of the psuedoinverse above, show that when XTX is

invertible, then X† = (XTX)

−1XT

.

c) Argue that when X is square and invertible, then X† = X−1

.

d) Argue that if X is rank r < p, then for λ > 0,

(XTX + λI)

−1XT =

Xr

i=1

σi

σ

2

i + λ

viu

T

i

.

e) Now argue that if X is rank r < p,

X† =

Xr

i=1

1

σi

viu

T

i = V Σ

−1

r U

T

where Σ−1

r

is a matrix with 1/σi on the diagonal for i = 1, . . . , r, and zero elsewhere.

4. The data file is available with a matrix X of 100 three-dimensional data points. A

script is available with code to assist you with visualizing and fitting this data. Use the

results of the SVD to find a, a basis for the best (minimum sum of squared distances)

one-dimensional subspace for the data.

a) Run the code to display the data in Figure the first figure. Use the rotate tool

to inspect the scatter plot from different angles. Does the data appear to lie very

close to a one-dimensional subspace? Does the data appear to be zero mean?

b) Figure 2 depicts the centered data and the one-dimensional subspace that contains

the dominant feature you identified using the SVD. Use the rotate tool to inspect

the data and one-dimensional subspace from different angles. Is a one-dimensional

subspace a reasonable fit to the data? Comment on the error.

c) Now comment out (insert %) the line of code that subtracts the mean of the

data. Does the dominant feature identified by SVD continue to be a good fit to

the data? Comment on the importance of removing the mean before performing

PCA.

2 of 2