## Description

1. In this part, we will apply the PCA technique for the task of image denoising. Consider the images barbara256.png

and stream.png present in the corresponding data/ subfolder – this image has gray-levels in the range from

0 to 255.

For the latter image, you should extract the top-left portion of size 256 by 256. Add zero mean

Gaussian noise of σ = 20 to one of these images using the MATLAB code im1 = im + randn(size(im))*20.

Note that this noise is image-independent. (If during the course of your implementation, your program takes

too long, you can instead work with the file barbara256-part.png which has size 128 by 128 instead of 256 by

256.

You can likewise extract the top-left 128 by 128 part of the stream.png image. You will not be penalized

for working on these image parts.)

(a) In the first part, you will divide the entire noisy image ‘im1’ into overlapping patches of size 7 by 7, and

create a matrix P of size 49 × N where N is the total number of image patches. Each column of P

is a single patch reshaped to form a vector. Compute eigenvectors of the matrix PPT

, and the eigencoefficients of each noisy patch. Let us denote the j

th eigen-coefficient of the i

th (noisy) patch (i.e. Pi)

by αij .

Define ¯α

2

j = max(0,

1

N

[

PN

i=1 α

2

ij ] − σ

2

), which is basically an estimate of the average squared

eigen-coefficients of the ‘original (clean) patches’.

Now, your task is to manipulate the noisy coefficients

{αij} using the following rule, which is along the lines of the Wiener filter update that we studied in class:

α

denoised

ij =

αij

1 + σ2

α¯

2

j

.

Here, α

denoised

ij stands for the j

th eigencoefficient of the i

th denoised patch. Note that

σ

2

α¯

2

j

is an estimate of the ISNR, which we absolutely need for any practical implementation of a Wiener

filter update.

After updating the coefficients by the Wiener filter rule, you should reconstruct the denoised

patches and re-assemble them to produce the final denoised image which we will call ‘im2’. Since you chose

overlapping patches, there will be multiple values that appear at any pixel. You take care of this situation

using simple averaging. Write a function myPCADenoising1.m to implement this. Display the final image

‘im2’ in your report and state its RMSE computed as ∥im2denoised − im2orig∥2

∥im2orig∥2

.

(b) In the second part, you will modify this technique. Given any patch Pi

in the noisy image, you should

collect K = 200 most similar patches (in a mean-squared error sense) from within a 31 × 31 neighborhood

centered at the top left corner of Pi

. We will call this set of similar patches as Qi (this set will of course

include Pi). Build an eigen-space given Qi and denoise the eigen-coefficients corresponding to only Pi

using the Wiener update mentioned earlier.

The only change will be that ¯α

2

j will now be defined using only

the patches from Qi (as opposed to patches from all over the image). Reconstruct the denoised version of

Pi

. Repeat this for every patch from the noisy image (i.e. create a fresh eigen-space each time). At any

pixel, there will be multiple values due to overlapping patches – simply average them. Write a function

1

myPCADenoising2.m to implement this.

Reconstruct the final denoised image, display it in your report

and state the RMSE value. Do so for both barbara as well as stream.

(c) Now run your bilateral filter code from Homework 2 on the noisy version of the barbara image. Compare

the denoised result with the result of the previous two steps for both images. What differences do you

observe? What are the differences between this PCA based approach and the bilateral filter?

(d) Consider that a student clamps the values in the noisy image‘im1’ to the [0,255] range, and then denoises

it using the aforementioned PCA-based filtering technique which assumes Gaussian noise. Is this approach

correct? Why (not)? [10 + 20 + 5 + 5 = 40 points]

2. Read Section 1 of the paper ‘An FFT-Based Technique for Translation, Rotation, and Scale-Invariant Image

Registration’ published in the IEEE Transactions on Image Processing in August 1996. A copy of this paper is

available in the homework folder.

(a) Describe the procedure in the paper to determine translation between two given images. What is the time

complexity of this procedure to predict translation if the images were of size N × N?

How does it compare

with the time complexity of pixel-wise image comparison procedure for predicting the translation?

(b) Also, briefly explain the approach for correcting for rotation between two images, as proposed in this paper

in Section II. Write down an equation or two to illustrate your point.

[10+10=20 points]

3. Consider a matrix A of size m × n, m ≤ n. Define P = AT A and Q = AAT

. (Note: all matrices, vectors and

scalars involved in this question are real-valued).

(a) Prove that for any vector y with appropriate number of elements, we have y

tP y ≥ 0. Similarly show

that z

tQz ≥ 0 for a vector z with appropriate number of elements.

Why are the eigenvalues of P and Q

non-negative?

(b) If u is an eigenvector of P with eigenvalue λ, show that Au is an eigenvector of Q with eigenvalue λ. If v

is an eigenvector of Q with eigenvalue µ, show that AT v is an eigenvector of P with eigenvalue µ. What

will be the number of elements in u and v?

(c) If vi

is an eigenvector of Q and we define ui ≜

AT vi

∥AT vi∥2

. Then prove that there will exist some real,

non-negative γi such that Aui = γivi

.

(d) It can be shown that u

T

i uj = 0 for i ̸= j and likewise v

T

i vj = 0 for i ̸= j for correspondingly distinct

eigenvalues. (You did this in HW4 where you showed that the eigenvectors of symmetric matrices are

orthonormal.) Now, define U = [v1|v2|v3|…|vm] and V = [u1|u2|u3|…|um]. Now show that A = UΓV

T

where Γ is a diagonal matrix containing the non-negative values γ1, γ2, …, γm.

With this, you have just

established the existence of the singular value decomposition of any matrix A. This is a key result in linear

algebra and it is widely used in image processing, computer vision, computer graphics, statistics, machine

learning, numerical analysis, natural language processing and data mining. [7.5 + 7.5 + 7.5 + 7.5 = 30

points]

4. Suppose you are standing in a well-illuminated room with a large window, and you take a picture of the scene

outside. The window undesirably acts as a semi-reflecting surface, and hence the picture will contain a reflection

of the scene inside the room, besides the scene outside.

While solutions exist for separating the two components

from a single picture, here you will look at a simpler-to-solve version of this problem where you would take two

pictures. The first picture g1 is taken by adjusting your camera lens so that the scene outside (f1) is in focus

(we will assume that the scene outside has negligible depth variation when compared to the distance from the

camera, and so it makes sense to say that the entire scene outside is in focus), and the reflection off the window

surface (f2) will now be defocussed or blurred.

This can be written as g1 = f1 + h2 ∗ f2 where h2 stands for

the blur kernel that acted on f2. The second picture g2 is taken by focusing the camera onto the surface of the

window, with the scene outside being defocussed. This can be written as g2 = h1 ∗ f1 + f2 where h1 is the blur

kernel acting on f1. Given g1 and g2, and assuming h1 and h2 are known, your task is to derive a formula to

determine f1 and f2.

Note that we are making the simplifying assumption that there was no relative motion

between the camera and the scene outside while the two pictures were being acquired, and that there were no

2

changes whatsoever to the scene outside or inside.

Even with all these assumptions, you will notice something

inherently problematic about the formula you will derive. What is it? [5+5 = 10 points]

3