Sale!

Assignment 2 STA410H1F/2102H1F solution

$30.00 $25.50

Category:

Description

5/5 - (4 votes)

1. An interesting variation of rejection sampling is the ratio of uniforms method. We start
by taking a bounded function h with h(x) ≥ 0 for all x and Z ∞
−∞
h(x) dx < ∞. We then define the region Ch =  (u, v) : 0 ≤ u ≤ q h(v/u)  and generate (U, V ) uniformly distributed on Ch. We then define the random variable X = V/U. (a) The joint density of (U, V ) is f(u, v) = 1 |Ch| for (u, v) ∈ Ch where |Ch| is the area of Ch. Show that the joint density of (U, X) is g(u, x) = u |Ch| for 0 ≤ u ≤ q h(x) and that the density of X is γ h(x) for some γ > 0.
(b) The implementation of this method is somewhat complicated by the fact that it is
typically difficult to sample (U, V ) from a uniform distribution on Ch. However, it is usually
possible to find a rectangle of the form Dh = {(u, v) : 0 ≤ u ≤ u+, v− ≤ v ≤ v+} such that
Ch is contained within Dh. Thus to draw (U, V ) from a uniform distribution on Ch, we can
use rejection sampling where we draw proposals (U

, V ∗
) from a uniform distribution on the
rectangle Dh; note that the proposals U
∗ and V
∗ are independent random variables with
Unif(0, u+) and Unif(v−, v+) distributions, respectively. Show that we can define u+, v− and
v+ as follows:
u+ = max
x
q
h(x) v− = min
x
x
q
h(x) v+ = max
x
x
q
h(x).
(Hint: It suffices to show that if (u, v) ∈ Ch then (u, v) ∈ Dh where Dh is defined using u+,
v−, and v+ above.)
(c) Implement (in R) the method above for the standard normal distribution taking h(x) =
exp(−x
2/2). In this case, u+ = 1, v− = −
q
2/e = −0.8577639, and v+ =
q
2/e = 0.8577639.
What is the probability that proposals are accepted?
1
2. Suppose we observe y1, · · · , yn where
yi = θi + εi (i = 1, · · · , n)
where {εi} is a sequence of random variables with mean 0 and finite variance representing
noise. We will assume that θ1, · · · , θn are dependent in the sense that |θi − θi−1| is small
for most values of i. The least squares estimates of θ1, · · · , θn are trivial — bθi = yi
for all i
— but we can modify least squares in a number of ways to accommodate the “smoothness”
assumption on {θi}. In this problem, we will consider estimating {θi} by minimizing
Xn
i=1
(yi − θi)
2 + λ
Xn
i=2
(θi − θi−1)
2
where λ > 0 is a tuning parameter. (The term λ
Xn
i=2
(θi − θi−1)
2
is a “roughness” penalty. In
practice, it is more common to use λ
Xn
i=2
|θi − θi−1| as it better estimates jumps in {θi}.)
(a) Show the estimates bθ1, · · · ,
bθn satisfy the equations
y1 = (1 + λ)
bθ1 − λ
bθ2
yj = −λ
bθj−1 + (1 + 2λ)
bθj − λ
bθj+1 (j = 2, · · · , n − 1)
yn = −λ
bθn−1 + (1 + λ)
bθn.
(Note that if we write this in matrix form y = Aλ
bθ, the matrix A is sparse.)
(b) Show (using results from class) that both the Jacobi and Gauss-Seidel algorithms can be
used to compute the estimates defined in part (a).
(c) Write a function in R to implement the Gauss-Seidel algorithm above. The inputs for this
function are a vector of responses y and the tuning parameter lambda. Test your function
(for various tuning parameters) on data generated by the following command:
> y <- c(rep(0,250),rep(1,250),rep(0,50),rep(1,450)) + rnorm(1000,0,0.1) How does varying λ change the estimates, particularly the estimates of the transitions from 0 to 1 in the step function? Supplemental problems (not to hand in): 3. To generate random variables from some distributions, we need to generate two independent two independent random variables Y and V where Y has a uniform distribution over some finite set and V has a uniform distribution on [0, 1]. It turns out that Y and V can be generated from a single Unif(0, 1) random variable U. 2 STA410H1F/2102H1F (a) Suppose for simplicity that the finite set is {0, 1, · · · , n − 1} for some integer n ≥ 2. For U ∼ Unif(0, 1), define Y = bnUc and V = nU − Y where bxc is the integer part of x. Show that Y has a uniform distribution on the set {0, 1, · · · , n − 1}, V has a uniform distribution on [0, 1], and Y and V are independent. (b) What happens to the precision of V defined in part (a) as n increases? (For example, if U has 16 decimal digits and n = 106 , how many decimal digits will V have?) Is the method in part (a) particularly feasible if n is very large? 4. One issue with rejection sampling is a lack of efficiency due to the rejection of random variables generated from the proposal density. An alternative is the acceptance-complement (A-C) method described below. Suppose we want to generate a continuous random variable from a density f(x) and that f(x) = f1(x) + f2(x) (where both f1 and f2 are non-negative) where f1(x) ≤ g(x) for some density function g. Then the A-C method works as follows: 1. Generate two independent random variables U ∼ Unif(0, 1) and X with density g. 2. If U ≤ f1(X)/g(X) then return X. 3. Otherwise (that is, if U > f1(X)/g(X)) generate X from the density
f

2
(x) = f2(x)
Z ∞
−∞
f2(t) dt
.
Note that we must be able to easily sample from g and f

2
in order for the A-C method to
be efficient; in some cases, they can both be taken to be uniform distributions.
(a) Show that the A-C method generates a random variable with a density f. What is the
probability that the X generated in step 1 (from g) is “rejected” in step 2?
(b) Suppose we want to sample from the truncated Cauchy density
f(x) = 2
π(1 + x
2
)
(−1 ≤ x ≤ 1)
using the A-C method with f2(x) = k, a constant, for −1 ≤ x ≤ 1 (so that f

2
(x) = 1/2 is a
uniform density on [−1, 1]) with
f1(x) = f(x) − f2(x) = f(x) − k (−1 ≤ x ≤ 1).
If g(x) is also a uniform density on [−1, 1] for what range of values of k can the A-C method
be applied?
3 STA410H1F/2102H1F
(c) Defining f1, f2, and g as in part (b), what value of k minimizes the probability that X
generated in step 1 of the A-C algorithm is rejected?
5. Suppose we want to generate a random variable X from the tail of a standard normal
distribution, that is, a normal distribution conditioned to be greater than some constant b.
The density in question is
f(x) = exp(−x
2/2)

2π(1 − Φ(b))
for x ≥ b
with f(x) = 0 for x < b where Φ(x) is the standard normal distribution function. Consider rejection sampling using the shifted exponential proposal density g(x) = b exp(−b(x − b)) for x ≥ b. (a) Define Y be an exponential random variable with mean 1 and U be a uniform random variable on [0, 1] independent of Y . Show that the rejection sampling scheme defines X = b + Y/b if −2 ln(U) ≥ Y 2 b 2 . (Hint: Note that b + Y/b has density g.) (b) Show the probability of acceptance is given by √ 2πb(1 − Φ(b)) exp(−b 2/2) . What happens to this probability for large values of b? (Hint: You need to evaluate M = max f(x)/g(x).) (c) Suppose we replace the proposal density g defined above by gλ(x) = λ exp(−λ(x − b)) for x ≥ b. (Note that gλ is also a shifted exponential density.) What value of λ maximizes the probability of acceptance? (Hint: Note that you are trying to solve the problem min λ>0
max
x≥b
f(x)
gλ(x)
for λ. Because the density gλ(x) has heavier tails, the minimax problem above will have the
same solution as the maximin problem
max
x≥b
min
λ>0
f(x)
gλ(x)
which may be easier to solve.)
4 STA410H1F/2102H1F