## Description

## Assignment A3.1 (5.1 in Textbook):

(You can directly access the links like http://www2.stat.duke.edu/ pdh10/FCBS/Exercises/school1.dat

or school1 <- scan(’http://www2.stat.duke.edu/~pdh10/FCBS/Exercises/school1.dat’)

in R to access the data.) Studying: The files school1.dat, school2.dat and school3.dat contain data on the amount of time students from three high schools spent on studying or homework

during an exam period. Analyze data from each of these schools separately, using the normal

model with a conjugate prior distribution, in which )

µ0 = 5, ‡2

0 = 4, Ÿ0 = 1, ‹0 = 2* and compute

or approximate the following:

• posterior means and 95% confidence intervals for the mean ◊ and standard deviation ‡ from

each school;

• the posterior probability that ◊i < ◊j < ◊k for all six permutations {i, j, k} of {1, 2, 3}

• the posterior probability that Y˜i < Y˜j < Y˜k for all six permutations {i, j, k} of {1, 2, 3},

where Y˜i is a sample from the posterior predictive distribution of school i.

• Compute the posterior probability that ◊1 is bigger than both ◊2 and ◊3, and the posterior

probability that Y˜1 is bigger than both Y˜2 and Y˜3.

## Assignment A3.2 (5.4 in Textbook):

Jereys’ prior: For sampling models expressed in terms of a p-dimensional vector Â, Jereys’

prior (Exercise 3.11) is defined as pJ (Â) Ã |I(Â)|, where |I(Â)| is the determinant of the p ◊ p

matrix I(Â) having entries I(Â)k,l = ≠E#

ˆ2 log p(Y | Â)/ˆÂkˆÂl

$

• Show that Jereys’ prior for the normal model is pJ

!

◊, ‡2″

Ã !

‡2″≠3/2

.

• Let y = (y1,…,yn) be the observed values of an i.i.d. sample from a normal !

◊, ‡2″

population. Find a probability density pJ

!

◊, ‡2 | y

”

such that pJ

!

◊, ‡2 | y

”

Ã pJ

!

◊, ‡2″

p

!

y | ◊, ‡2″

.

It may be convenient to write this joint density as pJ

!

◊ | ‡2, y

”

◊ pJ

!

‡2 | y

”

. Can this joint

density be considered a posterior density?

## Assignment A3.3 (6.2 in Textbook):

Mixture model: The file glucose. dat contains the plasma glucose concentration of 532 females

from a study on diabetes (see Exercise 7.6).

• Make a histogram or kernel density estimate of the data. Describe how this empirical

distribution deviates from the shape of a normal distribution.

Page 1 of 2

• Consider the following mixture model for these data: For each study participant there is

an unobserved group membership variable Xi which is equal to 1 or 2 with probability p

and 1 ≠ p. If Xi = 1 then Yi ≥ normal !

◊1, ‡2

1

”

, and if Xi = 2 then Yi ≥ normal !

◊2, ‡2

2

”

. Let

p ≥ beta(a, b), ◊j ≥ normal !

µ0, · 2

0

” and 1/‡j ≥ gamma !

‹0/2, ‹0‡2

0/2

” for both j = 1 and

j = 2. Obtain the full conditional distributions of (X1,…,Xn), p, ◊1, ◊2, ‡2

1 and ‡2

2.

• Setting a = b = 1, µ0 = 120, · 2

0 = 200, ‡2

0 = 1000 and ‹0 = 10, implement the Gibbs

sampler for at least 10,000 iterations. Let ◊

(s)

(1) = min Ó

◊

(s)

1 , ◊

(s)

2

Ô

and ◊

(s)

(2) = max Ó

◊

(s)

1 , ◊

(s)

2

Ô

.

Compute and plot the autocorrelation functions of ◊

(s)

(1) and ◊

(s)

(2), as well as their eective

sample sizes.

• For each iteration s of the Gibbs sampler, sample a value x ≥ binary 1

p(s)

2

, then sample

Y˜ (s) ≥ normal 1

◊

(s) x , ‡2(s) x

2

. Plot a histogram or kernel density estimate for the empirical

distribution of Y˜ (1),…, Y˜ (S)

, and compare to the distribution in part a).

Discuss the

adequacy of this two-component mixture model for the glucose data.

Sheet 3 is due on Nov 11st. Submit your solutions before Nov 11st, 5:00 pm.

Page 2 of 2