## Description

## Question 1:

In this question you are going to analyze the ‘wtloss’ data, available on the course website.

Briefly, the data come from a weight loss trial in which K=120 patients were randomized to three

treatment arms: dietary counseling at baseline (diet=0), dietary counseling at all sudy visits

(diet=1), and dietary counseling at all visits plus free access to an exercise facility (diet=2).

Each patient visited the study clinic monthly, for up to 12 months; at each visit their weight was

measured.

(a) Use lme() to fit two linear mixed effects models, both including a main effect for diet, a main

effect for time, and a diet by time interaction:

(i) random intercepts only

(ii) random intercepts and random slopes

Report the results in a table that would be suitable for a clinical journal, and provide precise

interpretations of the fixed effects and variance components from model (ii)

(b) Consider conducting a test for whether the random intercepts/slopes model provides a significantly better fit to the data than the random intercept model. Write down the null and

alternative hypotheses. In class we learned that this test is non-standard testing scenario,

and the likelihood ratio test statistic under the null is a mixture of χ

2

1

and χ

2

2 distributions. In

the lme help file look up simulate.lme. Use this function to simulate the null distribution,

setting n.sim=1000 and seed=1504. Plot this distribution, along with the χ

2

1 distribution

and the χ

2

2 distribution, highlighting the 95th percentiles. What do you conclude about the

adequacy of the random intercept model as compared to random intercept/slopes model.

(c) Conduct a residual analysis of model (i) from part (a). Report your results in a concise

manner and briefly summarize what you conclude, including whether the results from this

analysis are consistent with what you concluded from part (b).

(d) Use geeglm() to fit the same mean model from part (a) using GEE 1.5, based on: (i) working

independence (GEE-I), (ii) working exchangeable (GEE-E), and (iii) working AR-1 (GEEAR1). Report the results in a table that would be suitable for a clinical journal, and provide

precise interpretations of the regression parameter estimates from GEE-E.

1

(e) State the null and alternative hypothesis for the test of whether the rate of weight loss differs

for the treatment groups. Conduct the test for the GEE-E estimator and describe the results

using language that would be suitable to a non-biostatistician collaborator

(f) Assuming the mean model is correctly specified, comment on the consistency of the point

estimates reported in parts (a) and (d), as well as on the validity of the standard error

estimates.

2

where the weight matrix Wk is equal to V

−1

k

. For simplicity, we’ll consider the special case where

the response is continuous and the variance is constant (i.e. homogeneous). In that case, the GEE

estimating equation is given by:

X

K

k=1

DT

k V

−1

k

(Yk − µk

) = X

K

k=1

XT

k Wk(Yk − µk

).

Define the total weight given to cluster k as

Wtot,k =

Xnk

i=1

Xnk

j=1

wk,ij ,

where wk,ij is the (i, j)

th element of Wk.

(a) Assuming an exchangeable correlation structure with correlation parameter ρ, calculate Wtot,k

as a function of nk and ρ using the identity:

(aIm + b1m)

−1 =

1

a

Im −

b

a + mb1m

.

where Im denotes the m × m identity matrix and 1m denotes the m × m matrix of 1’s.

(b) From this, derive the form of the relative weight of a person with nk=10 to one where

nk=5.Calculate this value for ρ=0.9, ρ=0.5, and ρ=0.1 and comment on the trend that you

observe.

(c) What do the results in part (b) say about the weight for each subject when working independence is used?

(d) The per-observation weight (per single observation within a person) can be thought of as

Wtot,k/nk. Using the results from part (b), comment on the trend in the per observation

weight received.

3

### Question 2 (Optional):

In this question we are going to try to understand how much each cluster (or subject) and

each observation per cluster (or subject) is weighted by GEE. That is, even though Vk is called a

’working covariance matrix,’ it might be more natural to think of it as a working weighting matrix,