## Description

## 1. Skink Temperatures

Skinks are tested for their preferred daytime temperature. Each one is placed in a

long tank which is warmer at one end, cooler at the other. The temperature at the

position where it settles is recorded. There are four different species of skink, and

we wish to test (at the 5% level of significance) whether the species differ in their

preferred temperature.

The following table gives the data.

Species Preferred temperatures (oC) Total Mean

A 18 21 22 18 20 19 19 23 17 22 199 19.9

B 24 18 19 21 20 17 23 22 22 19 205 20.5

C 22 21 24 19 25 18 23 21 24 22 219 21.9

D 21 19 26 24 25 21 20 20 27 25 228 22.8

SAS output is given on pages 3 and 4.

(a) When running the experiment, other possible factors such as time of day, light,

amount of food recently eaten, are kept as near constant as possible. Why?

(b) The skinks are not put in the tank together. Why?

(c) Give values of n and p (the number of treatments) for this experiment. How

many degrees of freedom are in the Treatments row, the Error row and the

Total row of the ANOVA table? (Give the algebraic expressions and the actual

values for this experiment.)

(d) Use the output to write up the ANOVA in the style suggested in the Assignment

Guidelines on page 1. You should include a statement of the (complete) model

equation, and also comments on whether the assumptions are satisfied. Use a

5% significance level for the ANOVA test.

One-Way Analysis of Variance

Dependent Variable: Temperature

Source DF Sum of Squares Mean Square F Value Pr > F

Model 3 52.0750000 17.3583333 3.06 0.0402

Error 36 203.9000000 5.6638889

Corrected Total 39 255.9750000

R-Square Coeff Var Root MSE Temperature Mean

0.203438 11.18633 2.379893 21.27500

One-Way Analysis of Variance

Levene’s Test for Homogeneity of Temperature Variance

ANOVA of Squared Deviations from Group Means

Source DF Sum of Squares Mean Square F Value Pr > F

Species 3 86.1428 28.7143 1.36 0.2691

Error 36 757.4 21.0391

Means and Descriptive Statistics

Species Mean of

Temperature

Std. Dev. of

Temperature

Minimum of

Temperature

Maximum of

Temperature

21.275 2.5619253577 17 27

A 19.9 2.0248456731 17 23

B 20.5 2.2730302828 17 24

C 21.9 2.2335820757 18 25

D 22.8 2.8982753492 19 27

## 2. Nasal Sprays

Improvement in breathing airflow is measured for twenty-five people suffering from

nasal congestion. They were treated with either a saline spray (A) or one of four

nasal sprays (B, C, D, E) available over the counter in pharmacies.

Spray Airflow improvement Total Mean

A 15 10 16 14 8 63 12.6

B 25 41 37 44 26 173 34.6

C 21 6 9 15 14 65 13.0

D 16 7 24 22 15 84 16.8

E 24 15 39 34 30 142 28.4

527 21.08

Relevant SAS output follows, on pages 5 to 7.

Write a report that compares the five treatments using the guidelines on page 1.

Ensure that you comment on all the included SAS output. Use a 5% significance

level for all statistical tests. Make a recommendation for either a single best nasal

spray, or a group of best choices which are similar in their effects; refer to the Tukey

test to justify your decision.

One-Way Analysis of Variance

Results: Nasal Spray Example

The ANOVA Procedure

Class Level Information

Class Levels Values

Spray 5 A B C D E

Number of Observations Read 25

Number of Observations Used 25

Dependent Variable: Improvement

Source DF Sum of Squares Mean Square F Value Pr > F

Model 4 1959.440000 489.860000 9.73 0.0002

Error 20 1006.400000 50.320000

Corrected Total 24 2965.840000

R-Square Coeff Var Root MSE Improvement Mean

0.660669 33.65113 7.093659 21.08000

Source DF Anova SS Mean Square F Value Pr > F

Spray 4 1959.440000 489.860000 9.73 0.0002

STAT 292, 2020 5 Assignment 3

The ANOVA Procedure

Nasal Spray Example

Levene’s Test for Homogeneity of Improvement Variance

ANOVA of Squared Deviations from Group Means

Source DF Sum of Squares Mean Square F Value Pr > F

Type 4 11893.9 2973.5 1.59 0.2156

Error 20 37393.0 1869.6

Level of

Type N

Improvement

Mean Std Dev

A 5 12.6000000 3.43511281

B 5 34.6000000 8.67755726

C 5 13.0000000 5.78791845

D 5 16.8000000 6.68580586

E 5 28.4000000 9.28977933

Tukey’s Studentized Range (HSD) Test for Improvement

Note: This test controls the Type I experimentwise error rate, but it generally has a higher

Type II error rate than REGWQ.

Alpha 0.05

Error Degrees of Freedom 20

Error Mean Square 50.32

Critical Value of Studentized Range 4.23186

Minimum Significant Difference 13.425

Means with the same letter

are not significantly different.

Tukey Grouping Mean N Type

A 34.600 5 B

A

B A 28.400 5 E

B

B C 16.800 5 D

C

C 13.000 5 C

C

C 12.600 5 A

Nonparametric One-Way ANOVA

The NPAR1WAY Procedure: Nasal Spray Example

Wilcoxon Scores (Rank Sums) for Variable Improvement

Classified by Variable Type

Type N

Sum of

Scores

Expected

Under H0

Std Dev

Under H0

Mean

Score

A 5 36.50 65.0 14.682756 7.30

B 5 108.00 65.0 14.682756 21.60

C 5 35.00 65.0 14.682756 7.00

D 5 55.50 65.0 14.682756 11.10

E 5 90.00 65.0 14.682756 18.00

Average scores were used for ties.

Kruskal-Wallis Test

Chi-Square 15.8695

DF 4

Pr > Chi-Square 0.0032

3. Forensic dental X-rays

The extent to which X-rays can penetrate tooth enamel has been suggested as a suitable mechanism for differentiating between females and males in forensic medicine

(e.g., think about shows like ‘CSI’ and parts of ‘NCIS’). The table below gives spectropenetration gradients for one tooth from each of eight females and eight males.

Gender Y = spectropenetration gradient Mean Std. dev.

Female 4.8 5.3 3.7 4.1 5.6 4.0 3.6 5.0 4.5125 0.7605

Male 4.9 5.4 5.0 5.5 5.4 6.6 6.3 4.3 5.4250 0.7440

Note that a high reading reflects a fast drop-off in X-ray penetration, with less

penetration by X-rays.

(a) Explain why the teeth have been sampled from eight different people of each

sex, and not eight teeth from one female and eight from one male.

(b) Given that the researcher could afford to test n = 16 subjects, explain the

advantages of choosing eight from each group.

(c) SAS output from an ANOVA is on pages 9 and 10. Write a report, following

the guidelines on page 1.

(d) Explain why there is no point doing a Tukey test with this data.

STAT 292, 2020 8 Assignment 3

One-Way Analysis of Variance

Results: X-ray Penetration Gradient

The ANOVA Procedure

Class Level Information

Class Levels Values

Gender 2 Female Male

Number of Observations Read 16

Number of Observations Used 16

Dependent Variable: Xray_grad

Source DF Sum of Squares Mean Square F Value Pr > F

Model 1 3.33062500 3.33062500 5.88 0.0294

Error 14 7.92375000 0.56598214

Corrected Total 15 11.25437500

R-Square Coeff Var Root MSE Xray_grad Mean

0.295940 15.14099 0.752318 4.968750

Source DF Anova SS Mean Square F Value Pr > F

Gender 1 3.33062500 3.33062500 5.88 0.0294

The ANOVA Procedure

X-ray Penetration Gradient

Levene’s Test for Homogeneity of Xray_grad Variance

ANOVA of Squared Deviations from Group Means

Source DF Sum of Squares Mean Square F Value Pr > F

Gender 1 0.00189 0.00189 0.01 0.9305

Error 14 3.3504 0.2393

Level of

Gender N

Xray_grad

Mean Std Dev

Female 8 4.51250000 0.76052144

Male 8 5.42500000 0.74402381

4. Personality types

In psychology, there are tests to classify people into one of many personality types.

An experiment is run to find the extent of the influence of personality type on the

subject’s score in a certain test. A random sample of four personality types is taken,

and within each type a random sample of ten subjects is taken. Each subject is given

the test, and the score Y is recorded, with data as follows:

Type Test Score, Y

T1 50 52 44 49 60 51 40 41 54 39

T2 63 45 48 49 65 55 47 58 57 56

T3 50 52 47 48 44 56 55 39 51 53

T4 39 38 51 50 53 53 59 41 45 48

(a) Explain why this is a random effects design, rather than a fixed effects design.

(b) Some SAS output is given on pages 12 and 13. Note that the boxplots do not

include estimates of group means, since any differences in population means

are not the focus of this investigation.

Present a report and your conclusions. Include in your report comments on

whether the relevant assumptions seem satisfied. Give your estimated components of variance, plus the percentage of the total variance of Y that is due to

personality, along with the percentage unexplained,

Do you think personality type is important in determining the score on this

particular test?

SAS Output for Personality Type Example

Box Plot

One-Way Analysis of Variance

Results

The ANOVA Procedure

Class Level Information

Class Levels Values

PersType 4 T1 T2 T3 T4

Number of Observations Read 40

Number of Observations Used 40

Dependent Variable: Score

Source DF Sum of Squares Mean Square F Value Pr > F

Model 3 279.675000 93.225000 2.23 0.1017

Error 36 1506.700000 41.852778

Corrected Total 39 1786.375000

R-Square Coeff Var Root MSE Score Mean

0.156560 12.97117 6.469372 49.87500

Source DF Anova SS Mean Square F Value Pr > F

PersType 3 279.6750000 93.2250000 2.23 0.1017

STAT 292, 2020 12 Assignment 3

SAS Output for Personality Type Example

Levene’s Test for Homogeneity of Score Variance

ANOVA of Squared Deviations from Group Means

Source DF Sum of Squares Mean Square F Value Pr > F

PersType 3 2400.7 800.2 0.49 0.6926

Error 36 59004.7 1639.0

5. Phytoremediation

Phytoremediation (New Scientist, 20 Dec 1997, p.26) is a process by which plants

are used to remove toxic metals from the soil. For example, sunflowers were used

around Chernobyl, where there was radioactive contamination from a nuclear power

station accident.

Certain plants take up toxic metals (e.g. zinc, cadmium, uranium) and accumulate

them in their vacuoles as protection against chewing insects and infection.

Suppose that four species of plant were tested, at lower and higher soil pH, for their

uptake of zinc, Y , measured in parts per million (ppm) of dry plant weight at the

end of the trial.

Uptake of zinc, Y, (ppm):

Soil pH

Plant Name 5.5 (acid) 7 (neutral)

Lettuce 250 470 330 400 310 430

Martin red fescue 2850 2380 3130 1070 960 1300

Alpine pennycress 6340 4280 5170 2880 4330 3050

Bladder campion 3690 4750 5100 2360 1990 2140

(a) What kind of design is this? Give the model equation, including an interaction

term.

(b) SAS analysis of the data using the model from part (a) was tried on both raw

data Y and transformed data log Y . Diagnostic graphs from both analyses

are given on pages 15 and 16. Explain, with reasons, whether it is better to

analyse Y or log Y .

(c) Further SAS output is given on pages 17 to 19. Present a report and your

conclusions, following the usual guidelines. Use a 5% significance level.

SAS Output for Phytoremediation Example

STAT 292, 2020 15 Assignment 3

SAS Output for Phytoremediation Example

STAT 292, 2020 16 Assignment 3

SAS Output for Phytoremediation Example

Linear Models

The GLM Procedure

Class Level Information

ClassLevelsValues

pH 2acid neutral

Plant 4AlpineP BladderC Lettuce MartinRF

Number of Observations Read24

Number of Observations Used24

Dependent Variable: logZinc

Source DFSum of SquaresMean SquareF Value Pr > F

Model 7 24.03027190 3.43289599 92.71<.0001

Error 16 0.59245521 0.03702845

Corrected Total 23 24.62272711

R-Square Coeff Var Root MSElogZinc Mean

0.9759392.589699 0.192428 7.430507

Source DF Type I SSMean SquareF Value Pr > F

pH 1 1.46932364 1.46932364 39.68<.0001

Plant 321.65807393 7.21935798 194.97<.0001

pH*Plant 3 0.90287433 0.30095811 8.130.0016

Source DF Type III SSMean SquareF Value Pr > F

pH 1 1.46932364 1.46932364 39.68<.0001

Plant 321.65807393 7.21935798 194.97<.0001

pH*Plant 3 0.90287433 0.30095811 8.130.0016