Advanced experimental design and analysis 1

Sampling, Regression, Experimental Design and Analysis for
Environmental Scientists, Biologists, and Resource Managers

C. J. Schwarz
Department of Statistics and Actuarial Science, Simon Fraser University
cschwarz@stat.sfu.ca
December 2, 2011

Contents
100Logistic Regression - Advanced Topics
100.1Introduction . . . . . . . . . . . . . . . . . . . . . . .
100.2Sacrificial pseudo-replication . . . . . . . . . . . . . .
100.3Example: Fox-proofing mice colonies . . . . . . . . .
100.3.1 Using the simple proportions as data . . . . . .
100.3.2 Logistic regression using overdispersion . . . .
100.3.3 GLIMM modeling the random effect of colony
100.4Example: Over-dispersed Seeds Germination Data . .

1

.

.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.

.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.

.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.

.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.

.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.

.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.

.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.

.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

2

2
2
4
5
6
8
9

Chapter 100

Logistic Regression - Advanced Topics
100.1 Introduction
The previous chapters on chi-square tests, logistic regression, and logistic ANOVA only considered the
simplest of experiment designs where the data were collected under a completely randomized design, i.e.
every observation is independent of every other observation with complete randomization over experimental
units and treatments.
It is possible to extend logistic regression and logistic ANOVA to more complex experimental designs.
My course notes in a graduate course Stat-805 http://www.stat.sfu.ca/~cschwarz/Stat-805
have some details on these more advanced topics.
It is only recently that software has become readily available to analyze these types of experiments. The

illustrations below will use Proc GLIMMIX available in SAS v.9.1.3 or higher.
In this chapter some variations from the simple CRD will be discussed.

100.2 Sacrificial pseudo-replication
In many experiments, the experimental unit is a collection of individuals, but measurements take place on
the individual.
Hurlbert (1984) cites the example of an experiment to investigate the effect of fox predation upon the
sex ratio of mice. Four colonies of mice are established. Two of the colonies are randomly chosen and a
fox-proof fence is erected around the plots. The other two colonies serve as controls with out any fencing.

2

CHAPTER 100. LOGISTIC REGRESSION - ADVANCED TOPICS
Here are the data (Table 6 of Hurlbert (1984)):
Colony
Foxes
No foxes

% Males


Number males

Number females

A1

63%

22

13

A2

56%

9

7

B1

60%

15

10

B2

43%

97

130

This data has the characteristics of a chi-square test or logistic ANOVA. The factor (type of fencing) is
categorical. The response, the sex of the mouse, is also categorical. Many researchers would simply pool
over the replicates to give the pooled table:
Colony

% Males

Number males

Number females

Foxes

A1 + A2

61%

31

20

No foxes

B1 + B2

44%

112

140

If a χ2 test is applied to the pooled data, the p-value is less than 5% indicating there is evidence that the
sex ratio is not independent of the presence of foxes.
This “pooled analysis” is INCORRECT. According to Hurlbert (1984), the major problem is that individual units (the mice) are treated as independent objects, when in fact, there are not. Experimenters often
pool experimental units from disparate sets of observations in order to do simple chi-square tests or logistic
ANOVA. He specifically labels this pooling as sacrificial pseudo-replication.
Hurlbert (1984) identifies at least 4 reasons why the pooling is not valid:
• non-independence of observation. The 35 mice caught in A1 can be regarded as 35 observations
all subject to a common cause, as can the 16 mice in A2 , as each group were subject to a common
influence in the patches. Consequently, the pooled mice are NOT independent; they represent two sets
of interdependent or correlated observations. The pooled data set violates the fundamental assumption
of independent observations.
• throws away some information. The pooling throws out the information on the variability among
replicate plots. Without such information there is no proper way to assess the significance of the differences between treatments. Note that in previous cases of ordinary pseudo-replication (e.g. multiple
fish within a tank), this information is also discarded but is not needed - what is needed is the variation among tanks, not among fish. In the latter case, averaging over the pseudo-replicates causes no
problems.
• confusion of experimental and observational units. If one carries out a test on the pooled data,
one is implicitly redefining the experimental unit to be individual mice and not the field plots. The
enclosures (treatments) are applied at the plot level and not the mouse level. This is similar to the
problem of multiple fish within a tank that is subject to a treatment.

c
2012
Carl James Schwarz

3

CHAPTER 100. LOGISTIC REGRESSION - ADVANCED TOPICS
• unequal weighting. Pooling weights the replicate plots differentially. For example, suppose that one
enclosure had 1000 mice with 90% being male; and a second enclosure has 10 mice with 10% being
male. The pooled data would have 1000 + 10 mice with 900 + 1 being male for an overall male
ratio of 90%. Had the two enclosures been given equal weight, the average male percentage would be
(90%+10%)/2=50%. In the above example, the number of mice captured in the plots varies from 16
to over 200; the plot with over 200 mice essentially drives the results.
There are multiple ways to analyze this data that avoid the problem that render the pooled analysis
invalid. Unfortunately, JMP 7.0 does NOT have the ability to properly analyze this type of data. SAS will
be used to illustrate the various options in the sections that follow.

100.3 Example: Fox-proofing mice colonies
Hurlbert (1984) cites the example of an experiment to investigate the effect of fox predation upon the sex
ratio of mice. Four colonies of mice are established. Two of the colonies are randomly chosen and a foxproof fence is erected around the plots. The other two colonies serve as controls with out any fencing.
Here are the data (Table 6 of Hurlbert (1984)):
Colony
Foxes
No foxes

% Males

Number males

Number females

A1

63%

22

13

A2

56%

9

7

B1

60%

15

10

B2

43%

97

130

We being by reading in the data:

data mice;
length treatment $10.;
input colony $ treatment $ sex $ count;
datalines;
a1 foxes m 22
a1 foxes f 13
a2 foxes m 9
a2 foxes f 7
b1 no.foxes m 15
b1 no.foxes f 10
b2 no.foxes m 97
b2 no.foxes f 130
;;;;

c
2012
Carl James Schwarz

4

CHAPTER 100. LOGISTIC REGRESSION - ADVANCED TOPICS

100.3.1 Using the simple proportions as data
Hurlbert (1984) suggests the proper way to analyze the above experiment is to essentially compute a single
number for each plot and then do a two-sample t-test on the percentages. [This is equivalent to the ordinary
averaging process that takes place in ordinary pseudo-replication or sub-sampling.]
We can have SAS compute the proportion of males directly using Proc Transpose:

/* transpose the data to compute the proportions */
proc sort data=mice; by colony treatment;
proc transpose data=mice out=trans_mice;
by colony treatment;
var count;
id sex;
run;
data trans_mice;
set trans_mice;
p_males = m/(m+f);
drop _name_;
format m f 5.0 p_males 7.3;
run;

This gives:

colony

treatment

m

f

p_males

a1

foxes

22

13

0.629

a2

foxes

9

7

0.563

b1

no.foxes

15

10

0.600

b2

no.foxes

97

130

0.427

Proc Ttest is then used to analyze the data:
proc ttest data=trans_mice ci=none;
title2 ’Simple ttest on the proportion of males’;
class treatment;
var p_males;
ods output Statistics=ttest_statistics;
ods output TTests=ttest_tests;
run;

c
2012
Carl James Schwarz

5

CHAPTER 100. LOGISTIC REGRESSION - ADVANCED TOPICS
The simple summary statistics are:
Lower
Limit
of
Mean

Upper
Limit
of
Mean

Variable

treatment

N

Mean

Std
Error

p_males

foxes

2

0.5955

0.0330

0.1758

1.0153

p_males

no.foxes

0.0863

-0.5834

1.6108

Diff (1-2)

2
_

0.5137

p_males

0.0819

0.0924

-0.3159

0.4796

t
Value

DF

Pr
>
|t|

The results from a simple t-test conducted in SAS are:

Variable

Method

Variances

p_males

Pooled

Equal

0.89

2

0.4692

p_males

Satterthwaite

Unequal

0.89

1.2866

0.5102

The estimated difference in the sex ratio between colonies that are subject to fox predation and colonies
not subject to fox predation is .082 (SE .092) with p-values of .47 (pooled t-test) and .51 (unpooled t-test)
respectively. As the p-values are quite large, there is NO evidence of a predation effect.
With only two replicates (the colonies), this experiment is likely to have very poor power to detect
anything but gross differences.
The above analysis is not entirely satisfactory. The proportion of males have different variabilities because they are based on different number of total mice. As well, there may be overdispersion among colonies
under the same treatment, i.e. the variation in the proportion of males may be larger among the two colonies
under the same treatment than expected.

100.3.2 Logistic regression using overdispersion
Another “approximate” method to deal with the potential overdispersion among the colonies within the same
treatment group (the colony effect) is to use a standard logistic regression but use the goodness-of-fit test to
estimate an overdispersion effect. This overdispersion is then used to adjust the standard errors of estimates
and the test statistics for hypothesis tests. Please consult the chapter on Logistic Regression for more details.
Proc Genmod is then used to analyze the data:

c
2012
Carl James Schwarz

6

CHAPTER 100. LOGISTIC REGRESSION - ADVANCED TOPICS

proc genmod data=mice descending;
title2 ’Generalized Linear Model allowing for overdispersion’;
class treatment colony;
model sex = treatment /
dist=binomial link=logit dscale aggregate=colony type3;
freq count;
lsmeans treatment / diff cl;
ods output ParameterEstimates=GenModParameterEstimates;
ods output Type3=GenModType3;
ods output Diffs=GenModLSmeansDiff;
run;

The Dscale option on the model statement indicates that the deviance goodness-of-fit test is used to estimate
the overdispersion factor.
The bottom line of the parameter estimates table:

Parameter

Level1

95%
Lower
Confidence
Limit

95%
Upper
Confidence
Limit

Wald
Chi-Square

Pr
>
ChiSq

DF

Estimate

Standard
Error

1

-0.2231

0.1527

-0.5225

0.0762

2.13

0.1441
0.0800
.
_

Intercept
treatment

foxes

1

0.6614

0.3778

-0.0791

1.4019

treatment

no.foxes

0

0.0000

0.0000

0.0000

0.0000

3.06
.

0

1.2049

0.0000

1.2049

1.2049

_

Scale

estimates the overdispersion factor as 1.20. This implies that standard errors will be inflated by
test-statistics for effect tests will be deflated by a factor of 1.20.
The test for a treatment effect:

Source
treatment

Num
DF

Den
DF

F
Value

Pr
>
F

Chi-Square

Pr
>
ChiSq

Method

1

2

3.14

0.2185

3.14

0.0765

LR

gives a p-value of .0765 again indicating no evidence of an effect.

c
2012
Carl James Schwarz

7



1.20 and

CHAPTER 100. LOGISTIC REGRESSION - ADVANCED TOPICS
The estimated sex effect is:

Effect

treatment

_treatment

treatment

foxes

no.foxes

Est

SE

z
Value

Pr
>
|z|

Alpha

Lower

Upper

0.6614

0.3778

1.75

0.0800

0.05

-0.07912

1.4019

The estimate of .66 implies that the odds-ratio of the proportion of males between colonies with foxes and
without foxes is exp(.66) = 1.94x but the 95% confidence interval for the odds ratio is from exp(−.0791) =
.92 to exp(1.40) = 4.05 which includes the value of 1 (indicating no difference in the odds of males).
The use a simple overdispersion factor is not completely satisfactory. It assumes a single correction
factor for all of the estimates and again estimates the different amount of mice in each colony.

100.3.3 GLIMM modeling the random effect of colony
A more “refined” analysis is now available using Generalized Linear Mixed Models (GLIMM) which have
been implemented in SAS.
GLIMM allow the specification of random effects in much the same way as in advanced ANOVA models.
This is a very general treatment and now allows us to analyze data from very complex experimental designs.
In this model, the model would be specified as:
logit(pmales ) = T reatment Colony(T reatment)(R)
where the Colony(T reatment) would be the random effect of the experimental units (the colonies). A
logistic type model is used.
This is specified in SAS as:

proc glimmix data=mice;
title2 ’Glimmix analysis’;
class treatment colony ;
model sex(event=’m’) = treatment /
distribution=binary link=logit ddfm=kr;
random colony(treatment);
freq count;
lsmeans treatment / cl ilink;
lsmestimate treatment "trt effect" 1 -1 / cl;
ods output CovParms=GlimmixCovParms;
ods output Tests3=GlimmixTests3;
c
2012
Carl James Schwarz

8

CHAPTER 100. LOGISTIC REGRESSION - ADVANCED TOPICS
ods output LSMestimates=GlimmixLSMestimates;
run;

The output from GLIMMIX (SAS 9.3.1) follows. First is an estimate of the variability among colonies
(on the logit scale):
Parameter
colony(treatment)

Estimate

Standard
Error

0.06892

0.1675

Next is the test of the overall treatment effect:
Effect

Num
DF

Den
DF

F
Value

Pr
>
F

1

1.847

1.22

0.3919

treatment

The p-value is .39; again there is no evidence of a predation effect on the proportion of males in the colonies.
Finally, an estimate of the treatment effect:
Effect

Label

treatment

trt effect

Estimate

Standard
Error

DF

t
Value

Pr
>
|t|

Alpha

Lower

Upper

0.5269

0.4763

1.847

1.11

0.3919

0.05

-1.6940

2.7478

Some caution is required. The estimate of .53 (SE .47) is for the difference in the logit(proportions) between
males and females. If you take exp(.53) = 1.69, this is the estimated odds-ratio of males to females
comparing colonies with predators to colonies without predators. The 95% confidence interval for the oddsratio is exp(−1.6950) = .183) to exp(2.7478) = 15.60 which includes the value of 1 (indicating no effect).
Consult the chapter on logistic regression for an explanation of odds and odds-ratios.

100.4 Example: Over-dispersed Seeds Germination Data
This data is from the SAS manual.
In a seed germination test, seeds of two cultivars were planted in pots of two soil conditions. The following data contains the observed proportion of seeds that germinated for various combinations of cultivar
and soil condition. Variable n represents the number of seeds planted in a pot, and r represents the number germinated. CU LT and SOIL are indicator variables, representing the cultivar and soil condition,
c
2012
Carl James Schwarz

9

CHAPTER 100. LOGISTIC REGRESSION - ADVANCED TOPICS
respectively.
Pot

n

r

Cult

Soil

1

16

8

0

0

2

51

26

0

0

3

45

23

0

0

4

39

10

0

0

5

36

9

0

0

6

81

23

1

0

7

30

10

1

0

8

39

17

1

0

9

28

8

1

0

10

62

23

1

0

11

51

32

0

1

12

72

55

0

1

13

41

22

0

1

14

12

3

0

1

15

13

10

0

1

16

79

46

1

1

17

30

15

1

1

18

51

32

1

1

19

74

53

1

1

20

56

12

1

1

The SAS program that analyzed this data is available in the file germination.sas in the Sample Program
Library at: http://www.stat.sfu.ca/~cschwarz/Stat-650/Notes/MyPrograms.
Notice that the experimental unit is the pot (i.e. soil and cult were applied to the pot level), but the observational unit (what is actually measured) is the individual seed. The response variable for each individual
seed is the either yes or no depending if it germinated or not.
First, how big is the pot random effect? One way to estimate this would be to compare the variation
of pb among pots within the same soil-cultivar combination with the theoretical variation based on binomial
sampling within each pot. In order to account for the differing sample sizes in each pot, we will compute a
“standardized normal” variable for pot i within soil-cultivar combination j as:
zij = p
where pj =

P

i rij
nij

c
2012
Carl James Schwarz

pbij − pj

pbij (1 − pbij )/nij

is the average germination rate for the soil-cultivar combination j.
10

CHAPTER 100. LOGISTIC REGRESSION - ADVANCED TOPICS
If the additional pot-to-pot random variation was negligible, then Z should have an approximate standard
normal distribution with a variance of 1. The actual variance of Z was found to be 4.5 indicating that the
pot-to-pot variation in pb was about 4× larger than expected from a simple binomial variation.
Because of this extra-binomial variation, it is not proper to simply “ignore” the pot and pool over the
five pots for each cultivar-soil combination. This would be an example of sacrificial pseudo-replication as
outlined by Hurlbert (1984). As you will below, the pot-to-pot variation in the proportion that germinate is
more than can be explained by the simple binomial variation, i.e. there is a large random effect of pots that
must be incorporated.
A naive analysis could proceed by finding the proportion of seeds that germinated in each pot (e.g. for
pot 1, pb = 8/16 = 0.50) and then doing a two-factor CRD analysis on these proportions using the model:
pb = Soil Cult Soil ∗ Cult
This is not satisfactory because the number of seeds in each pot (n) varies considerably from pot-to-pot, and
hence the variance of pb also varies1 . A weighted analysis could be performed which would partially solve
this problem.
This naive analysis could be done using the Proc Mixed code:

proc mixed data=seeds;
title2 ’naive analysis on the proportions in each plot - using n as weight’;
class cult soil;
model phat = cult soil cult*soil / ddfm=kr;
lsmeans soil cult soil*cult/ diff adjust=tukey;
weight n;
ods output CovParms=NaiveMixedCovparms;
ods output Tests3=NaiveMixedTests3;
ods output LSmeans=NaiveMixedLSMeans;
run;

This give an estimate of the residual variance of:
Cov
Parm

Estimate
1.0057

Residual

The residual variance is a combination of pot-to-pot variance and the variability of the pb in each pot.
As a rough guess, the average germination rate is around 0.5 with an average sample size of around 45.
This would give a binomial variance of .5(1 − .5)/45 = .005. Hence the pot-to-pot variance is about
.026 − .005 = .020 which is about 4× that of the binomial variance which we saw earlier.
1 The

binomial variance of each pb for a pot would be found as

c
2012
Carl James Schwarz

11

p

p(1 − p)/n

CHAPTER 100. LOGISTIC REGRESSION - ADVANCED TOPICS
The following results for the tests of the main effects and interactions:
Num
DF

Den
DF

F
Value

Pr
>
F

cult

1

16

1.57

0.2287

soil

1

16

10.86

0.0046

cult*soil

1

16

0.05

0.8177

Effect

Hence the naive analysis find no evidence of an interaction effect of soil and cultivar, no evidence of a
main effect of cultivar, but strong evidence of a main effect of soil upon the germination rate.
Here are the estimates of the marginal means:
Effect

cult

soil

Estimate

Standard
Error

DF

t
Value

Pr
>
|t|

soil

0

0.3720

0.04891

16

7.61