6 classical hypothesis testing theory

Classical Hypothesis
Testing Theory

Adapted from Alexander Senf

Review
• 5 steps of classical hypothesis testing
(Ch. 3)
1. Declare null hypothesis H0 and alternate
hypothesis H1
2. Fix a threshold α for Type I error (1% or
5%)
•

Type I error (α): reject H0 when it is true

•

Type II error (β): accept H0 when it is false

3. Determine a test statistic

•
7/31/2008

a quantity calculated from the data
2

Review
4. Determine what observed values of the
test statistic should lead to rejection of
H0
•

Significance point K (determined by α)

5. Test to see if observed data is more
extreme than significance point K

7/31/2008

•

If it is, reject H0

•

Otherwise, accept H0

3

Overview of Ch. 9
– Simple Fixed-Sample-Size Tests
– Composite Fixed-Sample-Size Tests
– The -2 log λ Approximation
– The Analysis of Variance (ANOVA)
– Multivariate Methods
– ANOVA: the Repeated Measures Case
– Bootstrap Methods: the Two-sample ttest
– Sequential Analysis
7/31/2008

4

Simple Fixed-Sample-Size
Tests

7/31/2008

5

The Issue
• In the simplest case, everything is
specified
– Probability distribution of H0 and H1
• Including all parameters

– α (and K)
– But: β is left unspecified

• It is desirable to have a procedure that
minimizes β given a fixed α

– This would maximize the power of the test
• 1-β, the probability of rejecting H0 when H1 is true
7/31/2008

6

Most Powerful Procedure
• Neyman-Pearson Lemma
– States that the likelihood-ratio (LR) test is the
most powerful test for a given α
– The LR is defined as:

– where

f1 ( X 1 ) f1 ( X 2 )  f1 ( X n )
LR 
f 0 ( X 1 ) f 0 ( X 2 ) f 0 ( X n )

• f0, f1 are completely specified density functions for
H0,H1

• X1, X2, … Xn are iid random variables
7/31/2008

7

Neyman-Pearson Lemma
– H0 is rejected when LR ≥ K
– With a constant K chosen such that:
P(LR ≥ K when H0 is true) = α
– Let’s look at an example using the
Neyman-Pearson Lemma!
– Then we will prove it.

7/31/2008

8

Example
• Basketball players seem to
be taller than average

– Use this observation to
formulate our hypothesis H1:
• “Tallness is a factor in the
recruitment of KU basketball
players”

– The null hypothesis, H0, could
be:
• “No, the players on KU’s team are
a just average height compared to
the population in the U.S.”
• “Average height of the team and
the population in general is the
same”
7/31/2008

9

Example
• Setup:

– Average height of males in the US: 5’9 ½“
– Average height of KU players in 2008:
6’04 ½”
• Assumption: both populations are normaldistributed centered on their 2respective
( x  76.5 ) 2
( x  69.5 )


averages (μ0 = 69.5 in, μ
1=
8 2
8 76.5 in) and σ =
e
e
f1 ( x) 
• Sample size: 3 f 0 ( x) 
2 2
2 2

– Choose α: 5%

7/31/2008

10

Example
• The two populations:
f0

f1

p

height (inches)
7/31/2008

11

Example
– Our test statistic is the Likelihood Ratio, LR
e



( x1  76.5 ) 2
8

f1 ( x1 ) f1 ( x2 ) f1 ( x3 )  2 2
( x ) 
( x  69.5 )

f 0 ( x1 ) f 0 ( x2 ) f 0 ( x3 )
8
1

e

2 2

e



( x2  76.5 ) 2
8

e

2 2
2

e



( x2  69.5 ) 2
8

2 2



( x3  76.5 ) 2
8

2 2
e



( x3  69.5 ) 2
8

2 2

3

e

1
( xi  69.5 ) 2  ( xi  76.5 ) 2
8 i 1



– Now we need to determine a significance
point K at which we can reject H0, given α =
5%
7/31/2008

• P(Λ(x) ≥ K | H0 is true) = 0.05, determine K

12

Example
– So we just need to solve for K’ and calculate K:
  

f

0

( x1 ) f 0 ( x2 ) f 0 ( x3 )dx1dx2 dx3 0.05

K1' K 2' K 3'

• How to solve this? Well, we only need one set of
values to calculate K, so let’s pick two and solve for
the third:
 

f

0

( x1 ) f 0 ( x2 ) f 0 ( x3 )dx1dx2 dx3 0.05

68 71 K 3'

• We get one result: K3’=71.0803
7/31/2008

13

Example
– Then we can just plug it in to Λ and
calculate
K:
3
1
( K i'  69.5 ) 2  ( K i'  76.5 ) 2

8 i 1
K e
1
8 ( 68 69.5)2  ( 68 76.5)2 ( 71 69.5) 2  ( 71 76.5) 2 ( 71.0803 69.5) 2  ( 71.0803 76.5)2 
e

1.663 *10  7

7/31/2008

14

Example
– With the significance point K = 1.663*10-7 we
can now test our hypothesis based on
observations:
• E.g.: Sasha = 83 in, Darrell = 81 1in,
3 Sherron = 71 in

( X {83,81,71}) e

( X i  69.5 ) 2  ( X i  76.5 ) 2

8
i 1

(83,81,71) 1.446 *1012
• 1.446*1012 > 1.663*10-7
• Therefore, our hypothesis that tallness is a factor in
the recruitment of KU basketball players is true.
7/31/2008

15

Neyman-Pearson Proof
• Let A define region in the joint range
of X1, X2, … Xn such that LR ≥ K. A is
the critical region.
– IfAL(is
H 0 )the
only
f 0 (u2 )  f 0 region
(u n )du1du 2 of
 dusize
n  α
f 0 (u1 )critical
A
A
we are done

– Let’s
assume
another
critical
region
of
L
(
H
)


f
(
u
)
f
(
u
)

f
(
u
)
du
du

du


0
0
1
0
2
0
n
1
2
n
B

size α, defined
by B
B
7/31/2008

16

Proof
– H0 is rejected if the observed vector (x1,
x2, …, xn) is in A or in B.
– Let A and B overlap in region C
– Power of the test: rejecting H0 when H1 is
true
• The Power of this test using A is:

L( H ) f (u ) f (u ) f (u )du du
A

7/31/2008

1

1

1

1

2

1

n

1

2

 dun

A

17

Proof
– Define: Δ = ∫AL(H1) - ∫BL(H1)
• The power of the test using A minus using B
  f1 (u1 )  f1 (un )du1  dun   f1 (u1 )  f1 (un )du1  dun
A

B

 f1 (u1 )  f1 (un )du1  dun   f1 (u1 )  f1 (un )du1  dun
A\C

B\C

• Where A\C is the set of points in A but not in
C
• And B\C contains points in B but not in C

7/31/2008

18

Proof
– So, in A\C we have:



f1 (u1 )  f1 (u n )
K
f 0 (u1 )  f 0 (un )

f1 (u1 )  f1 (un ) Kf 0 (u1 )  f 0 (un )

– While in B\C we have:
f1 (u1 )  f1 (un ) Kf 0 (u1 )  f 0 (un )

Why?
7/31/2008

19

Proof
– Thus
  Kf 0 (u1 )  f 0 (un )du1  dun   Kf 0 (u1 )  f 0 (un )du1  dun
A\C

B\C

 Kf 0 (u1 )  f 0 (un )du1  dun   Kf 0 (u1 )  f 0 (un )du1  dun
A

B

 K  K

0

– Which implies that the power of the test
using A is greater than or equal to the
power using B.
7/31/2008
20

Composite Fixed-SampleSize Tests

7/31/2008

21

Not Identically Distributed
• In most cases, random variables are not
identically distributed, at least not in H1
– This affects the likelihood function, L
– For example, H1 in the two-sample t-test is:
m

L 
i 1

1
e
2 

 ( x1i  1 ) 2
2 2

n


i 1

1
e
2 

 ( x2 i   2 ) 2
2 2

– Where μ1 and μ2 are different

7/31/2008

22

Composite
– Further, the hypotheses being tested do
not specify all parameters
– They are composite

– This chapter only outlines aspects of
composite test theory relevant to the
material in this book.

7/31/2008

23

Parameter Spaces
– The set of values the parameters of interest
can take
– Null hypothesis: parameters in some region ω
– Alternate hypothesis: parameters in Ω
– ω is usually a subspace of Ω
• Nested hypothesis case
– Null hypothesis nested within alternate hypothesis
– This book focuses on this case

• “if the alternate hypothesis can explain the data
significantly better we can reject the null
hypothesis”
7/31/2008

24

λ Ratio
• Optimality theory for composite tests
suggests this as desirable test statistic:
Lmax ( )

Lmax ()
• Lmax(ω): maximum likelihood when parameters
are confined to the region ω
• Lmax(Ω): maximum likelihood when parameters
are confined to the region Ω, defined by H1
• H0 is rejected when λ is sufficiently small (→
Type I error)
7/31/2008

25

Example: t-tests
• The next slides calculate the λ-ratio
for the two sample t-test (with the
likelihood)
 (x   )
 (x   )
m
n
1
1
1i

L 
i 1

2 

e

2 2

1

2

2i


i 1

2 

e

2

2

2 2

– t-tests later generalize to ANOVA and T2
tests
7/31/2008

26

Equal Variance Two-Sided ttest
• Setup
– Random variables X11,…,X1m in group 1 are
Normally and Independently Distributed
( μ 1 ,σ 2 )
– Random variables X21,…,X2n in group 2 are
NID (μ2,σ2)
– X1i and X2j are independent for all i and j
– Null hypothesis H0: μ1= μ2 (= μ, unspecified)
– Alternate hypothesis H1: both unspecified
7/31/2008

27

Equal Variance Two-Sided ttest
• Setup (continued)
– σ2 is unknown and unspecified in H0 and
H1
• Is assumed to be the same in both
distributions
 {  ,0   2  }
1

– Region ω is:

2

 {  1  ,   2  ,0   2  }

– Region Ω is:

7/31/2008

28

Equal Variance Two-Sided ttest
• Derivation
– H0: writing μ for the mean, when μ1= μ2,
the maximum over likelihood ω is at
X 11  X 12    X 1m  X 21  X 22    X 2 n
ˆ  X 
mn

– And the (common) variance σ2 is
m

ˆ 02 
7/31/2008

2

n

2
(
X

X
)

(
X

X
)
i1 1i
i1 2i

mn
29

Equal Variance Two-Sided ttest
– Inserting both into the likelihood
function, L
Lmax ( ) 

7/31/2008

1
2
0

(2ˆ )

m n
2

e



m n
2

30

Equal Variance Two-Sided ttest
– Do the same thing for region Ω
ˆ1  X 1 

X 11  X 12    X 1m
m

ˆ 2  X 2 

m

̂ 12 

X 21  X 22    X 2 n
n

n

2
2
(
X

X
)

(
X

X
)
1
2
i 1 1i
i 1 2i

mn

– Which produces this likelihood Function,
L
m n

Lmax () 

7/31/2008

1

(2ˆ12 )

m n
2

e

2

31

Equal Variance Two-Sided ttest
– The test statistic λ is then
e



m n
2

Lmax ( ) (2ˆ 02 )


 m2n
Lmax ()
e
2
1

(2ˆ )

m n
2

 ˆ12 
 2 
 ˆ 0 

m n
2

m n
2

It’s the same function, just
With different variances
7/31/2008

32

Equal Variance Two-Sided ttest
– We can then use the algebraic identity
m

n

2

m

i 1

i 1

– To show that

– Where t
7/31/2008

2

n

2
(
X

X
)

(
X

X
)

(
X

X
)

(
X

X
)

1
2
 1i
 2i
 1i
 2i
i 1

2

 1 
  1t 2 
 mn  2 

i 1

mn
( X 1  X 2 )2
mn

m n
2

( X 1  X 2 ) mn
T
is (from Ch. 3)
S mn
33

Equal Variance Two-Sided ttest
– t is the observed value of T
– S is defined in Ch. 3 as
m

n

2
(
X

X
)

(
X

X
)
1
2
 1i
 2i

S 2  i 1

2

i 1

mn 2

λ

We can plot λ as a
function of t:
(e.g. m+n=10)
7/31/2008

t

34

Equal Variance Two-Sided ttest
– So, by the monotonicity argument, we can
use t2 or |t| instead of λ as test statistic
– Small values of λ correspond to large
values of |t|
– Sufficiently large |t| lead to rejection of H0
– The H0 distribution of t is known
• t-distribution with m+n-2 degrees of freedom

– Significance points are widely available
• Once α has been chosen, values of |t|
sufficiently large to reject H0 can be determined
7/31/2008

35

http://www.socr.ucla.edu/Applets.dir/T-table.html

Equal Variance Two-Sided ttest

7/31/2008
36

Equal Variance One-Sided ttest
• Similar to Two-Sided t-test case
– Different region Ω for H1:
• Means μ1 and μ2 are not simply different, but
one is larger than the other μ1 ≥ μ2

 {1  2 ,0   2  }
x1  x 2

• If
then maximum likelihood
estimates are the same as for the two-sided
case
7/31/2008

37

Equal Variance One-Sided ttest
• If x1  x 2
then the unconstrained maximum
of the likelihood is outside of ω
x1 , xat
• The unique maximum (is
2)
, implying
that the maximum in ω occurs at a boundary
point in Ω
( x)
• At this point estimates of μ1 and μ2 are equal
• At this point the likelihood ratio is 1 and H0 is
not rejected
• Result: H0 is rejected in favor of H1 (μ1 ≥ μ2)
only for sufficiently large positive values of t
7/31/2008

38

Example - Revised
• This scenario fits with our original
example:
– H1 is that the average height of KU
basketball players is bigger than for the
general population
– One-sided test
– We could assume that we don’t know the
averages for H0 and H1
– We actually don’t know σ (I just guessed 2 in
the original example)
7/31/2008

39

Example - Revised
• Updated example:
– Observation in group 1 (KU): X1 = {83, 81, 71}
– Observation in group 2: X2 = {65, 72, 70}
– Pick significance point for t from a table: tα =
2.132
• t-distribution, m+n-2 = 4 degrees of freedom, α =
0.05

– Calculate t with
(78.our
3  69observations
) 9
27.9
t

5.2122 6


2.185
12.7673

– t > tα, so we can reject H0!
7/31/2008

40

Comments
• Problems that might arise in other cases
– The λ-ratio might not reduce to a function of a
well-known test statistic, such as t
– There might not be a unique H0 distribution of λ
– Fortunately, the t statistic is a pivotal quantity
• Independent of the parameters not prescribed by H0
– e.g. μ, σ

– For many testing procedures this property does
not hold
7/31/2008

41

Unequal Variance Two-Sided
t-test
• Identical to Equal Variance Two-Sided ttest
– Except: variances in group 1 and group 2 are
no longer assumed to be identical
• Group 1: NID(μ1, σ12)
• Group 2: NID(μ2, σ22)
• With σ12 and σ22 unknown and not assumed
identical
• Region ω = {μ1 = μ2, 0 < σ12, σ22 < +∞}
• Ω makes no constraints on values μ1, μ2, σ12, and σ22

7/31/2008

42

Unequal Variance Two-Sided
t-test
– The likelihood function of (X11, X12, …,
X1m, X21, X22, …, X2n) then becomes
m


i 1

1
e
2  1



( x1i  1 ) 2
2 12

n


i 1

1
e
2  2



( x21i   2 ) 2
2 22

– Under H0 (μ1 = μ2 = μ), this becomes:
m


i 1

7/31/2008

1
e
2  1



( x1i   ) 2
2 12

n


i 1

1
e
2  2



( x21i   ) 2
2 22

43

Unequal Variance Two-Sided
t-test
̂ ˆ12
– Maximum likelihood estimates
,ˆ 22
and
satisfy the simultaneous
equations:

 (x

1i
2
1

 ˆ )

ˆ

2i
2
2

 ˆ )

ˆ

2
1

 (x


ˆ 22

(x



ˆ

7/31/2008

(x



1i

0

 ˆ ) 2

m

2i

 ˆ ) 2

n

44

Unequal Variance Two-Sided
t-test
–  cubic equation in̂
– Neither the λ ratio, nor any monotonic
function has a known probability distribution
when H0 is true!
– This does not lead to any useful testing
statistic
• The t-statistic may be used as reasonably close
• However H0 distribution is still unknown, as it
depends on the unknown ratio σ12/σ22
• In practice, a heuristic is often used (see Ch. 3.5)
7/31/2008

45

The -2 log λ Approximation

7/31/2008

46

The -2 log λ Approximation
• Used when the λ-ratio procedure does
not lead to a test statistic whose H0
distribution is known
– Example: Unequal Variance Two-Sided ttest

• Various approximations can be used
– But only if certain regularity assumptions
and restrictions hold true
7/31/2008

47

The -2 log λ Approximation
• Best known approximation:
– If H0 is true, -2 log λ has an asymptotic
chi-square distribution,
• with degrees of freedom equal to the
difference in parameters unspecified by H0
and H1, respectively.
• λ is the likelihood ratio
• “asymptotic” = “as the sample size → ∞”

– Provides an asymptotically valid testing
procedure
7/31/2008

48

The -2 log λ Approximation
– Restrictions:
• Parameters must be real numbers that can take
on values in some interval
• The maximum likelihood estimator is found at a
turning point of the function
– i.e. a “real” maximum, not at a boundary point

• H0 is nested in H1 (as in all previous slides)

– These restrictions are important in the
proof
• I skip the proof…
7/31/2008

49

The -2 log λ Approximation
• Instead:
– Our original basketball example, revised
again:
• Let’s drop our last assumption, that the variance
in the population at large is the same as in the
group of KU basketball players.
• All we have left now are our observations and
the hypothesis that μ1 > μ2
– Where μ1 is the average height of Basketball players

• Observation in group 1 (KU): X1 = {83, 81, 71}
• Observation in group 2: X2 = {65, 72, 70}
7/31/2008

50

Example – Revised Again
– Using the Unequal Variance One-Sided tTest
– We get:

7/31/2008

51

The Analysis of Variance
(ANOVA)

7/31/2008

52

The Analysis of Variance
(ANOVA)
• Probably the most frequently used
hypothesis testing procedure in
statistics
• This section
– Derives of the Sum of Squares
– Gives an outline of the ANOVA procedure
– Introduces one-way ANOVA as a
generalization of the two-sample t-test
– Two-way and multi-way ANOVA
– Further generalizations of ANOVA
7/31/2008

53

Sum of Squares
• New variables (from Ch. 3)
– The two-sample t-test tests for equality
of the means of two groups.
– We could express the observations as:

X ij i  Eij

i 1,2

– Where the Eij are assumed to be
NID(0,σ2)
– H0 is μ1 = μ2
7/31/2008

54

Sum of Squares
– This can also be written as:

X ij    i  Eij

i 1,2

• μ could be seen as overall mean
• αj as deviation from μ in group j

– This model is overparameterized
• Uses more parameters than necessary
m1  n 2 0
• Necessitates the requirement
• (always assumed imposed)
7/31/2008

55

Sum of Squares
– We are deriving a test procedure similar
to the two-sample two-sided t-test
– Using |t| as test statistic
• Absolute value of the T statistic

– This is equivalent to using t2
• Because it’s a monotonic function of |t|

– The square of the t statistic (from Ch. 3)
( X 1  X 2 ) mn
T
S mn
7/31/2008

56

Sum of Squares
– …can, after algebraic manipulations, be
written as F
B
F  (m  n  2)
W

m

– where
X 
1

j 1

X1 j

n

X 2 

m

j 1

X2j

X

n

mX 1  nX 2
mn

mn
B
( X 1  X 2 ) 2 m( X 1  X ) 2  n( X 2  X ) 2
mn
m

n

W  ( X 1 j  X 1 )   ( X 2 j  X 2 ) 2
j 1

7/31/2008

2

j 1

57

Sum of Squares
– B: between (among) group sum of squares
– W: within group sum of squares
– B + W: total sum of squares
• Can be shown to be:
m

n

i 1

i 1

2
2
(
X

X
)

(
X

X
)
 1i
 2i

– Total number of degrees of freedom: m + n
–1
• Between groups: 1
• Within groups: m + n - 2
7/31/2008

58

Sum of Squares
B
– This gives us the F statistic
F  ( m  n  2)
W

– Our goal is to test the significance of the
difference between the means of two groups
• B measures the difference

– The difference must be measured relative to
the variance within the groups
• W measures that

– The larger F is, the more significant the
difference
7/31/2008

59

The ANOVA Procedure
• Subdivide observed total sum of
squares into several components
– In our case, B and W

• Pick appropriate significance point for
a chosen Type I error α from an F table
• Compare the observed components to
test our hypothesis
7/31/2008

60

F-Statistic
• Significance points depend on
degrees of freedom in B and W
– In our case, 1 and (m + n – 2)

7/31/2008

http://www.ento.vt.edu/~sharov/PopEcol/tables/f005.html

61

Comments
• The two-group case readily generalizes
to any number of groups.
• ANOVAs can be classified in various
ways, e.g.
– fixed effects models
– mixed effects models
– random effects model
– Difference is discussed later
– For now we consider fixed effect models
• Parameter αi is fixed, but unknown, in group i
7/31/2008

X ij    i  Eij
62

Comments
• Terminology
– Although ANOVA contains the word
‘variance’
– What we actually test for is a equality in
means between the groups
• The different mean assumptions affect the
variance, though

• ANOVAs are special cases of regression
models from Ch. 8
7/31/2008

63

One-Way ANOVA
• One-Way fixed-effect ANOVA
• Setup and derivation
– Like two-sample t-test for g number of groups
– Observations (ni observations, i=1,2,…,g)
X i1 , X i 2 ,  , X in
– Using overparameterized
model for X
X ij    i  Eij

j 1,2,  , ni
i 1,2,  , g
– Eij assumed NID(0,σ2), Σniαi = 0, αi fixed in
group i
7/31/2008

64

One-Way ANOVA
– Null Hypothesis H0 is: α1 = α2 = … = αg =
0
g n
– Total sum of squares
is
( X  X )2
i



ij

i 1 j 1

g n
g
– This is subdivided
into
B
and W
2
i

W  ( X ij  X i ) 2

B  ni ( X i  X )

i 1 j 1

i 1

– with
7/31/2008

ni

X ij

j 1

ni

X i 

g

ni

X  
i 1 j 1

X ij
N

g

N  ni
i 1

65

One-Way ANOVA
– Total degrees of freedom: N – 1
• Subdivided into dfB = g – 1 and dfW = N - g

– This gives us our test statistic F
B N g
F *
W g1

– We can now look in the F-table for these
degrees of freedom to pick significance points
for B and W
– And calculate B and W from the observed data
– And accept or reject H0
7/31/2008

66

Example
• Revisiting the Basketball example
– Looking at it as a One-Way ANOVA
analysis
• Observation in group 1 (KU): X1 = {83, 81,
71}
• (73Observation
group
X.662  =
70}
.66  83)  (73.66  81)  (73in
.66  71
)  (73.66  652:
)  (73
72) {65,
(73.66  70)72,
239.3336
2

2

2

2

2

2

– Total Sum of Squares:
g

B  ni ( X i  X ) 2 3(78.33  76.33) 2  3(69  76.33) 2 130.57

– B (between groups sum of squares)
i 1

7/31/2008

67

Example
– W (within groups sum of squares)
g

ni

W  ( X ij  X i ) 2
i 1 j 1

((83  78.33) 2  (81  78.33) 2  (71  78.33) 2 )  ((65  69) 2  (72  69) 2  (70  69) 2 )
108.667

– Degrees of freedom
• Total: N-1 = 5
• dfB = g – 1 = 2 - 1 = 1
• dfW = N – g = 6 – 2 = 4

7/31/2008

68

Example
– Table lookup for df 1 and 4 and α = 0.05:
– Critical value: F = 7.71
– Calculate F from our data:
B N  g 130.57 6  2
F *

*
4.806
W g  1 108.667 2  1

– So… 4.806 < 7.71
– With ANOVA we actually accept H0!
• Seems to be the large variance in group 1
7/31/2008

69

Same Example – with Excel
• Screenshots:

7/31/2008

70

Excel
• Offers most of these tests, built-in

7/31/2008

71

Two-Way ANOVA
• Two-Way Fixed Effects ANOVA
• Overview only (in the scope of this book)
• More complicated setup; example:
– Expression levels of one gene in lung cancer
patients
– a different risk classes
• E.g.: ultrahigh, very high, intermediate, low

– b different age groups
– n individuals for each risk/age combination
7/31/2008

72

Two-Way ANOVA
– Expression levels (our observations): Xijk
• i is the risk class (i = 1, 2, …, a)
• j indicates the age group
• k corresponds to the individual in each group ( k = 1,
…, n)
– Each group is a possible risk/age combination

• The number of individuals in each group is the
same, n
• This is a “balanced” design
• Theory for unbalanced designs is more complicated
and not covered in this book
7/31/2008

73

Two-Way ANOVA
– The Xijk can be arranged in a table:
Risk category
1

2

3

4

1

n

n

n

n

2

n

n

n

n

3

n

n

n

n

4

n

n

n

n

5

n

n

n

n

Age group

j

i

Number of individuals in this
risk/age group (aka “cell”)
7/31/2008

This is a two-way table
74

Two-Way ANOVA
– The model adopted for each Xijk is

X ijk    i   j   ij  Eijk
i 1,2,  , a

j 1,2, , b

k 1,2, , n

• Where Eijk are NID(μ, α2)
• The mean of Xijk is μ + αi + βi + δij
• αi is a fixed parameter, additive for risk class i
• βi is a fixed parameter, additive for age group i
• δij is a fixed risk/age interaction parameter
– Should be added is a possible group/group interaction
exists
7/31/2008

75

Two-Way ANOVA
– These constraints are imposed
• Σ i α i = Σ i βi = 0
• Σiδij = 0 for all j
• Σjδij = 0 for all i

– The total sum of squares is then subdivided
into four groups:
•
•
•
•
7/31/2008

Risk class sum of squares
Age group sum of squares
Interaction sum of squares
Within cells (“residual” or “error”) sum of
squares

76

Two-Way ANOVA
– Associated with each sum of squares
• Corresponding degrees of freedom
• Hence also a corresponding mean square
– Sum of squares divided by degrees of freedom

– The mean squares are then compared using
F ratios to test for significance of various
effects
• First – test for a significant risk/age interaction
• F-ratio used is ratio of interaction mean square
and within-cells mean square
7/31/2008

77

Two-Way ANOVA

Age

• If such an interaction is used, it may not be
reasonable to test for significant risk or age
differences
• Example, μ in two risk classes, two age
Risk
groups:
1
2

Age

– No evidence of interaction

1

4

12

2

7

15

1

2

1

4

15

2

11 6

– Example of interaction
7/31/2008

78

Multi-Way ANOVA
• One-way and two-way fixed effects
ANOVAs can be extended to multiway ANOVAs
• Gets complicated
• Example: three-way ANOVA model:
X ijkm    i   j  k   ij ik   jk  ijk  Eijkm

7/31/2008

79

Further generalizations of
ANOVA
• The 2m factorial design
– A particular form of the one-way ANOVA
• Interactions between main effects

– m “factors” taken at two “levels”
• E.g. (1) Gender, (2) Tissue (lung, kidney), and
(3) status (affected, not affected)

– 2m possible combinations of levels/groups
– Can test for main effects and interactions
– Need replicated experiments
• n replications for each of the 2m experiments
7/31/2008

80

Further generalizations of
ANOVA
– Example, m = 3, denoted by A, B, C
• 8 groups, {abc, ab, ac, bc, a, b, c, 1}
• Write totals of n observations Tabc, Tab, …, T1
• The total between sum of squares can be
subdivided into seven individual sums of
squares
–
–
–
–

7/31/2008

Three main effects (A, B, C)
Three pair wise interactions (AB, AC, BC)
One triple-wise interaction (ABC)
Example:
ofTacsquares
(TabcSum
 Tab 
 Ta  Tbc for
Tb A,
T cand
T1 ) 2 for BC,
respectively
8n
(Tabc  Tab  Tac  Ta  Tbc  Tb  T cT1 ) 2
8n

81

Further generalizations of
ANOVA
– If m ≥ 5 the number of groups becomes large
– Then the total number of observations, n2m is
large
– It is possible to reduce the number of
observations by a process …

• Confounding
– Interaction ABC probably very small and not
interesting
– So, prefer a model without ABC, reduce data
– There are ANOVA designs for that
7/31/2008

82

Further generalizations of
ANOVA
• Fractional Replication
– Related to confounding
– Sometimes two groups cannot be
distinguished from each other, then they
are aliases
• E.g. A and BC

– This reduces the need to experiments and
data
– Ch. 13 talks more about this in the context
of microarrays
7/31/2008

83

Random/Mixed Effect
Models
• So far: fixed effect models
– E.g. Risk class, age group fixed in previous
example
• Multiple experiments would use same categories
• But: what if we took experimental data on several
random days?
• The days in itself have no meaning, but a
“between days” sum of squares must be extracted
– What if the days turn out to be important?
– If we fail to test for it, the significance of our procedure is
diminished.
– Days are a random category, unlike risk and age!
7/31/2008

84

Random/Mixed Effect
Models
• Mixed Effect Models
– If some categories are fixed and some
are random
– Symbols used:
• Greek letters for fixed effects
• Uppercase Roman letters for random effects
• Example: two-way mixed effect model with
– Risk class a and days d and n values collected
each day, the appropriate model is written:

X ikl    i  Dl  Gil  Eikl

7/31/2008

85

Random/Mixed Effect
Models
• Random effect model have no fixed
categories
• The details on the ANOVA analysis depend
on which effects are random and which are
fixed
• In a microarray context (more in Ch. 13)
– There tend to be several fixed and several
random effects, which complicates the analysis
– Many interactions simply assumed zero
7/31/2008

86

Multivariate Methods
ANOVA: the Repeated
Measures Case
Bootstrap Methods: the Twosample t-test
All skipped …
7/31/2008

87

Sequential Analysis

7/31/2008

88

Sequential Analysis
• Sequential Probability Ratio
– Sample size not known in advance
– Depends on outcomes of successive
observations
– Some of this theory is in BLAST
• Basic Local Alignment Search Tool

– The book focuses on discreet random
variables

7/31/2008

89

Sequential Analysis
– Consider:
• Random variable Y with distribution P(y;ξ)
• Tests usually relate to the value of
parameter ξ
• H0: ξ is ξ0
• H1: ξ is ξ1
• We can choose a value for the Type I error α
• And a value for the Type II error β
• Sampling then continues while
P ( y1 ; 1 ) P( y2 ; 1 )  P( yn ; 1 )
A
B
P( y1 ;  0 ) P ( y2 ;  0 )  P( yn ;  0 )
7/31/2008

90

Sequential Analysis
– A and B are chosen to correspond to an α
and β
– Sampling continues until the ratio is less
than A (accept H0) or greater than B (reject
H 0)
– Because these are discreet variables,
boundary overshoot usually occurs
• We don’t expect to exactly get values α and β

– Desired values for α and β approximately

1 
achieved
A  by using
B
1 

7/31/2008



91

Sequential Analysis
– It is also convenient to take logarithms,
which gives us:
P( yi ; 1 )

1 
log
  log
 log
1 
P ( yi ;  0 )

i

– Using
S1,0 ( y ) log

– We can write
7/31/2008

P ( y; 1 )
P ( y;  0 )


1 
log
  S1, 0 ( yi )  log
1 

i
92

Sequential Analysis
• Example: sequence matching
– H0: p0 = 0.25 (probability of a match is 0.25)
– H1: p1 = 0.35 (probability of a match is 0.35)
– Type I error α and Type II error β chosen 0.01
– Yi: 1 if there is a match at position i,
otherwise 0
– Sampling continues while
1
log   S1, 0 (Yi )  log 99
99 i

– with
7/31/2008

(0.35)Yi (0.65) (1 Yi )
S1,0 (Yi ) log
(0.25)Yi (0.75) (1 Yi )

93

Sequential Analysis
– S can be seen as the support offered by
Yi for H1
– The inequality
can be re-written as
 9.581  (Y  0.2984)  9.581



i

i

– This is actually a random walk with step
sizes 0.7016 for a match and -0.2984 for
a mismatch

7/31/2008

94

Sequential Analysis
• Power Function for a Sequential Test
– Suppose the true value of the parameter
of interest is ξ
– We wish to know the probability that H1
is accepted, given ξ
– This probability is the power Ρ(ξ) of the
test
( ) 

7/31/2008

 *
1 
1   *
 *

1 

1 (

(

) (

)

)

95

Sequential Analysis
– Where θ* is the unique non-zero solution

to θ in
 P ( y; 1 ) 

 P( y;  ) P( y;  ) 


yR

0

1



– R is the range of values of Y
– Equivalently, θ* is the unique non-zero
S ( y )
1
solution to θ in P( y;  )e
1, 0

yR

– Where S is defined as before
7/31/2008

96

Sequential Analysis
– This is very similar to Ch. 7 – Random
Walks
– The parameter θ* is the same as in Ch. 7
– And it will be the same in Ch 10 – BLAST
– < skipping the random walk part >

7/31/2008

97

Sequential Analysis
• Mean Sample Size
– The (random) number of observations
until one or the other hypothesis is
accepted
– Find approximation by ignoring
boundary overshoot
– Essentially identical method used to find
the mean number of steps until the
random walk stops
7/31/2008

98

Sequential Analysis
– Two expressions are calculated for
ΣiS1,0(Yi)
• One involves the mean sample size
• By equating both expressions, solve for
  
 1  
mean
sample
size
S
(
y
)

(
1


(

))
log


(

)
log




 1,0 i
i

1  

  


P(Yi ; 1 ) 
P(Yi ; 1 )


E ( S1, 0 (Yi )) E  log
 P(Yi ;  ) log

P(Yi ;  0 )  yR
P(Yi ;  0 )


7/31/2008

99

Sequential Analysis
– So, the mean sample size is:
(1  ( )) log( 1 )  ( ) log( 1 )
P ( y ;1 )
P
(
y
;

)
log
 yR
P ( y ; 0 )

– Both numerator and denominator
depend on Ρ(ξ), and so also on θ*
– A generalization applies if Q(y) of Y has
different distribution than H0 and H1 –
relevant to BLAST 
1 
(1  ( )) log( 1  )  ( ) log(
P ( y ;1 )
Q
(
y
)
log
 yR
P ( y ; 0 )

7/31/2008



)

100

Sequential Analysis
• Example
– Same sequence matching example as
before
• H0: p0 = 0.25 (probability of a match is 0.25)
• H1: p1 = 0.35 (probability of a match is 0.35)
• Type I error α and Type II error β chosen 0.01

– Mean sample size equation is:
9.190( p )  4.595
p log 75  (1  p ) log 13
15

– Mean sample size is when H0 is true: 194
– Mean sample size is when H1 is true: 182
7/31/2008

101

Sequential Analysis
• Boundary Overshoot
– So far we assumed no boundary overshoot
– In practice, there will almost always be, though
• Exact Type I and Type II errors different from α and β

– Random walk theory can be used to assess how
significant the effects of boundary overshoot are
– It can be shown that the sum of Type I and Type II
errors is always less than α + β (also individually)
– BLAST deals with this in a novel way -> see Ch.
10

7/31/2008

102

6 classical hypothesis testing theory

Dokumen yang terkait

Classical Theory take home 1

METOPEN 7 Hypothesis testing [Compatibility Mode].

meeting 11 Hypothesis Testing

Sesi 7. Hypothesis Testing: Two-Sample Inference

Sesi 9. Hypothesis Testing: Categorical Data

Kuliah 4-HYPOTHESIS TESTING.ppt

IMT mahasiswa Gizi Kelas A tidak lebih kecil daripada IMT mahasiswa Gizi Kelas B

Chapter09 - Hypothesis Testing One-Sample Tests

Chapter10 - Hypothesis Testing Two-Sample Tests

CFA 2018 SS 03 Reading 12 Hypothesis Testing

Dukungan

Links

6 classical hypothesis testing theory

Dokumen yang terkait

Classical Theory take home 1

METOPEN 7 Hypothesis testing [Compatibility Mode].

meeting 11 Hypothesis Testing

Sesi 7. Hypothesis Testing: Two-Sample Inference

Sesi 9. Hypothesis Testing: Categorical Data

Kuliah 4-HYPOTHESIS TESTING.ppt

IMT mahasiswa Gizi Kelas A tidak lebih kecil daripada IMT mahasiswa Gizi Kelas B

Chapter09 - Hypothesis Testing One-Sample Tests

Chapter10 - Hypothesis Testing Two-Sample Tests

CFA 2018 SS 03 Reading 12 Hypothesis Testing

Dokumen yang Anda mencari sudah siap untuk unduhkan