Sesi 9. Hypothesis Testing: Categorical Data
Lecture 10
Hypothesis testing:
Categorical Data Analysis
Biostatistics I: 2017-18
1
[email protected] Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatistics, Epidemiology and Population Heatlth
Learni ng Obj ecti ves
1.
Comparison of binomial proportion using Z
and 2 Test.
2.
Explain 2 Test for Independence of 2
variables
3.
Explain The Fisher’s test for independence
4.
McNemar’s tests for correlated data
5.
Kappa Statistic
6.
Use of Computer Program
Biostatistics I: 2017-18
[email protected]
2
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Data Types
Data
Quantitative
Discrete
Continuous
Biostatistics I: 2017-18
[email protected]
Qualitative
3
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Qual i tati ve Data
1.
2.
3.
4.
Qualitative Random Variables Yield
Responses That Can Be Put In Categories.
Example: Sex (Male, Female)
Measurement or Count Reflect # in
Category
Nominal (no order) or Ordinal Scale (order)
Data can be collected as continuous but
recoded to categorical data. Example
(Systolic Blood Pressure - Hypotension,
Normal tension, hypertension )
Biostatistics I: 2017-18
[email protected]
4
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypothesi s Tests
Qual i tati ve Data
Qualitative
Data
1 pop.
Proportion
2 or more
pop.
Independence
2 pop.
Z Test
Z Test
2 Test
Biostatistics I: 2017-18
[email protected]
2 Test
5
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Di fferences i n
Two Proporti ons
Biostatistics I: 2017-18
6
[email protected] Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatistics, Epidemiology and Population Heatlth
Hypotheses for
Two Proporti ons
Biostatistics I: 2017-18
[email protected]
7
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
Ha
Biostatistics I: 2017-18
[email protected]
8
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
Ha
p1 - p 2 = 0
p1 - p2 0
Biostatistics I: 2017-18
[email protected]
9
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
p1 - p 2 = 0
Ha
p1 - p2 0
p1 - p2 0
p1 - p 2 < 0
Biostatistics I: 2017-18
[email protected]
10
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
p1 - p 2 = 0
Ha
p1 - p2 0
p1 - p2 0
p1 - p 2 < 0
Biostatistics I: 2017-18
[email protected]
p1 - p2 0
p1 - p2 > 0
11
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
p1 - p 2 = 0
Ha
p1 - p2 0
p1 - p2 0
p1 - p 2 < 0
Biostatistics I: 2017-18
[email protected]
p1 - p2 0
p1 - p2 > 0
12
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Di fference i n Two
Proporti ons
1. Assumptions
Populations Are Independent
Populations Follow Binomial Distribution
Normal Approximation Can Be Used for large
samples (All Expected Counts 5)
2.
Z-Test Statistic for Two Proportions
Z
pˆ 1 pˆ 2 p1 p2
1 1
ˆp 1 pˆ
n1 n2
X1 X 2
where pˆ
n1 n2
Biostatistics I: 2017-18
[email protected]
13
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Sampl e Di stri buti on for Di fference
Between Proporti ons
12 22
X1 X 2 ~ N 1 2 ;
n
n
1
2
p1 p2
p1 1 p1 p2 1 p2
N p1 p2 ;
n1
n2
1 1
N 0; pq
n1 n2
x1 x2
p
,
n1 n2
under H 0 : p1 p2
Biostatistics I: 2017-18
[email protected]
14
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Thi nki ng Chal l enge
MA
You’re an epidemiologist for the
Department of Health in your
country. You’re studying the
prevalence of disease X in two
provinces (MA and CA). In MA, 74
of 1500 people surveyed were
diseased and in CA, 129 of 1500
were diseased. At .05 level, does
MA have a lower prevalence rate?
Biostatistics I: 2017-18
[email protected]
CA
15
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Biostatistics I: 2017-18
[email protected]
16
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
H0:
Ha:
=
nMA =
Test Statistic:
nCA =
Critical Value(s):
Decision:
Conclusion:
Biostatistics I: 2017-18
[email protected]
17
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
H0: pMA - pCA = 0
Ha: pMA - pCA < 0
=
nMA =
nCA =
Critical Value(s):
Test Statistic:
Decision:
Conclusion:
Biostatistics I: 2017-18
[email protected]
18
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Conclusion:
Biostatistics I: 2017-18
[email protected]
19
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
.05
Conclusion:
-1.645 0
Z
Biostatistics I: 2017-18
[email protected]
20
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons Sol uti on*
pˆ MA
74
X MA
.0493
nMA 1500
pˆ CA
X CA 129
.0860
nCA 1500
X MA X CA
74 129
pˆ
.0677
nMA nCA 1500 1500
Z
.0493 .0860 0
1
1
.0677 1 .0677
1500 1500
4.00
Biostatistics I: 2017-18
[email protected]
21
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Z = -4.00
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
.05
Conclusion:
-1.645 0
Z
Biostatistics I: 2017-18
[email protected]
22
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Z = -4.00
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
Reject at = .05
.05
Conclusion:
-1.645 0
Z
Biostatistics I: 2017-18
[email protected]
23
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Z = -4.00
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
Reject at = .05
.05
-1.645 0
Z
Conclusion:
There is evidence MA
is less than CA
Biostatistics I: 2017-18
[email protected]
24
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Between 2 Categori cal
Vari abl es
Biostatistics I: 2017-18
25
[email protected] Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatistics, Epidemiology and Population Heatlth
Hypothesi s Tests
Qual i tati ve Data
Qualitative
Data
1 pop.
Proportion
2 or more
pop.
Independence
2 pop.
Z Test
Z Test
2 Test
Biostatistics I: 2017-18
[email protected]
2 Test
26
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
1.
2.
Shows If a Relationship Exists
Between 2 Qualitative Variables, but
does Not Show Causality
Assumptions
• Multinomial Experiment
• All Expected Counts 5
3.
Uses Two-Way Contingency Table
Biostatistics I: 2017-18
[email protected]
27
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Conti ngency Tabl e
1.
Shows # Observations From 1
Sample Jointly in 2 Qualitative
Variables
Biostatistics I: 2017-18
[email protected]
28
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Conti ngency Tabl e
1. Shows # Observations From 1
Sample Jointly in 2 Qualitative
Variables
Levels of variable 2
Disease
Status
Disease
No disease
Total
Residence
Urban
Rural
63
15
78
49
33
82
Total
112
48
160
Levels of variable 1
Biostatistics I: 2017-18
[email protected]
29
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Hypotheses & Stati sti c
1.
Hypotheses
H0: Variables Are Independent
Ha: Variables Are Related (Dependent)
Biostatistics I: 2017-18
[email protected]
30
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Hypotheses & Stati sti c
1.
Hypotheses
H0: Variables Are Independent
Ha: Variables Are Related (Dependent)
2.
Test Statistic
Observed count
n Eˆn
2
2
allcells
ij
ij
Eˆnij
Biostatistics I: 2017-18
[email protected]
Expected
count
31
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Hypotheses & Stati sti c
1.
Hypotheses
H0: Variables Are Independent
Ha: Variables Are Related (Dependent)
2.
Test Statistic
n Eˆn
Observed count
2
2
allcells
ij
ij
Eˆnij
Expected
count
Rows Columns
Degrees of Freedom: (r - 1)(c - 1)
Biostatistics I: 2017-18
[email protected]
32
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence Expected
2
Counts
1.
2.
3.
Statistical Independence Means Joint
Probability Equals Product of
Marginal Probabilities
Compute Marginal Probabilities &
Multiply for Joint Probability
Expected Count Is Sample Size Times
Joint Probability
Biostatistics I: 2017-18
[email protected]
33
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
Biostatistics I: 2017-18
[email protected]
34
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
Biostatistics I: 2017-18
[email protected]
Total
35
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Total
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
78
Marginal probability =
160
Biostatistics I: 2017-18
[email protected]
36
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
112 78
Joint probability =
160 160
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Total
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
78
Marginal probability =
160
Biostatistics I: 2017-18
[email protected]
37
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
112 78
Joint probability =
160 160
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
78
Marginal probability =
160
112 78
Expected count = 160·
160 160
= 54.6
Biostatistics I: 2017-18
[email protected]
Total
38
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Cal cul ati on
Biostatistics I: 2017-18
[email protected]
39
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Cal cul ati on
Expected count =
aRow totalf aColumn totalf
Sample size
Biostatistics I: 2017-18
[email protected]
40
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Cal cul ati on
Expected count =
112x78
160
Disease
Status
aRow totalf aColumn totalf
Sample size
Residence
112x82
160
Urban
Rural
Obs. Exp. Obs. Exp. Total
Disease
63
54.6
49
57.4
112
No Disease
15
23.4
33
24.6
48
Total
78
78
82
82
48x78
160Biostatistics I: 2017-18
[email protected]
160
48x82
160
41
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Exampl e on HIV
You randomly sample 286 sexually active
individuals and collect information on their HIV
status and History of STDs. At the .05 level, is
there evidence of a relationship?
HIV
STDs Hx
No
Yes
Total
No
84
48
132
Biostatistics I: 2017-18
[email protected]
Yes
32
122
154
Total
116
170
286
42
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
Biostatistics I: 2017-18
[email protected]
43
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0:
Ha:
=
df =
Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
Biostatistics I: 2017-18
[email protected]
44
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
=
df =
Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
Biostatistics I: 2017-18
[email protected]
45
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
Biostatistics I: 2017-18
[email protected]
46
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Test Statistic:
Decision:
Reject
= .05
0
3.841
Conclusion:
2
Biostatistics I: 2017-18
[email protected]
47
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
E(nij) 5 in all
cells
116x132
286
STDs HX
HIV
154x116
286
No
Yes
Obs. Exp. Obs. Exp. Total
No
84
53.5
32
62.5
116
Yes
48
78.5
122
91.5
170
132
132
154
154
286
Total
170x132
286
Biostatistics I: 2017-18
[email protected]
170x154
286
48
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
2
all cells
c h
c h
nij E nij
E n
ij
a f
a f
n11 E n11
E n
11
84 53.5
53.5
2
2
2
a f
a f
n12 E n12
E n
2
12
32 62.5
62.5
2
122 91.5
91.5
Biostatistics I: 2017-18
[email protected]
a f
a f
n22 E n22
E n
2
22
2
54.29
49
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Test Statistic:
2 = 54.29
Decision:
Reject
= .05
0
3.841
Conclusion:
2
Biostatistics I: 2017-18
[email protected]
50
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Reject
= .05
0
3.841
Test Statistic:
2 = 54.29
Decision:
Reject at = .05
Conclusion:
2
Biostatistics I: 2017-18
[email protected]
51
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Reject
= .05
0
3.841
2
Test Statistic:
2 = 54.29
Decision:
Reject at = .05
Conclusion:
There is evidence of a
relationship
Biostatistics I: 2017-18
[email protected]
52
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
STATA
tabi 30 18 38 \ 13 7 22, chi2
|
col
row |
1
2
3 |
Total
-----------+---------------------------------+---------1 |
30
18
38 |
86
2 |
13
7
22 |
42
-----------+---------------------------------+---------Total |
43
25
60 |
128
Pearson chi2(2) = 0.7967 Pr = 0.671
Biostatistics I: 2017-18
[email protected]
53
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
SAS CODES
Data dis;
input STDs HIV count;
cards;
1 1 84
1 2 32
2 1 48
2 2 122
;
run;
Proc freq data=dis order=data;
weight Count;
tables STDs*HIV/chisq;
run;
Biostatistics I: 2017-18
[email protected]
54
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
SAS OUTPUT
Statistics for Table of STDs by HIV
Statistic
DF
Value
Prob
------------------------------------------------------Chi-Square
1
54.1502
S
Hypothesis testing:
Categorical Data Analysis
Biostatistics I: 2017-18
1
[email protected] Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatistics, Epidemiology and Population Heatlth
Learni ng Obj ecti ves
1.
Comparison of binomial proportion using Z
and 2 Test.
2.
Explain 2 Test for Independence of 2
variables
3.
Explain The Fisher’s test for independence
4.
McNemar’s tests for correlated data
5.
Kappa Statistic
6.
Use of Computer Program
Biostatistics I: 2017-18
[email protected]
2
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Data Types
Data
Quantitative
Discrete
Continuous
Biostatistics I: 2017-18
[email protected]
Qualitative
3
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Qual i tati ve Data
1.
2.
3.
4.
Qualitative Random Variables Yield
Responses That Can Be Put In Categories.
Example: Sex (Male, Female)
Measurement or Count Reflect # in
Category
Nominal (no order) or Ordinal Scale (order)
Data can be collected as continuous but
recoded to categorical data. Example
(Systolic Blood Pressure - Hypotension,
Normal tension, hypertension )
Biostatistics I: 2017-18
[email protected]
4
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypothesi s Tests
Qual i tati ve Data
Qualitative
Data
1 pop.
Proportion
2 or more
pop.
Independence
2 pop.
Z Test
Z Test
2 Test
Biostatistics I: 2017-18
[email protected]
2 Test
5
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Di fferences i n
Two Proporti ons
Biostatistics I: 2017-18
6
[email protected] Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatistics, Epidemiology and Population Heatlth
Hypotheses for
Two Proporti ons
Biostatistics I: 2017-18
[email protected]
7
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
Ha
Biostatistics I: 2017-18
[email protected]
8
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
Ha
p1 - p 2 = 0
p1 - p2 0
Biostatistics I: 2017-18
[email protected]
9
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
p1 - p 2 = 0
Ha
p1 - p2 0
p1 - p2 0
p1 - p 2 < 0
Biostatistics I: 2017-18
[email protected]
10
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
p1 - p 2 = 0
Ha
p1 - p2 0
p1 - p2 0
p1 - p 2 < 0
Biostatistics I: 2017-18
[email protected]
p1 - p2 0
p1 - p2 > 0
11
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
p1 - p 2 = 0
Ha
p1 - p2 0
p1 - p2 0
p1 - p 2 < 0
Biostatistics I: 2017-18
[email protected]
p1 - p2 0
p1 - p2 > 0
12
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Di fference i n Two
Proporti ons
1. Assumptions
Populations Are Independent
Populations Follow Binomial Distribution
Normal Approximation Can Be Used for large
samples (All Expected Counts 5)
2.
Z-Test Statistic for Two Proportions
Z
pˆ 1 pˆ 2 p1 p2
1 1
ˆp 1 pˆ
n1 n2
X1 X 2
where pˆ
n1 n2
Biostatistics I: 2017-18
[email protected]
13
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Sampl e Di stri buti on for Di fference
Between Proporti ons
12 22
X1 X 2 ~ N 1 2 ;
n
n
1
2
p1 p2
p1 1 p1 p2 1 p2
N p1 p2 ;
n1
n2
1 1
N 0; pq
n1 n2
x1 x2
p
,
n1 n2
under H 0 : p1 p2
Biostatistics I: 2017-18
[email protected]
14
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Thi nki ng Chal l enge
MA
You’re an epidemiologist for the
Department of Health in your
country. You’re studying the
prevalence of disease X in two
provinces (MA and CA). In MA, 74
of 1500 people surveyed were
diseased and in CA, 129 of 1500
were diseased. At .05 level, does
MA have a lower prevalence rate?
Biostatistics I: 2017-18
[email protected]
CA
15
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Biostatistics I: 2017-18
[email protected]
16
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
H0:
Ha:
=
nMA =
Test Statistic:
nCA =
Critical Value(s):
Decision:
Conclusion:
Biostatistics I: 2017-18
[email protected]
17
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
H0: pMA - pCA = 0
Ha: pMA - pCA < 0
=
nMA =
nCA =
Critical Value(s):
Test Statistic:
Decision:
Conclusion:
Biostatistics I: 2017-18
[email protected]
18
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Conclusion:
Biostatistics I: 2017-18
[email protected]
19
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
.05
Conclusion:
-1.645 0
Z
Biostatistics I: 2017-18
[email protected]
20
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons Sol uti on*
pˆ MA
74
X MA
.0493
nMA 1500
pˆ CA
X CA 129
.0860
nCA 1500
X MA X CA
74 129
pˆ
.0677
nMA nCA 1500 1500
Z
.0493 .0860 0
1
1
.0677 1 .0677
1500 1500
4.00
Biostatistics I: 2017-18
[email protected]
21
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Z = -4.00
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
.05
Conclusion:
-1.645 0
Z
Biostatistics I: 2017-18
[email protected]
22
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Z = -4.00
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
Reject at = .05
.05
Conclusion:
-1.645 0
Z
Biostatistics I: 2017-18
[email protected]
23
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Z = -4.00
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
Reject at = .05
.05
-1.645 0
Z
Conclusion:
There is evidence MA
is less than CA
Biostatistics I: 2017-18
[email protected]
24
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Between 2 Categori cal
Vari abl es
Biostatistics I: 2017-18
25
[email protected] Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatistics, Epidemiology and Population Heatlth
Hypothesi s Tests
Qual i tati ve Data
Qualitative
Data
1 pop.
Proportion
2 or more
pop.
Independence
2 pop.
Z Test
Z Test
2 Test
Biostatistics I: 2017-18
[email protected]
2 Test
26
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
1.
2.
Shows If a Relationship Exists
Between 2 Qualitative Variables, but
does Not Show Causality
Assumptions
• Multinomial Experiment
• All Expected Counts 5
3.
Uses Two-Way Contingency Table
Biostatistics I: 2017-18
[email protected]
27
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Conti ngency Tabl e
1.
Shows # Observations From 1
Sample Jointly in 2 Qualitative
Variables
Biostatistics I: 2017-18
[email protected]
28
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Conti ngency Tabl e
1. Shows # Observations From 1
Sample Jointly in 2 Qualitative
Variables
Levels of variable 2
Disease
Status
Disease
No disease
Total
Residence
Urban
Rural
63
15
78
49
33
82
Total
112
48
160
Levels of variable 1
Biostatistics I: 2017-18
[email protected]
29
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Hypotheses & Stati sti c
1.
Hypotheses
H0: Variables Are Independent
Ha: Variables Are Related (Dependent)
Biostatistics I: 2017-18
[email protected]
30
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Hypotheses & Stati sti c
1.
Hypotheses
H0: Variables Are Independent
Ha: Variables Are Related (Dependent)
2.
Test Statistic
Observed count
n Eˆn
2
2
allcells
ij
ij
Eˆnij
Biostatistics I: 2017-18
[email protected]
Expected
count
31
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Hypotheses & Stati sti c
1.
Hypotheses
H0: Variables Are Independent
Ha: Variables Are Related (Dependent)
2.
Test Statistic
n Eˆn
Observed count
2
2
allcells
ij
ij
Eˆnij
Expected
count
Rows Columns
Degrees of Freedom: (r - 1)(c - 1)
Biostatistics I: 2017-18
[email protected]
32
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence Expected
2
Counts
1.
2.
3.
Statistical Independence Means Joint
Probability Equals Product of
Marginal Probabilities
Compute Marginal Probabilities &
Multiply for Joint Probability
Expected Count Is Sample Size Times
Joint Probability
Biostatistics I: 2017-18
[email protected]
33
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
Biostatistics I: 2017-18
[email protected]
34
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
Biostatistics I: 2017-18
[email protected]
Total
35
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Total
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
78
Marginal probability =
160
Biostatistics I: 2017-18
[email protected]
36
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
112 78
Joint probability =
160 160
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Total
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
78
Marginal probability =
160
Biostatistics I: 2017-18
[email protected]
37
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
112 78
Joint probability =
160 160
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
78
Marginal probability =
160
112 78
Expected count = 160·
160 160
= 54.6
Biostatistics I: 2017-18
[email protected]
Total
38
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Cal cul ati on
Biostatistics I: 2017-18
[email protected]
39
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Cal cul ati on
Expected count =
aRow totalf aColumn totalf
Sample size
Biostatistics I: 2017-18
[email protected]
40
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Cal cul ati on
Expected count =
112x78
160
Disease
Status
aRow totalf aColumn totalf
Sample size
Residence
112x82
160
Urban
Rural
Obs. Exp. Obs. Exp. Total
Disease
63
54.6
49
57.4
112
No Disease
15
23.4
33
24.6
48
Total
78
78
82
82
48x78
160Biostatistics I: 2017-18
[email protected]
160
48x82
160
41
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Exampl e on HIV
You randomly sample 286 sexually active
individuals and collect information on their HIV
status and History of STDs. At the .05 level, is
there evidence of a relationship?
HIV
STDs Hx
No
Yes
Total
No
84
48
132
Biostatistics I: 2017-18
[email protected]
Yes
32
122
154
Total
116
170
286
42
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
Biostatistics I: 2017-18
[email protected]
43
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0:
Ha:
=
df =
Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
Biostatistics I: 2017-18
[email protected]
44
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
=
df =
Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
Biostatistics I: 2017-18
[email protected]
45
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
Biostatistics I: 2017-18
[email protected]
46
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Test Statistic:
Decision:
Reject
= .05
0
3.841
Conclusion:
2
Biostatistics I: 2017-18
[email protected]
47
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
E(nij) 5 in all
cells
116x132
286
STDs HX
HIV
154x116
286
No
Yes
Obs. Exp. Obs. Exp. Total
No
84
53.5
32
62.5
116
Yes
48
78.5
122
91.5
170
132
132
154
154
286
Total
170x132
286
Biostatistics I: 2017-18
[email protected]
170x154
286
48
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
2
all cells
c h
c h
nij E nij
E n
ij
a f
a f
n11 E n11
E n
11
84 53.5
53.5
2
2
2
a f
a f
n12 E n12
E n
2
12
32 62.5
62.5
2
122 91.5
91.5
Biostatistics I: 2017-18
[email protected]
a f
a f
n22 E n22
E n
2
22
2
54.29
49
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Test Statistic:
2 = 54.29
Decision:
Reject
= .05
0
3.841
Conclusion:
2
Biostatistics I: 2017-18
[email protected]
50
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Reject
= .05
0
3.841
Test Statistic:
2 = 54.29
Decision:
Reject at = .05
Conclusion:
2
Biostatistics I: 2017-18
[email protected]
51
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Reject
= .05
0
3.841
2
Test Statistic:
2 = 54.29
Decision:
Reject at = .05
Conclusion:
There is evidence of a
relationship
Biostatistics I: 2017-18
[email protected]
52
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
STATA
tabi 30 18 38 \ 13 7 22, chi2
|
col
row |
1
2
3 |
Total
-----------+---------------------------------+---------1 |
30
18
38 |
86
2 |
13
7
22 |
42
-----------+---------------------------------+---------Total |
43
25
60 |
128
Pearson chi2(2) = 0.7967 Pr = 0.671
Biostatistics I: 2017-18
[email protected]
53
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
SAS CODES
Data dis;
input STDs HIV count;
cards;
1 1 84
1 2 32
2 1 48
2 2 122
;
run;
Proc freq data=dis order=data;
weight Count;
tables STDs*HIV/chisq;
run;
Biostatistics I: 2017-18
[email protected]
54
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
SAS OUTPUT
Statistics for Table of STDs by HIV
Statistic
DF
Value
Prob
------------------------------------------------------Chi-Square
1
54.1502
S