Sesi 9. Hypothesis Testing: Categorical Data
Lecture 10
Hypothesis testing:
Categorical Data Analysis
Biostatistics I: 2017-18
1
sawilopo@yahoo.com Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatistics, Epidemiology and Population Heatlth
Learni ng Obj ecti ves
1.
Comparison of binomial proportion using Z
and 2 Test.
2.
Explain 2 Test for Independence of 2
variables
3.
Explain The Fisher’s test for independence
4.
McNemar’s tests for correlated data
5.
Kappa Statistic
6.
Use of Computer Program
Biostatistics I: 2017-18
sawilopo@yahoo.com
2
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Data Types
Data
Quantitative
Discrete
Continuous
Biostatistics I: 2017-18
sawilopo@yahoo.com
Qualitative
3
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Qual i tati ve Data
1.
2.
3.
4.
Qualitative Random Variables Yield
Responses That Can Be Put In Categories.
Example: Sex (Male, Female)
Measurement or Count Reflect # in
Category
Nominal (no order) or Ordinal Scale (order)
Data can be collected as continuous but
recoded to categorical data. Example
(Systolic Blood Pressure - Hypotension,
Normal tension, hypertension )
Biostatistics I: 2017-18
sawilopo@yahoo.com
4
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypothesi s Tests
Qual i tati ve Data
Qualitative
Data
1 pop.
Proportion
2 or more
pop.
Independence
2 pop.
Z Test
Z Test
2 Test
Biostatistics I: 2017-18
sawilopo@yahoo.com
2 Test
5
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Di fferences i n
Two Proporti ons
Biostatistics I: 2017-18
6
sawilopo@yahoo.com Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatistics, Epidemiology and Population Heatlth
Hypotheses for
Two Proporti ons
Biostatistics I: 2017-18
sawilopo@yahoo.com
7
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
Ha
Biostatistics I: 2017-18
sawilopo@yahoo.com
8
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
Ha
p1 - p 2 = 0
p1 - p2 0
Biostatistics I: 2017-18
sawilopo@yahoo.com
9
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
p1 - p 2 = 0
Ha
p1 - p2 0
p1 - p2 0
p1 - p 2 < 0
Biostatistics I: 2017-18
sawilopo@yahoo.com
10
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
p1 - p 2 = 0
Ha
p1 - p2 0
p1 - p2 0
p1 - p 2 < 0
Biostatistics I: 2017-18
sawilopo@yahoo.com
p1 - p2 0
p1 - p2 > 0
11
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
p1 - p 2 = 0
Ha
p1 - p2 0
p1 - p2 0
p1 - p 2 < 0
Biostatistics I: 2017-18
sawilopo@yahoo.com
p1 - p2 0
p1 - p2 > 0
12
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Di fference i n Two
Proporti ons
1. Assumptions
Populations Are Independent
Populations Follow Binomial Distribution
Normal Approximation Can Be Used for large
samples (All Expected Counts 5)
2.
Z-Test Statistic for Two Proportions
Z
pˆ 1 pˆ 2 p1 p2
1 1
ˆp 1 pˆ
n1 n2
X1 X 2
where pˆ
n1 n2
Biostatistics I: 2017-18
sawilopo@yahoo.com
13
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Sampl e Di stri buti on for Di fference
Between Proporti ons
12 22
X1 X 2 ~ N 1 2 ;
n
n
1
2
p1 p2
p1 1 p1 p2 1 p2
N p1 p2 ;
n1
n2
1 1
N 0; pq
n1 n2
x1 x2
p
,
n1 n2
under H 0 : p1 p2
Biostatistics I: 2017-18
sawilopo@yahoo.com
14
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Thi nki ng Chal l enge
MA
You’re an epidemiologist for the
Department of Health in your
country. You’re studying the
prevalence of disease X in two
provinces (MA and CA). In MA, 74
of 1500 people surveyed were
diseased and in CA, 129 of 1500
were diseased. At .05 level, does
MA have a lower prevalence rate?
Biostatistics I: 2017-18
sawilopo@yahoo.com
CA
15
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Biostatistics I: 2017-18
sawilopo@yahoo.com
16
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
H0:
Ha:
=
nMA =
Test Statistic:
nCA =
Critical Value(s):
Decision:
Conclusion:
Biostatistics I: 2017-18
sawilopo@yahoo.com
17
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
H0: pMA - pCA = 0
Ha: pMA - pCA < 0
=
nMA =
nCA =
Critical Value(s):
Test Statistic:
Decision:
Conclusion:
Biostatistics I: 2017-18
sawilopo@yahoo.com
18
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Conclusion:
Biostatistics I: 2017-18
sawilopo@yahoo.com
19
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
.05
Conclusion:
-1.645 0
Z
Biostatistics I: 2017-18
sawilopo@yahoo.com
20
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons Sol uti on*
pˆ MA
74
X MA
.0493
nMA 1500
pˆ CA
X CA 129
.0860
nCA 1500
X MA X CA
74 129
pˆ
.0677
nMA nCA 1500 1500
Z
.0493 .0860 0
1
1
.0677 1 .0677
1500 1500
4.00
Biostatistics I: 2017-18
sawilopo@yahoo.com
21
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Z = -4.00
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
.05
Conclusion:
-1.645 0
Z
Biostatistics I: 2017-18
sawilopo@yahoo.com
22
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Z = -4.00
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
Reject at = .05
.05
Conclusion:
-1.645 0
Z
Biostatistics I: 2017-18
sawilopo@yahoo.com
23
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Z = -4.00
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
Reject at = .05
.05
-1.645 0
Z
Conclusion:
There is evidence MA
is less than CA
Biostatistics I: 2017-18
sawilopo@yahoo.com
24
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Between 2 Categori cal
Vari abl es
Biostatistics I: 2017-18
25
sawilopo@yahoo.com Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatistics, Epidemiology and Population Heatlth
Hypothesi s Tests
Qual i tati ve Data
Qualitative
Data
1 pop.
Proportion
2 or more
pop.
Independence
2 pop.
Z Test
Z Test
2 Test
Biostatistics I: 2017-18
sawilopo@yahoo.com
2 Test
26
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
1.
2.
Shows If a Relationship Exists
Between 2 Qualitative Variables, but
does Not Show Causality
Assumptions
• Multinomial Experiment
• All Expected Counts 5
3.
Uses Two-Way Contingency Table
Biostatistics I: 2017-18
sawilopo@yahoo.com
27
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Conti ngency Tabl e
1.
Shows # Observations From 1
Sample Jointly in 2 Qualitative
Variables
Biostatistics I: 2017-18
sawilopo@yahoo.com
28
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Conti ngency Tabl e
1. Shows # Observations From 1
Sample Jointly in 2 Qualitative
Variables
Levels of variable 2
Disease
Status
Disease
No disease
Total
Residence
Urban
Rural
63
15
78
49
33
82
Total
112
48
160
Levels of variable 1
Biostatistics I: 2017-18
sawilopo@yahoo.com
29
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Hypotheses & Stati sti c
1.
Hypotheses
H0: Variables Are Independent
Ha: Variables Are Related (Dependent)
Biostatistics I: 2017-18
sawilopo@yahoo.com
30
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Hypotheses & Stati sti c
1.
Hypotheses
H0: Variables Are Independent
Ha: Variables Are Related (Dependent)
2.
Test Statistic
Observed count
n Eˆn
2
2
allcells
ij
ij
Eˆnij
Biostatistics I: 2017-18
sawilopo@yahoo.com
Expected
count
31
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Hypotheses & Stati sti c
1.
Hypotheses
H0: Variables Are Independent
Ha: Variables Are Related (Dependent)
2.
Test Statistic
n Eˆn
Observed count
2
2
allcells
ij
ij
Eˆnij
Expected
count
Rows Columns
Degrees of Freedom: (r - 1)(c - 1)
Biostatistics I: 2017-18
sawilopo@yahoo.com
32
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence Expected
2
Counts
1.
2.
3.
Statistical Independence Means Joint
Probability Equals Product of
Marginal Probabilities
Compute Marginal Probabilities &
Multiply for Joint Probability
Expected Count Is Sample Size Times
Joint Probability
Biostatistics I: 2017-18
sawilopo@yahoo.com
33
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
Biostatistics I: 2017-18
sawilopo@yahoo.com
34
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
Biostatistics I: 2017-18
sawilopo@yahoo.com
Total
35
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Total
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
78
Marginal probability =
160
Biostatistics I: 2017-18
sawilopo@yahoo.com
36
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
112 78
Joint probability =
160 160
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Total
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
78
Marginal probability =
160
Biostatistics I: 2017-18
sawilopo@yahoo.com
37
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
112 78
Joint probability =
160 160
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
78
Marginal probability =
160
112 78
Expected count = 160·
160 160
= 54.6
Biostatistics I: 2017-18
sawilopo@yahoo.com
Total
38
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Cal cul ati on
Biostatistics I: 2017-18
sawilopo@yahoo.com
39
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Cal cul ati on
Expected count =
aRow totalf aColumn totalf
Sample size
Biostatistics I: 2017-18
sawilopo@yahoo.com
40
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Cal cul ati on
Expected count =
112x78
160
Disease
Status
aRow totalf aColumn totalf
Sample size
Residence
112x82
160
Urban
Rural
Obs. Exp. Obs. Exp. Total
Disease
63
54.6
49
57.4
112
No Disease
15
23.4
33
24.6
48
Total
78
78
82
82
48x78
160Biostatistics I: 2017-18
sawilopo@yahoo.com
160
48x82
160
41
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Exampl e on HIV
You randomly sample 286 sexually active
individuals and collect information on their HIV
status and History of STDs. At the .05 level, is
there evidence of a relationship?
HIV
STDs Hx
No
Yes
Total
No
84
48
132
Biostatistics I: 2017-18
sawilopo@yahoo.com
Yes
32
122
154
Total
116
170
286
42
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
Biostatistics I: 2017-18
sawilopo@yahoo.com
43
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0:
Ha:
=
df =
Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
Biostatistics I: 2017-18
sawilopo@yahoo.com
44
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
=
df =
Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
Biostatistics I: 2017-18
sawilopo@yahoo.com
45
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
Biostatistics I: 2017-18
sawilopo@yahoo.com
46
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Test Statistic:
Decision:
Reject
= .05
0
3.841
Conclusion:
2
Biostatistics I: 2017-18
sawilopo@yahoo.com
47
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
E(nij) 5 in all
cells
116x132
286
STDs HX
HIV
154x116
286
No
Yes
Obs. Exp. Obs. Exp. Total
No
84
53.5
32
62.5
116
Yes
48
78.5
122
91.5
170
132
132
154
154
286
Total
170x132
286
Biostatistics I: 2017-18
sawilopo@yahoo.com
170x154
286
48
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
2
all cells
c h
c h
nij E nij
E n
ij
a f
a f
n11 E n11
E n
11
84 53.5
53.5
2
2
2
a f
a f
n12 E n12
E n
2
12
32 62.5
62.5
2
122 91.5
91.5
Biostatistics I: 2017-18
sawilopo@yahoo.com
a f
a f
n22 E n22
E n
2
22
2
54.29
49
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Test Statistic:
2 = 54.29
Decision:
Reject
= .05
0
3.841
Conclusion:
2
Biostatistics I: 2017-18
sawilopo@yahoo.com
50
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Reject
= .05
0
3.841
Test Statistic:
2 = 54.29
Decision:
Reject at = .05
Conclusion:
2
Biostatistics I: 2017-18
sawilopo@yahoo.com
51
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Reject
= .05
0
3.841
2
Test Statistic:
2 = 54.29
Decision:
Reject at = .05
Conclusion:
There is evidence of a
relationship
Biostatistics I: 2017-18
sawilopo@yahoo.com
52
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
STATA
tabi 30 18 38 \ 13 7 22, chi2
|
col
row |
1
2
3 |
Total
-----------+---------------------------------+---------1 |
30
18
38 |
86
2 |
13
7
22 |
42
-----------+---------------------------------+---------Total |
43
25
60 |
128
Pearson chi2(2) = 0.7967 Pr = 0.671
Biostatistics I: 2017-18
sawilopo@yahoo.com
53
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
SAS CODES
Data dis;
input STDs HIV count;
cards;
1 1 84
1 2 32
2 1 48
2 2 122
;
run;
Proc freq data=dis order=data;
weight Count;
tables STDs*HIV/chisq;
run;
Biostatistics I: 2017-18
sawilopo@yahoo.com
54
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
SAS OUTPUT
Statistics for Table of STDs by HIV
Statistic
DF
Value
Prob
------------------------------------------------------Chi-Square
1
54.1502
S
Hypothesis testing:
Categorical Data Analysis
Biostatistics I: 2017-18
1
sawilopo@yahoo.com Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatistics, Epidemiology and Population Heatlth
Learni ng Obj ecti ves
1.
Comparison of binomial proportion using Z
and 2 Test.
2.
Explain 2 Test for Independence of 2
variables
3.
Explain The Fisher’s test for independence
4.
McNemar’s tests for correlated data
5.
Kappa Statistic
6.
Use of Computer Program
Biostatistics I: 2017-18
sawilopo@yahoo.com
2
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Data Types
Data
Quantitative
Discrete
Continuous
Biostatistics I: 2017-18
sawilopo@yahoo.com
Qualitative
3
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Qual i tati ve Data
1.
2.
3.
4.
Qualitative Random Variables Yield
Responses That Can Be Put In Categories.
Example: Sex (Male, Female)
Measurement or Count Reflect # in
Category
Nominal (no order) or Ordinal Scale (order)
Data can be collected as continuous but
recoded to categorical data. Example
(Systolic Blood Pressure - Hypotension,
Normal tension, hypertension )
Biostatistics I: 2017-18
sawilopo@yahoo.com
4
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypothesi s Tests
Qual i tati ve Data
Qualitative
Data
1 pop.
Proportion
2 or more
pop.
Independence
2 pop.
Z Test
Z Test
2 Test
Biostatistics I: 2017-18
sawilopo@yahoo.com
2 Test
5
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Di fferences i n
Two Proporti ons
Biostatistics I: 2017-18
6
sawilopo@yahoo.com Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatistics, Epidemiology and Population Heatlth
Hypotheses for
Two Proporti ons
Biostatistics I: 2017-18
sawilopo@yahoo.com
7
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
Ha
Biostatistics I: 2017-18
sawilopo@yahoo.com
8
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
Ha
p1 - p 2 = 0
p1 - p2 0
Biostatistics I: 2017-18
sawilopo@yahoo.com
9
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
p1 - p 2 = 0
Ha
p1 - p2 0
p1 - p2 0
p1 - p 2 < 0
Biostatistics I: 2017-18
sawilopo@yahoo.com
10
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
p1 - p 2 = 0
Ha
p1 - p2 0
p1 - p2 0
p1 - p 2 < 0
Biostatistics I: 2017-18
sawilopo@yahoo.com
p1 - p2 0
p1 - p2 > 0
11
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Hypotheses for
Two Proporti ons
Research Questions
Hypothesis No Difference Pop 1 Pop 2 Pop 1 Pop 2
Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
H0
p1 - p 2 = 0
Ha
p1 - p2 0
p1 - p2 0
p1 - p 2 < 0
Biostatistics I: 2017-18
sawilopo@yahoo.com
p1 - p2 0
p1 - p2 > 0
12
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Di fference i n Two
Proporti ons
1. Assumptions
Populations Are Independent
Populations Follow Binomial Distribution
Normal Approximation Can Be Used for large
samples (All Expected Counts 5)
2.
Z-Test Statistic for Two Proportions
Z
pˆ 1 pˆ 2 p1 p2
1 1
ˆp 1 pˆ
n1 n2
X1 X 2
where pˆ
n1 n2
Biostatistics I: 2017-18
sawilopo@yahoo.com
13
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Sampl e Di stri buti on for Di fference
Between Proporti ons
12 22
X1 X 2 ~ N 1 2 ;
n
n
1
2
p1 p2
p1 1 p1 p2 1 p2
N p1 p2 ;
n1
n2
1 1
N 0; pq
n1 n2
x1 x2
p
,
n1 n2
under H 0 : p1 p2
Biostatistics I: 2017-18
sawilopo@yahoo.com
14
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Thi nki ng Chal l enge
MA
You’re an epidemiologist for the
Department of Health in your
country. You’re studying the
prevalence of disease X in two
provinces (MA and CA). In MA, 74
of 1500 people surveyed were
diseased and in CA, 129 of 1500
were diseased. At .05 level, does
MA have a lower prevalence rate?
Biostatistics I: 2017-18
sawilopo@yahoo.com
CA
15
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Biostatistics I: 2017-18
sawilopo@yahoo.com
16
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
H0:
Ha:
=
nMA =
Test Statistic:
nCA =
Critical Value(s):
Decision:
Conclusion:
Biostatistics I: 2017-18
sawilopo@yahoo.com
17
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
H0: pMA - pCA = 0
Ha: pMA - pCA < 0
=
nMA =
nCA =
Critical Value(s):
Test Statistic:
Decision:
Conclusion:
Biostatistics I: 2017-18
sawilopo@yahoo.com
18
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Conclusion:
Biostatistics I: 2017-18
sawilopo@yahoo.com
19
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
.05
Conclusion:
-1.645 0
Z
Biostatistics I: 2017-18
sawilopo@yahoo.com
20
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons Sol uti on*
pˆ MA
74
X MA
.0493
nMA 1500
pˆ CA
X CA 129
.0860
nCA 1500
X MA X CA
74 129
pˆ
.0677
nMA nCA 1500 1500
Z
.0493 .0860 0
1
1
.0677 1 .0677
1500 1500
4.00
Biostatistics I: 2017-18
sawilopo@yahoo.com
21
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Z = -4.00
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
.05
Conclusion:
-1.645 0
Z
Biostatistics I: 2017-18
sawilopo@yahoo.com
22
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Z = -4.00
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
Reject at = .05
.05
Conclusion:
-1.645 0
Z
Biostatistics I: 2017-18
sawilopo@yahoo.com
23
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Z Test for Two Proporti ons
Sol uti on*
Test Statistic:
H0: pMA - pCA = 0
Z = -4.00
Ha: pMA - pCA < 0
= .05
nMA = 1500 nCA = 1500
Critical Value(s):
Decision:
Reject
Reject at = .05
.05
-1.645 0
Z
Conclusion:
There is evidence MA
is less than CA
Biostatistics I: 2017-18
sawilopo@yahoo.com
24
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Between 2 Categori cal
Vari abl es
Biostatistics I: 2017-18
25
sawilopo@yahoo.com Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatistics, Epidemiology and Population Heatlth
Hypothesi s Tests
Qual i tati ve Data
Qualitative
Data
1 pop.
Proportion
2 or more
pop.
Independence
2 pop.
Z Test
Z Test
2 Test
Biostatistics I: 2017-18
sawilopo@yahoo.com
2 Test
26
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
1.
2.
Shows If a Relationship Exists
Between 2 Qualitative Variables, but
does Not Show Causality
Assumptions
• Multinomial Experiment
• All Expected Counts 5
3.
Uses Two-Way Contingency Table
Biostatistics I: 2017-18
sawilopo@yahoo.com
27
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Conti ngency Tabl e
1.
Shows # Observations From 1
Sample Jointly in 2 Qualitative
Variables
Biostatistics I: 2017-18
sawilopo@yahoo.com
28
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Conti ngency Tabl e
1. Shows # Observations From 1
Sample Jointly in 2 Qualitative
Variables
Levels of variable 2
Disease
Status
Disease
No disease
Total
Residence
Urban
Rural
63
15
78
49
33
82
Total
112
48
160
Levels of variable 1
Biostatistics I: 2017-18
sawilopo@yahoo.com
29
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Hypotheses & Stati sti c
1.
Hypotheses
H0: Variables Are Independent
Ha: Variables Are Related (Dependent)
Biostatistics I: 2017-18
sawilopo@yahoo.com
30
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Hypotheses & Stati sti c
1.
Hypotheses
H0: Variables Are Independent
Ha: Variables Are Related (Dependent)
2.
Test Statistic
Observed count
n Eˆn
2
2
allcells
ij
ij
Eˆnij
Biostatistics I: 2017-18
sawilopo@yahoo.com
Expected
count
31
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Hypotheses & Stati sti c
1.
Hypotheses
H0: Variables Are Independent
Ha: Variables Are Related (Dependent)
2.
Test Statistic
n Eˆn
Observed count
2
2
allcells
ij
ij
Eˆnij
Expected
count
Rows Columns
Degrees of Freedom: (r - 1)(c - 1)
Biostatistics I: 2017-18
sawilopo@yahoo.com
32
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence Expected
2
Counts
1.
2.
3.
Statistical Independence Means Joint
Probability Equals Product of
Marginal Probabilities
Compute Marginal Probabilities &
Multiply for Joint Probability
Expected Count Is Sample Size Times
Joint Probability
Biostatistics I: 2017-18
sawilopo@yahoo.com
33
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
Biostatistics I: 2017-18
sawilopo@yahoo.com
34
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
Biostatistics I: 2017-18
sawilopo@yahoo.com
Total
35
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Total
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
78
Marginal probability =
160
Biostatistics I: 2017-18
sawilopo@yahoo.com
36
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
112 78
Joint probability =
160 160
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Total
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
78
Marginal probability =
160
Biostatistics I: 2017-18
sawilopo@yahoo.com
37
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Exampl e
112 78
Joint probability =
160 160
Marginal probability = 112
160
Residence
Urban Rural
Obs. Obs.
Disease
Status
Disease
63
49
112
No Disease
15
33
48
78
82
160
Total
78
Marginal probability =
160
112 78
Expected count = 160·
160 160
= 54.6
Biostatistics I: 2017-18
sawilopo@yahoo.com
Total
38
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Cal cul ati on
Biostatistics I: 2017-18
sawilopo@yahoo.com
39
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Cal cul ati on
Expected count =
aRow totalf aColumn totalf
Sample size
Biostatistics I: 2017-18
sawilopo@yahoo.com
40
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Expected Count Cal cul ati on
Expected count =
112x78
160
Disease
Status
aRow totalf aColumn totalf
Sample size
Residence
112x82
160
Urban
Rural
Obs. Exp. Obs. Exp. Total
Disease
63
54.6
49
57.4
112
No Disease
15
23.4
33
24.6
48
Total
78
78
82
82
48x78
160Biostatistics I: 2017-18
sawilopo@yahoo.com
160
48x82
160
41
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Exampl e on HIV
You randomly sample 286 sexually active
individuals and collect information on their HIV
status and History of STDs. At the .05 level, is
there evidence of a relationship?
HIV
STDs Hx
No
Yes
Total
No
84
48
132
Biostatistics I: 2017-18
sawilopo@yahoo.com
Yes
32
122
154
Total
116
170
286
42
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
Biostatistics I: 2017-18
sawilopo@yahoo.com
43
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0:
Ha:
=
df =
Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
Biostatistics I: 2017-18
sawilopo@yahoo.com
44
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
=
df =
Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
Biostatistics I: 2017-18
sawilopo@yahoo.com
45
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
Biostatistics I: 2017-18
sawilopo@yahoo.com
46
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Test Statistic:
Decision:
Reject
= .05
0
3.841
Conclusion:
2
Biostatistics I: 2017-18
sawilopo@yahoo.com
47
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
E(nij) 5 in all
cells
116x132
286
STDs HX
HIV
154x116
286
No
Yes
Obs. Exp. Obs. Exp. Total
No
84
53.5
32
62.5
116
Yes
48
78.5
122
91.5
170
132
132
154
154
286
Total
170x132
286
Biostatistics I: 2017-18
sawilopo@yahoo.com
170x154
286
48
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
2
all cells
c h
c h
nij E nij
E n
ij
a f
a f
n11 E n11
E n
11
84 53.5
53.5
2
2
2
a f
a f
n12 E n12
E n
2
12
32 62.5
62.5
2
122 91.5
91.5
Biostatistics I: 2017-18
sawilopo@yahoo.com
a f
a f
n22 E n22
E n
2
22
2
54.29
49
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Test Statistic:
2 = 54.29
Decision:
Reject
= .05
0
3.841
Conclusion:
2
Biostatistics I: 2017-18
sawilopo@yahoo.com
50
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Reject
= .05
0
3.841
Test Statistic:
2 = 54.29
Decision:
Reject at = .05
Conclusion:
2
Biostatistics I: 2017-18
sawilopo@yahoo.com
51
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
Sol uti on
H0: No Relationship
Ha: Relationship
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Reject
= .05
0
3.841
2
Test Statistic:
2 = 54.29
Decision:
Reject at = .05
Conclusion:
There is evidence of a
relationship
Biostatistics I: 2017-18
sawilopo@yahoo.com
52
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
STATA
tabi 30 18 38 \ 13 7 22, chi2
|
col
row |
1
2
3 |
Total
-----------+---------------------------------+---------1 |
30
18
38 |
86
2 |
13
7
22 |
42
-----------+---------------------------------+---------Total |
43
25
60 |
128
Pearson chi2(2) = 0.7967 Pr = 0.671
Biostatistics I: 2017-18
sawilopo@yahoo.com
53
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
SAS CODES
Data dis;
input STDs HIV count;
cards;
1 1 84
1 2 32
2 1 48
2 2 122
;
run;
Proc freq data=dis order=data;
weight Count;
tables STDs*HIV/chisq;
run;
Biostatistics I: 2017-18
sawilopo@yahoo.com
54
Universitas Gadjah Mada, Faculty of Medicine, Department of Biostatatics, Epidemiology and Population Health
Test of Independence
2
SAS OUTPUT
Statistics for Table of STDs by HIV
Statistic
DF
Value
Prob
------------------------------------------------------Chi-Square
1
54.1502
S