Materi Analisis Data Kategori
Contingency Tables
1.
Explain 2 Test of Independence
2.
Measure of Association
Contingency Tables
• Tables representing all combinations
of levels of explanatory and response
variables
• Numbers in table represent Counts
of the number of cases in each cell
• Row and column totals are called
Marginal counts
2x2 Tables
• Each variable has 2 levels
– Explanatory Variable – Groups (Typically
based on demographics, exposure)
– Response Variable – Outcome (Typically
presence or absence of a characteristic)
2x2 Tables - Notation
Outcome
Present
Outcome
Absent
Group
Total
Group 1
n11
n12
n1.
Group 2
n21
n22
n2.
Outcome
Total
n.1
n.2
n..
2 Test of Independence
2 Test of Independence
• 1. Shows If a Relationship Exists
Between 2 Qualitative Variables
– One Sample Is Drawn
– Does Not Show Causality
• 2. Assumptions
– Multinomial Experiment
– All Expected Counts 5
• 3. Uses Two-Way Contingency Table
Test of Independence
Contingency Table
2
• 1. Shows # Observations From 1
Sample Jointly in 2 Qualitative
Variables
Test of Independence
Contingency Table
2
• 1. Shows # Observations From 1
Sample Jointly in 2 Qualitative
Levels of variable 2
Variables
Levels of variable 1
Test of Independence
Hypotheses & Statistic
2
• 1. Hypotheses
– H0: Variables Are Independent
– Ha: Variables Are Related (Dependent)
Test of Independence
Hypotheses & Statistic
2
• 1. Hypotheses
– H0: Variables Are Independent
– Ha: Variables Are Related (Dependent)
Observed count
• 2. Test Statistic
n
E
n
ij
ij
2
all cells
ch
Ec
n h
ij
2
Expected
count
Test of Independence
Hypotheses & Statistic
2
• 1. Hypotheses
– H0: Variables Are Independent
– Ha: Variables Are Related (Dependent)
• 2. Test Statistic
2
all cells
Observed count
ch
ch
nij E nij
E n
ij
2
Expected
count
Rows Columns
• Degrees of Freedom: (r - 1)(c - 1)
2 Test of Independence
Expected Counts
• 1. Statistical Independence Means
Joint Probability Equals Product of
Marginal Probabilities
• 2. Compute Marginal Probabilities &
Multiply for Joint Probability
• 3. Expected Count Is Sample Size
Times Joint Probability
Expected Count Example
Expected Count Example
Location
Urban Rural
House Style Obs. Obs.
Split-Level
63
49
Ranch
Total
Total
112
15
33
48
78
82
160
Expected Count Example
Marginal probability = 112
160
Expected Count Example
Marginal probability = 112
160
Location
Urban Rural
House Style Obs. Obs.
Split-Level
63
49
Ranch
Total
78
Marginal probability =
160
Total
112
15
33
48
78
82
160
Expected Count Example
112 78
Joint probability =
160 160
Marginal probability = 112
160
Location
Urban Rural
House Style Obs. Obs.
Split-Level
63
49
Ranch
Total
78
Marginal probability =
160
Total
112
15
33
48
78
82
160
Expected Count Example
112 78
Joint probability =
160 160
Marginal probability = 112
160
Location
Urban Rural
House Style Obs. Obs.
Split-Level
63
49
Ranch
Total
78
Marginal probability =
160
Total
112
15
33
48
78
82
160
112 78
Expected count = 160·
160 160
= 54.6
Expected Count Calculation
Expected Count Calculation
Expected count =
aRow totalfaColumn totalf
Sample size
Expected Count Calculation
Expected count =
112·78
160
aRow totalfaColumn totalf
Sample size
112·82
160
48·78
160
48·82
160
Test of Independence
Example
2
• You’re a marketing research analyst. You
ask a random sample of 286 consumers if
they purchase Diet Pepsi or Diet Coke. At
the .05 level, is there evidence of a
relationship?
Diet Pepsi
Diet Coke
No
Yes
Total
No
84
32
116
Yes
48
122
170
Total
132
154
286
2 Test of Independence
Solution
2 Test of Independence
Solution
• H0:
• Ha:
=
• df =
• Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
2 Test of Independence
Solution
• H0: No
Test Statistic:
Relationship
• Ha: Relationship
=
• df =
• Critical Value(s):
Decision:
Reject
Conclusion:
0
2
2 Test of Independence
Solution
• H0: No
Test Statistic:
Relationship
• Ha: Relationship
= .05
• df = (2 - 1)(2 - 1)
=1
Reject
• Critical Value(s):
Decision:
Conclusion:
0
2
2 Test of Independence
Solution
• H0: No
Test Statistic:
Relationship
• Ha: Relationship
= .05
• df = (2 - 1)(2 - 1)
=1
Reject
• Critical Value(s):
= .05
0
3.841
2
Decision:
Conclusion:
2 Test of Independence
Solution
E(nij) 5 in all
cells
116·132
286
154·116
286
170·132
286
170·154
286
2 Test of Independence
Solution
2
all cells
af
af
n11 E n11
E n
11
84 53.5
53.5
ch
ch
nij E nij
E n
2
2
ij
2
af
af
n12 E n12
E n
12
2
2
af
af
n22 E n22
E n
32 62.5 2
122 91.5
62.5
91.5
2
22
2
54.29
2 Test of Independence
Solution
• H0: No
Test Statistic:
Relationship
• Ha: Relationship
= .05
• df = (2 - 1)(2 - 1)
=1
Reject
• Critical Value(s):
= .05
0
3.841
2
2 = 54.29
Decision:
Conclusion:
2 Test of Independence
Solution
• H0: No
Test Statistic:
Relationship
• Ha: Relationship
= .05
• df = (2 - 1)(2 - 1)
=1
Reject
• Critical Value(s):
= .05
0
3.841
2
2 = 54.29
Decision:
Reject at = .05
Conclusion:
2 Test of Independence
Solution
• H0: No
Test Statistic:
Relationship
• Ha: Relationship
= .05
• df = (2 - 1)(2 - 1)
=1
Reject
• Critical Value(s):
= .05
0
3.841
2
2 = 54.29
Decision:
Reject at = .05
Conclusion:
There is evidence of a
relationship
Siskel and Ebert
•
|
Ebert
•
Siskel |
Con
Mix
Pro |
Total
• -----------+---------------------------------+---------•
Con |
24
8
13 |
45
•
Mix |
8
13
11 |
32
•
Pro |
10
9
64 |
83
• -----------+---------------------------------+---------•
Total |
42
30
88 |
160
Siskel and Ebert
•
|
Ebert
•
Siskel |
Con
Mix
Pro |
Total
•-----------+---------------------------------+---------•
Con |
24
8
13 |
45
•
|
11.8
8.4
24.8 |
45.0
•-----------+---------------------------------+---------•
Mix |
8
13
11 |
32
•
|
8.4
6.0
17.6 |
32.0
•-----------+---------------------------------+---------•
Pro |
10
9
64 |
83
•
|
21.8
15.6
45.6 |
83.0
•-----------+---------------------------------+---------•
Total |
42
30
88 |
160
•
|
42.0
30.0
88.0 |
160.0
•
Pearson chi2(4) =
45.3569
p < 0.001
Yate’s Statistics
• Method of testing for association for
2x2 tables when sample size is
moderate ( total observation
between 6 – 25)
2
Oij eij 0.5
2
i
j
eij
Measures of association
–
–
–
Relative Risk
End of
Odds Ratio
Absolute Risk
Chapter
Any blank slides that follow are
blank intentionally.
Relative Risk
• Ratio of the probability that the outcome
characteristic is present for one group,
relative to the other
• Sample proportions with characteristic
from groups 1 and 2:
n11
1
n1.
^
n21
2
n2.
^
Relative Risk
• Estimated Relative Risk:
^
RR 1
^
2
95% Confidence Interval for Population Relative Risk:
( RR (e 1.96
v
) , RR (e1.96
^
e 2.71828
v
v
))
^
(1 1 )
(1
n11
n21
2
)
Relative Risk
• Interpretation
– Conclude that the probability that the outcome
is present is higher (in the population) for group
1 if the entire interval is above 1
– Conclude that the probability that the outcome
is present is lower (in the population) for group 1
if the entire interval is below 1
– Do not conclude that the probability of the
outcome differs for the two groups if the interval
contains 1
Example - Coccidioidomycosis and
TNF-antagonists
• Research Question: Risk of developing Coccidioidmycosis
associated with arthritis therapy?
• Groups: Patients receiving tumor necrosis factor (TNF)
versus Patients not receiving TNF (all patients arthritic)
TNF
Other
Total
Source: Bergstrom, et al (2004)
COC
7
4
11
No COC
240
734
974
Total
247
738
985
Example - Coccidioidomycosis and
TNF-antagonists
• Group 1: Patients on TNF
• Group 2: Patients not on TNF
^
7
4
1
.0283 2
.0054
247
738
^
^
.0283
RR ^
5.24
2 .0054
1
95%CI : (5.24e 1.96
.3874
1 .0283 1 .0054
v
.3874
7
4
, 5.24e1.96
.3874
) (1.55 , 17.76)
Entire CI above 1 Conclude higher risk if on TNF
Odds Ratio
• Odds of an event is the probability it occurs
divided by the probability it does not occur
• Odds ratio is the odds of the event for group 1
divided by the odds of the event for group 2
• Sample odds of the outcome for each group:
n11 / n1.
n11
odds1
n12 / n1.
n12
odds2
n21
n22
Odds Ratio
• Estimated Odds Ratio:
odds1 n11 / n12 n11n22
OR
odds2 n21 / n22 n12 n21
95% Confidence Interval for Population Odds Ratio
( OR (e 1.96
v
) , OR (e1.96 v ) )
1
1
1
1
e 2.71828
v
n11
n12
n21
n22
Odds Ratio
• Interpretation
– Conclude that the probability that the outcome
is present is higher (in the population) for group
1 if the entire interval is above 1
– Conclude that the probability that the outcome
is present is lower (in the population) for group 1
if the entire interval is below 1
– Do not conclude that the probability of the
outcome differs for the two groups if the interval
contains 1
Example - NSAIDs and GBM
• Case-Control Study (Retrospective)
– Cases: 137 Self-Reporting Patients with Glioblastoma
Multiforme (GBM)
– Controls: 401 Population-Based Individuals matched to
cases wrt demographic factors
GBM Present GBM Absent
NSAID User
32
138
NSAID Non-User
105
263
Total
137
401
Source: Sivak-Sears, et al (2004)
Total
170
368
538
Example - NSAIDs and GBM
32(263)
8416
OR
0.58
138(105) 14490
1
1
1
1
v
0.0518
32 138 105 263
95% CI : ( 0.58e 1.96
0.0518
, 0.58e1.96
0.0518
) (0.37 , 0.91)
Interval is entirely below 1, NSAID use appears
to be lower among cases than controls
Absolute Risk
• Difference Between Proportions of outcomes
with an outcome characteristic for 2 groups
• Sample proportions with characteristic
from groups 1 and 2:
n11
1
n1.
^
n21
2
n2.
^
Absolute Risk
Estimated Absolute Risk:
^
^
AR 1 2
95% Confidence Interval for Population Absolute Risk
^
^ ^ ^
1 1 1 2 1 2
AR 1.96
n1.
n2.
Absolute Risk
• Interpretation
– Conclude that the probability that the outcome
is present is higher (in the population) for group
1 if the entire interval is positive
– Conclude that the probability that the outcome
is present is lower (in the population) for group 1
if the entire interval is negative
– Do not conclude that the probability of the
outcome differs for the two groups if the interval
contains 0
Example - Coccidioidomycosis and
TNF-antagonists
• Group 1: Patients on TNF
• Group 2: Patients not on TNF
^
7
4
1
.0283 2
.0054
247
738
^
^
^
AR 1 2 .0283 .0054 .0229
.0283(.9717) .0054(.9946)
247
738
.0229 .0213 (0.0016 , 0.0242)
95%CI : .0229 1.96
Interval is entirely positive, TNF is
associated with higher risk
Ordinal Explanatory and Response
Variables
• Pearson’s Chi-square test can be used to
test associations among ordinal variables,
but more powerful methods exist
• When theories exist that the association is
directional (positive or negative), measures
exist to describe and test for these specific
alternatives from independence:
– Gamma
– Kendall’s b
Concordant and Discordant Pairs
• Concordant Pairs - Pairs of individuals where
one individual scores “higher” on both ordered
variables than the other individual
• Discordant Pairs - Pairs of individuals where
one individual scores “higher” on one ordered
variable and the other individual scores
“lower” on the other
• C = # Concordant Pairs D = # Discordant
Pairs
– Under Positive association, expect C > D
– Under Negative association, expect C < D
– Under No association, expect C D
Example - Alcohol Use and Sick
Days
• Alcohol Risk (Without Risk, Hardly any Risk,
Some to Considerable Risk)
• Sick Days (0, 1-6, 7)
• Concordant Pairs - Pairs of respondents
where one scores higher on both alcohol
risk and sick days than the other
• Discordant Pairs - Pairs of respondents
where one scores higher on alcohol risk and
the other scores higher on sick days
Source: Hermansson, et al (2003)
Example - Alcohol Use and Sick
Days
ALCOHOL * SICKDAYS Crosstabulation
Count
ALCOHOL
Total
Without Risk
Hardly any Risk
Some-Considerable Risk
0 days
347
154
52
553
SICKDAYS
1-6 days
113
63
25
201
7+ days
145
56
34
235
Total
605
273
111
989
• Concordant Pairs: Each individual in a given cell
is concordant with each individual in cells
“Southeast” of theirs
•Discordant Pairs: Each individual in a given cell is
discordant with each individual in cells “Southwest”
of theirs
Example - Alcohol Use and Sick
Days
ALCOHOL * SICKDAYS Crosstabulation
Count
ALCOHOL
Total
Without Risk
Hardly any Risk
Some-Considerable Risk
0 days
347
154
52
553
SICKDAYS
1-6 days
113
63
25
201
7+ days
145
56
34
235
Total
605
273
111
989
C 347(63 56 25 34) 113(56 34) 154(25 34) 63(34) 83164
D 145(154 63 52 25) 113(154 52) 56(52 25) 63(52) 73496
Measures of Association
• Goodman and Kruskal’s Gamma:
C D
CD
^
^
1 1
• Kendall’s b:
C D
^
b
2
(n
2
2
n
)(
n
i.
2
n. j )
When there’s no association between the ordinal variables,
the population based values of these measures are 0.
Statistical software packages provide these tests.
Example - Alcohol Use and Sick
Days
C D 83164 73496
0.0617
C D 83164 73496
^
Symmetric Measures
Ordinal by
Ordinal
Kendall's tau-b
Gamma
N of Valid Cases
Value
.035
.062
989
Asymp.
a
Std. Error
.030
.052
b
Approx. T
1.187
1.187
a. Not assuming the null hypothesis.
b. Using the asymptotic standard error assuming the null hypothesis.
Approx. Sig.
.235
.235
1.
Explain 2 Test of Independence
2.
Measure of Association
Contingency Tables
• Tables representing all combinations
of levels of explanatory and response
variables
• Numbers in table represent Counts
of the number of cases in each cell
• Row and column totals are called
Marginal counts
2x2 Tables
• Each variable has 2 levels
– Explanatory Variable – Groups (Typically
based on demographics, exposure)
– Response Variable – Outcome (Typically
presence or absence of a characteristic)
2x2 Tables - Notation
Outcome
Present
Outcome
Absent
Group
Total
Group 1
n11
n12
n1.
Group 2
n21
n22
n2.
Outcome
Total
n.1
n.2
n..
2 Test of Independence
2 Test of Independence
• 1. Shows If a Relationship Exists
Between 2 Qualitative Variables
– One Sample Is Drawn
– Does Not Show Causality
• 2. Assumptions
– Multinomial Experiment
– All Expected Counts 5
• 3. Uses Two-Way Contingency Table
Test of Independence
Contingency Table
2
• 1. Shows # Observations From 1
Sample Jointly in 2 Qualitative
Variables
Test of Independence
Contingency Table
2
• 1. Shows # Observations From 1
Sample Jointly in 2 Qualitative
Levels of variable 2
Variables
Levels of variable 1
Test of Independence
Hypotheses & Statistic
2
• 1. Hypotheses
– H0: Variables Are Independent
– Ha: Variables Are Related (Dependent)
Test of Independence
Hypotheses & Statistic
2
• 1. Hypotheses
– H0: Variables Are Independent
– Ha: Variables Are Related (Dependent)
Observed count
• 2. Test Statistic
n
E
n
ij
ij
2
all cells
ch
Ec
n h
ij
2
Expected
count
Test of Independence
Hypotheses & Statistic
2
• 1. Hypotheses
– H0: Variables Are Independent
– Ha: Variables Are Related (Dependent)
• 2. Test Statistic
2
all cells
Observed count
ch
ch
nij E nij
E n
ij
2
Expected
count
Rows Columns
• Degrees of Freedom: (r - 1)(c - 1)
2 Test of Independence
Expected Counts
• 1. Statistical Independence Means
Joint Probability Equals Product of
Marginal Probabilities
• 2. Compute Marginal Probabilities &
Multiply for Joint Probability
• 3. Expected Count Is Sample Size
Times Joint Probability
Expected Count Example
Expected Count Example
Location
Urban Rural
House Style Obs. Obs.
Split-Level
63
49
Ranch
Total
Total
112
15
33
48
78
82
160
Expected Count Example
Marginal probability = 112
160
Expected Count Example
Marginal probability = 112
160
Location
Urban Rural
House Style Obs. Obs.
Split-Level
63
49
Ranch
Total
78
Marginal probability =
160
Total
112
15
33
48
78
82
160
Expected Count Example
112 78
Joint probability =
160 160
Marginal probability = 112
160
Location
Urban Rural
House Style Obs. Obs.
Split-Level
63
49
Ranch
Total
78
Marginal probability =
160
Total
112
15
33
48
78
82
160
Expected Count Example
112 78
Joint probability =
160 160
Marginal probability = 112
160
Location
Urban Rural
House Style Obs. Obs.
Split-Level
63
49
Ranch
Total
78
Marginal probability =
160
Total
112
15
33
48
78
82
160
112 78
Expected count = 160·
160 160
= 54.6
Expected Count Calculation
Expected Count Calculation
Expected count =
aRow totalfaColumn totalf
Sample size
Expected Count Calculation
Expected count =
112·78
160
aRow totalfaColumn totalf
Sample size
112·82
160
48·78
160
48·82
160
Test of Independence
Example
2
• You’re a marketing research analyst. You
ask a random sample of 286 consumers if
they purchase Diet Pepsi or Diet Coke. At
the .05 level, is there evidence of a
relationship?
Diet Pepsi
Diet Coke
No
Yes
Total
No
84
32
116
Yes
48
122
170
Total
132
154
286
2 Test of Independence
Solution
2 Test of Independence
Solution
• H0:
• Ha:
=
• df =
• Critical Value(s):
Test Statistic:
Decision:
Reject
Conclusion:
0
2
2 Test of Independence
Solution
• H0: No
Test Statistic:
Relationship
• Ha: Relationship
=
• df =
• Critical Value(s):
Decision:
Reject
Conclusion:
0
2
2 Test of Independence
Solution
• H0: No
Test Statistic:
Relationship
• Ha: Relationship
= .05
• df = (2 - 1)(2 - 1)
=1
Reject
• Critical Value(s):
Decision:
Conclusion:
0
2
2 Test of Independence
Solution
• H0: No
Test Statistic:
Relationship
• Ha: Relationship
= .05
• df = (2 - 1)(2 - 1)
=1
Reject
• Critical Value(s):
= .05
0
3.841
2
Decision:
Conclusion:
2 Test of Independence
Solution
E(nij) 5 in all
cells
116·132
286
154·116
286
170·132
286
170·154
286
2 Test of Independence
Solution
2
all cells
af
af
n11 E n11
E n
11
84 53.5
53.5
ch
ch
nij E nij
E n
2
2
ij
2
af
af
n12 E n12
E n
12
2
2
af
af
n22 E n22
E n
32 62.5 2
122 91.5
62.5
91.5
2
22
2
54.29
2 Test of Independence
Solution
• H0: No
Test Statistic:
Relationship
• Ha: Relationship
= .05
• df = (2 - 1)(2 - 1)
=1
Reject
• Critical Value(s):
= .05
0
3.841
2
2 = 54.29
Decision:
Conclusion:
2 Test of Independence
Solution
• H0: No
Test Statistic:
Relationship
• Ha: Relationship
= .05
• df = (2 - 1)(2 - 1)
=1
Reject
• Critical Value(s):
= .05
0
3.841
2
2 = 54.29
Decision:
Reject at = .05
Conclusion:
2 Test of Independence
Solution
• H0: No
Test Statistic:
Relationship
• Ha: Relationship
= .05
• df = (2 - 1)(2 - 1)
=1
Reject
• Critical Value(s):
= .05
0
3.841
2
2 = 54.29
Decision:
Reject at = .05
Conclusion:
There is evidence of a
relationship
Siskel and Ebert
•
|
Ebert
•
Siskel |
Con
Mix
Pro |
Total
• -----------+---------------------------------+---------•
Con |
24
8
13 |
45
•
Mix |
8
13
11 |
32
•
Pro |
10
9
64 |
83
• -----------+---------------------------------+---------•
Total |
42
30
88 |
160
Siskel and Ebert
•
|
Ebert
•
Siskel |
Con
Mix
Pro |
Total
•-----------+---------------------------------+---------•
Con |
24
8
13 |
45
•
|
11.8
8.4
24.8 |
45.0
•-----------+---------------------------------+---------•
Mix |
8
13
11 |
32
•
|
8.4
6.0
17.6 |
32.0
•-----------+---------------------------------+---------•
Pro |
10
9
64 |
83
•
|
21.8
15.6
45.6 |
83.0
•-----------+---------------------------------+---------•
Total |
42
30
88 |
160
•
|
42.0
30.0
88.0 |
160.0
•
Pearson chi2(4) =
45.3569
p < 0.001
Yate’s Statistics
• Method of testing for association for
2x2 tables when sample size is
moderate ( total observation
between 6 – 25)
2
Oij eij 0.5
2
i
j
eij
Measures of association
–
–
–
Relative Risk
End of
Odds Ratio
Absolute Risk
Chapter
Any blank slides that follow are
blank intentionally.
Relative Risk
• Ratio of the probability that the outcome
characteristic is present for one group,
relative to the other
• Sample proportions with characteristic
from groups 1 and 2:
n11
1
n1.
^
n21
2
n2.
^
Relative Risk
• Estimated Relative Risk:
^
RR 1
^
2
95% Confidence Interval for Population Relative Risk:
( RR (e 1.96
v
) , RR (e1.96
^
e 2.71828
v
v
))
^
(1 1 )
(1
n11
n21
2
)
Relative Risk
• Interpretation
– Conclude that the probability that the outcome
is present is higher (in the population) for group
1 if the entire interval is above 1
– Conclude that the probability that the outcome
is present is lower (in the population) for group 1
if the entire interval is below 1
– Do not conclude that the probability of the
outcome differs for the two groups if the interval
contains 1
Example - Coccidioidomycosis and
TNF-antagonists
• Research Question: Risk of developing Coccidioidmycosis
associated with arthritis therapy?
• Groups: Patients receiving tumor necrosis factor (TNF)
versus Patients not receiving TNF (all patients arthritic)
TNF
Other
Total
Source: Bergstrom, et al (2004)
COC
7
4
11
No COC
240
734
974
Total
247
738
985
Example - Coccidioidomycosis and
TNF-antagonists
• Group 1: Patients on TNF
• Group 2: Patients not on TNF
^
7
4
1
.0283 2
.0054
247
738
^
^
.0283
RR ^
5.24
2 .0054
1
95%CI : (5.24e 1.96
.3874
1 .0283 1 .0054
v
.3874
7
4
, 5.24e1.96
.3874
) (1.55 , 17.76)
Entire CI above 1 Conclude higher risk if on TNF
Odds Ratio
• Odds of an event is the probability it occurs
divided by the probability it does not occur
• Odds ratio is the odds of the event for group 1
divided by the odds of the event for group 2
• Sample odds of the outcome for each group:
n11 / n1.
n11
odds1
n12 / n1.
n12
odds2
n21
n22
Odds Ratio
• Estimated Odds Ratio:
odds1 n11 / n12 n11n22
OR
odds2 n21 / n22 n12 n21
95% Confidence Interval for Population Odds Ratio
( OR (e 1.96
v
) , OR (e1.96 v ) )
1
1
1
1
e 2.71828
v
n11
n12
n21
n22
Odds Ratio
• Interpretation
– Conclude that the probability that the outcome
is present is higher (in the population) for group
1 if the entire interval is above 1
– Conclude that the probability that the outcome
is present is lower (in the population) for group 1
if the entire interval is below 1
– Do not conclude that the probability of the
outcome differs for the two groups if the interval
contains 1
Example - NSAIDs and GBM
• Case-Control Study (Retrospective)
– Cases: 137 Self-Reporting Patients with Glioblastoma
Multiforme (GBM)
– Controls: 401 Population-Based Individuals matched to
cases wrt demographic factors
GBM Present GBM Absent
NSAID User
32
138
NSAID Non-User
105
263
Total
137
401
Source: Sivak-Sears, et al (2004)
Total
170
368
538
Example - NSAIDs and GBM
32(263)
8416
OR
0.58
138(105) 14490
1
1
1
1
v
0.0518
32 138 105 263
95% CI : ( 0.58e 1.96
0.0518
, 0.58e1.96
0.0518
) (0.37 , 0.91)
Interval is entirely below 1, NSAID use appears
to be lower among cases than controls
Absolute Risk
• Difference Between Proportions of outcomes
with an outcome characteristic for 2 groups
• Sample proportions with characteristic
from groups 1 and 2:
n11
1
n1.
^
n21
2
n2.
^
Absolute Risk
Estimated Absolute Risk:
^
^
AR 1 2
95% Confidence Interval for Population Absolute Risk
^
^ ^ ^
1 1 1 2 1 2
AR 1.96
n1.
n2.
Absolute Risk
• Interpretation
– Conclude that the probability that the outcome
is present is higher (in the population) for group
1 if the entire interval is positive
– Conclude that the probability that the outcome
is present is lower (in the population) for group 1
if the entire interval is negative
– Do not conclude that the probability of the
outcome differs for the two groups if the interval
contains 0
Example - Coccidioidomycosis and
TNF-antagonists
• Group 1: Patients on TNF
• Group 2: Patients not on TNF
^
7
4
1
.0283 2
.0054
247
738
^
^
^
AR 1 2 .0283 .0054 .0229
.0283(.9717) .0054(.9946)
247
738
.0229 .0213 (0.0016 , 0.0242)
95%CI : .0229 1.96
Interval is entirely positive, TNF is
associated with higher risk
Ordinal Explanatory and Response
Variables
• Pearson’s Chi-square test can be used to
test associations among ordinal variables,
but more powerful methods exist
• When theories exist that the association is
directional (positive or negative), measures
exist to describe and test for these specific
alternatives from independence:
– Gamma
– Kendall’s b
Concordant and Discordant Pairs
• Concordant Pairs - Pairs of individuals where
one individual scores “higher” on both ordered
variables than the other individual
• Discordant Pairs - Pairs of individuals where
one individual scores “higher” on one ordered
variable and the other individual scores
“lower” on the other
• C = # Concordant Pairs D = # Discordant
Pairs
– Under Positive association, expect C > D
– Under Negative association, expect C < D
– Under No association, expect C D
Example - Alcohol Use and Sick
Days
• Alcohol Risk (Without Risk, Hardly any Risk,
Some to Considerable Risk)
• Sick Days (0, 1-6, 7)
• Concordant Pairs - Pairs of respondents
where one scores higher on both alcohol
risk and sick days than the other
• Discordant Pairs - Pairs of respondents
where one scores higher on alcohol risk and
the other scores higher on sick days
Source: Hermansson, et al (2003)
Example - Alcohol Use and Sick
Days
ALCOHOL * SICKDAYS Crosstabulation
Count
ALCOHOL
Total
Without Risk
Hardly any Risk
Some-Considerable Risk
0 days
347
154
52
553
SICKDAYS
1-6 days
113
63
25
201
7+ days
145
56
34
235
Total
605
273
111
989
• Concordant Pairs: Each individual in a given cell
is concordant with each individual in cells
“Southeast” of theirs
•Discordant Pairs: Each individual in a given cell is
discordant with each individual in cells “Southwest”
of theirs
Example - Alcohol Use and Sick
Days
ALCOHOL * SICKDAYS Crosstabulation
Count
ALCOHOL
Total
Without Risk
Hardly any Risk
Some-Considerable Risk
0 days
347
154
52
553
SICKDAYS
1-6 days
113
63
25
201
7+ days
145
56
34
235
Total
605
273
111
989
C 347(63 56 25 34) 113(56 34) 154(25 34) 63(34) 83164
D 145(154 63 52 25) 113(154 52) 56(52 25) 63(52) 73496
Measures of Association
• Goodman and Kruskal’s Gamma:
C D
CD
^
^
1 1
• Kendall’s b:
C D
^
b
2
(n
2
2
n
)(
n
i.
2
n. j )
When there’s no association between the ordinal variables,
the population based values of these measures are 0.
Statistical software packages provide these tests.
Example - Alcohol Use and Sick
Days
C D 83164 73496
0.0617
C D 83164 73496
^
Symmetric Measures
Ordinal by
Ordinal
Kendall's tau-b
Gamma
N of Valid Cases
Value
.035
.062
989
Asymp.
a
Std. Error
.030
.052
b
Approx. T
1.187
1.187
a. Not assuming the null hypothesis.
b. Using the asymptotic standard error assuming the null hypothesis.
Approx. Sig.
.235
.235