Chapter10 - Hypothesis Testing Two-Sample Tests
Statistics for Managers Using Microsoft® Excel 5th Edition
Learning Objectives
In this chapter, you learn how to use hypothesis testing for comparing the difference between: The means of two independent populations
The means of two related populations
The proportions of two independent
populations The variances of two independent
populations Two-Sample Tests Overview
Two Sample Tests
Independent Population Means Means,
Related Populations Independent Population
Variances Group 1 vs.
Group 2 Same group before vs. after treatment
Variance 1 vs. Variance 2 Examples
Independent Population Proportions Proportion 1vs.
Proportion 2 Two-Sample Tests Goal: Test hypothesis or form
Independent
a confidence interval for the
Population Means
difference between two population means, μ – μ
1
2
σ and σ known 1 2 The point estimate for the
difference between sample
σ and σ unknown 1 2
means:
X – X 1 2 Two-Sample Tests Independent Populations Different data sources
Independent
Population MeansIndependent: Sample selected
from one population has no effect on the sample selected from the other population
1 2
σ and σ known Use the difference between 2
sample means
Use Z test, pooled variance t
σ and σ unknown 1 2
test, or separate-variance t test
Two-Sample Tests Independent Populations
Independent Population Means
Use a Z test statistic σ and σ known 1 2 Use S to estimate unknown σ,
σ and σ unknown 1 2 use a t test statistic
Two-Sample Tests Independent Populations
Assumptions:
Independent Population Means
Samples are randomly and independently drawn
population distributions are σ and σ known 1 2 normalσ and σ unknown 1 2
Two-Sample Tests Independent Populations
When σ and σ are known and both 1 2 Independent populations are normal, the test Population Means statistic is a Z-value and the standard error of X – X is 1 2
σ and σ known 1 2
2
2 σ σ
1
2 σ
σ and σ unknown 1 2 X 1 X 2
n n
1
2
Two-Sample Tests Independent Populations
2
The test statistic is:
X X Z
μ μ
1 n σ n σ
2
1
Independent Population Means
σ 1 and σ 2 known σ 1 and σ 2 unknown
2
1
2
2
2
1 Two-Sample Tests Independent Populations
Lower-tail test: H : μ 1
μ 2 H 1 : μ 1 < μ 2 i.e.,
H : μ 1 – μ 2 0
H 1 : μ 1 – μ 2 < 0 Upper-tail test:
H : μ 1 ≤ μ 2 H 1 : μ 1 > μ 2 i.e.,
H : μ 1 – μ 2 ≤ 0 H 1 : μ 1 – μ 2 > 0
Two-tail test: H : μ 1 = μ 2 H 1 : μ 1 ≠ μ 2 i.e.,
H : μ 1 – μ 2 = 0 H 1 : μ 1 – μ 2 ≠ 0
Two Independent Populations, Comparing Means
Two-Sample Tests Independent Populations
Two Independent Populations, Comparing Means Lower-tail test: Upper-tail test: Two-tail test:
H : μ – μ H : μ – μ ≤ 0 H : μ – μ = 0 1 2 0 1 2 1 2 H : μ – μ < 0 H : μ – μ > 0 H : μ – μ ≠ 0 1 1 2 1 1 2 1 1 2
/2 /2
- z
z -z z /2 /2 Reject H if Z < -Z Reject H if Z > Z Reject H if Z < -Z a a a/2 or Z > Z a/2 Two-Sample Tests Independent Populations
Assumptions:
Independent Population Means
Samples are randomly and independently drawn
Populations are normally σ and σ known 1 2 distributedσ and σ unknown 1 2
Population variances are unknown but assumed equalTwo-Sample Tests Independent Populations
Forming interval estimates:
Independent Population Means
The population variances are assumed equal, so use the two sample standard deviations and pool them to σ and σ known 1 2 estimate σσ and σ unknown 1 2
the test statistic is a t value with (n + n – 2) degrees 1 2 of freedomTwo-Sample Tests Independent Populations
Independent Population Means
σ 1 and σ 2 known σ 1 and σ 2 unknown
The pooled standard deviation is:
1) n ( ) 1 (n S S 1 n
1 n
S
2
1
2
2
2
2
1
1 p
Two-Sample Tests Independent Populations Where t has (n 1 + n 2 – 2) d.f., and
1 n
Independent Population Means
S S 1 n 1 n S 2 1 2 2 2 2 1 1 2 p
1) n ( ) 1 (n
The test statistic is:
X X
t
1 S μ μ
1 n
2
1
2
2 p
1
2
σ 1 and σ 2 known σ 1 and σ 2 unknown
Two-Sample Tests Independent Populations
You are a financial analyst for a brokerage firm. Is there a
difference in dividend yield between stocks listed on the NYSE & NASDAQ? You collect the following data: NYSE NASDAQ Number 21 25 Sample mean 3.27 2.53 Sample std dev 1.30 1.16
Assuming both populations are approximately normal with equal variances, is there a difference in average yield ( = 0.05)?
Two-Sample Tests Independent Populations
1
The test statistic is:
X X t
25
1 S μ μ
1.16
1
2
1
2
2 p
1
2
3.27 n 1 n
1.30
2.53
1.5021 1) 1) 25 ( - (21
1 5021 .
21
1
25
2.040
S 1 n 1 n S 2 2 2 1 2 2 2 2 1 1 2 p
21 1) n ( ) 1 (n S
1
1
Two-Sample Tests Independent Populations
H : μ 1 - μ 2 = 0 i.e. (μ 1 = μ 2 )
H 1 : μ 1 - μ 2 ≠ 0 i.e. (μ 1 ≠ μ 2 )
= 0.05
df = 21 + 25 - 2 = 44
Critical Values: t = ± 2.0154
Test Statistic: 2.040 t
2.0154 -2.0154 .025 Reject H Reject H .025
Decision: Reject H at α = 0.05
2.040
Conclusion: There is evidence of a difference in the means.
Independent Populations Unequal Variance
are equal, the pooled-variance t test is inappropriate
Instead, use a separate-variance t test, which
If you cannot assume population variances
includes the two separate sample variances in the computation of the test statistic The computations are complicated and are
best performed using Excel
Two-Sample Tests Independent Populations
Independent Population Means
σ 1 and σ 2 known σ 1 and σ 2 unknown
2
2
2
1
2
1
2
1 n σ n σ
X X Z
The confidence interval for
μ 1 – μ
2 is:
Two-Sample Tests Independent Populations
Independent Population Means
σ 1 and σ 2 known σ 1 and σ 2 unknown
2 1 2 p 2 - n n 2 1 n
1 n
1 S
X X 2 1 t
The confidence interval for μ 1 – μ 2 is:
Where
1) n ( ) 1 (n S S 1 n
1 n S 2 1 2 2 2 2 1 1 2 p Two-Sample Tests Related Populations Tests Means of 2 Related Populations
Paired or matched samples
Repeated measures (before/after)
Use difference between paired values:
D = X - X
1
2 Eliminates Variation Among Subjects
Assumptions:
Both Populations Are Normally Distributed
Two-Sample Tests Related Populations The ith paired difference is D , where i
D = X - X i 1i 2i
The point estimate for the population mean paired difference is D : n
D
i i 1 D n
Suppose the population standard deviation of the difference scores, σ D , is known.
Two-Sample Tests Related Populations The test statistic for the mean difference is a Z value: n σ
μ D Z D D
Where μ D = hypothesized mean difference σ D = population standard deviation of differences n = the sample size (number of pairs) Two-Sample Tests Related Populations If σ is unknown, you can estimate the
D unknown population standard deviation with a sample standard deviation: n 2
(D D )
i
i 1 S D n1 Two-Sample Tests Related Populations The test statistic for D is now a t statistic:
D μ D t
S D n n
2 Where t has n - 1 d.f.
(D D ) i
i
1 and
S S is:
D D n
1 Two-Sample Tests Related Populations
Lower-tail test: Upper-tail test: Two-tail test: H : μ H : μ ≤ 0 H : μ = 0 D 0 D D
H : μ < 0 H : μ > 0 H : μ ≠ 0 1 D 1 D 1 D
/2 /2
- t
t -t t /2 /2 Reject H if t < -t Reject H if t > t Reject H if t < -t a a a/2 or t > t a/2 Two-Sample Tests Related Populations Example Assume you send your salespeople to a “customer service” training workshop. Has the training made a difference in the number of complaints? You collect the following data:
Salesperson Number of Complaints Difference, D i (2-1) Before (1) After (2) C.B.
6 4 -2 T.F.
20 6 -14 M.H.
3 2 -1 R.K.
4
- 4
Two-Sample Tests Related Populations Example
Salesperson Number of Complaints Difference, D i (2-1) Before (1) After (2) C.B.
6 4 -2 T.F.
20 6 -14 M.H.
3 2 -1 R.K.
M.O n
4 2 -4 (D D ) i
D i
S
D i 1 n
1 D 4 .
2 n
5.67 Two-Sample Tests Related Populations Example Has the training made a difference in the number of complaints (at the α = 0.01 level)?
H : μ = 0 D Critical Value = ± 4.604 d.f. = n - 1 = 4 H : μ 1 D 0 Test Statistic:
D μ
4.2 D t 1.66
S / n 5.67/ D
5
/2
Two-Sample Tests Related Populations Example Reject - 4.604 4.604 Reject
- - 1.66 Decision: Do not reject H (t statistic is not in the reject region)
Conclusion: There is no evidence of a significant change in the number of complaints /2 Two-Sample Tests Related Populations The confidence interval for μ (σ known) is:
D σ
D Z D n
Where n = the sample size (number of pairs in the paired sample)
Two-Sample Tests Related Populations
The confidence interval for μ (σ unknown) is:
D S D
D t 1 n n n 2
(D D ) i i 1 where S D
n1 Two Population Proportions Goal: Test a hypothesis or form a confidence interval for the difference between two independent population proportions, π – π
1
2 Assumptions:
n π (1-π 1 1 5 , n ) 5 1 1 n π (1-π 2 2 5 , n ) 5 2 2 The point estimate for the difference is p - p 1 2 Two Population Proportions Since you begin by assuming the null hypothesis is true, you assume π = π and pool
1
2 the two sample (p) estimates.
X X
1
2 The pooled estimate for p
the overall proportion is:
n n
1
2 where X and X are the number of 1 2 successes in samples 1 and 2
- – p
1
)
1 ( n n p p p p ZX X p
X , n n
X , n
n
2 2 2 1 1 1 2 1 2 1
2 is a Z statistic:
1
The test statistic for p
1
Two Population Proportions
1
2
1
2
1
2
P P where
Two Population Proportions
Hypothesis for Population Proportions Lower-tail test:
H : π 1 π 2 H 1 : π 1 < π 2 i.e.,
H : π 1 – π 2 0
H 1 : π 1 – π 2 < 0 Upper-tail test:
H : π 1 ≤ π 2 H 1 : π 1 > π 2 i.e.,
H : π 1 – π 2 ≤ 0 H 1 : π 1 – π 2 > 0
Two-tail test: H : π 1 = π 2 H 1 : π 1 ≠ π 2 i.e.,
H : π 1 – π 2 = 0 H 1 : π 1 – π 2 ≠ 0 Two Population Proportions
Hypothesis for Population Proportions Lower-tail test: Upper-tail test: Two-tail test:
H : π – π H : π – π ≤ 0 H : π – π = 0 1 2 0 1 2 1 2 H : π – π < 0 H : π – π > 0 H : π – π ≠ 0 1 1 2 1 1 2 1 1 2
/2 /2
- z
z -z z /2 /2 Reject H if Z < -Z Reject H if Z > Z Reject H if Z < -Z Z > Z or Two Independent Population Proportions: Example
Is there a significant difference between the
proportion of men and the proportion of
women who will vote Yes on Proposition A?
In a random sample of 72 men, 36 indicated
they would vote Yes and, in a sample of 50
women, 31 indicated they would vote Yes Test at the .05 level of significance Two Independent Population Proportions: Example H : π 1 – π 2 = 0 (the two proportions are equal)
1 1 2 H : π – π ≠ 0 (there is a significant difference between proportions)
The sample proportions are:
Men: p 1 = 36/72 = .50
Women: p = 31/50 = .62 2 The pooled estimate for the overall proportion is:
X X
36
31
67 1 2 p .549 n n
72 50 122 1 2
- 1.96
1.96 .025
.025
- 1.31 Decision: Do not reject H Conclusion: There is no evidence of a significant difference in proportions who will vote yes between men and women.
Critical Values = ±1.96 For = .05
p p
1 n 1 ) p (1 p z 2 1 2 1 2 1
1 .549) (1 .549 .62 .50 n
72
1
50
1.31
is:
π 1 – π 2
The test statistic for
Two Independent Population Proportions: Example
Reject H Reject H Two Independent Population Proportions The confidence interval for π – π is:
1
2 p (1 p ) p (1 p )
1
1
2
2 p p Z
1 2 n n
1
2 Testing Population Variances
Purpose: To determine if two independent populations have the same variability.
2 2 2 2 2 2 H : σ = σ H : σ H : σ ≤ σ 1 2 2 2 1 2 σ 2 2 1 2 2 2 H : σ ≠ σ H : σ < σ H : σ > σ 1 1 2 1 1 2 1 1 2 Two-tail test Lower-tail test Upper-tail test Testing Population Variances
2
2
2
1 S
S
F The F test statistic is: = Variance of Sample 1 n 1 - 1 = numerator degrees of freedom n 2 - 1 = denominator degrees of freedom = Variance of Sample 2 2 1 S
2 2 S Testing Population Variances
The F critical value is found from the F table
There are two appropriate degrees of freedom: numerator and denominator.
In the F table,
numerator degrees of freedom determine the column
denominator degrees of freedom determine the row Testing Population Variances
F L Reject H Do not reject H
H : σ 1 2 σ 2 2 H 1 : σ 1 2 < σ 2 2 Reject H if F < F L
F U Reject H Do not reject H
H : σ 1 2 ≤ σ 2 2 H 1 : σ 1 2 > σ 2 2 Reject H if F > F U Lower-tail test Upper-tail test Testing Population Variances
Two-tail test
2 2 H : σ = σ 1 2 2 2 H : σ ≠ σ 1 1 2
/2 2 /2 S 1 F F 2 U rejection region
S 2 F reject H Do not Reject H for a two-tail test is: 2 S L U 1 F F F F 2 L S 2 Testing Population Variances To find the critical F values:
1. Find F from the F table for n – 1 U 1 numerator and n – 1 denominator degrees 2 of freedom.
1 F L
2. Find F using the formula: L
F * U
Where F is from the F table with n – 1 U* 2 numerator and n – 1 denominator degrees of 1 freedom (i.e., switch the d.f. from F ) U Testing Population Variances
You are a financial analyst for a brokerage firm. You
want to compare dividend yields between stocks listed on the NYSE & NASDAQ. You collect the following data:
NYSE NASDAQ Number
21
25 Mean
3.27
2.53 Std dev
1.30
1.16 Is there a difference in the variances between the
NYSE & NASDAQ at the = 0.05 level?
- – σ
n 2 – 1 = 25 – 1 = 24 d.f.
F U :
F L = 1/F .025, 24, 20 = 0.41
n 1 – 1 = 21 – 1 = 20 d.f.
Denominator:
n 2 – 1 = 25 – 1 = 24 d.f.
Numerator:
F U = F .025, 20, 24 = 2.33
Denominator:
Testing Population Variances
n 1 – 1 = 21 – 1 = 20 d.f.
Numerator:
H 1 : σ 2 1 – σ 2 2 ≠ 0 (there is a difference between variances)
22 = 0 (there is no difference between variances)
21
H : σ
Form the hypothesis test:
F L : Testing Population Variances
The test statistic is: 256 .
1 16 .
1 30 .
1
2
2
2
2
2
1
S S F /2 = .025 F U =2.33 Reject H Do not reject H F L =0.41 /2 = .025
Reject H
F
F = 1.256 is not in the rejection region, so we do not reject H
Conclusion: There is insufficient evidence of a difference in variances at = .05 Chapter Summary In this chapter, we have
Performed Z test for the differences in two means
Compared two independent samples
Performed pooled variance t test for the differences in two means
Formed confidence intervals for the differences
between two means Compared two related samples (paired samples)
Performed paired sample Z and t tests for the mean
difference Formed confidence intervals for the paired
difference Performed separate-variance t test Chapter Summary
Compared two population proportions
Formed confidence intervals for the difference between two population proportions
Performed Z-test for two population proportions
Performed F tests for the difference between two population variances
Used the F table to find F critical values In this chapter, we have