Chapter10 - Hypothesis Testing Two-Sample Tests

  Statistics for Managers Using Microsoft® Excel 5th Edition

Learning Objectives

  In this chapter, you learn how to use hypothesis testing for comparing the difference between: The means of two independent populations

   The means of two related populations

   The proportions of two independent

   populations The variances of two independent

   populations Two-Sample Tests Overview

Two Sample Tests

  Independent Population Means Means,

  Related Populations Independent Population

  Variances Group 1 vs.

  Group 2 Same group before vs. after treatment

  Variance 1 vs. Variance 2 Examples

  Independent Population Proportions Proportion 1vs.

  Proportion 2 Two-Sample Tests Goal: Test hypothesis or form

  Independent

  a confidence interval for the

  Population Means

  difference between two population means, μ – μ

  1

  2

  σ and σ known 1 2 The point estimate for the

  difference between sample

  σ and σ unknown 1 2

  

means:

  X – X 1 2 Two-Sample Tests Independent Populations Different data sources

   Independent

  

Population Means

  Independent: Sample selected

  from one population has no effect on the sample selected from the other population

   1 2

  σ and σ known Use the difference between 2

  sample means

   Use Z test, pooled variance t

  σ and σ unknown 1 2

  test, or separate-variance t test

Two-Sample Tests Independent Populations

  Independent Population Means

  Use a Z test statistic σ and σ known 1 2 Use S to estimate unknown σ,

  σ and σ unknown 1 2 use a t test statistic

  Two-Sample Tests Independent Populations

Assumptions:

  Independent Population Means

  

Samples are randomly and independently drawn

  

population distributions are σ and σ known 1 2 normal

  σ and σ unknown 1 2

Two-Sample Tests Independent Populations

  When σ and σ are known and both 1 2 Independent populations are normal, the test Population Means statistic is a Z-value and the standard error of X – X is 1 2

  σ and σ known 1 2

  2

  2 σ σ

  1

  2 σ  

  σ and σ unknown 1 2 X 1 X 2

  n n

  1

  2

Two-Sample Tests Independent Populations

  2

  The test statistic is:

  X X Z     

  μ μ

  1 n σ n σ

  2

  1

  Independent Population Means

  σ 1 and σ 2 known σ 1 and σ 2 unknown

  2

  1

  2

  2

  2

     

  1 Two-Sample Tests Independent Populations

  Lower-tail test: H : μ 1

   μ 2 H 1 : μ 1 < μ 2 i.e.,

  H : μ 1 – μ 2  0

  H 1 : μ 1 – μ 2 < 0 Upper-tail test:

  H : μ 1 ≤ μ 2 H 1 : μ 1 > μ 2 i.e.,

  H : μ 1 – μ 2 ≤ 0 H 1 : μ 1 – μ 2 > 0

  Two-tail test: H : μ 1 = μ 2 H 1 : μ 1 ≠ μ 2 i.e.,

  H : μ 1 – μ 2 = 0 H 1 : μ 1 – μ 2 ≠ 0

  Two Independent Populations, Comparing Means

  Two-Sample Tests Independent Populations

  Two Independent Populations, Comparing Means Lower-tail test: Upper-tail test: Two-tail test:

  H : μ – μ H : μ – μ ≤ 0 H : μ – μ = 0 1 2  0 1 2 1 2 H : μ – μ < 0 H : μ – μ > 0 H : μ – μ ≠ 0 1 1 2 1 1 2 1 1 2

    /2 /2

  • z

  z -z z  /2 /2 Reject H if Z < -Z Reject H if Z > Z Reject H if Z < -Z a a a/2 or Z > Z a/2 Two-Sample Tests Independent Populations

Assumptions:

  Independent Population Means

  

Samples are randomly and independently drawn

  

Populations are normally σ and σ known 1 2 distributed

  σ and σ unknown 1 2

Population variances are unknown but assumed equal

  Two-Sample Tests Independent Populations

Forming interval estimates:

  Independent Population Means

  

The population variances are assumed equal, so use the two sample standard deviations and pool them to σ and σ known 1 2 estimate σ

  σ and σ unknown 1 2

the test statistic is a t value with (n + n – 2) degrees 1 2 of freedom

Two-Sample Tests Independent Populations

  Independent Population Means

  σ 1 and σ 2 known σ 1 and σ 2 unknown

  The pooled standard deviation is:    

  1) n ( ) 1 (n S S 1 n

  1 n

S

  2

  1

  2

  2

  2

  2

  1

  1 p      

   Two-Sample Tests Independent Populations Where t has (n 1 + n 2 – 2) d.f., and

     

  1 n

  Independent Population Means

     

  S S 1 n 1 n S 2 1 2 2 2 2 1 1 2 p   

      1) n ( ) 1 (n

  The test statistic is:

  X X

t

  1 S μ μ

  1 n

     

  2

  1

  2

  2 p

  1

  2

      

     

  σ 1 and σ 2 known σ 1 and σ 2 unknown

  Two-Sample Tests Independent Populations

You are a financial analyst for a brokerage firm. Is there a

   difference in dividend yield between stocks listed on the NYSE & NASDAQ? You collect the following data: NYSE NASDAQ Number 21 25 Sample mean 3.27 2.53 Sample std dev 1.30 1.16

  Assuming both populations are approximately normal with equal variances, is there a difference in average yield ( = 0.05)?

  Two-Sample Tests Independent Populations        

  1  

  The test statistic is:

X X t

  25

  1 S μ μ

  1.16

      

     

        

       

  1

  2

  1

  2

  2 p

  1

  2

  3.27 n 1 n

  1.30

  2.53

  1.5021 1) 1) 25 ( - (21

  1 5021 .

  21

  1

  25

  2.040

       

        

        

  S 1 n 1 n S 2 2 2 1 2 2 2 2 1 1 2 p

  21 1) n ( ) 1 (n S

  1

  1

Two-Sample Tests Independent Populations

   H : μ 1 - μ 2 = 0 i.e. (μ 1 = μ 2 )

   H 1 : μ 1 - μ 2 ≠ 0 i.e. (μ 1 ≠ μ 2 )

    = 0.05

   df = 21 + 25 - 2 = 44

   Critical Values: t = ± 2.0154

   Test Statistic: 2.040 t

  2.0154 -2.0154 .025 Reject H Reject H .025

  Decision: Reject H at α = 0.05

  2.040

  Conclusion: There is evidence of a difference in the means.

Independent Populations Unequal Variance

   are equal, the pooled-variance t test is inappropriate

Instead, use a separate-variance t test, which

  If you cannot assume population variances

   includes the two separate sample variances in the computation of the test statistic The computations are complicated and are

   best performed using Excel

Two-Sample Tests Independent Populations

  Independent Population Means

  σ 1 and σ 2 known σ 1 and σ 2 unknown

   

  2

  2

  2

  1

  2

  1

  2

  1 n σ n σ

  X X    Z

  The confidence interval for

  μ 1 – μ

  2 is:

Two-Sample Tests Independent Populations

  Independent Population Means

  σ 1 and σ 2 known σ 1 and σ 2 unknown

       

     

     2 1 2 p 2 - n n 2 1 n

  1 n

  1 S

  X X 2 1 t

  The confidence interval for μ 1 – μ 2 is:

  Where    

  1) n ( ) 1 (n S S 1 n

  1 n S 2 1 2 2 2 2 1 1 2 p        Two-Sample Tests Related Populations Tests Means of 2 Related Populations

  Paired or matched samples

   Repeated measures (before/after)

   Use difference between paired values:

   D = X - X

  1

  2 Eliminates Variation Among Subjects

   Assumptions:

  

Both Populations Are Normally Distributed

  Two-Sample Tests Related Populations The ith paired difference is D , where i

  D = X - X i 1i 2i

  The point estimate for the population mean paired difference is D : n

D

i

   i 1 D  n

  Suppose the population standard deviation of the difference scores, σ D , is known.

  Two-Sample Tests Related Populations The test statistic for the mean difference is a Z value: n σ

  μ D Z D D

    Where μ D = hypothesized mean difference σ D = population standard deviation of differences n = the sample size (number of pairs) Two-Sample Tests Related Populations If σ is unknown, you can estimate the

  D unknown population standard deviation with a sample standard deviation: n 2

  (D D )

i

  i 1 S D  n

  1  Two-Sample Tests Related Populations The test statistic for D is now a t statistic:

  D μ  D t 

  S D n n

  2 Where t has n - 1 d.f.

  (D D )  i

   i

  1 and

  S S is:

  D  D n

  1  Two-Sample Tests Related Populations

  Lower-tail test: Upper-tail test: Two-tail test: H : μ H : μ ≤ 0 H : μ = 0 D  0 D D

  H : μ < 0 H : μ > 0 H : μ ≠ 0 1 D 1 D 1 D

    /2 /2

  • t

  t -t t  /2 /2 Reject H if t < -t Reject H if t > t Reject H if t < -t a a a/2 or t > t a/2 Two-Sample Tests Related Populations Example Assume you send your salespeople to a “customer service” training workshop. Has the training made a difference in the number of complaints? You collect the following data:

  Salesperson Number of Complaints Difference, D i (2-1) Before (1) After (2) C.B.

  6 4 -2 T.F.

  20 6 -14 M.H.

  3 2 -1 R.K.

  4

  • 4

Two-Sample Tests Related Populations Example

  Salesperson Number of Complaints Difference, D i (2-1) Before (1) After (2) C.B.

  6 4 -2 T.F.

  20 6 -14 M.H.

  3 2 -1 R.K.

  M.O n

  4 2 -4 (D D ) i

   D i 

S

D

   i 1 n

  1  D 4 .

  2    n

  5.67  Two-Sample Tests Related Populations Example Has the training made a difference in the number of complaints (at the α = 0.01 level)?

  H : μ = 0 D Critical Value = ± 4.604 d.f. = n - 1 = 4 H : μ 1 D 0 Test Statistic:

  D μ

  4.2    D t 1.66    

  S / n 5.67/ D

  5

  /2

  Two-Sample Tests Related Populations Example Reject - 4.604 4.604 Reject

  • - 1.66 Decision: Do not reject H (t statistic is not in the reject region)

  Conclusion: There is no evidence of a significant change in the number of complaints/2 Two-Sample Tests Related Populations The confidence interval for μ (σ known) is:

   D σ

  D Z D  n

  Where n = the sample size (number of pairs in the paired sample)

  Two-Sample Tests Related Populations

The confidence interval for μ (σ unknown) is:

   D S D

  D t  1 n n n 2

  (D D ) i   i  1 where S D

n

  1  Two Population Proportions Goal: Test a hypothesis or form a confidence interval for the difference between two independent population proportions, π – π

  1

  2 Assumptions:

  n π (1-π 1 1  5 , n )  5 1 1 n π (1-π 2 2  5 , n )  5 2 2 The point estimate for the difference is p - p 1 2 Two Population Proportions Since you begin by assuming the null hypothesis is true, you assume π = π and pool

  1

  2 the two sample (p) estimates.

  X X 

  1

  2 The pooled estimate for p 

  the overall proportion is:

  n n 

  1

  2 where X and X are the number of 1 2 successes in samples 1 and 2

  • – p

  1

)

1 ( n n p p p p Z

  X X p   

  X , n n

  X , n

  n

  2 2 2 1 1 1 2 1 2 1

  2 is a Z statistic:

  1

    The test statistic for p

  1

  Two Population Proportions    

  1

  2

  1

  2

  1

  2

       

     

     

    P P where

  

Two Population Proportions

  Hypothesis for Population Proportions Lower-tail test:

  H : π 1  π 2 H 1 : π 1 < π 2 i.e.,

  H : π 1 – π 2  0

  H 1 : π 1 – π 2 < 0 Upper-tail test:

  H : π 1 ≤ π 2 H 1 : π 1 > π 2 i.e.,

  H : π 1 – π 2 ≤ 0 H 1 : π 1 – π 2 > 0

  Two-tail test: H : π 1 = π 2 H 1 : π 1 ≠ π 2 i.e.,

  H : π 1 – π 2 = 0 H 1 : π 1 – π 2 ≠ 0 Two Population Proportions

  Hypothesis for Population Proportions Lower-tail test: Upper-tail test: Two-tail test:

  H : π – π H : π – π ≤ 0 H : π – π = 0 1 2  0 1 2 1 2 H : π – π < 0 H : π – π > 0 H : π – π ≠ 0 1 1 2 1 1 2 1 1 2

    /2 /2

  • z

  z -z z  /2 /2 Reject H if Z < -Z Reject H if Z > Z Reject H if Z < -Z   Z > Z  or Two Independent Population Proportions: Example

Is there a significant difference between the

   proportion of men and the proportion of

women who will vote Yes on Proposition A?

In a random sample of 72 men, 36 indicated

  

they would vote Yes and, in a sample of 50

women, 31 indicated they would vote Yes Test at the .05 level of significance 

  Two Independent Population Proportions: Example H : π 1 – π 2 = 0 (the two proportions are equal)

    1 1 2 H : π – π ≠ 0 (there is a significant difference between proportions)

  The sample proportions are:

   Men: p 1 = 36/72 = .50

   

  Women: p = 31/50 = .62 2  The pooled estimate for the overall proportion is:

  X X

  36

  31

  67 1   2 p .549     n n

  72 50 122 1   2

  • 1.96

  1.96 .025

  .025

  • 1.31 Decision: Do not reject H Conclusion: There is no evidence of a significant difference in proportions who will vote yes between men and women.

  Critical Values = ±1.96 For  = .05

        p p

      

       

       

     

  1 n 1 ) p (1 p z 2 1 2 1 2 1  

  1 .549) (1 .549 .62 .50 n

  72

  1

  50

  1.31

         

  is:

  π 1 – π 2

  The test statistic for

  Two Independent Population Proportions: Example

  Reject H Reject H Two Independent Population Proportions The confidence interval for π – π is:

  1

  2 p (1 p ) p (1 p )

   

  1

  1

  2

  2 p p Z

    

  

  1 2  n n

  1

  2 Testing Population Variances

   Purpose: To determine if two independent populations have the same variability.

  2 2 2 2 2 2 H : σ = σ H : σ H : σ ≤ σ 1 2 2 2 1 2  σ 2 2 1 2 2 2 H : σ ≠ σ H : σ < σ H : σ > σ 1 1 2 1 1 2 1 1 2 Two-tail test Lower-tail test Upper-tail test Testing Population Variances

  2

  2

  2

  1 S

S

F 

  The F test statistic is: = Variance of Sample 1 n 1 - 1 = numerator degrees of freedom n 2 - 1 = denominator degrees of freedom = Variance of Sample 2 2 1 S

  2 2 S Testing Population Variances

  

The F critical value is found from the F table

   There are two appropriate degrees of freedom: numerator and denominator.

   In the F table,

   numerator degrees of freedom determine the column

   denominator degrees of freedom determine the row Testing Population Variances 

  F L Reject H Do not reject H

  H : σ 1 2  σ 2 2 H 1 : σ 1 2 < σ 2 2 Reject H if F < F L

   F U Reject H Do not reject H

  H : σ 1 2 ≤ σ 2 2 H 1 : σ 1 2 > σ 2 2 Reject H if F > F U Lower-tail test Upper-tail test Testing Population Variances

  Two-tail test

  2 2 H : σ = σ 1 2 2 2 H : σ ≠ σ 1 1 2

  /2 2 /2 S 1 F F   2 U rejection region

  S 2 F reject H Do not Reject H for a two-tail test is: 2 S L U 1 F F F F   2 L S 2 Testing Population Variances To find the critical F values:

  1. Find F from the F table for n – 1 U 1 numerator and n – 1 denominator degrees 2 of freedom.

  1 F  L

  2. Find F using the formula: L

  F * U

  Where F is from the F table with n – 1 U* 2 numerator and n – 1 denominator degrees of 1 freedom (i.e., switch the d.f. from F ) U Testing Population Variances

  You are a financial analyst for a brokerage firm. You

   want to compare dividend yields between stocks listed on the NYSE & NASDAQ. You collect the following data:

   NYSE NASDAQ Number

   21

  25 Mean

  3.27

  2.53 Std dev

  1.30

  1.16 Is there a difference in the variances between the

   NYSE & NASDAQ at the  = 0.05 level?

  • – σ

   n 2 – 1 = 25 – 1 = 24 d.f.

  F U :

  F L = 1/F .025, 24, 20 = 0.41

   n 1 – 1 = 21 – 1 = 20 d.f.

   Denominator:

   n 2 – 1 = 25 – 1 = 24 d.f.

   Numerator:

  F U = F .025, 20, 24 = 2.33

   Denominator:

  Testing Population Variances

   n 1 – 1 = 21 – 1 = 20 d.f.

   Numerator:

   H 1 : σ 2 1 – σ 2 2 ≠ 0 (there is a difference between variances)

  22 = 0 (there is no difference between variances)

  21

   H : σ

   Form the hypothesis test:

  F L : Testing Population Variances

   The test statistic is: 256 .

  1 16 .

  1 30 .

  1

  2

  2

  2

  2

  2

  1   

  S S F /2 = .025 F U =2.33 Reject H Do not reject H F L =0.41 /2 = .025

  Reject H

  F

   F = 1.256 is not in the rejection region, so we do not reject H

   Conclusion: There is insufficient evidence of a difference in variances at  = .05 Chapter Summary In this chapter, we have

   Performed Z test for the differences in two means

  Compared two independent samples

    Performed pooled variance t test for the differences in two means

Formed confidence intervals for the differences

   between two means Compared two related samples (paired samples)

   Performed paired sample Z and t tests for the mean

   difference Formed confidence intervals for the paired

   difference Performed separate-variance t test  Chapter Summary

   Compared two population proportions

   Formed confidence intervals for the difference between two population proportions

   Performed Z-test for two population proportions

   Performed F tests for the difference between two population variances

   Used the F table to find F critical values In this chapter, we have