Week 10 Hypothesis Testing on Two Samples UMN Lecturer Team

  Week 10 Hypothesis Testing on Two Samples UMN Lecturer Team

  RSO@9/9/2012 Probabilistic and Statistics

  Objectives

  • Last week, we have studied about hypothesis testing on a mean at a particular value
  • Today, we learn about hypothesis testing between two means

  Why we need more than two samples

  • The experiment is made on two different samples:
    • – An engineer wants to test the aluminum quality under two different temperatures.
    • – An UMN student wants to test the system information implementation results between two different groups of users: Marketing and Finance

  Independent samples Two samples are independent when they are taken from two different population where there is no way for the first sample related to the other sample. Example:

  • A student wants to test the quality of two light bulbs from two brands  two samples from different populations  to find the differences between two population
  • A student takes a group of Papua students and a group of Java students and test whether they have a same consumption behavior two samples from different population to find the similarities between them

  When we know that our samples are different from each other?

  • The differences between two samples are shown on the differences on their means
  • Both samples must normally distributed
  • Both sample sizes are minimum 30
  • Doesn’t matter with their population shape

  Central limit theorem Formula for independent samples and known population standard deviations (10.1)

  Example

  • We want to test whether the salary of advertising managers are different from the salary of auditing managers
  • Here is the data we have:

  Steps of making Hypotheses

  • Hypothesize • Choose a proper statistic test
  • Step-3 Specify the value of 
  • making decision region
  • gather sample data
  • analysis the sample data
  • Statistical Conclusion • Making business decision

  Step-1

  • Hypothesize:
    • – When we don’t care about how far the differences between them use H o
    • – When we want to know how far the differences between them use H o

  • Let’s make it one by one:
  • H o

  : µ 12  H 1 : µ 1 ≠µ 2

  : µ 12 =   H 1 : µ 1 - µ 2 ≠ 

  : µ

  1

  =µ

  2

  : µ

  • H

  1

  1 ≠µ

  z distribution is used when

  t distribution is used when

  Differences between population means are tested

  • data is normally distributed
  • is known
  • Sample statistic =
  • data is normally distributed
  • is unknown
  • Sample statistic =

  Step-3 Specify the value of

  When it is not mentioned, use

  =5%

  Step 4: making decision region

  • Because it is a two tailed test, we use /2= 0.025  z=1.96
  • Decision region:
    • – Rejection region Z<-1.96 or Z>1.96
    • – Non Rejection region -1.96 ≤ Z ≤1.96

  Step 5: gather sample data

  • Compute any information about the sample we have:

  Step 6: analysis the sample data

  • Compute the statistic test

  We assume µ =µ 1 2 Step 7: Statistical Conclusion

  • Rejection region Z<-1.96 or Z>1.96
  • z=2.35
  •  z>1.96
  •  z is in rejection region
  •  null hypothesis is rejected is accepted
  •  H : µ is true

  ≠µ

  a

  1

  2

  Step 8: Making business decision

  • = $70.700 (for advertising managers) and

  1

  = $62.187 (for auditing managers)

  2

  • It’s true that the salary of advertising managers are different from the salary of auditing managers
  • Because of > it can be concluded that

  1

  2

  • Advertising managers’ salary is bigger than auditing managers′

  Confidence interval to estimate µ -µ

  1

  2 How big is the difference? We can apply Confidence interval to estimate how big the difference is using this following formula:

  Example Answer 1 <Z<z )=98% 1

  • For CI=98%  α =2%  P(-z

  )=49% =2.33 P(0<Z<z 1  z

  1

  Self Study Homework

  • For Monday class:
    • – Do 10.4 for NIM ended with even number
    • – Do 10.5 for NIM ended with odd number

  • For Thursday class:
    • – Do 10.6 for NIM ended with 0,3,6,9
    • – Do 10.7 NIM ended with 1,4,7
    • – Do 10.8 for NIM ended with 2,5,8

  Estimating differences between two means when their variances are unknown

  and  . So what?

  1

  2

  • We don’t know what is 

  = 

  • We can assume that 

  1 2.

  • By assuming this, we can use this formula:

  Example Sebuah perusahaan menyelenggarakan training. Training diselenggarakan dalam 2 metode yang berbeda: metode A dan metode B. Manager HRD ingin mengetahui apakah ada perbedaan yang signifikan antara kedua metode ini. Tabel di bawah ini adalah data hasil training tsb:

  Step-1

  • Hypothesize:

  STEP-2: Choose a proper statistic test z distribution is used when

  t distribution is used when

  Differences between population means are tested

  • data is normally distributed
  • is known
  • Sample statistic =
  • data is normally distributed
  • is unknown
  • Sample statistic =

  Step-3 Specify the value of

  When it is not mentioned, use

  =5%

  Step 4: making decision region

  • Because it is a two tailed test, we use /2= 0.025 df=n +n -2=15+12-2=25

  1

  2

  = ±2.060

  • t

  0.025,25

  • Decision region:
    • – Rejection region t<-2.060 or t>2.060
    • – Non Rejection region -2.060 ≤t≤2.060

  Step 5: gather sample data

  • Compute any information about the sample we have:

  Step 6: analysis the sample data

  • Compute the statistic test

  We assume µ =µ 1 2 Step 7: Statistical Conclusion

  • Rejection region t<-2.060 or t>2.060
  • t=-5.20
  •  t<-2.06
  •  t is in rejection region
  •  null hypothesis is rejected is accepted
  •  H a

  Step 8: Making business decision

  • Significantly µ ≠µ

  1

  2

  • = 47.73 (method A) and = 56.5 (method B)

  1

  2

  < it can be concluded that

  • Because of

  1

  2

  • Method B is more effective than method A

  How big is the difference?

  • When both data is normally distributed
  • Both  are unknown

  = 

  1

  2

  • Use this formula:

  Example

  • For the previous problems, we can estimate the CI:

  19.495 15 − 1 + 18.273 12 − 1

  1

  1 47.73 − 56.50 − −5.20 15 + 12 − 2 15 +

  12 ≤ −

  1

  2 (19.495)(15 − 1) + (18.273)(12 − 1) ≤ 47.73 − 56.50 + (−5.20)

  15 + 12 − 2 Self Study

  Homework

  • For Monday class:
    • – Do 10.15 for NIM ended with even number
    • – Do 10.16 for NIM ended with odd number

  • For Thursday class:
    • – Do 10.18 for NIM ended with 0,3,6,9
    • – Do 10.19 NIM ended with 1,4,7
    • – Do 10.20 for NIM ended with 2,5,8

  When sample is not independent

  • Sample is not independent:
    • – The same humans or objects are used before and after an experiment
    • – Twins, families, spouses, siblings are placed in two groups

  Hypothesis testing for dependent samples

  • The approach for dependent samples are different from the independent samples
  • Make pairs of related members
  • Calculate their differences

  a 1 a 2 a 3 a 4 a 5 b 1 b 2 b 3 b 4 b 5

   d 1 =b 1- a 1 d 2 =b 2- a 2 d 3 =b 3- a 3 d 4 =b 4- a 4 d 5 =b 5- a 5 Formula

  = = ( − )

  2 − 1

  Example Seorang investor di Bursa saham ingin mengetahui apakah ada perbedaan yang signifikan Antara rasio P/E (Price to earning) dari dua tahun yang berurutan. 9 perusahaan dipilih secara acak, tidak diketahui informasi sebelumnya, dan diasumsikan =1%. Berdistribusi normal.

  Step 1 to step 3

  Step 4: making decision region

  • Because it is a two tailed test, we use /2= 0.005 df=n-

  1=9-1=8 = ±3.355

  • t

  0.005,8

  • Decision region:
    • – Rejection region t<-3.355 or t> 3.355
    • – Non Rejection region - 3.355 ≤t≤ 3.355

  Step 5: gather sample data

  • Compute any information about the sample we have:

  Step 6: analysis the sample data

  • Compute the statistic test

  Step 7: Statistical Conclusion

  • Rejection region t<-3.355 or t> 3.355
  • t=-0.70
  •  - 3.355 ≤t≤ 3.355
  •  t is in non rejection region
  •  null hypothesis is not rejected is accepted  D=0
  •  H

  Step 8: Making business decision

  There is no significant differences in the average P/E ratio between year 1 and year 2.

  If there is a significant different, how big is it?

  Self Study Homework

  • For Monday class:
    • – Do 10.22 for NIM ended with even number
    • – Do 10.23 for NIM ended with odd number

  • For Thursday class:
    • – Do 10.26 for NIM ended with 0,3,6,9
    • – Do 10.27 NIM ended with 1,4,7
    • – Do 10.28 for NIM ended with 2,5,8