19 The fuel efficiency (mpg) of any particular new vehicle under specified driving con-

Example 8.19 The fuel efficiency (mpg) of any particular new vehicle under specified driving con-

ditions may not be identical to the EPA figure that appears on the vehicle’s sticker. Suppose that four different vehicles of a particular type are to be selected and driven over a certain course, after which the fuel efficiency of each one is to be determined. Let m denote the true average fuel efficiency under these conditions.

Consider testing H 0 : m 5 20 versus H 0 : m . 20 using the one-sample t test based

on the resulting sample. Since the test is based on n2153 degrees of freedom, the P-value for an upper-tailed test is the area under the t curve with 3 df to the right of the calculated t.

Let’s first suppose that the null hypothesis is true. We asked Minitab to generate 10,000 different samples, each containing 4 observations, from a normal population distribution with mean value m 5 20 and standard deviation s52 . The first sample and resulting summary quantities were

336

CHAPTER 8 Tests of Hypotheses Based on a Single Sample

1 x 5 20.830, x 2 5 22.232, x 3 5 20.276, x 4 5 17.718 20.264 2 20

x 5 20.264

s 5 1.8864 t 5

5 .2799

.1.8864 14

The P-value is the area under the 3-df t curve to the right of .2799, which according to Minitab is .3989. Using a significance level of .05, the null hypothesis would of course not be rejected. The values of t for the next four samples were 21.7591, .6082 , 2.7020 , and 3.1053, with corresponding P-values .912, .293, .733, and .0265.

Figure 8.11(a) shows a histogram of the 10,000 P-values from this simulation experiment. About 4.5 of these P-values are in the first class interval from

0 to .05. Thus when using a significance level of .05, the null hypothesis is rejected in roughly 4.5 of these 10,000 tests. If we continued to generate samples and carry out the test for each sample at significance level .05, in the long run 5 of

the P-values would be in the first class interval. This is because when H 0 is true

and a test with significance level .05 is used, by definition the probability of reject-

ing H 0 is .05. Looking at the histogram, it appears that the distribution of P-values is rela-

tively flat. In fact, it can be shown that when H 0 is true, the probability distribution

of the P-value is a uniform distribution on the interval from 0 to 1. That is, the den- sity curve is completely flat on this interval, and thus must have a height of 1 if the total area under the curve is to be 1. Since the area under such a curve to the left of

.05 is (.05)(1) 5 .05 , we again have that the probability of rejecting H 0 when it is

true that it is .05, the chosen significance level.

Now consider what happens when H 0 is false because

. We again had

Minitab generate 10,000 different samples of size 4 (each from a normal distribution with m 5 21 and s52 ), calculate t 5 (x 2 20)(s24) for each one, and then determine the P-value. The first such sample resulted in x 5 20.6411, s 5 .49637, t 5 2.5832 , P-value 5 .0408 . Figure 8.11(b) gives a histogram of the resulting P-values. The shape of this histogram is quite different from that of Figure 8.11(a)— there is a much greater tendency for the P-value to be small (closer to 0) when

m 5 21 than when m 5 20 . Again H 0 is rejected at significance level .05 whenever

the P-value is at most .05 (in the first class interval). Unfortunately, this is the case for only about 19 of the P-values. So only about 19 of the 10,000 tests correctly reject the null hypothesis; for the other 81, a type II error is committed. The diffi- culty is that the sample size is quite small and 21 is not very different from the value asserted by the null hypothesis.

Figure 8.11(c) illustrates what happens to the P-value when H 0 is false because

m 5 22 (still with n54 and s52 ). The histogram is even more concentrated toward values close to 0 than was the case when m 5 21 . In general, as m moves further to the right of the null value 20, the distribution of the P-value will become more and more concentrated on values close to 0. Even here a bit fewer than 50 of the P-values are smaller than .05. So it is still slightly more likely than not that the null hypothesis is incorrectly not rejected. Only for values of m much larger than 20 (e.g., at least 24 or 25) is it highly likely that the P-value will be smaller than .05 and thus give the correct conclusion.

The big idea of this example is that because the value of any test statistic is random, the P-value will also be a random variable and thus have a distribution. The further the actual value of the parameter is from the value specified by the null hypothesis, the more the distribution of the P-value will be concentrated on values

close to 0 and the greater the chance that the test will correctly reject H 0 (corre-

sponding to smaller b).

Figure 8.11 P-value simulation results for Example 8.19

■

EXERCISES Section 8.4 (47–62)

47. For which of the given P-values would the null hypothesis

50. Newly purchased tires of a certain type are supposed to be

be rejected when performing a level .05 test?

filled to a pressure of 30 lbin 2 . Let m denote the true

average pressure. Find the P-value associated with each

d. .047

e. .148

given z statistic value for testing H 0 : m 5 30 versus

48. Pairs of P-values and significance levels, a, are given. For

H a : m 2 30 .

each pair, state whether the observed P-value would lead to

a. 2.10

b. 21.75 c. 2.55 d. 1.41 e. 25.3

rejection of H 0 at the given significance level.

51. Give as much information as you can about the P-value of a

a. P-value 5 .084, a 5 .05

t test in each of the following situations:

b. P-value 5 .003, a 5 .001

a. Upper-tailed test,

df 5 8, t 5 2.0

c. P-value 5 .498, a 5 .05

b. Lower-tailed test,

df 5 11, t 5 22.4

d. P-value 5 .084, a 5 .10

c. Two-tailed test,

df 5 15, t 5 21.6

e. P-value 5 .039, a 5 .01

d. Upper-tailed test,

df 5 19, t 5 2.4

f. P-value 5 .218, a 5 .10

e. Upper-tailed test,

df 5 5, t 5 5.0

49. Let m denote the mean reaction time to a certain stimulus.

f. Two-tailed test,

df 5 40, t 5 24.8

For a large-sample z test of H 0 :m55 versus H a :m.5 ,

52. The paint used to make lines on roads must reflect enough

find the P-value associated with each of the given values of

light to be clearly visible at night. Let m denote the true

the z test statistic.

average reflectometer reading for a new type of paint under

consideration. A test of H 0 : m 5 20 versus H a : m . 20 will

CHAPTER 8 Tests of Hypotheses Based on a Single Sample

be based on a random sample of size n from a normal pop-

b. If the true percentage of “early yields” is actually 50

ulation distribution. What conclusion is appropriate in each

(so that the theoretical point is the median of the yield

of the following situations?

distribution) and a level .01 test is used, what is the prob-

a. n 5 15, t 5 3.2, a 5 .05

ability that the company concludes a modification of the

b. n 5 9, t 5 1.8, a 5 .01

process is necessary?

c. n 5 24, t 5 2.2

57. The article “Heavy Drinking and Polydrug Use Among

53. Let m denote true average serum receptor concentration for

College Students” (J. of Drug Issues, 2008: 445–466) stated

all pregnant women. The average for all women is known to

that 51 of the 462 college students in a sample had a lifetime

be 5.63. The article “Serum Transferrin Receptor for the

abstinence from alcohol. Does this provide strong evidence

Detection of Iron Deficiency in Pregnancy” (Amer. J. of

for concluding that more than 10 of the population sam-

Clinical Nutr., 1991: 1077–1081) reports that

pled had completely abstained from alcohol use? Test the

P-value . .10 for a test of H 0 : m 5 5.63 versus

appropriate hypotheses using the P-value method. [Note:

H a : m 2 5.63 based on n 5 176 pregnant women. Using a

The article used more advanced statistical methods to study

significance level of .01, what would you conclude?

the use of various drugs among students characterized as

54. The article “Analysis of Reserve and Regular Bottlings: Why

light, moderate, and heavy drinkers.]

Pay for a Difference Only the Critics Claim to Notice?”

58. A random sample of soil specimens was obtained, and the

(Chance, Summer 2005, pp. 9–15) reported on an experiment

amount of organic matter () in the soil was determined for

to investigate whether wine tasters could distinguish between

each specimen, resulting in the accompanying data (from

more expensive reserve wines and their regular counterparts.

“Engineering Properties of Soil,” Soil Science, 1998: 93–102).

Wine was presented to tasters in four containers labeled A, B,

C, and D, with two of these containing the reserve wine and

the other two the regular wine. Each taster randomly selected

three of the containers, tasted the selected wines, and indi-

cated which of the three heshe believed was different from

the other two. Of the n 5 855 tasting trials, 346 resulted in

The values of the sample mean, sample standard deviation,

correct distinctions (either the one reserve that differed from

and (estimated) standard error of the mean are 2.481, 1.616,

the two regular wines or the one regular wine that differed

and .295, respectively. Does this data suggest that the true

from the two reserves). Does this provide compelling evi-

average percentage of organic matter in such soil is some-

dence for concluding that tasters of this type have some abil-

thing other than 3? Carry out a test of the appropriate

ity to distinguish between reserve and regular wines? State

hypotheses at significance level .10 by first determining the

and test the relevant hypotheses using the P-value approach.

P-value. Would your conclusion be different if a 5 .05 had

Are you particularly impressed with the ability of tasters to

been used? [Note: A normal probability plot of the data

distinguish between the two types of wine?

shows an acceptable pattern in light of the reasonably large

55. An aspirin manufacturer fills bottles by weight rather than

sample size.]

by count. Since each bottle should contain 100 tablets, the

59. The accompanying data on cube compressive strength

average weight per tablet should be 5 grains. Each of

(MPa) of concrete specimens appeared in the article

100 tablets taken from a very large lot is weighed, resulting

“Experimental Study of Recycled Rubber-Filled High-

in a sample average weight per tablet of 4.87 grains and a

Strength Concrete” (Magazine of Concrete Res., 2009:

sample standard deviation of .35 grain. Does this informa-

549–556):

tion provide strong evidence for concluding that the company is not filling its bottles as advertised? Test the

appropriate hypotheses using a 5 .01 by first computing

the P-value and then comparing it to the specified

a. Is it plausible that the compressive strength for this type

significance level.

of concrete is normally distributed?

56. Because of variability in the manufacturing process, the

b. Suppose the concrete will be used for a particular appli-

actual yielding point of a sample of mild steel subjected to

cation unless there is strong evidence that true average

increasing stress will usually differ from the theoretical

strength is less than 100 MPa. Should the concrete be

yielding point. Let p denote the true proportion of samples

used? Carry out a test of appropriate hypotheses using

that yield before their theoretical yielding point. If on the

the P-value method.

basis of a sample it can be concluded that more than 20 of

60. A certain pen has been designed so that true average writing

all specimens yield before the theoretical point, the produc-

lifetime under controlled conditions (involving the use of a

tion process will have to be modified.

writing machine) is at least 10 hours. A random sample of

a. If 15 of 60 specimens yield before the theoretical point,

18 pens is selected, the writing lifetime of each is deter-

what is the P-value when the appropriate test is used, and

mined, and a normal probability plot of the resulting data

what would you advise the company to do?

supports the use of a one-sample t test.

8.5 Some Comments on Selecting a Test

a. What hypotheses should be tested if the investigators

recalibration necessary? Carry out a test of the relevant

believe a priori that the design specification has been

hypotheses using the P-value approach with a 5 .05 .

satisfied?

62. The relative conductivity of a semiconductor device is

b. What conclusion is appropriate if the hypotheses of part

determined by the amount of impurity “doped” into the

(a) are tested, t 5 22.3 , and a 5 .05 ?

device during its manufacture. A silicon diode to be used for

c. What conclusion is appropriate if the hypotheses of part

a specific purpose requires an average cut-on voltage of .60

(a) are tested, t 5 21.8 , and a 5 .01 ?

V, and if this is not achieved, the amount of impurity must

d. What should be concluded if the hypotheses of part (a)

be adjusted. A sample of diodes was selected and the cut-on

are tested and t 5 23.6 ?

voltage was determined. The accompanying SAS output

61. A spectrophotometer used for measuring CO concentration

resulted from a request to test the appropriate hypotheses.

[ppm (parts per million) by volume] is checked for accuracy

T by taking readings on a manufactured gas (called span gas) Prob. . T u

in which the CO concentration is very precisely controlled at 70 ppm. If the readings suggest that the spectrophotome-

[Note: SAS explicitly tests H 0 :m50 , so to test H 0 : m 5 .60,

ter is not working properly, it will have to be recalibrated.

the null value .60 must be subtracted from each x i ; the reported

Assume that if it is properly calibrated, measured concen-

mean is then the average of the (x i 2 .60) values. Also, SAS’s

tration for span gas samples is normally distributed. On the

P-value is always for a two-tailed test.] What would be con-

basis of the six readings—85, 77, 82, 68, 72, and 69—is

cluded for a significance level of .01? .05? .10?

19 The fuel efficiency (mpg) of any particular new vehicle under specified driving con-

Example 8.19 The fuel efficiency (mpg) of any particular new vehicle under specified driving con-

Parts

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

STUDI AREA TRAFFIC CONTROL SYSTEM (ATCS) PADA PERSIMPANGAN DI KOTA MALANG (JALAN A. YANI – L. A. SUCIPTO – BOROBUDUR)

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

FENOLOGI KEDELAI BERDASARKAN KRITERIA FEHR-CAVINESS PADA DELAPAN PERSILANGAN SERTA EMPAT TETUA KEDELAI (Glycine max. L. Merrill)

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

The Correlation between students vocabulary master and reading comprehension

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

Transmission of Greek and Arabic Veteri

Dukungan

Links

19 The fuel efficiency (mpg) of any particular new vehicle under specified driving con-

Example 8.19 The fuel efficiency (mpg) of any particular new vehicle under specified driving con-

Parts

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

STUDI AREA TRAFFIC CONTROL SYSTEM (ATCS) PADA PERSIMPANGAN DI KOTA MALANG (JALAN A. YANI – L. A. SUCIPTO – BOROBUDUR)

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

FENOLOGI KEDELAI BERDASARKAN KRITERIA FEHR-CAVINESS PADA DELAPAN PERSILANGAN SERTA EMPAT TETUA KEDELAI (Glycine max. L. Merrill)

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

The Correlation between students vocabulary master and reading comprehension

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

Transmission of Greek and Arabic Veteri

Dokumen yang Anda mencari sudah siap untuk unduhkan