10 The true average voltage drop from collector to emitter of insulated gate bipolar
Example 8.10 The true average voltage drop from collector to emitter of insulated gate bipolar
transistors of a certain type is supposed to be at most 2.5 volts. An investigator selects a sample of n 5 10 such transistors and uses the resulting voltages as a basis
for testing H 0 : m 5 2.5 versus H a : m . 2.5 using a t test with significance level
a 5 .05 . If the standard deviation of the voltage distribution is s 5 .100 , how
likely is it that H 0 will not be rejected when in fact m 5 2.6 ? With
d 5 u 2.5 2 2.6 u .100 5 1.0 , the point on the b curve at 9 df for a one-tailed test with a 5 .05 above 1.0 has a height of approximately .1, so b < .1 . The investiga- tor might think that this is too large a value of b for such a substantial departure from
H 0 and may wish to have b 5 .05 for this alternative value of m. Since d 5 1.0 , the point (d, b) 5 (1.0, .05) must be located. This point is very close to the 14 df curve, so using n 5 15 will give both a 5 .05 and b 5 .05 when the value of m is 2.6 and s 5 .10 . A larger value of s would give a larger b for this alternative, and an alter- native value of m closer to 2.5 would also result in an increased value of b.
■ Most of the widely used statistical software packages are capable of calculat-
ing type II error probabilities. They generally work in terms of power, which is sim- ply 12b . A small value of b (close to 0) is equivalent to large power (near 1). A powerful test is one that has high power and therefore good ability to detect when the null hypothesis is false.
As an example, we asked Minitab to determine the power of the upper-tailed test in Example 8.10 for the three sample sizes 5, 10, and 15 when a 5 .05, s 5 .10, and the value of m is actually 2.6 rather than the null value 2.5—a “difference” of.
CHAPTER 8 Tests of Hypotheses Based on a Single Sample
2.6 2.5 .1. We also asked the software to determine the necessary sample size for
a power of .9 ( b 5 .1 ) and also .95. Here is the resulting output:
Power and Sample Size
Testing mean null (versus . null) Calculating power for mean null difference Alpha 0.05 Assumed standard deviation 0.1
Sample Target Actual
Difference
Size
Power Power
The power for the sample size n 5 10 is a bit smaller than .9. So if we insist that the power be at least .9, a sample size of 11 is required and the actual power for that n is roughly .92. The software says that for a target power of .95, a sample size of n 5 13 is required, whereas eyeballing our b curves gave 15. When available, this type of software is more reliable than the curves. Finally, Minitab now also provides power curves for the specified sample sizes, as shown in Figure 8.6. Such curves show how the power increases for each sample size as the actual value of m moves further and further away from the null value.
Power Curves for 1-Sample t Test
Sample Size 5
Assumptions Alpha 0.05
Figure 8.6 Power curves from Minitab for the t test of Example 8.10
EXERCISES Section 8.2 (15–36)
15. Let the test statistic Z have a standard normal distribution
a. H a :m.m 0 , rejection region z 1.88
when H 0 is true. Give the significance level for each of the
b. H a :m,m 0 , rejection region z 22.75
following situations:
c. H a :m2m 0 , rejection region z 2.88 or z 22.88
8.2 Tests About a Population Mean
16. Let the test statistic T have a t distribution when H 0 is true. Give
out to see whether this is the case. What conclusion is
the significance level for each of the following situations:
appropriate in each of the following situations?
a. H a :m.m 0 , df 5 15 , rejection region t 3.733
a. n 5 13, t 5 1.6, a 5 .05
b. H a :m,m 0 , n 5 24 , rejection region t 22.500
b. n 5 13, t 5 21.6, a 5 .05
c. H a :m2m 0 , n 5 31 , rejection region t 1.697 or
c. n 5 25, t 5 22.6, a 5 .01
t 21.697
d. n 5 25, t 5 23.9
17. Answer the following questions for the tire problem in
22. The article “The Foreman’s View of Quality Control”
Example 8.7.
(Quality Engr., 1990: 257–280) described an investigation
a. If x 5 30,960 and a level a 5 .01 test is used, what is
into the coating weights for large pipes resulting from a
the decision?
galvanized coating process. Production standards call for
b. If a level .01 test is used, what is b(30,500)?
a true average weight of 200 lb per pipe. The accompany-
c. If a level .01 test is used and it is also required that
ing descriptive summary and boxplot are from Minitab.
b(30,500) 5 .05 , what sample size n is necessary?
d. If x 5 30,960 , what is the smallest a at which H can be
Variable N Mean Median TrMean StDev SEMean 0 ctg wt
30 206.73 206.00 206.81 6.35 rejected (based on 1.16 n 5 16 )?
18. Reconsider the paint-drying situation of Example 8.2, in
which drying time for a test specimen is normally distrib-
uted with s59 . The hypotheses H 0 : m 5 75 versus
H a : m , 75 are to be tested using a random sample of n 5 25 observations.
a. How many standard deviations (of ) below the null X value is
b. If x 5 72.3 , what is the conclusion using a 5 .01 ?
Coating weight
c. What is a for the test procedure that rejects H when
a. What does the boxplot suggest about the status of the
d. For the test procedure of part (c), what is b(70)?
specification for true average coating weight?
e. If the test procedure of part (c) is used, what n is neces-
b. A normal probability plot of the data was quite straight. Use
sary to ensure that b(70) 5 .01 ?
the descriptive output to test the appropriate hypotheses.
f. If a level .01 test is used with n 5 100 , what is the prob- ability of a type I error when m 5 76 ?
23. Exercise 36 in Chapter 1 gave n 5 26 observations on escape time (sec) for oil workers in a simulated exercise,
19. The melting point of each of 16 samples of a certain brand
from which the sample mean and sample standard deviation
of hydrogenated vegetable oil was determined, resulting in
are 370.69 and 24.36, respectively. Suppose the investiga-
x 5 94.32 . Assume that the distribution of the melting point
tors had believed a priori that true average escape time
is normal with s 5 1.20 .
would be at most 6 min. Does the data contradict this prior
a. Test H 0 : m 5 95 versus H a : m 2 95 using a two-tailed
belief? Assuming normality, test the appropriate hypotheses
level .01 test.
using a significance level of .05.
b. If a level .01 test is used, what is b(94), the probability of a type II error when m 5 94 ?
24. Reconsider the sample observations on stabilized viscosity
c. What value of n is necessary to ensure that b(94) 5 .1
of asphalt specimens introduced in Exercise 46 in Chapter 1
when ? a 5 .01
(2781, 2900, 3013, 2856, and 2888). Suppose that for a par- ticular application it is required that true average viscosity
20. Lightbulbs of a certain type are advertised as having an
be 3000. Does this requirement appear to have been satis-
average lifetime of 750 hours. The price of these bulbs is
fied? State and test the appropriate hypotheses.
very favorable, so a potential customer has decided to go ahead with a purchase arrangement unless it can be conclu-
25. The desired percentage of SiO 2 in a certain type of alumi-
sively demonstrated that the true average lifetime is smaller
nous cement is 5.5. To test whether the true average per-
than what is advertised. A random sample of 50 bulbs was
centage is 5.5 for a particular production facility, 16
selected, the lifetime of each bulb determined, and the
independently obtained samples are analyzed. Suppose that
appropriate hypotheses were tested using Minitab, resulting
the percentage of SiO 2 in a sample is normally distributed
in the accompanying output.
with and s 5 .3 that . x 5 5.25
a. Does this indicate conclusively that the true average per-
Variable N
Mean StDev SEMean
Z P-Value
lifetime 50 738.44 38.20
centage differs from 5.5? Carry out the analysis using the sequence of steps suggested in the text.
What conclusion would be appropriate for a significance
b. If the true average percentage is m 5 5.6 and a level
level of .05? A significance level of .01? What significance
a 5 .01 test based on n 5 16 is used, what is the prob-
level and conclusion would you recommend?
ability of detecting this departure from H 0 ?
21. The true average diameter of ball bearings of a certain type
c. What value of n is required to satisfy a 5 .01 and
is supposed to be .5 in. A one-sample t test will be carried
b(5.6) 5 .01 ?
CHAPTER 8 Tests of Hypotheses Based on a Single Sample
26. To obtain information on the corrosion-resistance properties
A normal probability plot of the data shows a reasonably lin-
of a certain type of steel conduit, 45 specimens are buried in
ear pattern, so it is plausible that the population distribution of
soil for a 2-year period. The maximum penetration (in mils)
repair time is at least approximately normal. The sample mean
for each specimen is then measured, yielding a sample aver-
and standard deviation are 249.7 and 145.1, respectively.
age penetration of x 5 52.7 and a sample standard deviation
a. Is there compelling evidence for concluding that true
of s 5 4.8 . The conduits were manufactured with the specifi-
average repair time exceeds 200 min? Carry out a test of
cation that true average penetration be at most 50 mils. They
hypotheses using a significance level of .05.
will be used unless it can be demonstrated conclusively that
b. Using s 5 150 , what is the type II error probability of
the specification has not been met. What would you conclude?
the test used in (a) when true average repair time is actu-
27. Automatic identification of the boundaries of significant struc-
ally 300 min? That is, what is b(300)?
tures within a medical image is an area of ongoing research.
30. Have you ever been frustrated because you could not get a
The paper “Automatic Segmentation of Medical Images Using
container of some sort to release the last bit of its contents?
Image Registration: Diagnostic and Simulation Applications”
The article “Shake, Rattle, and Squeeze: How Much Is Left
(J. of Medical Engr. and Tech., 2005: 53–63) discussed a new
in That Container?” (Consumer Reports, May 2009: 8)
technique for such identification. A measure of the accuracy of
reported on an investigation of this issue for various con-
the automatic region is the average linear displacement (ALD).
sumer products. Suppose five 6.0 oz tubes of toothpaste of a
The paper gave the following ALD observations for a sample
particular brand are randomly selected and squeezed until no
of 49 kidneys (units of pixel dimensions).
more toothpaste will come out. Then each tube is cut open 1.38 0.44 1.09 0.75 0.66 1.28 0.51 and the amount remaining is weighed, resulting in the fol-
0.39 0.70 0.46 0.54 0.83 0.58 0.64 lowing data (consistent with what the cited article reported): 1.30 0.57 0.43 0.62 1.00 1.05 0.82 .53, .65, .46, .50, .37. Does it appear that the true average 1.10 0.65 0.99 0.56 0.64 0.45 amount left is less than 10 of the advertised net contents?
0.82 1.06 0.41 0.58 0.66 0.54 0.83 a. Check the validity of any assumptions necessary for test-
1.11 0.34 1.25 0.38 1.44 1.28 0.51 ing the appropriate hypotheses.
b. Carry out a test of the appropriate hypotheses using a
a. Summarizedescribe the data.
significance level of .05. Would your conclusion change
b. Is it plausible that ALD is at least approximately nor-
if a significance level of .01 had been used?
mally distributed? Must normality be assumed prior to
c. Describe in context type I and II errors, and say which
calculating a CI for true average ALD or testing hypothe-
error might have been made in reaching a conclusion.
ses about true average ALD? Explain.
31. A well-designed and safe workplace can contribute greatly to
c. The authors commented that in most cases the ALD is
increased productivity. It is especially important that workers
better than or of the order of 1.0. Does the data in fact
not be asked to perform tasks, such as lifting, that exceed their
provide strong evidence for concluding that true average
capabilities. The accompanying data on maximum weight of
ALD under these circumstances is less than 1.0? Carry
lift (MAWL, in kg) for a frequency of four liftsmin was
out an appropriate test of hypotheses.
reported in the article “The Effects of Speed, Frequency, and
d. Calculate an upper confidence bound for true average ALD
Load on Measured Hand Forces for a Floor-to-Knuckle
using a confidence level of 95, and interpret this bound.
Lifting Task” (Ergonomics, 1992: 833–843); subjects were
28. Minor surgery on horses under field conditions requires a
randomly selected from the population of healthy males ages
reliable short-term anesthetic producing good muscle relax-
18–30. Assuming that MAWL is normally distributed, does
ation, minimal cardiovascular and respiratory changes, and
the data suggest that the population mean MAWL exceeds
a quick, smooth recovery with minimal aftereffects so that
25? Carry out a test using a significance level of .05.
horses can be left unattended. The article “A Field Trial of Ketamine Anesthesia in the Horse” (Equine Vet. J., 1984:
176–179) reports that for a sample of n 5 73 horses to
32. The recommended daily dietary allowance for zinc among
which ketamine was administered under certain conditions,
males older than age 50 years is 15 mgday. The article
the sample average lateral recumbency (lying-down) time
“Nutrient Intakes and Dietary Patterns of Older Americans:
was 18.86 min and the standard deviation was 8.6 min. Does
A National Study” (J. of Gerontology, 1992: M145–150)
this data suggest that true average lateral recumbency time
reports the following summary data on intake for a sample
under these conditions is less than 20 min? Test the appro-
of males age 65–74 years: n 5 115 , x 5 11.3 , and
priate hypotheses at level of significance .10.
s 5 6.43 . Does this data indicate that average daily zinc
29. The article “Uncertainty Estimation in Railway Track Life-
intake in the population of all males ages 65–74 falls below
Cycle Cost” (J. of Rail and Rapid Transit, 2009) presented
the recommended allowance?
the following data on time to repair (min) a rail break in the
33. Reconsider the accompanying sample data on expense ratio
high rail on a curved track of a certain railway line.
() for large-cap growth mutual funds first introduced in
Exercise 1.53.
8.3 Tests Concerning a Population Proportion
A normal probability plot shows a reasonably linear pattern.
a. Does this data suggest that the population mean reading
a. Is there compelling evidence for concluding that the pop-
under these conditions differs from 100? State and test
ulation mean expense ratio exceeds 1? Carry out a test
the appropriate hypotheses using
a 5 .05 .
of the relevant hypotheses using a significance level of .01.
b. Suppose that prior to the experiment a value of s 5 7.5
b. Referring back to (a), describe in context type I and II
had been assumed. How many determinations would
errors and say which error you might have made in
then have been appropriate to obtain b 5 .10 for the
reaching your conclusion. The source from which the
alternative ? m 5 95
data was obtained reported that m 5 1.33 for the popu-
.0 lation of all 762 such funds. So did you actually commit , when the population distribution
35. Show that for any
is normal and s is known, the two-tailed test satisfies
an error in reaching your conclusion?
b(m 0 2 ) 5 b(m 0 1 , so that b(mr) is symmetric
c. Supposing that
s 5 .5 , determine and interpret the power
about m 0 .
of the test in (a) for the actual value of m stated in (b).
36. For a fixed alternative value m , show that b(mr) S 0 as
34. A sample of 12 radon detectors of a certain type was
nS` for either a one-tailed or a two-tailed z test in the
selected, and each was exposed to 100 pCiL of radon. The
case of a normal population distribution with known s.
resulting readings were as follows: