f. b. g. A rejection region, the set of all test statistic values for which H

Explain your reasoning. [Hint: Think about the consequences of a type I and type II error for each possibility.] 5. Before agreeing to purchase a large order of polyethylene sheaths for a particular type of high-pressure oil-filled sub- marine power cable, a company wants to see conclusive evi- dence that the true standard deviation of sheath thickness is less than .05 mm. What hypotheses should be tested, and why? In this context, what are the type I and type II errors? 6. Many older homes have electrical systems that use fuses rather than circuit breakers. A manufacturer of 40-amp fuses wants to make sure that the mean amperage at which its fuses burn out is in fact 40. If the mean amperage is lower than 40, customers will complain because the fuses require replacement too often. If the mean amperage is higher than 40, the manufacturer might be liable for damage to an elec- trical system due to fuse malfunction. To verify the amperage of the fuses, a sample of fuses is to be selected and inspected. If a hypothesis test were to be performed on the resulting data, what null and alternative hypotheses would be of inter- est to the manufacturer? Describe type I and type II errors in the context of this problem situation. 7. Water samples are taken from water used for cooling as it is being discharged from a power plant into a river. It has been determined that as long as the mean temperature of the dis- charged water is at most 150°F, there will be no negative effects on the river’s ecosystem. To investigate whether the plant is in compliance with regulations that prohibit a mean discharge water temperature above 150°, 50 water samples will be taken at randomly selected times and the temperature of each sample recorded. The resulting data will be used to test the hypotheses versus . In the context of this situ- ation, describe type I and type II errors. Which type of error would you consider more serious? Explain. 8. A regular type of laminate is currently being used by a manu- facturer of circuit boards. A special laminate has been devel- oped to reduce warpage. The regular laminate will be used on one sample of specimens and the special laminate on another sample, and the amount of warpage will then be determined for each specimen. The manufacturer will then switch to the spe- cial laminate only if it can be demonstrated that the true aver- age amount of warpage for that laminate is less than for the regular laminate. State the relevant hypotheses, and describe the type I and type II errors in the context of this situation. 9. Two different companies have applied to provide cable tele- vision service in a certain region. Let p denote the proportion of all potential subscribers who favor the first company over the second. Consider testing versus based on a random sample of 25 individuals. Let X denote the number in the sample who favor the first company and x rep- resent the observed value of X. a. Which of the following rejection regions is most appro- priate and why? R 3 5 5x: x 176 R 1 5 5x: x 7 or x 186, R 2 5 5x: x 86, H a : p 2 .5 H : p 5 .5 H a : m . 1508 H : m 5 1508 b. In the context of this problem situation, describe what the type I and type II errors are. c. What is the probability distribution of the test statistic X when H is true? Use it to compute the probability of a type I error. d. Compute the probability of a type II error for the selected region when , again when , and also for both and . e. Using the selected region, what would you conclude if 6 of the 25 queried favored company 1? 10. A mixture of pulverized fuel ash and Portland cement to be used for grouting should have a compressive strength of more than 1300 KNm 2 . The mixture will not be used unless exper- imental evidence indicates conclusively that the strength specification has been met. Suppose compressive strength for specimens of this mixture is normally distributed with . Let m denote the true average compressive strength. a. What are the appropriate null and alternative hypotheses? b. Let denote the sample average compressive strength for randomly selected specimens. Consider the test procedure with test statistic and rejection region . What is the probability distribution of the test statistic when H is true? What is the probability of a type I error for the test procedure? c. What is the probability distribution of the test statistic when ? Using the test procedure of part b, what is the probability that the mixture will be judged unsatisfactory when in fact a type II error? d. How would you change the test procedure of part b to obtain a test with significance level .05? What impact would this change have on the error probability of part c? e. Consider the standardized test statistic . What are the values of Z corresponding to the rejection region of part b? 11. The calibration of a scale is to be checked by weighing a 10-kg test specimen 25 times. Suppose that the results of dif- ferent weighings are independent of one another and that the weight on each trial is normally distributed with kg. Let m denote the true average weight reading on the scale. a. What hypotheses should be tested? b. Suppose the scale is to be recalibrated if either or . What is the probability that recalibration is carried out when it is actually unnecessary? c. What is the probability that recalibration is judged un- necessary when in fact ? When ? d. Let . For what value c is the rejec- tion region of part b equivalent to the “two-tailed” region of either or ? e. If the sample size were only 10 rather than 25, how should the procedure of part d be altered so that ? f. Using the test of part e, what would you conclude from the following sample data? 9.981 10.006 9.857 10.107 9.888 9.728 10.439 10.214 10.190 9.793 a 5 .05 z 2c z c z 5 x 2 10s 1n m 5 9.8 m 5 10.1 x 9.8968 x 10.1032 s 5 .200 Z 5 X 2 1300s 1n 5 X 2 130013.42 m 5 1350 m 5 1350 x 1331.26 X n 5 10 X s 5 60 p 5 .7 p 5 .6 p 5 .4 p 5 .3 Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook andor eChapters. Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. g. Reexpress the test procedure of part b in terms of the standardized test statistic . 12. A new design for the braking system on a certain type of car has been proposed. For the current system, the true average braking distance at 40 mph under specified conditions is known to be 120 ft. It is proposed that the new design be implemented only if sample data strongly indicates a reduc- tion in true average braking distance for the new design. a. Define the parameter of interest and state the relevant hypotheses. b. Suppose braking distance for the new system is normally distributed with . Let denote the sample average braking distance for a random sample of 36 observations. Which of the following three rejection regions is appro- priate: ? c. What is the significance level for the appropriate region of part b? How would you change the region to obtain a test with ? d. What is the probability that the new design is not imple- mented when its true average braking distance is actually 115 ft and the appropriate region from part b is used? a 5 .001 R 3 5 5x: either x 125.13 or x 114.876 R 2 5 5x: x 115.206, R 1 5 5x: x 124.806, X s 5 10 Z 5 X 2 10s 1n e. Let . What is the significance level for the rejection region ? For the region ? 13. Let denote a random sample from a normal pop- ulation distribution with a known value of s. a. For testing the hypotheses versus where m is a fixed number, show that the test with test statistic and rejection region has significance level .01. b. Suppose the procedure of part a is used to test versus . If , and , what is the probability of committing a type I error when ? When ? In general, what can be said about the probability of a type I error when the actual value of m is less than m ? Verify your assertion. 14. Reconsider the situation of Exercise 11 and suppose the rejection region is . a. What is a for this procedure? b. What is b when ? When ? Is this desirable? m 5 9.9 m 5 10.1 5z: z 2.51 or z 22.656 5x: x 10.1004 or x 9.89406 5 m 5 98 m 5 99 s 5 5 m 5 100, n 5 25 H a : m . m H : m m x m 1 2.33s 1n X H a : m . m H : m 5 m X 1 , c , X n 5z: z 22.886 5z: z 22.336 Z 5 X 2 120s 1n 8.2 Tests About a Population Mean The general discussion in Chapter 7 of confidence intervals for a population mean m focused on three different cases. We now develop test procedures for these cases. Case I: A Normal Population with Known s Although the assumption that the value of s is known is rarely met in practice, this case provides a good starting point because of the ease with which general proce- dures and their properties can be developed. The null hypothesis in all three cases will state that m has a particular numerical value, the null value, which we will denote by m . Let represent a random sample of size n from the normal population. Then the sample mean has a normal distribution with expected value and standard deviation . When H is true, . Consider now the statistic Z obtained by standardizing under the assumption that H is true: Substitution of the computed sample mean gives z, the distance between and m expressed in “standard deviation units.” For example, if the null hypothesis is , and , then the test statistic value is . That is, the observed value of is 1.5 standard deviations of larger than what we expect it to be when H is true. The statistic Z is a natural measure of the distance between , the estimator of m, and its expected value when H is true. If this distance is too great in a direction consistent with H a , the null hypothesis should be rejected. Suppose first that the alternative hypothesis has the form . Then an value less than m certainly does not provide support for H a . Such an corresponds x x H a : m . m X X x z 5 103 2 1002.0 5 1.5 x 5 103 H : m 5 100, s X 5 s 1n 5 10 125 5 2.0 x x Z 5 X 2 m s 1n X m X 5 m s X 5 s 1n m X 5 m X X 1 , c , X n Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook andor eChapters. Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. to a negative value of z since is negative and the divisor is positive. Similarly, an value that exceeds m by only a small amount corresponding to z, which is positive but small does not suggest that H should be rejected in favor of H a . The rejection of H is appropriate only when considerably exceeds m —that is, when the z value is positive and large. In summary, the appropriate rejection region, based on the test statistic Z rather than , has the form . As discussed in Section 8.1, the cutoff value c should be chosen to control the probability of a type I error at the desired level a. This is easily accomplished because the distribution of the test statistic Z when H is true is the standard normal distribution that’s why m was subtracted in standardizing. The required cutoff c is the z critical value that captures upper-tail area a under the z curve. As an example, let , the value that captures tail area . Then, More generally, the rejection region has type I error probability a. The test procedure is upper-tailed because the rejection region consists only of large values of the test statistic. Analogous reasoning for the alternative hypothesis suggests a rejection region of the form , where c is a suitably chosen negative number is far below m if and only if z is quite negative. Because Z has a standard normal distribution when H is true, taking yields . This is a lower-tailed test. For example, implies that the rejection region specifies a test with significance level .10. Finally, when the alternative hypothesis is should be rejected if is too far to either side of m 0. This is equivalent to rejecting H either if or if . Suppose we desire . Then, Thus c is such that , the area under the z curve to the right of c, is .025 and not .05. From Section 4.3 or Appendix Table A.3, , and the rejection region is or . For any a, the two-tailed rejection region or has type I error probability a since area a2 is captured under each of the two tails of the z curve. Again, the key reason for using the standardized test sta- tistic Z is that because Z has a known distribution when H is true standard normal, a rejection region with desired type I error probability is easily obtained by using an appropriate critical value. The test procedure for case I is summarized in the accompanying box, and the corresponding rejection regions are illustrated in Figure 8.2. z 2z a 2 z z a 2 z 2 1.96 z 1.96 c 5 1.96 1 2 ⌽c 5 ⌽2c 1 1 2 ⌽c 5 2[1 2 ⌽c] .05 5 PZ c or Z 2c when Z has a standard normal distribution a 5 .05 z 2c z c x H a : m 2 m , H z 2 1.28 z .10 5 1.28 P type I error 5 a c 5 2z a x z c H a : m , m z z a 5 PZ 1.645 when Z , N0,1 5 1 2 ⌽1.645 5 .05 a 5 Ptype I error 5 PH is rejected when H is true .05 z .05 5 1.645 c 5 1.645 z c X x x s 1n x 2 m Null hypothesis: Test statistic value: Alternative Hypothesis Rejection Region for Level a Test upper-tailed test lower-tailed test either or two-tailed test z 2z a 2 z z a 2 H a : m 2 m z 2z a H a : m , m z z a H a : m . m z 5 x 2 m s 1n H : m 5 m Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook andor eChapters. Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. Example 8.6 ␣ z ␣ z z Rejection region: z ⱖ z ␣ Rejection region: z ⱕ ⫺z ␣ Shaded area ⫽ ⫽ Ptype I error ␣ Total shaded area ⫽ ⫽ Ptype I error ␣ ⫺ ⫺ z 2 ␣ z 2 ␣ Rejection region: either z ⱖ z 2 or z ⱕ ⫺ 2 ␣ ␣ Shaded area ⫽ 2 ␣ Shaded area ⫽ 2 ␣ z curve probability distribution of test statistic Z when H is true a b c Figure 8.2 Rejection regions for z tests: a upper-tailed test; b lower-tailed test; c two-tailed test Use of the following sequence of steps is recommended when testing hypotheses about a parameter. 1. Identify the parameter of interest and describe it in the context of the problem sit- uation. 2. Determine the null value and state the null hypothesis. 3. State the appropriate alternative hypothesis. 4. Give the formula for the computed value of the test statistic substituting the null value and the known values of any other parameters, but not those of any sample- based quantities. 5. State the rejection region for the selected significance level a. 6. Compute any necessary sample quantities, substitute into the formula for the test statistic value, and compute that value. 7. Decide whether H should be rejected, and state this conclusion in the problem context. The formulation of hypotheses Steps 2 and 3 should be done before examining the data. A manufacturer of sprinkler systems used for fire protection in office buildings claims that the true average system-activation temperature is 130°. A sample of sys- tems, when tested, yields a sample average activation temperature of 131.08°F. If the distribution of activation times is normal with standard deviation 1.5°F, does the data contradict the manufacturer’s claim at significance level ? 1. Parameter of interest: average activation temperature. 2. Null hypothesis: . 3. Alternative hypothesis: a departure from the claimed value in either direction is of concern. 4. Test statistic value: z 5 x 2 m s 1n 5 x 2 130 1.5 1n H a : m 2 130 H : m 5 130 null value 5 m 5 130 m 5 true a 5 .01 n 5 9 Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook andor eChapters. Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 5. Rejection region: The form of H a implies use of a two-tailed test with rejection region either or . From Section 4.3 or Appendix Table A.3, , so we reject H if either or . 6. Substituting and , That is, the observed sample mean is a bit more than 2 standard deviations above what would have been expected were H true. 7. The computed value does not fall in the rejection region , so H cannot be rejected at significance level .01. The data does not give strong support to the claim that the true average differs from the design value of 130. ■ B and Sample Size Determination The z tests for case I are among the few in statistics for which there are simple formulas available for b, the probability of a type II error. Consider first the upper-tailed test with rejection region . This is equivalent to , so H will not be rejected if . Now let denote a particular value of m that exceeds the null value m . Then, As m⬘ increases, m ⫺ m⬘ becomes more negative, so bm⬘ will be small when m⬘ greatly exceeds m because the value at which ⌽ is evaluated will then be quite neg- ative. Error probabilities for the lower-tailed and two-tailed tests are derived in an analogous manner. If s is large, the probability of a type II error can be large at an alternative value m⬘ that is of particular concern to an investigator. Suppose we fix a and also specify b for such an alternative value. In the sprinkler example, company officials might view as a very substantial departure from and there- fore wish in addition to . More generally, consider the two restrictions and for specified and Then for an upper-tailed test, the sample size n should be chosen to satisfy This implies that It is easy to solve this equation for the desired n. A parallel argument yields the nec- essary sample size for lower- and two-tailed tests as summarized in the next box. 2z b 5 z critical value that captures lower-tail area b 5 z a 1 m 2 mr s 1n ⌽ az a 1 m 2 mr s 1n b 5 b b . m r, a , b mr 5 b P type I error 5 a a 5 .01 b 132 5 .10 H : m 5 130 m r 5 132 5 ⌽ az a 1 m 2 mr s 1n b 5 P a X 2 m r s 1n , z a 1 m 2 mr s 1n when m 5 mr b 5 PX , m 1 z a s 1n when m 5 mr bmr 5 PH is not rejected when m 5 mr m r x , m 1 z a s 1n x m 1 z a s 1n z z a 2.16 , 2.58 22.58 , z 5 2.16 z 5 131.08 2 130 1.5 19 5 1.08 .5 5 2.16 x 5 131.08 n 5 9 z 2 2.58 z 2.58 z .005 5 2.58 z 2z .005 z z .005 Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook andor eChapters. Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. Example 8.7 Let m denote the true average tread life of a certain type of tire. Consider testing versus based on a sample of size from a normal population distribution with . A test with requires . The probability of making a type II error when is Since , the requirement that the level .01 test also have necessitates The sample size must be an integer, so tires should be used. ■ Case II: Large-Sample Tests When the sample size is large, the z tests for case I are easily modified to yield valid test procedures without requiring either a normal population distribution or known s . The key result was used in Chapter 7 to justify large-sample confidence intervals: A large n implies that the standardized variable has approximately a standard normal distribution. Substitution of the null value m in place of m yields the test statistic which has approximately a standard normal distribution when H is true. The use of rejection regions given previously for case I e.g., when the alternative hypothesis is then results in test procedures for which the significance H a : m . m z z a Z 5 X 2 m S 1n Z 5 X 2 m S 1n n 5 30 n 5 c 15002.33 1 1.28 30,000 2 31,000 d 2 5 25.42 2 5 29.32 b 31,000 5 .1 z .1 5 1.28 b 31,000 5 ⌽ a2.33 1 30,000 2 31,000 1500 116 b 5 ⌽2.34 5 .3669 m 5 31,000 z a 5 z .01 5 2.33 a 5 .01 s 5 1500 n 5 16 H a : m . 30,000 H : m 5 30,000 Alternative Hypothesis Type II Error Probability for a Level a Test where The sample size n for which a level a test also has at the alternative value m⬘ is c s z a 2 1 z b m 2 mr d 2 for a two-tailed test an approximate solution c s z a 1 z b m 2 mr d 2 for a one-tailed upper or lower test μ n 5 b mr 5 b ⌽ z 5 the standard normal cdf. ⌽ az a 2 1 m 2 mr s 1n b 2 ⌽a2z a 2 1 m 2 mr s 1n b H a : m 2 m 1 2 ⌽ a2z a 1 m 2 mr s 1n b H a : m , m ⌽ az a 1 m 2 mr s 1n b H a : m . m b mr Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook andor eChapters. Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. Example 8.8 level is approximately rather than exactly a. The rule of thumb will again be used to characterize a large sample size. A dynamic cone penetrometer DCP is used for measuring material resistance to penetration mmblow as a cone is driven into pavement or subgrade. Suppose that for a particular application it is required that the true average DCP value for a cer- tain type of pavement be less than 30. The pavement will not be used unless there is conclusive evidence that the specification has been met. Let’s state and test the appropriate hypotheses using the following data “Probabilistic Model for the Analysis of Dynamic Cone Penetrometer Test Values in Pavement Structure Evaluation,” J. of Testing and Evaluation, 1999: 7–14: 14.1 14.5 15.5 16.0 16.0 16.7 16.9 17.1 17.5 17.8 17.8 18.1 18.2 18.3 18.3 19.0 19.2 19.4 20.0 20.0 20.8 20.8 21.0 21.5 23.5 27.5 27.5 28.0 28.3 30.0 30.0 31.6 31.7 31.7 32.5 33.5 33.9 35.0 35.0 35.0 36.7 40.0 40.0 41.3 41.7 47.5 50.0 51.0 51.8 54.4 55.0 57.0 Figure 8.3 shows a descriptive summary obtained from Minitab. The sample mean DCP is less than 30. However, there is a substantial amount of variation in the data sample coefficient of variation , so the fact that the mean is less than the design specification cutoff may be a consequence just of sampling variabil- ity. Notice that the histogram does not resemble at all a normal curve and a normal probability plot does not exhibit a linear pattern, but the large-sample z tests do not require a normal population distribution. 5 s x 5 .4265 n . 40 Descriptive Statistics Variable: DCP Anderson-Darling Normality Test A-Squarect 1.902 0.000 28.7615 12.2647 150.423 0.808264 –3.9E–01 52 14.1000 18.2250 27.5000 35.0000 57.0000 3.21761 15.2098 31.7000 20.0000 10.2784 25.3470 20 25 30

15 25

35 45

55 P-Value: Mean StDev Variance Skewness Kurtosis N Minimum 1st Quartile Median 3rd Quartile Maximum 95 Confidence Interval for Mu 95 Confidence Interval for Mu 95 Confidence Interval for Sigma 95 Confidence Interval for Median 95 Confidence Interval for Median Figure 8.3 Minitab descriptive summary for the DCP data of Example 8.8 1. true average DCP value

2. H

: m 5 30 m 5 Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook andor eChapters. Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 3. so the pavement will not be used unless the null hypothesis is rejected

4. 5.

A test with significance level .05 rejects H when a