Introduction Lyman Ott Michael Longnecker
3.
Problem: What effect does nitrogen fertilizer have on wheat production? For a study of the effects of nitrogen fertilizer on wheat production, a total of 15 fields were available to the researcher. She randomly assigned three fields to each of the five nitrogen rates under investigation. The same variety of wheat was planted in all 15 fields. The fields were culti- vated in the same manner until harvest, and the number of pounds of wheat per acre was then recorded for each of the 15 fields. The experi- menter wanted to determine the optimal level of nitrogen to apply to any wheat field, but, of course, she was limited to running experiments on a limited number of fields. After determining the amount of nitrogen that yielded the largest production of wheat in the study fields, the experimenter then concluded that similar results would hold for wheat fields possessing characteristics somewhat the same as the study fields. Is the experimenter justified in reaching this conclusion? 4. Problem: Determining public opinion toward a question, issue, product, or candidate. Similar applications of statistics are brought to mind by the frequent use of the New York TimesCBS News, Washington Post ABC News, CNN, Harris, and Gallup polls. How can these poll- sters determine the opinions of more than 195 million Americans who are of voting age? They certainly do not contact every potential voter in the United States. Rather, they sample the opinions of a small number of potential voters, perhaps as few as 1,500, to estimate the reaction of every person of voting age in the country. The amazing result of this process is that if the selection of the voters is done in an unbiased way and voters are asked unambiguous, nonleading questions, the fraction of those persons contacted who hold a particular opinion will closely match the fraction in the total population holding that opinion at a particular time. We will supply convincing supportive evidence of this assertion in subsequent chapters. These problems illustrate the four-step process in Learning from Data. First, there was a problem or question to be addressed. Next, for each problem a study or experiment was proposed to collect meaningful data to answer the problem. The quality assurance department had to decide both how many bulbs needed to be tested and how to select the sample of 1,000 bulbs from the total production of bulbs to obtain valid results. The polling groups must decide how many voters to sample and how to select these individuals in order to obtain information that is representative of the population of all voters. Similarly, it was necessary to care- fully plan how many participants in the weight-gain study were needed and how they were to be selected from the list of all such participants. Furthermore, what variables should the researchers have measured on each participant? Was it neces- sary to know each participant’s age, sex, physical fitness, and other health-related variables, or was weight the only important variable? The results of the study may not be relevant to the general population if many of the participants in the study had a particular health condition. In the wheat experiment, it was important to measure both the soil characteristics of the fields and the environmental condi- tions, such as temperature and rainfall, to obtain results that could be generalized to fields not included in the study. The design of a study or experiment is crucial to obtaining results that can be generalized beyond the study. Finally, having collected, summarized, and analyzed the data, it is important to report the results in unambiguous terms to interested people. For the lightbulb example, management and technical staff would need to know the quality of their production batches. Based on this information, they could determine whether adjustments in the process are necessary. Therefore, the results of the statistical analyses cannot be presented in ambiguous terms; decisions must be made from a well-defined knowledge base. The results of the weight-gain study would be of vital interest to physicians who have patients participating in the smoking-cessation program. If a significant increase in weight was recorded for those individuals who had quit smoking, physicians may have to recommend diets so that the former smokers would not go from one health problem smoking to another elevated blood pressure due to being overweight. It is crucial that a careful description of the participants—that is, age, sex, and other health-related information—be in- cluded in the report. In the wheat study, the experiment would provide farmers with information that would allow them to economically select the optimum amount of nitrogen required for their fields. Therefore, the report must contain information concerning the amount of moisture and types of soils present on the study fields. Otherwise, the conclusions about optimal wheat production may not pertain to farmers growing wheat under considerably different conditions. To infer validly that the results of a study are applicable to a larger group than just the participants in the study, we must carefully define the population see Definition 1.1 to which inferences are sought and design a study in which the sample see Definition 1.2 has been appropriately selected from the designated population. We will discuss these issues in Chapter 2. population sample DEFINITION 1.1 A population is the set of all measurements of interest to the sample collec- tor. See Figure 1.2. DEFINITION 1.2 A sample is any subset of measurements selected from the population. See Figure 1.2. 1.2 Why Study Statistics? We can think of many reasons for taking an introductory course in statistics. One reason is that you need to know how to evaluate published numerical facts. Every person is exposed to manufacturers’ claims for products; to the results of sociolog- ical, consumer, and political polls; and to the published results of scientific re- search. Many of these results are inferences based on sampling. Some inferences are valid; others are invalid. Some are based on samples of adequate size; others are not. Yet all these published results bear the ring of truth. Some people partic- ularly statisticians say that statistics can be made to support almost anything. Others say it is easy to lie with statistics. Both statements are true. It is easy, purposely or unwittingly, to distort the truth by using statistics when presenting the results of sampling to the uninformed. It is thus crucial that you become an informed and critical reader of data-based reports and articles. A second reason for studying statistics is that your profession or employment may require you to interpret the results of sampling surveys or experimentation or to employ statistical methods of analysis to make inferences in your work. For example, practicing physicians receive large amounts of advertising describing the benefits of new drugs. These advertisements frequently display the numerical results of experiments that compare a new drug with an older one. Do such data really imply that the new drug is more effective, or is the observed difference in results due simply to random variation in the experimental measurements? Recent trends in the conduct of court trials indicate an increasing use of probability and statistical inference in evaluating the quality of evidence. The use of statistics in the social, biological, and physical sciences is essential because all these sciences make use of observations of natural phenomena, through sample surveys or experimentation, to develop and test new theories. Statistical methods are employed in business when sample data are used to forecast sales and profit. In addition, they are used in engineering and manufacturing to monitor product qual- ity. The sampling of accounts is a useful tool to assist accountants in conducting au- dits. Thus, statistics plays an important role in almost all areas of science, business, and industry; persons employed in these areas need to know the basic concepts, strengths, and limitations of statistics. The article “What Educated Citizens Should Know About Statistics and Probability,” by J. Utts, in The American Statistician, May 2003, contains a number FIGURE 1.2 Population and sample Set of all measurements: the population Set of measurements selected from the population: the sample of statistical ideas that need to be understood by users of statistical methodology in order to avoid confusion in the use of their research findings. Misunderstandings of statistical results can lead to major errors by government policymakers, medical workers, and consumers of this information. The article selected a number of top- ics for discussion. We will summarize some of the findings in the article. A com- plete discussion of all these topics will be given throughout the book. 1. One of the most frequent misinterpretations of statistical findings is when a statistically significant relationship is established between two variables and it is then concluded that a change in the explanatory variable causes a change in the response variable. As will be discussed in the book, this conclusion can be reached only under very restrictive constraints on the experimental setting. Utts examined a recent Newsweek article discussing the relationship between the strength of religious beliefs and physical healing. Utts’ article discussed the problems in reaching the conclusion that the stronger a patient’s reli- gious beliefs, the more likely patients would be cured of their ailment. Utts shows that there are numerous other factors involved in a patient’s health, and the conclusion that religious beliefs cause a cure can not be validly reached. 2. A common confusion in many studies is the difference between statisti- cally significant findings in a study and practically significant findings. This problem often occurs when large data sets are involved in a study or experiment. This type of problem will be discussed in detail through- out the book. We will use a number of examples that will illustrate how this type of confusion can be avoided by the researcher when reporting the findings of their experimental results. Utts’ article illustrated this problem with a discussion of a study that found a statistically significant difference in the average heights of military recruits born in the spring and in the fall. There were 507,125 recruits in the study and the differ- ence in average height was about 1 兾4 inch. So, even though there may be a difference in the actual average height of recruits in the spring and the fall, the difference is so small 1 兾4 inch that it is of no practical importance.3.
The size of the sample also may be a determining factor in studies in which statistical significance is not found. A study may not have selected a sample size large enough to discover a difference between the several populations under study. In many government-sponsored studies, the researchers do not receive funding unless they are able to demonstrate that the sample sizes selected for their study are of an appropriate size to detect specified differences in populations if in fact they exist. Methods to determine appropriate sample sizes will be pro- vided in the chapters on hypotheses testing and experimental design. 4. Surveys are ubiquitous, especially during the years in which national elections are held. In fact, market surveys are nearly as widespread as political polls. There are many sources of bias that can creep into the most reliable of surveys. The manner in which people are selected for inclusion in the survey, the way in which questions are phrased, and even the manner in which questions are posed to the subject may affect the conclusions obtained from the survey. We will discuss these issues in Chapter 2.Parts
» Introduction 2 Why Study Statistics? 6
» Some Current Applications of Statistics 8 A Note to the Student 12
» Summary 13 Exercises 13 Lyman Ott Michael Longnecker
» Introduction and Abstract of Research Study 16 Observational Studies 18
» Sampling Designs for Surveys 24 Experimental Studies 30
» Designs for Experimental Studies 35 Research Study: Exit Polls versus Election Results 46
» Summary 47 Exercises 48 Lyman Ott Michael Longnecker
» Introduction and Abstract of Research Study 56 Calculators, Computers, and Software Systems 61
» Describing Data on a Single Variable: Measures of Variability 85 The Boxplot 97
» Summary and Key Formulas 116 Introduction and Abstract of Research Study 140
» Finding the Probability of an Event 144 Basic Event Relations and Probability Laws 146
» Conditional Probability and Independence 149 Bayes’ Formula 152
» Variables: Discrete and Continuous 155 Probability Distributions for Discrete Random Variables 157
» A Continuous Probability Distribution: The Normal Distribution 171 Random Sampling 178
» Sampling Distributions 181 Normal Approximation to the Binomial 191
» Minitab Instructions 201 Summary and Key Formulas 203
» Exercises 203 Introduction and Abstract of Research Study 222
» Estimation of m 225 Choosing the Sample Size for Estimating m 230
» A Statistical Test for m 232 Research Study: Effects of Oil Spill on Plant Growth 325
» Summary and Key Formulas 330 Introduction and Abstract of Research Study 360
» Summary and Key Formulas 386 Introduction and Abstract of Research Study 402
» Checking on the AOV Conditions 416 An Alternative Analysis: Transformations of the Data 421
» Summary and Key Formulas 436 Introduction and Abstract of Research Study 451
» Measuring Strength of Relation 528 Odds and Odds Ratios 530
» Summary and Key Formulas 545 Introduction and Abstract of Research Study 572
» Estimating Model Parameters 581 Inferences about Regression Parameters 590
» Predicting New y Values Using Regression 594 Examining Lack of Fit in Linear Regression 598
» The Inverse Regression Problem Calibration 605 Correlation 608
» Research Study: Two Methods for Detecting E. coli 616 Summary and Key Formulas 621
» Introduction and Abstract of Research Study 664 The General Linear Model 674
» Estimating Multiple Regression Coefficients 675 Inferences in Multiple Regression 683
» Testing a Subset of Regression Coefficients 691 Introduction and Abstract of Research Study 878
» The Extrapolation Problem 1023 Introduction and Abstract of Research Study 1091
» Exercises 1160 Lyman Ott Michael Longnecker
» Introduction Lyman Ott Michael Longnecker
» Why Study Statistics? Lyman Ott Michael Longnecker
» Some Current Applications of Statistics
» Introduction and Abstract of Research Study
» Observational Studies Lyman Ott Michael Longnecker
» Sampling Designs for Surveys
» Experimental Studies Lyman Ott Michael Longnecker
» Designs for Experimental Studies
» Research Study: Exit Polls versus Election Results
» Summary Lyman Ott Michael Longnecker
» Exercises Lyman Ott Michael Longnecker
» A prospective study is conducted to study the relationship between incidence of
» A study was conducted to examine the possible relationship between coronary disease
» A hospital introduces a new screening procedure to identify patients suffering from
» A high school mathematics teacher is convinced that a new software program will
» Do you think the two types of surveys will yield similar results on the percentage of
» What types of biases may be introduced into each of the surveys? Edu.
» Each name is randomly assigned a number. The names with numbers 1 through 1,000
» The Environmental Protection Agency EPA is required to inspect landfills in the
» factors b. factor levels blocks d. experimental unit measurement unit f. replications treatments
» A horticulturalist is measuring the vitamin C concentration in oranges in an orchard
» A medical specialist wants to compare two different treatments T
» In the design described in b make the following change. Within each hospital, the
» An experiment is planned to compare three types of schools—public, private-
» measurement unit f. replications
» treatments Lyman Ott Michael Longnecker
» The 48 treatments comprised 3, 4, and 4 levels of fertilizers N, P, and K, respectively,
» Ten different software packages were randomly assigned to 30 graduate students. The
» Four different glazes are applied to clay pots at two different thicknesses. The kiln
» There are two possible experimental designs. Design A would use a random sample
» When asked how the experiment is going, the researcher replies that one recipe smelled
» What information should be collected from the workers? Bus.
» How would you obtain a list of private doctors and medical facilities so that a sample
» What are some possible sources for determining the population growth and health
» How could you sample the population of health care facilities and types of private
» Calculators, Computers, and Software Systems
» Describing Data on a Single Variable: Graphical Methods
» Describing Data on a Single Variable:
» The Boxplot Lyman Ott Michael Longnecker
» Summarizing Data from More Than One Variable:
» Research Study: Controlling for Student Background
» Construct a pie chart for these dat b. Construct a bar chart for these dat
» If one of these 25 days were selected at random, what would be the chance probability
» Would you describe per capita income as being fairly homogenous across the
» Construct separate relative frequency histograms for the survival times of both the
» Compare the two histograms. Does the new therapy appear to generate a longer
» Plot the defense expenditures time-series data and describe any trends across the time
» Plot the four separate time series and describe any trends in the separate time
» Do the trends appear to imply a narrowing in the differences between male and
» Construct a relative frequency histogram plot for the homeownership data given in the
» How could Congress use the information in these plots for writing tax laws that allow
» Describing Data on a Single Variable: Measures of Central Tendency
» What does the relationship among the three measures of center indicate about the
» Which of the three measures would you recommend as the most appropriate repre-
» Using your results from parts a and b, comment on the relative sensitivity of the
» Which measure, mean or median, would you recommend as the most appropriate
» What measure or measures best summarize the center of these distributions?
» Describing Data on a Single Variable: Measures of Variability
» Verify that the mean years’ experience is 5 years. Does this value appear to ade-
» Verify that c. Calculate the sample variance and standard deviation for the experience data.
» Calculate the coefficient of variation CV for both the racer’s age and their years of
» Estimate the standard deviations for both the racer’s age and their years of experience
» Might another measure of variability be better to compare luxury and budget hotel
» Generate a time-series plot of the mercury concentrations and place lines for both
» Select the most appropriate measure of center for the mercury concentrations. Compare
» Compare the variability in mercury concentrations at the two sites. Use the CV in
» When comparing the center and variability of the two sites, should the years
» From these graphs, determine the median, lower quartile, and upper quartile for the
» Comment on the similarities and differences in the distributions of daily costs for the
» Compare the mean and median for the 3 years of dat Which value, mean or median,
» Compare the degree of variability in homeownership rates over the 3 years. Soc.
» Use this side-by-side boxplot to discuss changes in the median homeownership rate
» Use this side-by-side boxplot to discuss changes in the variation in these rates over the
» Construct the intervals Lyman Ott Michael Longnecker
» Why do you think the Empirical Rule and your percentages do not match well? Edu.
» Data analysts often find it easier to work with mound-shaped relative frequency his-
» Refer to the data of Exercise 3.49. a. Compute the sample median and the mode.
» Refer to the data of Exercise 3.49. a. Compute the interquartile range.
» Find the 20th percentile for the homeownership percentage and interpret this value
» Congress wants to designate those states that have the highest homeownership percent-
» Similarly identify those states that fall into the upper 10th percentile of homeownership
» Can the combined median be calculated from the medians for each number of members? Gov.
» The DJIA is a summary of data. Does the DJIA provide information about a popula-
» Interpret the values 62, 32.8, 41.3, and 7.8 in the upper left cell of the cross tabulation.
» Does there appear to be a difference in the relationships between the seizure counts
» Describe the type of apparent differences, if any, that you found in a.
» Predict the effect of removing the patient with ID 207 from the data set on the size of
» Using a computer program, compute the correlations with patient ID 207 removed
» Is there support for the same conclusions for the math scores as obtained for the read-
» If the conclusions are different, why do you suppose this has happened? Med.
» Why is it not possible to conclude that large relative values for minority and
» List several variables related to the teachers and students in the schools which may be
» Construct a scatterplot of the number of AIDS cases versus the number of tuberculo-
» Compute the correlation between the number of AIDS cases and the number of
» Why do you think there may be a correlation between these two diseases? Med.
» Compute the correlation between the number of syphilis cases and the number of
» Identify the states having number of tuberculosis cases that are above the 90th
» Compute the correlation coefficient for this data set. Is there a strong or weak rela-
» Finding the Probability of an Event
» Basic Event Relations and Probability Laws
» b. Lyman Ott Michael Longnecker
» Conditional Probability and Independence
» Bayes’ Formula Lyman Ott Michael Longnecker
» Variables: Discrete and Continuous
» Probability Distributions for Discrete Random Variables
» Two Discrete Random Variables:
» Probability Distributions for Continuous
» A Continuous Probability Distribution:
» Random Sampling Lyman Ott Michael Longnecker
» Sampling Distributions Lyman Ott Michael Longnecker
» Normal Approximation to the Binomial
» Evaluating Whether or Not a Population
» Research Study: Inferences about Performance-
» Minitab Instructions Lyman Ott Michael Longnecker
» The National Angus Association has stated that there is a 60
» The quality control section of a large chemical manufacturing company has under-
» A new blend of coffee is being contemplated for release by the marketing division of a
» The probability that a customer will receive a package the day after it was sent by a
» The sportscaster in College Station, Texas, states that the probability that the Aggies
» Let a two-digit number represent an individual running of the screening test. Which
» If we generate 2,000 sets of 20 two-digit numbers, how can the outcomes of this simula-
» The state consumers affairs office provided the following information on the frequency of
» What is the probability that a randomly selected car will need some repairs?
» Suppose you purchase a lottery ticket. What is the probability that your 3-digit number
» Which of the probability approaches subjective, classical, or relative frequency did
» In Exercise 4.10, assume that each one of the outcomes has probability 1
» A: Observing exactly 1 head b. B: Observing 1 or more heads
» C: Observing no heads 4.12 For Exercise 4.11:
» A: Observe a 6 b. B: Observe an odd number
» C: Observe a number greater than 3 d. D: Observe an even number and a number greater than 2
» Complement of A b. Either A or B
» A volunteer blood donor walks into a Red Cross Blood office. What is the probability
» It is brown. b. It is red or green.
» It is not blue. d. It is both red and brown. b.
» Refer to Exercise 4.11. a. Are the events A and B independent? Why or why not?
» Which pairs of the events A B, B C, and A C are mutually exclusive? Justify
» Describe in words the event T
» Compute the probability of the occurrence of the event T
» What is the probability that a professional selected at random would accept the
» What is the probability that a professional selected at random is part of a two-
» Refer to Exercise 4.23 a. Are events A and B independent?
» Find Find PA, PB, and PC. b. Find , , and .
» Find , , Lyman Ott Michael Longnecker
» Suppose two customers are chosen at random from the list of all customers. What is
» Find the probability that a customer selected at random will pay two consecutive
» Find the probability that a customer selected at random will pay neither of two
» Find the probability that a customer chosen at random will pay exactly one month in full.
» In Example 4.4, compute the probability that the test incorrectly identifies the defects D
» Find the probability that a patient truly did not have appendicitis given that the radio-
» The thickness of ice 20 feet from the shoreline in Lake Superior during a random day
» Is the number of cars running a red light during a given light cycle a discrete or
» Is the time between the light turning red and the last car passing through the inter-
» Are the number of students in class responding Strongly agree a continuous or discrete
» Are the percent of students in class responding Strongly agree a continuous or discrete
» Construct a graph of Py. b. Find .
» Find . d. Find . Lyman Ott Michael Longnecker
» Suppose the fire station must call for additional equipment from a neighboring city
» A biologist randomly selects 10 portions of water, each equal to .1 cm
» All 10 automobiles failed the inspection. b. Exactly 6 of the 10 failed the inspection.
» Six or more failed the inspection. d. All 10 passed the inspection.
» Two are rated as outstanding. b. Two or more are rated as outstanding.
» Py ⫽ 1 given m ⫽ 3.0 b. Py ⬎ 1 given m ⫽ 2.5
» No cars arrive. b. More than one car arrives.
» Write an expression for the probability that there are less than six sales, do not com-
» What assumptions are needed to write the expression in part a?
» Use a computer program if available to compute the exact probability that less than
» z ⫽ 0 and z ⫽ 1.6 b. z ⫽ 0 and z ⫽ 2.3
» Repeat Exercise 4.53 for these values: a. z ⫽ .7 and z ⫽ 1.7
» z ⫽ ⫺1.2 and z ⫽ 0 4.55 Repeat Exercise 4.53 for these values:
» z ⫽ ⫺1.29 and z ⫽ 0 b. z ⫽ ⫺.77 and z ⫽ 1.2
» Repeat Exercise 4.53 for these values: a. z ⫽ ⫺1.35 and z ⫽ ⫺.21
» z ⫽ ⫺.37 and z ⫽ ⫺1.20 4.57 Find the probability that z is greater than 1.75.
» Find the probability that z is less than 1.14. 4.59 Find a value for z, say z
» Find a value for z, say z , such that Pz ⬎ z
» Find a value for z, say z , such that P⫺z
» Py ⬎ 100 b. Py ⬎ 105 Lyman Ott Michael Longnecker
» P100 ⬍ y ⬍ 108 Lyman Ott Michael Longnecker
» P304 ⬍ y ⬍ 665 d. k such that P500 ⫺ k ⬍ y ⬍ 500 ⫹ k ⫽ .60
» Convert y ⬎ 85 to the z-score equivalent. c. Find Py ⬍ 115 and Py ⬎ 85.
» Find the value of z for these areas. a. an area .025 to the right of z
» an area .05 to the left of z
» Find the probability of observing a value of z greater than these values. a. 1.96
» 2.21 c. ⫺2.86 Lyman Ott Michael Longnecker
» ⫺0.73 Lyman Ott Michael Longnecker
» What is the probability that the elapsed time between submission and reimbursement
» If you had a travel voucher submitted more than 55 days ago, what might you
» Greater than 600 b. Greater than 700
» Less than 450 d. Between 450 and 600
» Py ⬍ 200 b. Py ⬎ 100 Lyman Ott Michael Longnecker
» Using either a random number table or a computer program, generate a second ran-
» Give several reasons why you need to generate a different set of random numbers for
» Refer to Exercise 4.77. Describe the sampling distribution for the sample sum . Is it un-
» What fraction of the patients scored between 800 and 1,100? b. Less than 800?
» Greater than 1,200? So Lyman Ott Michael Longnecker
» Use the Empirical Rule to describe the distribution of y, the number of patients
» If the facility was built with a 160-patient capacity, what fraction of the weeks might
» What proportion of the population spends more than 7 hours per day watching
» In a 1998 study of television viewing, a random sample of 500 adults reported that the
» If the EPA mandates that a nitrogen oxide level of 2.7 gm cannot be exceeded, what
» At most, 25 of Polluters exceed what nitrogen oxide level value that is, find the
» The company producing the Polluter must reduce the nitrogen oxide level so that
» If a patient’s systolic readings during a given day have a normal distribution with
» If five measurements are taken at various times during the day, what is the probability
» How many measurements would be required so that the probability is at most 1 of
» The expected number of errors b. The probability of observing fewer than four errors
» The probability of observing more than two errors
» Compute the exact probabilities and corresponding normal approximations for y
» The normal approximation can be improved slightly by taking Py
» Compute the exact probabilities and corresponding normal approximations with the
» Let y be a binomial random variable with n
» Calculate P4 Lyman Ott Michael Longnecker
» Use a normal approximation without the continuity correction to calculate the
» Refer to Exercise 4.89. Use the continuity correction to compute the probability P4
» Does it appear that the 45 data values appear to be a random sample from a normal
» Compute the correlation coefficient and p-value to assess whether the data appear to
» Use the 45 sample means to determine whether the sampling distribution of the
» Compute the correlation coefficient and p-value to assess whether the 45 means
» Use a normal quantile plot to assess whether the data appear to fit a normal
» Compute the correlation coefficient and p-value for the normal quantile plot.
» Find the probability of selecting a 1-foot-square sample of material at random that on
» Describe the sampling distribution for based on random samples of 15 1-foot sections.
» What percentage of the males in this age bracket could be expected to have a serum
» Estimation of M Lyman Ott Michael Longnecker
» Choosing the Sample Size for Estimating M
» Choosing the Sample Size for Testing M
» The Level of Significance of a Statistical Test
» Inferences about M for a Normal Population,
» Inferences about M When Population Is Nonnormal
» Research Study: Percent Calories from Fat
» What characteristics of the nurses other than dietary intake might be important in
» Calculate a 95 confidence interval for the mean caffeine content m of the coffee pro-
» Explain to the CEO of the company in nonstatistical language, the interpretation of
» What would happen to the width of the confidence intervals if the level of confidence
» What would happen to the width of the confidence intervals if the number of samples
» If the level of confidence remains at 95 for the 720 confidence intervals in a given
» If the number of samples is increased from 50 to 100 each hour, how many of the
» If the number of samples remains at 50 each hour but the level of confidence is in-
» Construct a 99 confidence interval for the mean gross profit margin of m of all small
» The city manager reads the report and states that the confidence interval for m con-
» If the level of confidence remains at 99 but the tolerable width of the interval is .4,
» If the level of confidence decreases to 95 but the specified width of the interval
» If the level of confidence increases to 99.5 but the specified width of the interval
» If the level of confidence is increased to 99 with the average rent estimated to
» Suppose the budget for the project will not support both increasing the level of confi-
» Using a ⫽ .05, what conclusions can you make about the hypotheses based on the
» Refer to Exercise 5.18. Sketch the power curve for rejecting H : m ⱖ 28 by determining
» Suppose we keep a ⫽ .05 but change to n ⫽ 20. Without actually recalculating the
» How many of the 100 tests of hypotheses resulted in your reaching the decision to
» Suppose you were to conduct 100 tests of hypotheses and in each of these tests the
» What type of error are you making if you incorrectly reject H ?
» What proportion of the 100 tests of hypotheses resulted in the correct decision, that
» In part a, you were estimating the power of the test when m
» Based on your calculation in b how many of the 100 tests of hypotheses would you
» Did decreasing a from .05 to .01 increase or decrease the power of the test? Explain
» Refer to Exercise 5.23. Compute the power of the test PWRm
» Refer to Exercise 5.26. Suppose a random sample of 100 students is selected yielding
» Is there sufficient evidence a ⫽ .05 in the data that the mean lead concentration
» Based on your answer in c, is the sample size large enough for the test procedures to
» Place a 95 confidence interval on the mean reading time for all incoming freshmen
» Plot the reading time using a normal probability plot or boxplot. Do the data appear
» What are some weak points in this study relative to evaluating the potential of the
» Place a 99 confidence interval on the average number of miles driven, m, prior to
» Is there significant evidence a ⫽ .01 that the manufacturer’s claim is false? What is
» Is there a contradiction between the interval estimate of m and the conclusion
» Test the research hypothesis that the mean oxygen level is less than 5 ppm. What is
» Assuming the eighteen 2-week periods are fairly typical of the volumes throughout
» Construct a 90 confidence interval for the mean change in mileage. On the basis of
» Refer to Exercise 5.45. a. Calculate the probability of a Type II error for several values of m
» Suggest some changes in the way in which this study in Exercise 5.45 was conducted.
» Use a computer program to obtain 1,000 bootstrap samples from the 20 comprehen-
» Use a computer program to obtain 1,000 bootstrap samples from the 15 tire wear
» Use a computer program to obtain 1,000 bootstrap samples from the 8 oxygen levels.
» Use a computer program to obtain 1,000 bootstrap samples from the 18 recycle vol-
» Compare the p-value from part a to the p-value obtained in Exercise 5.44. x
» Use Table 4 in the Appendix to obtain L
» Use the large-sample approximation to determine L
» Graph the data using a boxplot or normal probability plot and determine whether the
» Based on your answer to part a, is the mean or the median cost per household a
» Place a 95 confidence interval on the amount spent on health care by the typical
» Does the typical worker spend more than 400 per year on health care needs? Use
» Is there sufficient evidence that a blood-alcohol level of .1 causes any increase in
» Which summary of reaction time differences seems more appropriate, the mean or
» Is there sufficient evidence that the median difference in reaction time is greater than
» What other factors about the drivers are important in attempting to decide whether
» For both fund A and fund B, estimate the mean and median annual rate of return and
» Which of the parameters, the mean or median, do you think best represents the
» Is there sufficient evidence that the mean annual rate of return for the two mutual
» Suppose the population has a distribution that is highly skewed to the right. The
» When testing hypotheses about the mean or median of a highly skewed population, the
» When testing hypotheses about the mean or median of a lightly skewed population,
» Construct a 95 confidence on the mean time to handle a complaint after imple-
» Is there sufficient evidence that the incentive plan has reduced the mean time to handle
» Is there sufficient evidence that the mean mercury concentration has increased since
» Assuming that the standard deviation of the mercury concentration is .32 mgm
» If the mean number of days to birth beyond the due date was 13 days prior to the in-
» What factors may be important in explaining why the doctors’ projected due dates
» Using a graphical display, determine whether the data appear to be a random sample
» Estimate the mean dissolution rate for the batch of tablets, for both a point estimate
» Is there significant evidence that the batch of pills has a mean dissolution rate less
» Calculate the probability of a Type II error if the true dissolution rate is 19.6 mg. Bus.
» Estimate the typical amount of ore produced by the mine using both a point estimate
» Is there significant evidence that on a typical day the mine produces more than
» Inferences about Lyman Ott Michael Longnecker
» A Nonparametric Alternative: Lyman Ott Michael Longnecker
» Inferences about M Lyman Ott Michael Longnecker
» Choosing Sample Sizes for Inferences about
» Research Study: Effects of Oil Spill on Plant Growth
» Describe a method for randomly selecting the tracts where flora density measurements
» State several hypotheses that may be of interest to the researchers. H
» H : m Lyman Ott Michael Longnecker
» Refer to the data of Exercise 6.3. a. Give the level of significance for your test.
» Place a 95 confidence interval on m
» Do the data provide sufficient evidence that rats exposed to a 5°C environment have a
» Do the data provide sufficient evidence that successful companies have a lower percent-
» How large is the difference between the percentage of returns for successful and unsuc-
» Identify the value of the pooled-variance t statistic the usual t test based on the equal
» Is there significant evidence that there is a difference in the distribution of SB2M for
» Discuss the implications of your findings in part c on the evaluation of the influence
» What has a greater effect, if any, on the level of significance of the t test, skewness or
» What has a greater effect, if any, on the level of significance of the Wilcoxon rank sum
» What has a greater effect, if any, on the power of the Wilcoxon rank sum test, skewness
» For what type of population distributions would you recommend using the Wilcoxon H H H
» Consider the data given here. Pair
» Conduct a paired t test of H
» Using a testing procedure related to the binomial distribution, test the
» Give the level of significance for your test. b. Place a 95 confidence interval on m
» Plot the pairs of observations in a scatterplot with the 1982 values on the horizontal
» Compute the correlation coefficient between the pair of observations.
» Answer the questions posed in Exercise 6.11 parts a and b using a paired data
» Consider the data given in Exercise 6.23. a. Conduct a Wilcoxon signed-rank test of H
» Compare your conclusions here to those given in Exercise 6.23. Does it matter which
» Refer to the data of Exercise 6.31. a. Give the level of significance for your test.
» Place a 95 confidence interval on the median difference, M.
» Use the level and power values for the paired t test and Wilcoxon signed-rank test given in
» Which type of deviations from a normal distribution, skewness or heavy-tailedness,
» For small sample sizes, n ⱕ 20, does the actual level of the Wilcoxon signed-rank test
» Suppose a boxplot of the differences in the pairs from a paired data set has many out-
» Suppose the sample sizes are the same for both groups. What sample size is needed
» Suppose the user group will have twice as many patients as the placebo group. What
» How many chemical plant cooling towers need to be measured if we want a probabil-
» What assumptions did you make in part a in order to compute the sample size? Env.
» Is there sufficient evidence to support the conjecture that ozone exposure increases
» Estimate the size of the increase in lung capacity after exposure to ozone using a 95
» After completion of the study, the researcher claimed that ozone causes increased
» Estimate the size of the difference in mean noise level between the two types of jets
» How would you select the jets for inclusion in this study? Ag.
» An entomologist is investigating which of two fumigants, F
» Estimate the size of the difference in the mean number of parasites between the two
» After combining the data from the two depths, does there appear to be a difference in
» Estimate the size of the difference in the mean population abundance at the two
» Refer to Exercise 6.46. Answer the following questions using the combined data for both depths.
» Use the Wilcoxon rank sum test to assess whether there is a difference in population
» Discuss any differences in the conclusions obtained using the t-procedures and the
» Plot the four data sets using side-by-side boxplots to demonstrate the effect of depth
» Separately for each depth, evaluate differences between the sites within and outside
» Discuss the veracity of the following statement: “The oil spill did not adversely affect the
» A possible criticism of the study is that the six sites outside the oil trajectory were not
» What are some possible problems with using the before and after oil spill data in
» Compare the mean drop in blood pressure for the high-dose group and the control
» Estimate the size of the difference in the mean drop for the high-dose and control
» Do the conditions required for the statistical techniques used in a and b appear to
» Estimate the size of the difference in the mean drop for the low-dose and control
» Estimate the size of the difference in the mean drop for the low-dose and high-dose
» If we tested each of the three sets of hypotheses at the .05 level, estimate the experiment-
» Suggest a procedure by which we could be ensured that the experiment-wide Type I
» Can the licensing board conclude that the mean score of nurses who receive a BS in
» The mean test scores are considered to have a meaningful difference only if they differ
» State the null and alternative hypotheses in
» Estimate the size of the difference in campaign expenditures for female and male
» What is the level of significance of the test for a change in mean pH after reclamation
» The land office assessed a fine on the mining company because the t test indicated a
» A Type I error? b. A Type II error?
» Both a Type I and a Type II error? d. Neither a Type I nor a Type II error?
» Refer to Exercise 6.60. Suppose we wish to test the research hypothesis that m
» Do the data support the conjecture that progabide reduces the mean number of
» Determine the sample size so that we are 95 confident that the estimate of the
» Estimation and Tests for a Population Variance
» Estimation and Tests for Comparing
» For the E. coli research study, answer the following. a. What are the populations of interest?
» What are some factors other than the type of detection method HEC versus HGMF
» Describe a method for randomly assigning the E. coli samples to the two devices for
» State several hypotheses that may be of interest to the researchers.
» Find Py ⬎ 52.62. d. Find Py ⬍ 10.52.
» Find Py ⬎ 34.38. e. Find P10.52 ⬍ y ⬍ 34.38.
» For a chi-square distribution with df ⫽ 80, compare the actual values given in Table 7
» Suppose that y has a chi-square distribution with df ⫽ 277. Find approximate values
» If the process yields jars having a normal distribution with a mean of 32.30 ounces and
» Does the plot suggest any violation of the conditions necessary to use the chi-square
» Place bounds on the p-value of the test. Engin.
» Does the boxplot suggest any violation of the conditions necessary to use the chi-square
» Estimate the standard deviation in the speeds of the vehicles on the interstate using a
» Do the data indicate at the 5 level that the standard deviation in vehicle speeds
Show more