Experimental Studies Lyman Ott Michael Longnecker

experiments. In experimental studies, the researcher controls the crucial factors by one of two methods. Method 1: The subjects in the experiment are randomly assigned to the treatments. For example, ten rats are randomly assigned to each of the four dose levels of an experimental drug under investigation. Method 2: Subjects are randomly selected from different populations of interest. For example, 50 male and 50 female dogs are randomly selected from animal shelters in large and small cities and tested for the presence of heart worms. In Method 1, the researcher randomly selects experimental units from a homoge- neous population of experimental units and then has complete control over the as- signment of the units to the various treatments. In Method 2, the researcher has control over the random sampling from the treatment populations but not over the assignment of the experimental units to the treatments. In experimental studies, it is crucial that the scientist follows a systematic plan established prior to running the experiment. The plan includes how all ran- domization is conducted, either the assignment of experimental units to treat- ments or the selection of units from the treatment populations. There may be extraneous factors present that may affect the experimental units. These factors may be present as subtle differences in the experimental units or slight differences in the surrounding environment during the conducting of the experiment. The ran- domization process ensures that, on the average, any large differences observed in the responses of the experimental units in different treatment groups can be attributed to the differences in the groups and not to factors that were not con- trolled during the experiment. The plan should also include many other aspects on how to conduct the experiment. A list of some of the items that should be included in such a plan are listed here: 1. The research objectives of the experiment 2. The selection of the factors that will be varied the treatments

3.

The identification of extraneous factors that may be present in the ex- perimental units or in the environment of the experimental setting the blocking factors 4. The characteristics to be measured on the experimental units response variable 5. The method of randomization, either randomly selecting from treatment populations or the random assignment of experimental units to treatments 6. The procedures to be used in recording the responses from the experi- mental units 7. The selection of the number of experimental units assigned to each treatment may require designating the level of significance and power of tests or the precision and reliability of confidence intervals 8. A complete listing of available resources and materials Terminology A designed experiment is an investigation in which a specified framework is pro- vided in order to observe, measure, and evaluate groups with respect to a desig- nated response. The researcher controls the elements of the framework during the designed experiment experiment in order to obtain data from which statistical inferences can provide valid comparisons of the groups of interest. There are two types of variables in a experimental study. Controlled vari- ables called factors are selected by the researchers for comparison. Response vari- ables are measurements or observations that are recorded but not controlled by the researcher. The controlled variables form the comparison groups defined by the research hypothesis. The treatments in an experimental study are the conditions constructed from the factors. The factors are selected by examining the questions raised by the re- search hypothesis. In some experiments, there may only be a single factor, and hence the treatments and levels of the factor would be the same. In most cases, we will have several factors and the treatments are formed by combining levels of the factors. This type of treatment design is called a factorial treatment design. We will illustrate these ideas in the following example. EXAMPLE 2.4 A researcher is studying the conditions under which commercially raised shrimp reach maximum weight gain. Three water temperatures 25°, 30°, 35° and four water salinity levels 10, 20, 30, 40 were selected for study. Shrimp were raised in containers with specified water temperatures and salinity levels. The weight gain of the shrimp in each container was recorded after a 6-week study period. There are many other factors that may affect weight gain, such as, density of shrimp in the containers, variety of shrimp, size of shrimp, type of feeding, and so on. The experiment was conducted as follows: 24 containers were available for the study. A specific variety and size of shrimp was selected for study. The density of shrimp in the container was fixed at a given amount. One of the three water tem- peratures and one of the four salinity levels were randomly assigned to each of the 24 containers. All other identifiable conditions were specified to be maintained at the same level for all 24 containers for the duration of the study. In reality, there will be some variation in the levels of these variables. After 6 weeks in the tanks, the shrimp were harvested and weighed. Identify the response variable, factors, and treatments in this example. Solution The response variable is weight of the shrimp at the end of the 6-week study. There are two factors: water temperature at three levels 25°, 30°, and 35° and water salinity at four levels 10, 20, 30, and 40. We can thus create 3 ⭈ 4 ⫽ 12 treatments from the combination of levels of the two factors. These factor- level combinations representing the 12 treatments are shown here: 25°, 10 25°, 20 25°, 30 25°, 40 30°, 10 30°, 20 30°, 30 30°, 40 35°, 10 35°, 20 35°, 30 35°, 40 Following proper experimental procedures, 2 of the 24 containers would be ran- domly assigned to each of the 12 treatments. In other circumstances, there may be a large number of factors and hence the number of treatments may be so large that only a subset of all possible treatments would be examined in the experiment. For example, suppose we were investigating the effect of the following factors on the yield per acre of soybeans: Factor 1—Five Varieties of Soybeans, Factor 2—Three Planting Densities, Factor 3—Four Levels of Fertilization, Factor 4—Six Locations within Texas, and Factor 5—Three factors measurements or observations treatments treatment design factorial treatment design Irrigation Rates. From the five factors, we can form 5 ⭈ 3 ⭈ 4 ⭈ 6 ⭈ 3 ⫽ 1,080 distinct treatments. This would make for a very large and expensive experiment. In this type of situation, a subset of the 1,080 possible treatments would be selected for studying the relationship between the five factors and the yield of soybeans. This type of experiment has a fractional factorial treatment structure since only a fraction of the possible treatments are actually used in the experiment. A great deal of care must be taken in selecting which treatments should be used in the experiment so as to be able to answer as many of the researcher’s questions as possible. A special treatment is called the control treatment. This treatment is the benchmark to which the effectiveness of the remaining treatments are compared. There are three situations in which a control treatment is particularly necessary. First, the conditions under which the experiments are conducted may prevent gen- erally effective treatments from demonstrating their effectiveness. In this case, the control treatment consisting of no treatment may help to demonstrate that the ex- perimental conditions are keeping the treatments from demonstrating the differ- ences in their effectiveness. For example, an experiment is conducted to determine the most effective level of nitrogen in a garden growing tomatoes. If the soil used in the study has a high level of fertility prior to adding nitrogen to the soil, all lev- els of nitrogen will appear to be equally effective. However, if a treatment consist- ing of adding no nitrogen—the control—is used in the study, the high fertility of the soil will be revealed since the control treatment will be just as effective as the nitrogen-added treatments. A second type of control is the standard method treatment to which all other treatments are compared. In this situation, several new procedures are proposed to replace an already existing well-established procedure. A third type of control is the placebo control. In this situation, a response may be obtained from the sub- ject just by the manipulation of the subject during the experiment. A person may demonstrate a temporary reduction in pain level just by visiting with the physician and having a treatment prescribed. Thus, in evaluating several different methods of reducing pain level in patients, a treatment with no active ingredients, the placebo, is given to a set of patients without the patients’ knowledge. The treat- ments with active ingredients are then compared to the placebo to determine their true effectiveness. The experimental unit is the physical entity to which the treatment is ran- domly assigned or the subject that is randomly selected from one of the treatment populations. For the shrimp study of Example 2.4, the experimental unit is the container. Consider another experiment in which a researcher is testing various dose levels treatments of a new drug on laboratory rats. If the researcher randomly assigned a single dose of the drug to each rat, then the experimental unit would be the individual rat. Once the treatment is assigned to an experimental unit, a single replication of the treatment has occurred. In general, we will randomly assign sev- eral experimental units to each treatment. We will thus obtain several independent observations on any particular treatment and hence will have several replications of the treatments. In Example 2.4, we had two replications of each treatment. Distinct from the experimental unit is the measurement unit. This is the physical entity upon which a measurement is taken. In many experiments, the ex- perimental and measurement unit are identical. In Example 2.4, the measurement unit is the container, the same as the experimental unit. However, if the individual shrimp were weighed as opposed to obtaining the total weight of all the shrimp in each container, the experimental unit would be the container, but the measure- ment unit would be the individual shrimp. fractional factorial treatment structure control treatment experimental unit replication measurement unit EXAMPLE 2.5 Consider the following experiment. Four types of protective coatings for frying pans are to be evaluated. Five frying pans are randomly assigned to each of the four coatings. A measure of the abrasion resistance of the coating is measured at three locations on each of the 20 pans. Identify the following items for this study: exper- imental design, treatments, replications, experimental unit, measurement unit, and total number of measurements. Solution Experimental design: Completely randomized design. Treatments: Four types of protective coatings. Replication: There are five frying pans replications for each treatment. Experimental unit: Frying pan, because coatings treatments are randomly assigned to the frying pans. Measurement unit: Particular locations on the frying pan. Total number of measurements: 4 ⭈ 5 ⭈ 3 ⫽ 60 measurements in this experiment. The experimental unit is the frying pan since the treatment was randomly assigned to a coating. The measurement unit is a location on the frying pan. The term experimental error is used to describe the variation in the responses among experimental units that are assigned the same treatment and are observed under the same experimental conditions. The reasons that the experimental error is not zero include a the natural differences in the experimental units prior to their receiving the treatment, b the variation in the devices that record the meas- urements, c the variation in setting the treatment conditions, and d the effect on the response variable of all extraneous factors other than the treatment factors. EXAMPLE 2.6 Refer to the previously discussed laboratory experiment in which the researcher randomly assigns a single dose of the drug to each of 10 rats and then measures the level of drug in the rats bloodstream after 2 hours. For this experiment the experi- mental unit and measurement unit are the same: the rat. Identify the four possible sources of experimental error for this study. See a to d in the last paragraph before this example. Solution We can address these sources as follows: a Natural differences in experimental units prior to receiving the treat- ment. There will be slight physiological differences among rats, so two rats receiving the exact same dose level treatment will have slightly different blood levels 2 hours after receiving the treatment. b Variation in the devices used to record the measurements. There will be differences in the responses due to the method by which the quantity of the drug in the rat is determined by the laboratory technician. If several determinations of drug level were made in the blood of the same rat, there may be differences in the amount of drug found due to equipment variation, technician variation, or conditions in the laboratory. c Variation in setting the treatment conditions. If there is more than one replication per treatment, the treatment may not be exactly the same from one rat to another. Suppose, for example, that we had ten experimental error replications of each dose treatment. It is highly unlikely that each of the ten rats receives exactly the same dose of drug specified by the treatment. There could be slightly different amounts of the drug in the syringes and slightly different amounts could be injected and enter the bloodstreams. d The effect on the response blood level of all extraneous factors other than the treatment factors. Presumably, the rats are all placed in cages and given the same amount of food and water prior to determining the amount of drug in their blood. However, the temperature, humidity, external stimulation, and other conditions may be somewhat different in the ten cages. This may have an effect on the responses of the ten rats. Thus, these differences and variation in the external conditions within the labora- tory during the experiment all contribute to the size of the experimental error in the experiment. EXAMPLE 2.7 Refer to Example 2.4. Suppose that each treatment is assigned to two containers and that 40 shrimp are placed in each container. After 6 weeks, the individual shrimp are weighed. Identify the experimental units, measurement units, factors, treatments, number of replications, and possible sources of experimental error. Solution This is a factorial treatment design with two factors: temperature and salinity level. The treatments are constructed by selecting a temperature and salin- ity level to be assigned to a particular container. We would have a total of 3 ⭈ 4 ⫽ 12 possible treatments for this experiment. The 12 treatments are 25°, 10 25°, 20 25°, 30 25°, 40 30°, 10 30°, 20 30°, 30 30°, 40 35°, 10 35°, 20 35°, 30 35°, 40 We next randomly assign two containers to each of the 12 treatments. This results in two replications of each treatment. The experimental unit is the container since the individual containers are randomly assigned to a treatment. Forty shrimp are placed in the containers and after 6 weeks the weights of the individual shrimps are recorded. The measurement unit is the individual shrimp since this is the physical entity upon which an observation is made. Thus, in this experiment the experimental and measurement unit are different. Several possible sources of experimental error include the difference in the weights of the shrimp prior to being placed in the con- tainer, how accurately the temperature and salinity levels are maintained over the 6-week study period, how accurately the shrimp are weighed at the conclusion of the study, the consistency of the amount of food fed to the shrimp was each shrimp given exactly the same quantity of food over the 6 weeks, and the variation in any other conditions which may affect shrimp growth.

2.5 Designs for Experimental Studies

The subject of designs for experimental studies cannot be given much justice at the beginning of a statistical methods course—entire courses at the undergraduate and graduate levels are needed to get a comprehensive understanding of the methods and concepts of experimental design. Even so, we will attempt to give you a brief overview of the subject because much data requiring summarization and analysis arise from experimental studies involving one of a number of designs. We will work by way of examples. A consumer testing agency decides to evaluate the wear characteristics of four major brands of tires. For this study, the agency selects four cars of a standard car model and four tires of each brand. The tires will be placed on the cars and then driven 30,000 miles on a 2-mile racetrack. The decrease in tread thickness over the 30,000 miles is the variable of interest in this study. Four different drivers will drive the cars, but the drivers are professional drivers with comparable training and experience. The weather conditions, smoothness of track, and the maintenance of the four cars will be essentially the same for all four brands over the study period. All extraneous factors that may affect the tires are nearly the same for all four brands. Thus, the testing agency feels confident that if there is a difference in wear characteristics between the brands at the end of the study, then this is truly a difference in the four brands and not a difference due to the manner in which the study was conducted. The testing agency is interested in recording other factors, such as the cost of the tires, the length of warranty offered by the manufacturer, whether the tires go out of balance during the study, and the evenness of wear across the width of the tires. In this example, we will only consider tread wear. There should be a recorded tread wear for each of the sixteen tires, four tires for each brand. The methods presented in Chapters 8 and 15 could be used to summa- rize and analyze the sample tread wear data in order to make comparisons infer- ences among the four tire brands. One possible inference of interest could be the selection of the brand having minimum tread wear. Can the best-performing tire brand in the sample data be expected to provide the best tread wear if the same study is repeated? Are the results of the study applicable to the driving habits of the typical motorist? Experimental Designs There are many ways in which the tires can be assigned to the four cars. We will consider one running of the experiment in which we have four tires of each of the four brands. First, we need to decide how to assign the tires to the cars. We could randomly assign a single brand to each car, but this would result in a design having the unit of measurement the total loss of tread for all four tires on the car and not the individual tire loss. Thus, we must randomly assign the sixteen tires to the four cars. In Chapter 15, we will demonstrate how this randomization is conducted. One possible arrangement of the tires on the cars is shown in Table 2.2. In general, a completely randomized design is used when we are interested in comparing t “treatments” in our case, t ⫽ 4, the treatments are brand of tire. For each of the treatments, we obtain a sample of observations. The sample sizes could be different for the individual treatments. For example, we could test 20 tires from Brands A, B, and C but only 12 tires from Brand D. The sample of observations from a treatment is assumed to be the result of a simple random sample of observations TABLE 2.2 Completely randomized design of tire wear Car 1 Car 2 Car 3 Car 4 Brand B Brand A Brand A Brand D Brand B Brand A Brand B Brand D Brand B Brand C Brand C Brand D Brand C Brand C Brand A Brand D completely randomized design