INTRODUCTION ENUMERATIVE VERSUS ANALYTIC STUDIES

2.1 INTRODUCTION

DescribingPatternsinData In Chapter 1, we stated that numerical information data is often required for gaining new knowledge, effectively im- proving business processes, and, in general, making better decisions. We also suggested that some amount of variability in the data is unavoidable even though measurements are made under identical or very nearly identical conditions. This chapter is concerned with methods for describing and summarizing data to highlight any important features or patterns they may contain. The goal is to make the information in data obvious. Because there are several types of data, a particular procedure for displaying and summarizing one kind of data may not be H A P T E R W O H A P T E R W O C T C T

2.2 ENUMERATIVE VERSUS ANALYTIC STUDIES

p exploratory data analysis. enumerative study, Frame: beyond 36 p CHAPTER 2 DESCRIBING PATTERNS IN DATA A distinction emphasized by Dr. Deming and others. appropriate for another. However, there are general categories of methods that work, to a greater or lesser extent, for everything. These categories include tables, plots, and numerical summaries. The practice of examining data with a collection of relatively simple tables, plots, and numbers is called We said in Chapter 1 that the population often represents the target of the numerical inquiry, although the population may be difficult to define or simply unavailable for study. In general, however, we learn about the population by sampling from it. To illustrate situations where defining the appropriate population is difficult and generalizations the sample observations must be carefully interpreted, we must distinguish between enumerative and analytic studies. In an interest centers on the identifiable, unchanging, gen- erally finite collection of units from which the sample was selected. For example, we may be interested in the 1996 per capita cost of health care for all U.S. companies with at least 100 employees. To get some idea of what the average 1996 per capita cost might be, a sample of these firms is selected and health care costs determined. The per capita costs for the units firms in the sample are the sample observations. The per capita costs for the entire collection of units firms are the population observations. The population numbers include the sample numbers and, if time and resources allow, a complete enumeration of the population is possible. A list, or similar mechanism, for identifying the entire set of relevant sampling units is called a frame. A list of the entire set of relevant sampling units In the example on health care cost, a frame is a list of all U.S. firms with at least 100 employees as of December 31, 1996. Enumerative investigations are typically concerned with making generalizations inferences from the sample data to the complete collection of units in the frame. Along these lines, enumerative studies have two distinguishing characteristics. First, the frame entire collection of units does not change. This ordinarily means that enumerative studies pertain to an environment existing at a particular point in time. Second, a 100 sample of the frame provides the complete answer to the question posed. An internal audit conducted to determine the extent to which long-distance telephone calls are business related is an enumerative study. The frame is a list of all long-distance calls made by the several hundred employees of a particular firm for the previous month. A sample of employees is selected, and their long-distance calls are audited. The results will be used to determine the amount the employer paid for nonbusiness-related calls. Perhaps the audit will suggest an investigation of all the items in the frame. analytic study predict before exit poll 37

2.2 ENUMERATIVE VERSUS ANALYTIC STUDIES

Product acceptance sampling is another good example of an enumerative study. A shipment of parts from a supplier is accepted or rejected depending on the number of defective parts in a sample of parts from the shipment. The frame is the aggregate collection of parts in the shipment, say, a truckload, and interest centers on the number of defective parts in the truckload. An is a study that is not enumerative. Analytic studies generally take place over time and are concerned with processes or cause-and-effect systems. The most effective analytic studies involve a plan for collecting the data. The objective is to improve future practices or products. Analytic studies often involve comparisons. Will this material or that material lead to more durable products? Will this method of training or that method of training lead to more productive employees? Will this type of service or that type of service lead to a higher retention of customers? In analytic studies, we are interested in drawing conclusions about a process or product that often does not exist at the beginning of the study. We are no longer dealing with a collection of identifiable units and, consequently, there is no relevant frame population from which to sample. Instead, we are typically dealing with observations derived from a current process or product, and we must what will happen at some future time if, for example, certain actions are taken. Consider a public opinion poll of registered voters held an election. If interest centers on the proportion of people voting for, say, the Republican candidate on election day, the pre-election-day poll is an analytic study. Even a 100 sample of the registered vot- ers will not allow us to predict the outcome of the election with certainty. Between the time of the poll and election day, some voters will change their minds, additional eligible people may register, some voters will not vote for one reason or another, and these “stay-at-homes” may well differ in their voting preferences from those who do vote. We want to draw con- clusions about a future process election-day voting from information on a current process pre-election voting indications that might be quite different. However, an to de- terminetheproportionofvoterswhohavevotedforaparticularcandidateisanenumerative study, since a 100 sample provides perfect information provided all voters tell the truth . New products are frequently test-marketed before full-scale production occurs. Consumer responses to a prototype product are used to fine-tune the product before full production or, perhaps, to abandon it altogether. Full evaluation of all test-marketed products still may not tell us about the process of interest—the process associated with producing the final product. Studies involving prototypes or trial products are analytic. The vast majority of numerical studies in business are analytic and we have to be careful about making statements or taking actions based on observations from current processes. If a process is stable and unchanging in a state of “statistical control” and remains so, current data may be used to reach conclusions about future performance of the process. However, the validity of extrapolating from current conditions should always be thoroughly examined. Enumerative studies, too, must be conducted with care. The validity of inferences from enumerative studies depends, in part, on how well the frame represents the target population. The issues raised either explicitly or implicitly in this section—collecting appro- priate data, summarizing numerical information, monitoring processes, generalizing beyond the data or time period, reaching valid conclusions, and so forth—will be considered as we progress through this book. After-the-fact analysis cannot compen- sate for a poorly planned investigation and, as we shall see, the planning process is different for analytic than for enumerative studies.

2.3 VARIABLES AND DATA