Kaunas 422,000
10,000 Niemen R.
Berezina R. Smolensk
Moscow 100,000
° C
–15 °
–30 °
–9 °
–21 °
–11 °
–20 °
–30 °
Dec. 6 Nov. 28
Nov. 14 Oct. 9
Temperature
†
p
2.8 STATISTICS IN CONTEXT
Figure 2.18
4 The Demise of Napoleon’s Army in Russia, 1812 – 1813 based on display
by Charles Minard
The quality control department of a midwest manufacturer of microwave ovens is re- quired by the U.S. government to monitor the amount of radiation emitted when the
doors of the ovens are closed. Observations of the radiation passing through the closed doors of
42 ovens were obtained and are given in Table 2.8.
The Visual Display of Quantitative Information.
86
p
Data courtesy of J. D. Cryer.
n
CHAPTER 2 DESCRIBING PATTERNS IN DATA
A copy of Minard’s original map and additional discussion are contained in Tufte, E. R. Cheshire, Conn.: Graphics Press, 1983.
Napoleon’s Grand Army to capture Russia. A simplified version of the original graphic appears in Figure 2.18.
The 422,000 troops that entered Russia near Kaunas are shown as a wide shaded river flowing toward Moscow, and the retreating army as a small black stream.
The width of the band indicates the size of the army at each location on the map. Napoleon had to provide 422,000 soldiers in Poland to field 100,000 troops in
Moscow.
Even the simplified version Fig. 2.18 of the original graphic dramatically conveys the losses that left the army with 10,000 returning members. The temperature scale at
the bottom of the graph, pertaining to the retreat from Moscow, helps to explain the loss of life, including the incident where thousands died trying to cross the Berezina
River in subzero temperatures.
Another informative graph is presented in Figure 2.19. In this graph, countries of the world are scaled to approximate size according to 1992 stock market capitalization.
We see that most of the world’s money resides in the United States, Japan, and, to a lesser extent, the United Kingdom.
Although we do not intend to pursue graphics, the impact of these examples should motivate you to think creatively when displaying data.
†
87
Barbados Jamaica
TrinidadTobago Argentina
Brazil Colombia
Venezuela Costa Rica
Peru Chile
UK Ireland
United Kingdom
Finland Sweden
Norway
Czechoslovakia
Germany
Austria Greece
Netherlands Belgium
Italy Spain
France
Portugal Hungary
Switzerland Denmark
Poland
Kenya Turkey
Egypt Nigeria
South Africa Morocco
Botswana Tunisia
Ghana Ivory Coast
Zimbabwe Mauritius
Cyprus Jordan
Israel Kuwait Iran
Oman Indonesia
Australia Singapore
Pakistan Sri Lanka
China Hong Kong
India Thailand
Bangladesh Malaysia
Philippines Taiwan
Korea
Japan
New Zealand Luxembourg
Canada The World According to Stock Market Capitalization
Where the worlds money is: Countries are scaled to
approximate size according to 1992 stock market capitalization.
Sources: IFC and FT-Actuaries World Indices
United States
Mexico
Figure 2.19 The World According to Stock Market Capitalization
0.00 0.10
0.20 0.30
0.40 RADTNDC
RADTNDC Stem-and-leaf of RADTNDC
N = 42
Leaf Unit = 0.010 N
MEAN MEDIAN
STDEV 42
0.1283 0.1000
0.1003 6
0 112223 17
0 55555788899 MIN
MAX Q1
Q3 11
1 00000000012 0.0100
0.4000 0.0500
0.1800 14
1 55588 9
2 000 6
2 6
3 0000 2
3 2
4 00
4 4
4
Panel 2.1
Radiation Through Closed Doors of Microwave Ovens mwcm
TABLE 2.8
To determine the chance of exceeding a prespecified tolerance level, a pattern of variation for the amounts of radiation was required. Can we regard the observations
here as being normally distributed? Panel 2.1 shows the stem-and-leaf diagram and the boxplot of the radiation data in
Table 2.8. We call the radiation variable RADiaTioN Door Closed in the Minitab plots. It is clear from these plots that the radiation data are skewed to the right, with several
ovens having relatively large values of .30 and .40.
To bring the large radiation measurements more in line with the remaining obser- vations, we consider a reexpression or transformation of the data. The objective here is
to create a set of data that can reasonably be described by a normal distribution. One transformation that brings large positive values relatively closer to the re-
maining values is the square root transformation. For example, the numbers 9
3 and 100
10 are closer together than the original numbers 9 and 100. If we apply the square root transformation twice, that is, take the fourth root,
, of each observation, we get the results shown in Panel 2.2. We label the reexpressed data Fourth
ROOT Door Closed in the plots. It is evident from the stem-and-leaf diagram and boxplot in Panel 2.2 that the
transformed observations are reasonably symmetric and, we would argue, nearly normal.
88
2
x
CHAPTER 2 DESCRIBING PATTERNS IN DATA
.15 .09
.18 .10
.05 .12
.18 .05
.08 .10
.07 .02
.01 .10
.10 .10
.02 .10
.01 .40
.10 .05
.03 .05
.15 .10
.15 .09
.08 .18
.10 .20
.11 .30
.02 .20
.20 .30
.30 .40
.30 .05
Plots of Microwave Radiation Data Door Closed
0.36 0.48
0.60 0.72
FROOTDC
Stem-and-leaf of FROOTDC N
= 42 FROOTDC
Leaf Unit = 0.010 N
MEAN MEDIAN
STDEV 2
3 11 42
0.5643 0.5623
0.1198 5
3 777 6
4 1 MIN
MAX Q1
Q3 11
4 77777 0.3162
0.7953 0.4729
0.6514 17
5 133344 11
5 66666666678 14
6 222 11
6 55666 6
7 4444 2
7 99
Panel 2.2
The radiation data are measurements of a particular characteristic associated with the manufacture of microwave ovens. The extent to which these measurements depict
the behavior of radiation readings for yet-to-be-manufactured ovens depends on the sta- bility of the manufacturing process. As we explain in Chapter 13, if the process is “in-
control”—that is, if the causes of variation in the radiation measurements remain the same
or —then we would expect the current observations to
tell us something about future values. If something unusual occurs, or the manufacturing process changes in some fundamental way, for example, there is a new supplier of raw
materials or parts, or a new method of assembly, then the current radiation measure- ments may have little to say about future emissions through closed doors. The validity
of any generalizations beyond the data in hand depends very much on the crucial as- sumption that the future is much like the past.
We are generally concerned with the radiation emitted through the closed doors of all ovens that have been or
produced. Consequently, a study of the radiation measurements to see whether the microwave ovens meet government standards is an
analytic study. A 100 sample of all the ovens currently available will not provide perfect information about the performance of future ovens.
Suppose the data in Table 2.8 are recorded in the order in which the ovens were manufactured. That is, the first observation in the first row is the radiation measure-
ment for the oldest oven, the first oven of the group produced. The second observa- tion in the first row is the radiation measurement for the second oldest oven, and so
forth. Thus, as we read across the rows in the table, we encounter more recent ob- servations. A time-ordered plot of the transformed radiation measurements is given in
Figure 2.20 page 90.
There are three horizontal lines in the figure. The middle line is at the value of the mean of the observations. The upper line is at 3 standard deviations above the mean.
This line is called the The lower line is located at 3 standard
deviations below the mean. This line is called the Because
89
constant common causes
will be
2.8 STATISTICS IN CONTEXT
upper control limit UCL. lower control limit LCL.
Plots of the Transformed Microwave Radiation Data Door Closed
LCL = .2567 x = .5643
UCL = .8718
– Individual values
.30 10
20 Observation number
30 40
.60 .90
p
Figure 2.20
s
of their location with respect to the mean, the UCL and LCL are sometimes called the three-sigma 3 limits.
A chart like the one in Figure 2.20 is called a A control chart is simply
a time-ordered or time series plot with upper and lower control limits drawn on each side of the mean of the observations. Control charts are used to display variability and to
discover how much variability in the observations is due to random or common cause variation, and how much is due to unique events or
We discuss control charts in Chapter 13. At this point, we use the control chart simply to display the time-
ordered radiation measurements relative to their mean and 3 standard deviation limits. Let’s interpret what we see.
We see that all the transformed radiation measurements are within 3 standard devia- tions of the mean. However, there is some tendency for the radiation values for the older
ovens to be below the mean and the values for the more recently manufactured ovens to be above the mean. This could indicate some change in the manufacturing process
leading to higher levels of emitted radiation. The evidence is inconclusive and addi- tional monitoring may be required, but the slight upward drift in the data illustrates the
importance of looking at observations in the time order in which they were produced.
If the process is stable in control, we would expect the observations in the control chart to vary about the centerline mean, within the 3 limits, with no specific pattern
of variation. A few observations outside the 3 limits or a long sequence of observa-
tions above or below the mean suggest that the process is not stable. That is, the causes of variability in the numbers are not constant or common over time. A change has oc-
curred. As we discuss in Chapter 13, once a change is detected, we search for the reason for the change.
Sometimes subgroup means rather than the individual observations are plotted in control charts. This often produces a clearer picture of the measured characteristic.
Let’s look at the measurements in Figure 2.20. Suppose we collect the observations into subgroups of size 3. The rationale might be that the first three observations are a
90
p
The standard deviation used in the construction of the control limits is an estimate of based on the range
and produces a slightly different number than the sample standard deviation .
s
special causes.
CHAPTER 2 DESCRIBING PATTERNS IN DATA
s
s s
control chart.
A Time-Ordered Plot of the Transformed Radiation Measurements
LCL = .3635 x = .5643
UCL = .7650
– Sample mean
.40 5
10 Sample number
15 .60
.80
v v
v v
X
2.9 CHAPTER SUMMARY