FIGURE 5.14
PDFs of two t distributions and a standard
normal distribution
.45 .40
.35 .30
.25 .20
.15 .10
.05
– 6 – 4
– 2 2
4 6
Normal distribution t
distribution, df = 5
t distribution,
df = 2
y PDFs
Properties of Student’s
t
Distribution
1.
There are many different t distributions. We specify a particular one by a parameter called the degrees of freedom df. See Figure 5.14.
2.
The t distribution is symmetrical about 0 and hence has mean equal to 0, the same as the z distribution.
3.
The t distribution has variance df 兾df ⫺ 2, and hence is more variable
than the z distribution, which has variance equal to 1. See Figure 5.14.
4.
As the df increases, the t distribution approaches the z distribution. Note that as df increases, the variance df
兾df ⫺ 2 approaches 1.
5.
Thus, with
we conclude that t has a t distribution with df ⫽ n ⫺ 1, and, as n in- creases, the distribution of t approaches the distribution of z.
t ⫽ y ⫺ m
s 兾 1n
The phrase ‘‘degrees of freedom’’ sounds mysterious now, but the idea will eventually become second nature to you. The technical definition requires ad-
vanced mathematics, which we will avoid; on a less technical level, the basic idea is that degrees of freedom are pieces of information for estimating s using s. The
standard deviation s for a sample of n measurements is based on the deviations
Because always, if
n ⫺ 1 of the deviations are known, the last nth is fixed mathematically to make the sum equal 0. It is therefore noninforma-
tive. Thus, in a sample of n measurements there are n ⫺ 1 pieces of information degrees of freedom about s. A second method of explaining degrees of freedom
is to recall that s measures the dispersion of the population values about m, so prior to estimating s we must first estimate m. Hence, the number of pieces of in-
formation degrees of freedom in the data that can be used to estimate s is n ⫺ 1, the number of original data values minus the number of parameters estimated
prior to estimating s.
Because of the symmetry of t, only upper-tail percentage points probabili- ties or areas of the distribution of t have been tabulated; these appear in Table 2
⌺ y
i
⫺ y ⫽ 0
y
i
⫺ y.
in the Appendix. The degrees of freedom df are listed along the left column of the page. An entry in the table specifies a value of t, say t
A
, such that an area a lies to its right. See Figure 5.15. Various values of a appear across the top of Table 2 in
the Appendix. Thus, for example, with df ⫽ 7, the value of t with an area .05 to its right is 1.895 found in the a ⫽ .05 column and df ⫽ 7 row. Since the t distribution
approaches the z distribution as df approach ⬁, the values in the last row of Table 2 are the same as z
a
. Thus, we can quickly determine z
a
by using values in the last row of Table 2 in the Appendix.
We can use the t distribution to make inferences about a population mean m. The sample test concerning m is summarized next. The only difference between the
z test discussed earlier in this chapter and the test given here is that s replaces s. The t test rather than the z test should be used any time s is unknown and the dis-
tribution of y-values is mound-shaped.
t
A
FIGURE 5.15
Illustration of area tabulated in Table 2 in the Appendix
for the t distribution f
t
t t
␣
␣
Summary of a Statistical Test for with a Normal
Population Distribution Unknown
Hypotheses:
Case 1.
H : m ⱕ m
vs. H
a
: m ⬎ m right-tailed test
Case 2.
H : m ⱖ m
vs. H
a
: m ⬍ m left-tailed test
Case 3.
H : m ⫽ m
vs. H
a
: m ⫽ m two-tailed test
T.S.: R.R.: For a probability a of a Type I error and df ⫽ n ⫺ 1,
Case 1.
Reject H if t ⱖ t
a
.
Case 2.
Reject H if t ⱕ ⫺t
a
.
Case 3.
Reject H if |t| ⱖ t
a 兾2
. Level of significance p-value:
Case 1.
p-value ⫽ Pt ⱖ computed t
Case 2.
p-value ⫽ Pt ⱕ computed t
Case 3.
p-value ⫽ 2Pt ⱖ |computed t 兩
t ⫽ y ⫺ m
s 兾 1n
Recall that a denotes the area in the tail of the t distribution. For a one-tailed test with the probability of a Type I error equal to a, we locate the rejection region
using the value from Table 2 in the Appendix, for specified a and df ⫽ n ⫺ 1. How- ever, for a two-tailed test we would use the t-value from Table 2 corresponding to
a 兾2 and df ⫽ n ⫺ 1.
Thus, for a one-tailed test we reject the null hypothesis if the computed value of t is greater than the t-value from Table 2 in the Appendix, with specified
a and df ⫽ n ⫺ 1. Similarly, for a two-tailed test we reject the null hypothesis if
兩t兩 is greater than the t-value from Table 2 with a
兾2 and df ⫽ n ⫺ 1.
EXAMPLE 5.15
A massive multistate outbreak of food-borne illness was attributed to Salmonella enteritidis. Epidemiologists determined that the source of the illness was ice
cream. They sampled nine production runs from the company that had produced the ice cream to determine the level of Salmonella enteritidis in the ice cream.
These levels MPNg are as follows:
.593 .142 .329 .691 .231 .793 .519 .392 .418 Use these data to determine whether the average level of Salmonella enteritidis
in the ice cream is greater than .3 MPNg, a level that is considered to be very dangerous. Set a ⫽ .01.
Solution
The null and research hypotheses for this example are H
: m ⱕ .3 H
a
: m ⬎ .3 Because the sample size is small, we need to examine whether the data appear to
have been randomly sampled from a normal distribution. Figure 5.16 is a normal probability plot of the data values. All nine points fall nearly on the straight line.
We conclude that the normality condition appears to be satisfied. Before setting up the rejection region and computing the value of the test statistic, we must first com-
pute the sample mean and standard deviation. You can verify that
y ⫽ .456 and s ⫽ .2128
FIGURE 5.16
Normal probability plot for Salmonella data
.999 .99
.95 .80
.50 .20
.05 .01
.001 .12
.22 .32
.42 .52
.62 .72
.82 Salmonella level
Probability
The rejection region with a ⫽ .01 is R.R.: Reject H
if t ⬎ 2.896, where from Table 2 in the Appendix, the value of t
.01
with df ⫽ 9 ⫺ 1 ⫽ 8 is 2.896. The computed value of t is
The observed value of t is not greater than 2.896, so we have insufficient evidence to indicate that the average level of Salmonella enteritidis in the ice cream is
greater than .3 MPNg. The level of significance of the test is given by
p-value ⫽ Pt ⬎ computed t ⫽ Pt ⬎ 2.20 The t tables have only a few areas a for each value of df. The best we can do is
bound the p-value. From Table 2 with df ⫽ 8, t
.05
⫽ 1.860 and t
.025
⫽ 2.306. Because
computed t ⫽ 2.20, .025 ⬍ p-value ⬍ .05. However, with a ⫽ .01 ⬍ .025 ⬍ p-value, we can still conclude that p-value ⬎ a, and hence fail to reject H
. The output from Minitab given here shows that the p-value ⫽ .029.
As we commented previously, in order to state that the level of Salmonella enteri- tidis is less than or equal to .3, we need to calculate the probability of Type II error
for some crucial values of m in H
a
. These calculations are somewhat more complex than the calculations for the z test. We will use a set of graphs to determine bm
a
. The value of bm
a
depends on three quantities, df ⫽ n ⫺ 1, a, and the distance d from m
a
to m in s units,
Thus, to determine bm
a
, we must specify a, m
a
, and provide an estimate of s. Then with the calculated d and df ⫽ n ⫺ 1, we locate bm
a
on the graph. Table 3 in the Appendix provides graphs of bm
a
for a ⫽ .01 and .05 for both one-sided and two-sided hypotheses for a variety of values for d and df.
EXAMPLE 5.16
Refer to Example 5.15. We have n ⫽ 9, a ⫽ .01, and a one-sided test. Thus, df ⫽ 8 and if we estimate s
⬇ .25, we can compute the values of d corresponding to selected values of m
a
. The values of bm
a
can then be determined using the graphs in Table 3 in the Appendix. Figure 5.17 is the necessary graph for this example. To
illustrate the calculations, let m
a
⫽ .45. Then
d ⫽ |m
a
⫺ m |
s ⫽
|.45 ⫺ .3| .25
⫽ .6
d ⫽ |m
a
⫺ m |
s
T-Test of the Mean
Test of mu ⇐ 0.3000 vs mu 0.3000 Var
iable N
Mean StDev
SE Mean T
P Sal. Lev
9 0.4564
0.2128 0.0709
2.21 0.029
T Confidence Intervals
Var iable
N Mean
StDev SE Mean
95.0 CI Sal. Lev
9 0.4564
0.2128 0.0709
0.2928, 0.6201
t ⫽ y ⫺ m
s 兾1n
⫽ .456 ⫺ .3
.2128 兾19
⫽ 2.20
FIGURE 5.17
Probability of Type II error curves a ⫽ .01, one-sided
.2 .4 .6 .8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 .1
.2 .3
.4 .5
.6 .7
.8 .9
1.0
β .45
β .55
d for
µ
a
= .45 d
for µ
a
= .55
99 74
49 39
29 19
14 8
4 3
2
Difference d Probability of
T ype II error
TABLE 5.5
Probability of Type II errors M
a
.35 .4
.45 .5
.55 .6
.65 .7 .75 .8
d .2
.4 .6
.8 1.0
1.2 1.4
1.6 1.8 2.0 b
M
a
.97 .91
.79 .63
.43 .26
.13 .05 .02 .00
We draw a vertical line from d ⫽ .6 on the horizontal axis to the line labeled 8, our df. We then locate the value on the vertical axis at the height of the intersection,
.79. Thus, b.45 ⫽ .79. Similarly, to determine b.55, first compute d ⫽ 1.0, draw a vertical line from d ⫽ 1.0 to the line labeled 8, and locate .43 on the vertical axis.
Thus, b.55 ⫽ .43. Table 5.5 contains values of bm
a
for several values of m
a
. Because the values of bm
a
are large for values of m
a
that are considerably larger than m
⫽ .3—for example, b.6 ⫽ .26—we will not state that m is less than or
equal to .3, but will only state that the data fail to support the contention that m is larger than .3.
In addition to being able to run a statistical test for m when s is unknown, we can construct a confidence interval using t. The confidence interval for m with s
unknown is identical to the corresponding confidence interval for m when s is known, with z replaced by t and s replaced by s.
1001 ⴚ ␣ Confidence Interval for , Unknown
Note: df ⫽ n ⫺ 1 and the confidence coefficient is 1 ⫺ a.
y ⫾ t
a 兾2
s 兾1n
EXAMPLE 5.17
An airline wants to evaluate the depth perception of its pilots over the age of 50. A random sample of n ⫽ 14 airline pilots over the age of 50 are asked to judge the
distance between two markers placed 20 feet apart at the opposite end of the lab- oratory. The sample data listed here are the pilots’ error recorded in feet in
judging the distance.
2.7 2.4 1.9 2.6 2.4 1.9 2.3 2.2 2.5 2.3 1.8 2.5 2.0 2.2
Use the sample data to place a 95 confidence interval on m, the average error in depth perception for the company’s pilots over the age of 50.
Solution
Before setting up a 95 confidence interval on m, we must first assess the normality assumption by plotting the data in a normal probability plot or a
boxplot. Figure 5.18 is a boxplot of the 14 data values. The median line is near the center of the box, the right and left whiskers are approximately the same length,
and there are no outliers. The data appear to be a sample from a normal distribu- tion. Thus, it is appropriate to construct the confidence interval based on the t dis-
tribution. You can verify that
y ⫽ 2.26 and s ⫽ .28
FIGURE 5.18
Boxplot of distance with 95 t confidence
interval for the mean
Referring to Table 2 in the Appendix, the t-value corresponding to a ⫽ .025 and df ⫽ 13 is 2.160. Hence, the 95 confidence interval for m is
which is the interval 2.26 ⫾ .16, or 2.10 to 2.42. Thus, we are 95 confident that the average error in the pilots’ judgment of the distance is between 2.10 and 2.42 feet.
In this section, we have made the formal mathematical assumption that the population is normally distributed. In practice, no population has exactly a normal
distribution. How does nonnormality of the population distribution affect infer- ences based on the t distribution?
There are two issues to consider when populations are assumed to be non- normal. First, what kind of nonnormality is assumed? Second, what possible effects
do these specific forms of nonnormality have on the t-distribution procedures? The most important deviations from normality are skewed distributions and
heavy-tailed distributions. Heavy-tailed distributions are roughly symmetric but
y ⫾ t
a 兾2
s 兾1n or 2.26 ⫾ 2.160 .28兾114
skewed distributions heavy-tailed distributions
have outliers relative to a normal distribution. Figure 5.19 displays four such dis- tributions: Figure 5.19a is the standard normal distribution, Figure 5.19b is a
heavy-tailed distribution a t distribution with df ⫽ 3, Figure 5.19c is a distribu- tion mildly skewed to the right, and Figure 5.19d is a distribution heavily skewed
to the right.
To evaluate the effect of nonnormality as exhibited by skewness or heavy tails, we will consider whether the t-distribution procedures are still approximately cor-
rect for these forms of nonnormality and whether there are other more efficient procedures. For example, even if a test procedure for m based on the t distribution
gave nearly correct results for, say, a heavy-tailed population distribution, it might be possible to obtain a test procedure with a more accurate probability of Type I
error and greater power if we test hypotheses about the population median in place of the population m. Also, in the case of heavy-tailed or highly skewed population
distributions, the median rather than m is a more appropriate representation of the population center.
The question of approximate correctness of t procedures has been studied extensively. In general, probabilities specified by the t procedures, particularly the
confidence level for confidence intervals and the Type I error for statistical tests, have been found to be fairly accurate, even when the population distribution is
heavy-tailed. However, when the population is very heavy-tailed, as is the case in Figure 5.19b, the tests of hypotheses tend to have probability of Type I errors
smaller than the specified level, which leads to a test having much lower power and hence greater chances of committing Type II errors. Skewness, particularly
with small sample sizes, can have an even greater effect on the probability of both Type I and Type II errors. When we are sampling from a population distribution
that is normal, the sampling distribution of a t statistic is symmetric. However,
FIGURE 5.19
a Density of the standard normal distribution.
b Density of a heavy-tailed distribution. c Density of a
lightly skewed distribution. d Density of a highly skewed
distribution.
when we are sampling from a population distribution that is highly skewed, the sampling distribution of a t statistic is skewed, not symmetric. Although the degree
of skewness decreases as the sample size increases, there is no procedure for determining the sample size at which the sampling distribution of the t statistic
becomes symmetric.
As a consequence, the level of a nominal a ⫽ .05 test may actually have a level of .01 or less when the sample size is less than 20 and the population distribu-
tion looks like that of Figure 5.19b, c, or d. Furthermore, the power of the test will be considerably less than when the population distribution is a normal distri-
bution, thus causing an increase in the probability of Type II errors. A simulation study of the effect of skewness and heavy-tailedness on the level and power of the
t test yielded the results given in Table 5.6. The values in the table are the power values for a level a ⫽ .05 t test of H
: m ⱕ m versus H
a
: m ⬎ m . The power values
are calculated for shifts of size d ⫽ |m
a
⫺ m |
兾s for values of d ⫽ 0, .2, .6, .8. Three different sample sizes were used: n ⫽ 10, 15, and 20.When d ⫽ 0, the level of the
test is given for each type of population distribution. We want to compare these values to .05. The values when d ⬎ 0 are compared to the corresponding values
when sampling from a normal population. We observe that when sampling from the lightly skewed distribution and the heavy-tailed distribution, the levels are
somewhat less than .05 with values nearly equal to .05 when using n ⫽ 20. How- ever, when sampling from a heavily skewed distribution, even with n ⫽ 20 the
level is only .011. The power values for the heavy-tailed and heavily skewed popula- tions are considerably less than the corresponding values when sampling from a nor-
mal distribution. Thus, the test is much less likely to correctly detect that the alternative hypothesis H
a
is true. This reduced power is present even when n ⫽ 20. When sampling from a lightly skewed population distribution, the power values
are very nearly the same as the values for the normal distribution. Because the t procedures have reduced power when sampling from skewed
populations with small sample sizes, procedures have been developed that are not as affected by the skewness or extreme heavy-tailedness of the population distri-
bution. These procedures are called robust methods of estimation and inference. Two robust procedures, the sign test and Wilcoxon signed rank test, will be con-
sidered in Section 5.8 and Chapter 6, respectively. They are both more efficient than the t test when the population distribution is very nonnormal in shape. Also,
they maintain the selected a level of the test unlike the t test, which, when applied to very nonnormal data, has a true a value much different from the selected
a
value. The same comments can be made with respect to confidence intervals for the mean. When the population distribution is highly skewed, the coverage
probability of a nominal 1001 ⫺ a confidence interval is considerably less than 1001 ⫺ a.
n ⫽ 10 n ⫽ 15
n ⫽ 20 Shift d
Shift d Shift d
Population Distribution
.2 .6
.8 .2
.6 .8
.2 .6
.8
Normal .05
.145 .543
.754 .05
.182 .714
.903 .05
.217 .827
.964 Heavy tailed
.035 .104
.371 .510
.049 .115
.456 .648
.045 .163
.554 .736
Light skewness .025
.079 .437
.672 .037
.129 .614
.864 .041
.159 .762
.935 Heavy skewness
.007 .055
.277 .463
.006 .078
.515 .733
.011 .104
.658 .873
TABLE 5.6
Level and power values for t test
robust methods
So what is a nonexpert to do? First, examine the data through graphs. A boxplot or normal probability plot will reveal any gross skewness or extreme
outliers. If the plots do not reveal extreme skewness or many outliers, the nominal t-distribution probabilities should be reasonably correct. Thus, the level and power
calculations for tests of hypotheses and the coverage probability of confidence intervals should be reasonably accurate. If the plots reveal severe skewness
or heavy-tailedness, the test procedures and confidence intervals based on the t-distribution will be highly suspect. In these situations, we have two alternatives.
First, it may be more appropriate to consider inferences about the population me- dian rather than the population mean. When the data are highly skewed or very
heavy-tailed, the median is a more appropriate measure of the center of the pop- ulation than is the mean. In Section 5.9, we will develop tests of hypotheses and
confidence intervals for the population median. These procedures will avoid the problems encountered by the t-based procedures discussed in this section when
the population distribution is highly skewed or heavy-tailed. However, in some sit- uations, the researcher may be required to provide inferences about the mean, or
the median may not be an appropriate alternative to the mean as a summary of the population. In Section 5.8, we will discuss a technique based on bootstrap methods
for obtaining an approximate confidence interval for the population mean.
5.8 Inferences about M When Population Is Nonnormal
and
n
Is Small: Bootstrap Methods
The statistical techniques in the previous sections for constructing a confidence in- terval or a test of hypotheses for m required that the population have a normal dis-
tribution or that the sample size be reasonably large. In those situations where neither of these requirements can be met, an alternative approach using bootstrap
methods can be employed. This technique was introduced by Efron in the article, “Bootstrap Methods: Another Look at the Jackknife,” Annals of Statistics, 7,
pp. 1–26. The bootstrap is a technique by which an approximation to the sampling distribution of a statistic can be obtained when the population distribution is
unknown. In Section 5.7 inferences about m were based on the fact that the statistic
had a t distribution. We used the t-tables Table 2 in the Appendix to obtain appropriate percentiles and p-values for confidence intervals and tests of
hypotheses. However, it was required that the population from which the sample was randomly selected have a normal distribution or that the sample size n be rea-
sonably large. The bootstrap will provide a means for obtaining percentiles of when the population distribution is nonnormal andor the sample size is relatively
small.
The bootstrap technique utilizes data-based simulations for statistical infer- ence. The central idea of the bootstrap is to resample from the original data set, thus
producing a large number of replicate data sets from which the sampling distribution of a statistic can be approximated. Suppose we have a sample y
1
, y
2
, . . . , y
n
from a population and we want to construct a confidence interval or test a set of hypothe-
ses about the population mean m. We realize either from prior experience with this population or by examining a normal quantile plot that the population has a non-
normal distribution. Thus, we are fairly certain that the sampling distribution of
is not the t distribution, so it would not be appropriate to use the t-tables t ⫽
y ⫺ m s
兾 1n y ⫺ m
s 兾 1n
t statistic ⫽ y ⫺ m
s 兾1n
to obtain percentiles. Also, the sample size n is relatively small so we are not too sure about applying the Central Limit Theorem and using the z-tables to obtain
percentiles to construct confidence intervals or to test hypotheses. The bootstrap technique consists of the following steps:
1.
Select a random sample y
1
, y
2
, . . . , y
n
of size n from the population and compute the sample mean, , and sample standard deviation, s.
2.
Select a random sample of size n, with replacement from y
1
, y
2
, . . . , y
n
yielding .
3.
Compute the mean and standard deviation of
.
4.
Compute the value of the statistic
5.
Repeat Steps 2 – 4 a large number of times B to obtain Use
these values to obtain an approximation to the sampling distribution of
. Suppose we have n ⫽ 20 and we select B ⫽ 1,000 bootstrap samples. The steps in
obtaining the bootstrap approximation to the sampling distribution of are
depicted here. Obtain random sample y
1
, y
2
, . . . , y
20
, from population, and compute and s First bootstrap sample:
yields , and Second bootstrap sample:
yields , and .
. .
Bth bootstrap sample: yields , and
We then use the B values of to obtain the approximate per-
centiles. For example, suppose we want to construct a 95 confidence interval for m and B ⫽ 1,000. We need the lower and upper .025 percentiles,
. Thus, we would take the 1,000.025 ⫽ 25th largest value of
⫽
.025
and the 1,000 1 ⫺ .025 ⫽ 975th largest value of ⫽
.975
. The approximate 95 confidence interval for m would be
EXAMPLE 5.18
Secondhand smoke is of great concern, especially when it involves young children. Breathing secondhand smoke can be harmful to children’s health, contributing to
health problems such as asthma, Sudden Infant Death Syndrome SIDS, bronchi- tis and pneumonia, and ear infections. The developing lungs of young children are
severely affected by exposure to secondhand smoke. The Child Protective Services CPS in a city is concerned about the level of exposure to secondhand smoke for
children placed by their agency in foster parents care. A method of determining level of exposure is to determine the urinary concentration of cotanine, a metabo-
lite of nicotine. Unexposed children will typically have mean cotanine levels of 75 or less. A random sample of 20 children expected of being exposed to secondhand
smoke yielded the following urinary concentrations of cotanine:
29, 30, 53, 75, 89, 34, 21, 12, 58, 84, 92, 117, 115, 119, 109, 115, 134, 253, 289, 287
冢
y ⫺ ˆt
.025
s 1n
, y ⫹ ˆt
.975
s 1n
冣
ˆt ˆt
ˆt ˆt
ˆt
.025
and ˆt
.975
ˆt: ˆt
1
, ˆt
2
, . . . , ˆt
B
ˆt
B
⫽
y ⫺ y
s 兾120
s y
y
1
, y
2
, . . . , y
20
ˆt
2
⫽
y ⫺ y
s 兾120
s y
y
1
, y
2
, . . . , y
20
ˆt
1
⫽
y ⫺ y
s 兾120
s y
y
1
, y
2
, . . . , y
20
y
y ⫺ m s
兾 1n y ⫺ m
s 兾 1n
ˆt
1
, ˆt
2
, . . . , ˆt
B
. ˆt ⫽
y ⫺
y s
兾 1n y
1
, y
2
, . . . , y
n
s y
y
1
, y
2
, . . . , y
n
y