19 Suppose that material strength for a randomly selected specimen of a particular
Example 5.19 Suppose that material strength for a randomly selected specimen of a particular
type has a Weibull distribution with parameter values a
2 (shape) and b 5
(scale). The corresponding density curve is shown in Figure 5.6. Formulas from Section 4.5 give
m 5 E(x) 5 4.4311 m| 5 4.1628 s 2 5 V(X) 5 5.365 s 5 2.316
The mean exceeds the median because of the distribution’s positive skew.
5.3 Statistics and Their Distributions
f(x) .15
Figure 5.6 The Weibull density curve for Example 5.19
We used statistical software to generate six different samples, each with n
10, from
this distribution (material strengths for six different groups of ten specimens each). The results appear in Table 5.1, followed by the values of the sample mean, sample median, and sample standard deviation for each sample. Notice first that the ten observations in any particular sample are all different from those in any other sample. Second, the six values of the sample mean are all different from one another, as are the six values of the sample median and the six values of the sample standard deviation. The same is true of the sample 10 trimmed means, sample fourth spreads, and so on.
Table 5.1 Samples from the Weibull Distribution of Example 5.19 Sample
Furthermore, the value of the sample mean from any particular sample can be regarded as a point estimate (“point” because it is a single number, corresponding to
a single point on the number line) of the population mean m, whose value is known to be 4.4311. None of the estimates from these six samples is identical to what is being estimated. The estimates from the second and sixth samples are much too large, whereas the fifth sample gives a substantial underestimate. Similarly, the sam- ple standard deviation gives a point estimate of the population standard deviation. All six of the resulting estimates are in error by at least a small amount.
In summary, the values of the individual sample observations vary from sample to sample, so will in general the value of any quantity computed from sample data, and the value of a sample characteristic used as an estimate of the corresponding popula- tion characteristic will virtually never coincide with what is being estimated.
■
CHAPTER 5 Joint Probability Distributions and Random Samples
A statistic is any quantity whose value can be calculated from sample data. Prior to obtaining data, there is uncertainty as to what value of any particular statistic will result. Therefore, a statistic is a random variable and will be denoted by an uppercase letter; a lowercase letter is used to represent the calculated or observed value of the statistic.
Thus the sample mean, regarded as a statistic (before a sample has been selected or
an experiment carried out), is denoted by ; the calculated value of this statistic is . X x
Similarly, S represents the sample standard deviation thought of as a statistic, and its computed value is s. If samples of two different types of bricks are selected and the
individual compressive strengths are denoted by X 1 , ..., X m and Y 1 , ..., Y n , respec- tively, then the statistic X , the difference between the two sample mean com- Y
pressive strengths, is often of great interest.
Any statistic, being a random variable, has a probability distribution. In partic- ular, the sample mean has a probability distribution. Suppose, for example, that n
2 components are randomly selected and the number of breakdowns while under warranty is determined for each one. Possible values for the sample mean number of
breakdowns are 0 (if X 1 X 2 0), .5 (if either X 1 0 and X 2 1 or X 1 1 and
X 2 0), 1, 1.5, . . .. The probability distribution of specifies P( 0), P( .5), and so on, from which other probabilities such as P(1
3) and P( 2.5) can X
be calculated. Similarly, if for a sample of size n
2, the only possible values of the sample variance are 0, 12.5, and 50 (which is the case if X 1 and X 2 can each take on only the values 40, 45, or 50), then the probability distribution of S 2 gives P(S 2 0), P(S 2 12.5), and P(S 2 50). The probability distribution of a statistic is sometimes
referred to as its sampling distribution to emphasize that it describes how the statis- tic varies in value across all samples that might be selected.