Review of Statistics and Probability Concepts

3.4 Review of Statistics and Probability Concepts

3.4.1 Statistical Descriptors Statistical uncertainties can be described by the mean, x , standard deviation, σ, and coefficient of variation, COV. These terms are defined in the following paragraphs.

The mean value, x , of a given set of data, x = (x 1 ,x 2 , ..., x N ), is calculated as:

x = (Eq. 3-7) N

where

N = Number of data values (dim).

The mean value is also called the expected value or the average of the data set. The standard deviation, σ, is a measure of dispersion of the data in the same units as the data, x i , and is defined as:

x i − x )( N − 1 ) (Eq. 3-8)

The coefficient of variation, COV, is a dimensionless measure of the variability of the data. The COV is defined as the standard deviation ( σ) divided by the mean value (x¯):

COV = σ x (Eq. 3-9)

The COV expresses the magnitude of the variability as a percentage or fraction of the mean value.

Knowledge of these statistical descriptions is needed when compiling the load and resistance statistics required for calibration by reliability analyses. An example is provided below.

Example Problem 3-2:

This example illustrates the calculation of the mean, x ¯, standard deviation, σ, and coefficient of variation, COV, using Eqs. 3-7, 3-8 and 3-9, respectively. The bias factor is also defined and calculated. Analysis of the data shows that a larger sample size is needed so that a few outlying points do not distort the distribution of the data.

Several case histories of good quality static pile load tests in sands were compiled, together with the boring information available from these sites. Using the SPT blow counts from the borings, the driven pile axial soil resistance for each case was predicted using Meyerhof's (1976) procedure.

3-10

Table 3-2 summarizes the results of the predicted resistances, the measured resistances and the ratio of the measured to the predicted resistances.

Table 3-2

Summary of Measured and Predicted Driven Pile Axial Soil Resistance

Using Meyerhof Procedure

Measured Resistance

Predicted Resistance

Measured Resistance/

(tons)

(tons)

Predicted Resistance

The mean, x ¯, standard deviation, σ, and coefficient of variation, COV, of the ratio of measured to predicted capacities are 1.22, 0.66 and 0.54, respectively. Because x ¯, of the ratio of the measured to the predicted resistance is 1.22, this implies for this example that Meyerhof's method tends to underpredict, on average, the pile resistance. Thus, there is a difference between what is predicted and what is measured. This difference is referred to as the "bias." The bias factor, λ, of Meyerhof's SPT method is defined as the ratio of the measured resistance to the predicted resistance as shown in Eq. 3-10:

λ = R m R n (Eq. 3-10)

3-11 3-11

R m = Measured nominal resistance R n = Predicted nominal resistance

For the example given in Table 3-2, the average bias factor, λ, is 1.22 with a COV = 0.54. The COV is relatively high because one standard deviation represents a variation of 54 percent from the mean compared to typical values less than 20 percent. The high variation is caused by the large bias in the last two data entries in Table 3-2. If these two data points are dropped, the values of x ¯, σ, and COV are changed to 1.06, 0.23 and 0.22, respectively. These values are more reasonable. Because the data set is relatively small, dropping the number of data points from 24 to 22 by removing the two outlying points changes the statistics dramatically. The data base of load tests should be large enough and should contain high quality data, so that the statistics derived from the data base will be representative of the loads and prediction practice.

3.4.2 Probability Density Functions

A histogram showing the frequency of occurrence of the measured to predicted axial driven pile resistances from Table 3-2 is presented in Figure 3-1. The histogram was constructed by counting the number of ratios in each of the equal intervals of 0.25 (e.g., in the interval from 0.51 to 0.75 there are 2 values and in the interval from 0.76 to 1.00 there are 10 values). It is apparent by examining Figure 3-1 that the distribution of measured to predicted resistance is not symmetrical about the mean value of 1.22. Further, the histogram shows how extreme the value near 4.0 is and brings into question its validity.

Figure 3-1